Rubryx – a blend of experience and knowledge |
||||
Download Rubryx 2.1 - new version of our text classification program |
Rubryx Short Manual Introduction Rubryx is a program of pattern classification of web sites. It allows classifying a large bulk of specialized textual information and generating web-catalogs, electronic libraries, reference systems on account of expert information and full-text analysis. System requirements:
How to work with the program:
Main window. With the help of buttons "Add" and "Delete", make a list of classes. Main window. Choose a class and double-click to enter it. A dialog window "Selection of class patterns" will appear. With the help of "Add" button, make a list of a few documents (4-6) fully representing the corresponding class. Press OK. The program will automatically generate the vocabulary depending on the selected patterns. The process can take a few minutes. Do the same for each class. Index ranges from 1 to 100. The index is defined empirically. For its initial value, consult the Statistics button. Practical hints
The aim of the program is to classify the documents most efficiently.
For a successful solution of the task, an accurate selection of the
class and threshold value of index K is required. The classes should
be selected so that their intersection is minimized and the most bulk
of documents is covered. Index K should be chosen so that odd documents
are not included into the class (K value is too small) and suitable
documents are not sorted away (K value is too big). A
number of preliminary classifications may be required. How to create a new dictionary
It is necessary to create a special dictionary to tune the program on new domain. The dictionary is placed in three text files. Where to buy the program
Rubryx is a shareware product. |
Rubryx Community
KSU |
||
Copyright
©
2001-2006. All rights reserved |
||||