Relevance Feature Discovery for Text Mining
Our Price
₹3,500.00
10000 in stock
Support
Ready to Ship
Description
Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. This system proposes an innovative technique for finding and classifying low-level terms based on both their appearances in the higher-level features and their specificity in a training set. Input training documents were splitted into word by word. Each word in the training set is compared with the input query and the frequency for each query is identified and based on that the documents were divided into two classes. The support values for each classes were measured. The specificity value is calculated for each classes. Then FClustering algorithm is employed inorder to divide the input data into three classes. From the calculation of the maximum and minimum specificity of each classes were identified and based on that the classes were divided into three categories. The performance of the process is measured interms of the weightage, top 20 documents, F beta values, mean average precision, breakpoint and interpolated average precision values.