Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics
Our Price
₹3,000.00
10000 in stock
Support
Ready to Ship
Description
Our objective is to overcome the limitations of high dimensional sequence data by introducing a statistical metric for the selection of discriminated or more informative features from a protein sequence. The proposed technique selects the most significant features which lead to improved classification results on various kinds of super families sequences obtained from the publically known benchmark datasets. The incorporated technique is similar to alignment-free sequence classification methods and has significant advantages over the existing classification methods. Moreover, the proposed technique is very simple, fast, reliable, and robust and requires a very short training time. The methodology begins with the selection of training data and then proceeds with the sequence encoding and feature selection modules. After selection of statistically significant features, different classifiers have been used during the training. After a successful training, the system is tested on unknown data and classification performance of the proposed technique has been evaluated. This makes sequence classification a more challenging task than classification on feature vectors. In this paper, we present a brief review of the existing work on sequence classification. We summarize the sequence classification in terms of methodologies and application do-mains. We also provide a review on several extensions of the sequence classification problem, such as early classification on sequences and semi-supervised learning on sequences.
Tags: 2014, Biomedical Projects, Java