EVOLVING DIVERSE ENSEMBLES USING GENETIC PROGRAMMING FOR CLASSIFICATION WITH UNBALANCED DATA
US$52.58
10000 in stock
SupportDescription
ABSTRACT
In classification, machine learning algorithms can suffer a performance bias when data sets are unbalanced. Data sets are unbalanced when at least one class is represented by only a small number of minority classes, while the other class makes up the majority. In this scenario, classifiers can have good accuracy on the majority class, but very poor accuracy on the minority class. This paper proposes a multi objective genetic programming (MOGP) approach to evolving accurate and diverse ensembles of genetic program classifiers with good performance on both the minority and majority of classes. The evolved ensembles comprise of non-dominated solutions in the population where individual members vote on class membership. To develop a multiobjective GP (MOGP) approach to classification with unbalanced data, using the minority and majority class accuracy as competing objectives in the learning process. Our first goal is to compare two popular Pareto-based fitness schemes in the MOGP algorithm, namely, SPEA2 and NSGAII, across a number of classification tasks with unbalanced data. Recent work has shown that while NSGAII can be effective in evolving a good set of no dominated solutions in some tasks, this performance needs to be improved for difficult classification problems. To hypothesize that SPEA2 can evolve better-performing classifiers on these tasks, as this strategy is known to better exploit the middle region of the frontier; whereas NSGAII tends to reward exploration at the end regions. This highlights the importance of developing an effective fitness evaluation strategy in the underlying MOGP algorithm to evolve good ensemble members.