Online ISSN: 2515-8260

Author : A. SHANMUGAPRIYA, Dr. N. Tajunisha,


STACKING ENSEMBLE LEARNING AND FEATURE SELECTION METHODS FOR DATA CLASSIFICATION

Dr. N. Tajunisha, A. SHANMUGAPRIYA

European Journal of Molecular & Clinical Medicine, 2020, Volume 7, Issue 3, Pages 4970-4994

In data mining approaches, classification is a major task. Various modern applications have adopted this. Recently Cascaded Fuzzy Relevance Vector Machine (FRVM) are developed to classify datasets. However single classifier will not give higher accuracy rather than the multiple classifiers. For solving this issue, ensemble learning is introduced in this work for the classification of samples. It is a way of combining various classifiers such as Enhanced Adaptive Neuro Fuzzy Inference System (EANFIS) and Modified Convolutional Neural Network (MCNN) from which a novel classifier is formulated which performs better than any constituent classifier. This work consists of four major steps. First step consists of parallel operation data samples for classification. Second feature selection is performed by using Filter based functions are chi-squared filter, Euclidean Distance, Pearson correlation coefficient, Correlation Based Feature Selection (CFS), Fast Correlation Based Filter (FCBF), and Information Gain (IG). Thirdly outliers are removed by Fuzzy C means (FCM) clustering algorithm. Further in classification model, virtual pair is selected automatically by using Ant Colony Optimization (ACO) method and then Stacking Ensemble Learning (SEL) has been developed for classification reducing error rate and for improving accuracy.  SEL is a technique in which two classifiers are trained using a single training dataset. Using k-fold validation, further divided the training set and formed the resultant model. Performance comparison results of various classifiers under two benchmark datasets such as Wisconsin Diagnostic Breast Cancer (WDBC) and PD600.