An in Depth Analysis of Machine Learning Classifiers for Prediction of Student’s Performance
European Journal of Molecular & Clinical Medicine,
2020, Volume 7, Issue 8, Pages 2811-2825
AbstractMachine learning algorithms are sensitive to the nature and the dimension of the data that are fed into the model for analysis. These algorithms tend to perform significantly different depending upon the dataset used for analysis and training. It then becomes difficult to discover the best algorithm to handle a particular dataset. In the current work, we have made an attempt to verify 24 different state of the art supervised machine learning algorithms in an effort to find the most suitable classifier for predicting the performance of students in our University. Out of the 24 algorithms that we have identified, we found Naïve Bayes (NB) and Stabilized Nearest Neighbor Classifier (SNN) to be the most suitable for deployment followed by K-Nearest Neighbors (KNN) and Cost Sensitive C5.0 (C5.0Cost). We have also determined that handling missing values using KNN improves the classification of minority classes. The classifiers have been evaluated with the sensitivity, specificity, precision, kappa and F-score metrics. It has further been established that the performance metric “Accuracy” is misleading when dealing with imbalanced dataset and balanced accuracy provides far better and reliable information for the model being developed.
- Article View: 44
- PDF Download: 88