Online ISSN: 2515-8260

Optimized Feed Forward Neural Network For Classification Of Diabetes In Big Data Environment

Main Article Content

N.V.Poornima1 , Dr.B.Srinivasan2 , Dr.P.Prabhusundhar3

Abstract

Diabetes can be turn into life threaten diseases, if it is not treated at an early stage. Especially, in the women, the chances of diabetes is higher as compared to men due to the hormonal changes during pregnancies. Due to this, they suffer a long term diabetes as well as other diseases due to their tensions and regular life chores. This can be prevented if the diagnosis is determined at an early stage. Mostly, the trained doctors are required to confirm the diabetes. It requires manual work and complete knowledge in it. This problem is avoided by several research works using machine learning algorithm for the classification. Those algorithms process effectively for smaller dataset with smaller number of attributes. Hence, to overcome this shortcomings, in this an optimization based classifier is proposed to process on larger datasets like BIG DATA. In existing, the improved K-means and logistic regression algorithm is applied. It able to improve the classification rate but the features for training and computational time is high. To overcome this problem, an optimization based machine learning approach is used in this paper. Before the classification process, the data is pre-processed to remove the missing values and non-available values. In this, the K-means clustering is applied to the data to remove the outlier data and to reduce the training time period of the prediction process. The prediction of diabetic is done with the help of feed forward neural network. The inputs for the network is by selecting dominant attributes in the dataset using optimization process. The objective function is to reduce the misclassification rate of the classifier. In this, the cuckoo search optimization and feed forward neural network is used. This approach able to improve the accuracy as compared to the existing technique. The whole process is realized in MATLAB R 2018a environment and evaluated in terms of accuracy, precision, recall, F-measure and Matthew correlation coefficient.This approach outperforms all other existing technique with F-measure of 96.7%.

Article Details