Evolutionary computation technique combined with ensemble model for classification of diabetes

Machine learning (ML) was a rapidly advancing technology in the modern world. It had a wide variety of applications such as medical diagnosis, stock market trading, email spam, and malware filtering, etc., ML algorithms train the computer to learn from the past data and make predictions on the unknown samples. This research mainly focuses on the prediction of the PIMA Indian diabetes disease. The diabetes dataset was taken from the UCI machine learning repository. The research work was broken down into three stages. The AdaBoost technique was applied to all the features of the PIMA Indian Diabetes dataset. The correlation technique was applied for feature selection and the selected features were trained and tested with AdaBoost. A novel Hybrid Genetic Algorithm (HGA) was designed and developed for feature selection and the selected features were trained and tested with AdaBoost. Even though the correlation identifies the feature subsets based on statistical relevance but it fails in providing optimal feature subset. This drawback was overcome by the proposed novel HGA by selecting an optimal feature subset that can improve the performance of the AdaBoost model. A comparison of correlation and HGA was performed. The HGA with AdaBoost outperformed when compared with correlation with AdaBoost and AdaBoost models in terms of accuracy. The proposed methods were also applied to other datasets like the Wisconsin breast cancer diagnostic and Cleveland heart disease datasets to show its broader applications. The HGA with AdaBoost outperformed other reported techniques for the PIMA Indian diabetes

Full-Text PDF

Latest issues

To read the issue click on a cover