Predictive Modeling For Classification Of Breast Cancer Data Set Using Feature Selection Techniques

Main Article Content

S. Leena Nesamani , S. Nirmala Sugirtha Rajini , Ibeth Catherine Figueroa Sánchez , María Del Pilar Melgarejo Figueroa , Digna Amabilia Manrique De Lara Suárez , Oscar Felipe Carnero Fuentes

Abstract

Predictive modeling or predictanalysis is the process of trying to predict the outcome from data using machine learning models. The quality of the output predominantly depends on the quality of the data that is provided to the model. The process of selecting the best choice of input to a machine learning model depends on a variety of criteria and is referred to as feature engineering. The work is conducted to classify the breast cancer patients into either the recurrence or non-recurrence category.A categorical breast cancer dataset is used in this work from which the best set of features is selected to make accurate predictions. Two feature selection techniques namely the Chi squared technique and the Mutual Information technique have been used. The selected features were then used by the Logistic Regression model to make the final prediction. It was identified that the Mutual Information technique proved to be more efficient and produced higher accuracy in the predictions.

Article Details

Section
Articles