keywords: Breast cancer, SVM, C4.5, Naïve Bayes, WDBC
Breast cancer is one of the leading cancers for women when compared to all other cancers. It is the second highest cause of death in women. Breast cancer risk in Africa revealed that 1 out of 28 women develop breast cancer during their lifetime. This is more prominent in urban areas being 1 out of 22 in a lifetime compared to rural areas where the risk is relatively much lower being 1 out of 60 women developing breast cancer in their lifetime. The aim of this study is to investigate the performance of different classification techniques on the Wisconsin breast cancer dataset from UCI machine learning database. In this experiment, we compare three classification techniques - C4.5 decision tree, Naïve Bayes (NB) and Support Vector Machine (SVM). Two cross validation approaches were used for all the learners, that is, 10 and 20-folds cross validation. The best results were achieved in 20-folds cross validation with NB has accuracy of 97.5%, SVM with accuracy of 97.2%, while C4.5 algorithms with accuracy of 94.8%.