Breast Cancer Detection Analysis Using Different Machine Learning Techniques: South Iraq Case Study

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Feb 26, 2025
Salma Abdulbaki Mahmood Myssar Jabbar Hammood Al-Battbootti Saad Shaheen Hamadi Iuliana Marin Costin-Anton Boiangiu Nicolae Goga

Abstract

Contemporary oncology has seen a growing interest in digital technologies, whose integration with extensive healthcare and clinical data has raised new aspirations in managing patient profiles and organizing treatment plans. Among the commonly used digital technologies are Machine Learning (ML) methods that can perform many tasks, such as prediction, classification, and description, based on previously stored big data with high precision and speed. This study aims to develop a predictive ML model for early prediction of breast cancer based on a set of medically categorized risk factors. The locally collected database contained 415 instances from Al-Sadr Teaching Hospital in Basrah, Iraq, 219 (53%) of which were breast cancer patients, whereas 196 (47%) of them were control, respectively non-patients. It trained seven machine learning methods, namely Decision Tree (DT), Random Forest (RF), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Logical Regression (LR), Multinominal Naïve Bayes (NB), and Gaussian NB. The dataset was cleaned and balanced before being used. The results proved the superiority of the Decision Tree model with 96% accuracy, 96% sensitivity, and 96% specificity, the Random Forest model with 94% accuracy, 100% sensitivity, and 87% specificity, and SVM model with 92% accuracy, 96% sensitivity, and 87% specificity, respectively. Other models gave diverging results. The current study concluded that modern technologies should be employed to raise awareness and control diseases. The need to adopt Electronic Health Records (EHR) to ensure the integration of clinical data of different types recorded over time for patients contributes to building accurate and reliable prediction models.

Abstract 95 | PDF Downloads 59

##plugins.themes.bootstrap3.article.details##

Keywords

Breast Cancer Prediction, Machine Learning, Comparative Study, Naïve Bayes, K-Nearest Neighbors, Support Vector Machines, Decision Trees, Random Forests, Logistic Regression

References
Adi Pratama, F. R., & Oktora, S. I. (2023). Synthetic Minority Over-sampling Technique (SMOTE) for handling imbalanced data in poverty classification. Statistical Journal of the IAOS, vol. 39(1), pp. 233-239. doi:10.3233/SJI-220080
Afrash, M. R., Bayani, A., Shanbehzadeh, M., Bahadori, M., & Kazemi-Arpanahi, H. (2022). Developing the Breast Cancer Risk Prediction System using Hybrid Machine Learning Algorithms, Journal of Education and Health Promotion, vol. 11(1), pp. 1-12. doi:10.4103/jehp.jehp_42_22
Al-Hashimi, M. M. Y. (2021). Trends in Breast Cancer Incidence in Iraq During the Period 2000-2019. Asian Pacific Journal of Cancer Prevention, vol. 22(12), pp. 3889-3896. doi:10.31557/APJCP.2021.22.12.3889
Alotaibi, B. S., Alghamdi, R., Aljaman, S., Hariri, R. A., Althunayyan, L. S., AlSenan, B. F., Alnemer, A. M. (2024). The Accuracy of Breast Cancer Diagnostic Tools, Cureus, vol. 16(1), pp. 1-9. doi:10.7759/cureus. 51776
Al-Rikabi, A., & Husain, S. (2012). Increasing Prevalence of Breast Cancer among Saudi Patients Attending a Tertiary Referral Hospital: A Retrospective Epidemiologic Study, Croatian Medical Journal, vol. 53(3), pp. 239-243. doi:10.3325/cmj.2012.53.239
Asif, S., Wenhui, Y., ur-Rehman, S., ul-ain, Q., Amjad, K., Yueyang, Y., Jinhai, S., & Awais, M. (2024). Advancements and Prospects of Machine Learning in Medical Diagnostics: Unveiling the Future of Diagnostic Precision. Archives of Computational Methods in Engineering, 1-31. doi:10.1007/s11831-024-10148-w
Azamjah, N., Soltan-Zadeh, Y., & Zayeri, F. (2019). Global Trend of Breast Cancer Mortality Rate: A 25-Year Study. Asian Pacific journal of cancer prevention: APJCP, vol. 20(7), pp. 2015-2020. doi:10.31557/ APJCP.2019.20.7.2015
Battineni, G., Chintalapudi, N., & Amenta, F. (2020). Performance Analysis of Different Machine Learning Algorithms in Breast Cancer Predictions, EAI Endorsed Transactions on Pervasive Health and Technology, vol. 6(23), pp. 1-7. doi:10.4108/eai.28-5-2020.166010
Chakkouch, M., Ertel, M., Mengad, A., & Amali, S. (2023). A Comparative Study of Machine Learning Techniques to Predict Types of Breast Cancer Recurrence, International Journal of Advanced Computer Science and Applications, vol. 14(5), pp. 296-302. doi:10.14569/IJACSA.2023.0140531
Chen, H., Wang, N., Du, X., Mei, K., Zhou, Y., & Cai, G. (2023). Classification Prediction of Breast Cancer Based on Machine Learning, Computational Intelligence and Neuroscience, vol. 2023(1), pp. 1-9. doi:10.1155/2023/6530719
Chtouki, K., Rhanoui, M., Mikram, M., Yousfi, S., & Amazian, K. (2023). Supervised Machine Learning for Breast Cancer Risk Factors Analysis and Survival Prediction. In: Lazaar, M., En-Naimi, E.M., Zouhair, A., Al Achhab, M., Mahboub, O. (eds) Proceedings of the 6th International Conference on Big Data and Internet of Things. BDIoT 2022. Lecture Notes in Networks and Systems, vol. 625, pp. 59-71. Springer, Cham. doi:10.1007/978-3-031-28387-1_6
Cuthrell, K. M., & Tzenios, N. (2023). Breast Cancer: Updated and Deep Insights. International Research Journal of Oncology, vol. 6(1), pp. 104-118.
Daly, A.; Rolph, R.; Cutress, R. I, & Copson, E. R. (2021) A Review of Modifiable Risk Factors in Young Women for the Prevention of Breast Cancer, Breast Cancer: Targets and Therapy, vol. 13, pp. 241-257. doi: 10.2147/BCTT.S268401
Dar, R. A., Rasool, M., & Assad, A. (2022). Breast Cancer Detection using Deep Learning: Datasets, Methods, and Challenges Ahead. Computers in biology and medicine, vol.149, pp. 1-23. doi: 10.1016/j.compbio med.2022.106073
Darwich, M., & Bayoumi, M. (2024). An Evaluation of the Effectiveness of Machine Learning Prediction Models in Assessing Breast Cancer Risk, Informatics in Medicine Unlocked, vol. 49, pp. 1-17. doi:10.1016/j.imu.2024.101550
Dianati-Nasab, M., Salimifard, K., Mohammadi, R., Saadatmand, S., Fararouei, M., Hosseini, K. S., Jiavid-Sharifi, B., Chaussalet, T., & Dehdar, S. (2024). Machine Learning Algorithms to Uncover Risk Factors of Breast Cancer: Insights from a Large Case-Control Study, Frontiers in Oncology, vol. 13, pp. 1-13. doi:10.3389/fonc.2023.1276232
Edwards, T. L., Greene, C. A., Piekos, J. A., Hellwege, J. N., Hampton, G., Jasper, E. A., & Velez Edwards, D. R. (2023). Challenges and Opportunities for Data Science in Women's Health. Annual Review of Biomedical Data Science, vol. 6(1), pp. 23-45. doi:10.1146/annurev-biodatasci-020722-105958
Ferroni, P., Roselli, M., Buonomo, O., Spila, A., Portarena, I., Laudisi, A., Valente, M., Pirillo, S., Fortunato, L., Costarelli, L., Cavaliere, F., & Guadagni, F. (2018). Anticancer Research, vol. 38(8), pp. 4705-4712. doi: 10.21873/anticanres.12777
Garcia-Moreno, F. M., Ruiz-Espigares, J., Gutiérrez-Naranjo, M. A., & Marchal, J. A. (2024). Using Deep Learning for Predicting the Dynamic Evolution of Breast Cancer Migration, Computers in Biology and Medicine, vol. 180, pp. 1-18. doi:10.1016/j.comp biomed.2024.108890
González-Castro, L., Chávez, M., Duflot, P., Bleret, V., Martin, A. G., Zobel, M., Nateqi, J., Lin, S., Pazos-Arias, J.J., Del Fiol, G., & López-Nores, M. (2023). Machine Learning Algorithms to Predict Breast Cancer Recurrence using Structured and Unstructured Sources from Electronic Health Records, Cancers, vol. 15(10), pp. 1-16. doi:10.3390/cancers15102741
Iparraguirre-Villanueva, O., Epifanía-Huerta, A., Torres-Ceclén, C., Ruiz-Alvarado, J., & Cabanillas-Carbonel, M. (2023). Breast Cancer Prediction using Machine Learning Models, International Journal of Advanced Computer Science and Applications, vol. 14(2), pp. 610-620. doi:20.500.13053/9106
Jain, B., & Singla, N. (2023). Breast Cancer Detection using Machine Learning Algorithms. Journal of Computers, Mechanical and Management, vol. 2(6), pp. 30-35. doi: 10.57159/gadl.jcmm.2.6.230109
Lee, M. (2023). Deep Learning Techniques with Genomic Data in Cancer Prognosis: A Comprehensive Review of the 2021–2023 Literature, Biology, vol. 12(7), pp. 1-22. doi:10.3390/biology12070893
Li, J., Zhou, Z., Dong, J., Fu, Y., Li, Y., Luan, Z., Peng, X. (2021). Predicting Breast Cancer 5-Year Survival using Machine Learning: A Systematic Review, PloS one, vol. 16(4), pp. 1-23. doi: 10.1371/journal.pone.0250370
Manikandan, P., Durga, U., & Ponnuraja, C. (2023). An Integrative Machine Learning Framework for Classifying SEER Breast Cancer, Scientific Reports, vol. 13(1), pp. 1-12. doi:10.1038/s41598-023-32029-1
Mohaimenul, I., & Poly, T. N. (2019). Machine Learning Models of Breast Cancer Risk Prediction. bioRxiv, pp. 1-5. doi:10.1101/723304
Mohsin, R. N., & Mohamad, B. J. (2024). Clinical and Histopathological Features of Breast Cancer in Iraqi Patients between 2018-2021. Iraqi Journal of Science, vol. 65(1), pp. 90-107. doi:10.24996/ijs.2024.65.1.9
Mueller, T., Segin, A., Weigand, C., & Schmitt, R. H. (2023). Feature Selection for Measurement Models. International Journal of Quality & Reliability Management, vol. 40(3), pp. 777-800. doi: 10.1108/ IJQRM-07-2021-0245
Naji, M. A., El Filali, S., Aarika, K., Benlahmar, El H., Abdelouhahid, R. A., & Debauche, O. (2021). Machine Learning Algorithms for Breast Cancer Prediction and Diagnosis, Procedia Computer Science, vol. 191, pp. 487-492. doi:10.1016/j.procs.2021.07.062
Obuchowicz, R., Strzelecki, M., & Piórkowski, A. (2024). Clinical Applications of Artificial Intelligence in Medical Imaging and Image Processing - A Review, Cancers, vol. 16(10), pp. 1-16. doi:10.3390/cancers16101870
Parekh, D. H., & Dahiya, V. I. S. H. A. L. (2023). Early Detection of Breast Cancer Using Machine Learning and Ensemble Techniques. International Journal of Computing, vol. 22(2), pp. 231-237. doi:10.47839/ ijc.22.2.3093
Poornajaf, M., & Yosefi, S. (2023). Improvement of the Performance of Machine Learning Algorithms in Predicting Breast Cancer. Frontiers in Health Informatics, vol. 12, pp. 1-7. doi: 10.30699/fhi. v12i1.400
Rabiei, R., Ayyoubzadeh, S. M., Sohrabei, S., Esmaeili, M., & Atashi, A. (2022). Prediction of Breast Cancer using Machine Learning Approaches, Journal of Biomedical Physics & Engineering, vol. 12(3), pp. 297-308. doi: 10.31661/jbpe.v0i0.2109-1403
Reshan, M. S. A., Amin, S., Zeb, M. A., Sulaiman, A., Alshahrani, H., Azar, A. T., & Shaikh, A. (2023). Enhancing Breast Cancer Detection and Classification using Advanced Multi-Model Features and Ensemble Machine Learning Techniques, Life, vol. 13(10), pp. 1-20. doi:10.3390/life13102093
Roheel, A., Khan, A., Anwar, F., Akbar, Z., Akhtar, M. F., Imran Khan, M., Sohail, M.F., & Ahmad, R. (2023). Global Epidemiology of Breast Cancer Based on Risk Factors: A Systematic Review. Frontiers in Oncology, vol. 13, pp. 1-15. doi: 10.3389/fonc.2023. 1240098.
Syamsiah Mashohor, D. N. F. P. M., Mahmud, R., Hanafi, M., & Bahari, N. (2023). Transition of Traditional Method to Deep Learning Based Computer-Aided System for Breast Cancer using Automated Breast Ultrasound System (ABUS) Images: A Review. Artificial Intelligence Review, vol. 56, pp. 15271-15300. doi:10.1007/s10462-023-10511-6
Tapak, L., Shirmohammadi-Khorram, N., Amini, P., Alafchi, B., Hamidi, O., Poorolajal, J. (2019). Prediction of Survival and Metastasis in Breast Cancer Patients using Machine Learning Classifiers, Clinical Epidemiology and Global Health, vol. 7(3), pp. 293-299. doi:10.1016/j.cegh.2018.10.003
Yadav, R. K., Singh, P., & Kashtriya, P. (2023). Diagnosis of Breast Cancer using Machine Learning Techniques - A Survey. Procedia Computer Science, vol. 218, pp. 1434-1443. doi:10.1016/j.procs.2023.01.122
Zhang, S., Jin, Z., Bao, L., & Shu, P. (2024). The Global Burden of Breast Cancer in Women from 1990 to 2030: Assessment and Projection Based on the Global Burden of Disease Study 2019. Frontiers in Oncology, vol. 14, pp. 1-13. doi: 10.3389/fonc.2024.1364397
Zhu, J. J., Yang, M., & Ren, Z. J. (2023). Machine Learning in Environmental Research: Common Pitfalls and Best Practices, Environmental Science & Technology, vol. 57(46), pp. 17671-17689. doi:10.1021/acs.est.3c00026
Section
Technical Briefs