Multi-Layer Perceptron For Diagnosing Stroke With The SMOTE Method In Overcoming Data Imbalances

M. Hafidz Ariansyah, Sri Winarno, Esmi Nur Fitri, Helynda Mulya Arga Retha

Abstract

Stroke is the sudden loss of brain function due to an interruption of the blood supply to the brain. Stroke is a dangerous disease that can even cause death for patients. The diagnosis of stroke must be made quickly and precisely to increase the likelihood that the patient can live a normal life again. In making a diagnosis, several factors can influence the patient to get a stroke diagnosis, including symptoms of hypertension to heart disease. From these problems, the researcher wants to classify the diagnosis of stroke so that stroke can get earlier treatment so that patients do not experience prolonged illness. The data used in this study is a stroke dataset with 4861 data labeled 0 which indicates no stroke, and 249 data labeled 1 which indicates a stroke diagnosis. This study uses the Synthetic Minority Over-sampling (SMOTE) method that will be applied to the Multi-Layer Perceptron algorithm so that researchers can get the performance of the stroke diagnosis classification model. Researchers use the SMOTE method so that the data in the classification model is balanced so that the model can make accurate predictions and avoid overfitting on the Multi-Layer Perceptron so that the accuracy in predicting stroke is better than just using an ordinary Multi-Layer Perceptron. The results of the confusion matrix analysis show that SMOTE can increase the prediction of stroke diagnosis from 12,5% to 84,89% in optimal test.

Full Text:

PDF (1-8)

References

R. Perna, L. Harik, "The role of rehabilitation psychology in stroke care described through case examples", NeuroRehab, vol.46, no.2, pp. 195-204, 2020, 10.3233/NRE-192970.

T. G. Rahayu, "Hubungan Pengetahuan dan Sikap Keluarga Dengan Risiko Kejadian Stroke Berulang", JIKP, vol.9, no.2, pp. 140–146, 2020.

E. Ernawati, S. Sovia, and D. Nomiko, "Family Coaching terhadap Pelaksanaan Tugas Kesehatan Keluarga pada Klien Stroke", JKS, vol.6, no.1, pp. 109-116, 2022, https://doi.org/10.31539/jks.v6i1.3847.

Kementrian Kesehatan Republik Indonesia, "Hasil utama Riskesdas 2018", In Kementerian Kesehatan Badan Penelitian dan Pengembangan Kesehatan, 2018, https://kesmas.kemkes.go.id/assets/upload/dir_519d41d8cd98f00/files/Hasil- riskesdas-2018_1274.pdf, Accessed in 02 Jan 2023.

Dinas Kesehatan Provinsi Jambi, "Profile Health Department of Health Jambi Province", 2020, http://dinkes.jambiprov.go.id/file/informasi_publik/MTYxNTE2NDQyOA_Wkt1615164428_XtLnBkZg.pdf, Accessed in 02 Jan 2023.

R. E. Pambudi, S. Sriyanto, and F. Firmansyah, "Klasifikasi Penyakit Stroke Menggunakan Algoritma Decision Tree C. 45", TEKNIKA, vol.16, no.2, pp. 221-226, 2022, https://doi.org/10.5281/zenodo.7535865.

D.Elreedy, and A. F Atiya, "A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance", Information Sciences, vol. 505, pp. 32-64, 2019, https://doi.org/10.1016/j.ins.2019.07.070.

X. W. Liang, A. P. Jiang, T. Li, Y. Y. Xue, and G. T. Wang, "LR-SMOTE—An improved unbalanced data set oversampling based on K-means and SVM", KBS, vol. 196, pp. 105845, 2020, https://doi.org/10.1016/j.knosys.2020.105845.

J. Kusuma, B. H. Hayadi, W. Wanayumini, And R. Rosnelly, "Komparasi Metode Multi Layer Perceptron (MLP) dan Support Vector Machine (SVM) untuk Klasifikasi Kanker Payudara", MIND, vol. 7, no.1, pp. 51-60, 2022, https://doi.org/10.26760/mindjournal.v7i1.51-60.

E. Chamseddine, N. Mansouri, M. Soui, and M. Abed, "Handling class imbalance in COVID-19 chest X-ray images classification: Using SMOTE and weighted loss", Applied Soft Computing, vol. 129, p. 109588, 2022, https://doi.org/10.1016/j.asoc.2022.109588.

A. F. Hardiyanti, and D. Fitrianah, "Perbandingan Algoritma C4. 5 dan Multilayer Perceptron untuk Klasifikasi Kelas Rumah Sakit di DKI Jakarta", InComTech, vol.11, no.3, 198-209, 2021, 10.22441/incomtech.v11i3.10632.

F. Fedesoriano, “Stroke Prediction Datasetâ€, Kaggle, 2021, https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset, Accessed in 20 Dec 2023.

Q. A'yuniyah, E. Tasia, N. Nazira, P. F. Pratama, M. R. Anugrah, J. Adhiva, and M. Mustakim, "Implementasi Algoritma Naïve Bayes Classifier (NBC) untuk Klasifikasi Penyakit Ginjal Kronik", JSON, vol. 4, no. 1, 72-76, 2022, http://dx.doi.org/10.30865/json.v4i1.4781.

Y. Zhang, M. Safdar, J. Xie, J. Li, M. Sage, and Y. F. Zhao, "A systematic review on data of additive manufacturing for machine learning applications: the data quality, type, preprocessing, and management", JIM, pp. 1-36, 2022, https://doi.org/10.1007/s10845-022-02017-9.

H. Hairani, K. E. Saputro, and S. Fadli, "K-means-SMOTE untuk menangani ketidakseimbangan kelas dalam klasifikasi penyakit diabetes dengan C4. 5, SVM, dan Naive Bayes", JTSK, vol. 8, no.2, pp. 89-93, 2020, https://doi.org/10.14710/jtsiskom.8.2.2020.89-93.

F. D. Astuti, and F. N. Lenti, "Implementasi SMOTE untuk mengatasi Imbalance Class pada Klasifikasi Car Evolution menggunakan K-NN", JUPITER, vol. 13, no. 1, pp. 89-98, 2021.

X. W. Liang, A. P. Jiang, T. Li, Y. Y. Xue, and G. T. Wang, "LR-SMOTE—An improved unbalanced data set oversampling based on K-means and SVM", KBS, vol. 196, pp. 105845, 2020, https://doi.org/10.1016/j.knosys.2020.105845.

J. Sun, H. Li, H. Fujita, B. Fu, and W. Ai, "Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with SMOTE and time weighting", Info Fusion, vol. 54, pp. 128-144, 2020, https://doi.org/10.1016/j.inffus.2019.07.006.

T. Pan, J. Zhao, W. Wu, and J. Yang, "Learning imbalanced datasets based on SMOTE and Gaussian distribution", Info Sci, vol. 512, pp. 1214-1233, 2020, https://doi.org/10.1016/j.ins.2019.10.048.

R. Y. Choi, A. S. Coyner, J. Kalpathy-Cramer, M. F. Chiang, and J. P. Campbell, "Introduction to machine learning, neural networks, and deep learning", TVST, vol. 9, no.2, pp. 14-14, 2020, https://doi.org/10.1167/tvst.9.2.14.

M. Desai, and M. Shah, "An anatomization on breast cancer detection and diagnosis employing multi-layer perceptron neural network (MLP) and Convolutional neural network (CNN)", Clinical eHealth, vol. 4, pp. 1-11, 2021, https://doi.org/10.1016/j.ceh.2020.11.002.

S. Nosratabadi, S. Ardabili, Z. Lakner, C. Mako, and A. Mosavi, "Prediction of food production using machine learning algorithms of multilayer perceptron and ANFIS", Agriculture, vol. 11, no. 5, pp. 408, 2021, https://doi.org/10.3390/agriculture11050408.

M. H. Ariansyah, S. Winarno, and A. Salam, "STB Sentiment Analysis Classification Multiclass Modeling Using Calibrated Classifier With SGDC Tuning As Basis and Sigmoid Method", International Journal of Computer and Information System (IJCIS), vol. 4, no. 1, pp. 1-7, 2023, https://doi.org/10.29040/ijcis.v4i1.107.

J. Xu, Y. Zhang, and D. Miao, "Three-way confusion matrix for classification: A measure driven view", Info sci, vol. 507, pp. 772-794, 2020, https://doi.org/10.1016/j.ins.2019.06.064.

Refbacks

  • There are currently no refbacks.