Cyber Threat Detection Using an Ensemble Model Approach for Phishing Website Identification

Dani Rofianto, Egi Safitri, Khusnatul Amaliah, Jaka Fitra, Astria Hijriani

Abstract

The development of digital technology has had a significant impact on various aspects of life, including an increase in cybersecurity threats, especially phishing attacks. Phishing is a method of cyber fraud that manipulates victims to provide sensitive information by posing as a trusted entity. This research aims to develop and evaluate the effectiveness of several machine learning algorithms in detecting phishing websites. The methods used in this research include the application of Random Forest, Extra Trees, Multiple Layer Perceptron, Ada Boost, and Decision Tree algorithms on website datasets containing the characteristics of phishing and non-phishing sites. Performance evaluation is performed by measuring the accuracy, precision, recall, and F1 value of each algorithm. In addition, a voting technique is applied to combine the results of the best-performing algorithms with the aim of improving the overall detection accuracy. The results showed that the voting technique was able to provide superior results compared to the use of a single algorithm, with significant improvements in accuracy and recall values. These findings reinforce the importance of ensemble approaches in machine learning to improve phishing detection capabilities, which in turn contributes to improved cybersecurity.

Full Text:

PDF (81-89)

References

J. Guaña-Moya, M. A. Chiluisa-Chiluisa, P. del C. Jaramillo-Flores, D. Naranjo-Villota, E. R. Mora-Zambrano, and L. G. Larrea-Torres, “Ataques de phishing y cómo prevenirlos Phishing attacks and how to prevent them,” in 2022 17th Iberian Conference on Information Systems and Technologies (CISTI), 2022, pp. 1–6. doi: 10.23919/CISTI54924.2022.9820161.

N. Frevel, D. Beiderbeck, and S. L. Schmidt, “The impact of technology on sports – A prospective study,” Technol Forecast Soc Change, vol. 182, 2022, doi: 10.1016/j.techfore.2022.121838.

Kunal, M. Rana, D. Sharma, and Anurag, “Understanding Cyber-Attacks and their Impact on Global Financial Landscape,” in 2023 International Conference on Circuit Power and Computing Technologies (ICCPCT), 2023, pp. 1452–1456. doi: 10.1109/ICCPCT58313.2023.10245828.

Ö. Aslan, S. S. Aktuğ, M. Ozkan-Okay, A. A. Yilmaz, and E. Akin, “A Comprehensive Review of Cyber Security Vulnerabilities, Threats, Attacks, and Solutions,” 2023. doi: 10.3390/electronics12061333.

A. A. Hasegawa, N. Yamashita, M. Akiyama, and T. Mori, “Experiences, Behavioral Tendencies, and Concerns of Non-Native English Speakers in Identifying Phishing Emails,” Journal of Information Processing, vol. 30, 2022, doi: 10.2197/ipsjjip.30.841.

Z. Alkhalil, C. Hewage, L. Nawaf, and I. Khan, “Phishing Attacks: A Recent Comprehensive Study and a New Anatomy,” 2021. doi: 10.3389/fcomp.2021.563060.

A. Redi and N. Ernasari, “Efforts to Overcome Web-Based Phishing Crimes in the World of Cyber Crime,” in Proceedings of the 3rd Multidisciplinary International Conference, MIC 2023, 28 October 2023, Jakarta, Indonesia, EAI, 2023. doi: 10.4108/eai.28-10-2023.2341807.

Bhuvana, A. S. Bhat, T. Shetty, and Mr. P. Naik, “A Study on Various Phishing Techniques and Recent Phishing Attacks,” International Journal of Advanced Research in Science, Communication and Technology, 2021, doi: 10.48175/ijarsct-2094.

S. Zhang, Z. Yan, K. Dong, H. Li, and X. Yuchi, “Phishing Domain Name Detection Based on Hierarchical Fusion of Multimodal Features,” in 2022 IEEE 16th International Conference on Big Data Science and Engineering (BigDataSE), IEEE, Dec. 2022, pp. 1–6. doi: 10.1109/BigDataSE56411.2022.00010.

S. Oh and T. Shon, “Cybersecurity Issues in Generative AI,” in 2023 International Conference on Platform Technology and Service, PlatCon 2023 - Proceedings, 2023. doi: 10.1109/PlatCon60102.2023.10255179.

N. S. M. Mizan, M. Y. Ma’arif, N. S. M. Satar, and S. M. Shahar, “Cnds-cybersecurity: Issues and challenges in asean countries,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 8, no. 1.4 S1, 2019, doi: 10.30534/ijatcse/2019/1781.42019.

S. K. Khan, N. Shiwakoti, P. Stasinopoulos, and M. Warren, “Cybersecurity regulatory challenges for connected and automated vehicles – State-of-the-art and future directions,” Transp Policy (Oxf), vol. 143, 2023, doi: 10.1016/j.tranpol.2023.09.001.

S. Chanti and T. Chithralekha, “A literature review on classification of phishing attacks,” 2022. doi: 10.19101/IJATEE.2021.875031.

“APWG: Phishing Activity Trends Report Q4 2018,” Computer Fraud & Security, vol. 2019, no. 3, pp. 4–4, Jan. 2019, doi: 10.1016/S1361-3723(19)30025-9.

S. Asiri, Y. Xiao, S. Alzahrani, S. Li, and T. Li, “A Survey of Intelligent Detection Designs of HTML URL Phishing Attacks,” IEEE Access, vol. 11, pp. 6421–6443, 2023, doi: 10.1109/ACCESS.2023.3237798.

W. Li, S. Manickam, S. U. A. Laghari, and Y.-W. Chong, “Uncovering the Cloak: A Systematic Review of Techniques Used to Conceal Phishing Websites,” IEEE Access, vol. 11, pp. 71925–71939, 2023, doi: 10.1109/ACCESS.2023.3293063.

L. Tang and Q. H. Mahmoud, “A Survey of Machine Learning-Based Solutions for Phishing Website Detection,” 2021. doi: 10.3390/make3030034.

A. Ozcan, C. Catal, E. Donmez, and B. Senturk, “A hybrid DNN–LSTM model for detecting phishing URLs,” Neural Comput Appl, vol. 35, no. 7, pp. 4957–4973, Mar. 2023, doi: 10.1007/s00521-021-06401-z.

A. Karim, M. Shahroz, K. Mustofa, S. B. Belhaouari, and S. R. K. Joga, “Phishing Detection System Through Hybrid Machine Learning Based on URL,” IEEE Access, vol. 11, 2023, doi: 10.1109/ACCESS.2023.3252366.

K. Subashini and V. Narmatha, “Website Phishing Detection of Machine Learning Approach using SMOTE method,” in 2023 Fifth International Conference on Electrical, Computer and Communication Technologies (ICECCT), IEEE, Feb. 2023, pp. 1–5. doi: 10.1109/ICECCT56650.2023.10179745.

M. A. Al Ahasan, M. Hu, and N. Shahriar, “OFMCDM/IRF: A Phishing Website Detection Model based on Optimized Fuzzy Multi-Criteria Decision-Making and Improved Random Forest,” in 2023 Silicon Valley Cybersecurity Conference (SVCC), IEEE, May 2023, pp. 1–8. doi: 10.1109/SVCC56964.2023.10165344.

P. Subarkah and A. N. Ikhsan, “Identifikasi Website Phishing Menggunakan Algoritma Classification And Regression Trees (CART),” Jurnal Ilmiah Informatika, vol. 6, no. 2, pp. 127–136, Dec. 2021, doi: 10.35316/jimi.v6i2.1342.

R. Krishnamurthi, A. Kumar, D. Gopinathan, A. Nayyar, and B. Qureshi, “An overview of iot sensor data processing, fusion, and analysis techniques,” 2020. doi: 10.3390/s20216076.

R. Yamada, D. Okada, J. Wang, T. Basak, and S. Koyama, “Interpretation of omics data analyses,” 2021. doi: 10.1038/s10038-020-0763-5.

V. Fey, D. Jambulingam, H. Sara, S. Heron, C. Sipeky, and J. Schleutker, “Biocpr–a tool for correlation plots,” Data (Basel), vol. 6, no. 9, 2021, doi: 10.3390/data6090097.

Refbacks

  • There are currently no refbacks.