Comparison of Machine Learning Algorithms in Detecting Contaminants in Drinkable Water

Souhayla Elmeftahi, Maulana Decky Rakhman, Alam Rahmatulloh

Abstract

Water, a vital natural resource essential for human existence, is a fundamental human right, indispensable for a dignified life. Despite its significance, the quality of water is often compromised by a myriad of harmful substances, minerals, and contaminants stemming from various sectors like industry, agriculture, residential, and energy. Traditional methods such as WQI and STORET, relying on manual inspection, prove time-consuming. Thus, the integration of machine learning emerges as a pivotal solution to swiftly assess water quality.

Numerous studies have explored this challenge using various algorithms; however, a definitive comparison is elusive due to the abundance of existing methods. In response, this research undertakes a meticulous evaluation of seven algorithms to ascertain the optimal approach for water quality classification, employing metric values as benchmarks. Notably, the Random Forest algorithm emerges as the most effective, achieving an impressive accuracy of approximately 84.8%. Following closely are the XGBoost and CatBoost algorithms, showcasing commendable performance with accuracies of 82.9% and 80.2%, respectively. Subsequent rankings include the Decision Tree algorithm at 77.3%, SVM at 72.3%, K-NN at 70.6%, and AdaBoost with the lowest accuracy at 63.33%. This comparative analysis contributes valuable insights for informed decision-making in water quality assessment.

    

Full Text:

PDF (7-14)

Refbacks

  • There are currently no refbacks.