Comparative Analysis of Distance Functions on DBSCAN Algorithm: Mapping Malnourished Toddlers in Medan City, Indonesia

Authors

  • Ichwanul Muslim Karo Karo Medan State University
  • Mohd Farhan Bin Md. Fudzee Universiti Tun Hussein Onn Malaysia
  • Shahreen Binti Kasim Universiti Tun Hussein Onn Malaysia
  • Azizul Azhar Ramli Universiti Tun Hussein Onn Malaysia
  • Jemal H. Abawajy Deakin University
  • Mohammad Syafwan Arshad MZR Global Sdn Bhd, Shah Alam, MALAYSIA

Keywords:

Malnutrition toddler, DBSCAN, distance functions, silhouette index, Minpts

Abstract

Medan City is one of Indonesia's largest cities and faces fundamental challenges in addressing malnourished toddlers. It had a stunting prevalence of 19.9% in 2022. The high rates necessitate a practical approach to identifying and managing high-risk areas. This study aims to map districts in Medan City based on the spatial data of public health center locations and malnutrition data for toddlers, using the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. DBSCAN is a popular clustering algorithm because of its ability to group data based on density and detect outliers as noise. However, using the Euclidean distance function in DBSCAN may not be appropriate for all geospatial cases. The novelty lies in comparing five distance functions (Euclidean, Manhattan, Minkowski, Cosine, Chebyshev) within DBSCAN to determine which produces the most meaningful clustering in a geospatial health context. The study shows that DBSCAN with the Chebyshev distance function cannot effectively map the malnutrition problem in toddlers, as indicated by a Silhouette index (SI) value below 0.25. The clustering quality using Minkowski and Cosine distance functions in DBSCAN is not superior to that of the classical DBSCAN, with all three producing weak and unclear structures. The most effective mapping results come from using the Manhattan distance function in DBSCAN, which yields an SI value of 0.51045, two clusters, and optimal parameters of Minpts = 6–9 and ε = 6.98–7.8. The first cluster includes two districts (Medan Labuhan and Marelan), while the remaining districts form the second cluster. The analysis of different distance functions provides new insights into how selecting the appropriate distance measure can influence clustering quality in a geospatial context with DBSCAN. The similarity of the clusters is expected to inform decision-making in addressing toddler malnutrition issues in Medan City.

Downloads

Download data is not yet available.

Downloads

Published

30-06-2025

Issue

Section

Articles

How to Cite

Karo Karo, I. M., Bin Md. Fudzee , M. F. ., Binti Kasim, S., Ramli, A. A. ., Abawajy, J. H. ., & Mohammad Syafwan Arshad. (2025). Comparative Analysis of Distance Functions on DBSCAN Algorithm: Mapping Malnourished Toddlers in Medan City, Indonesia. Journal of Soft Computing and Data Mining, 6(1), 262-277. https://penerbit.uthm.edu.my/ojs/index.php/jscdm/article/view/19140