Optimization of k-means clustering using particle swarm optimization algorithm for human development index

Authors

  • Ufil Hidayatul Laili Universitas Islam Negeri Maulana Malik Ibrahim
  • Muhammad Faisal Universitas Islam Negeri Maulana Malik Ibrahim
  • Fachrul Kurniawan Universitas Islam Negeri Maulana Malik Ibrahim

DOI:

https://doi.org/10.31763/businta.v8i1.678

Keywords:

human development index, k-means clustering, particle swarm optimization

Abstract

K-Means algorithm can be used to cluster the Human Development Index in East Java in particular for the people, the hope is that with this development all the problems that exist in the community including poverty, unemployment, school dropouts, health and social inequality can be resolved. However, this algorithm has a weakness that is sensitive to the determination of the initial centroid. Initial centroids that are determined randomly will reduce the level of accuracy, often get stuck at the local optimum, and get random solutions. Optimization algorithms such as PSO can overcome this by determining the optimal initial centroid. The quality of clusters produced by K-Means algorithm with and without PSO algorithm is measured using the average Silhouette Coefficient (SC). In this study, better accuracy was obtained between pure kmeans and PSO based kmeans where the comparison value of pure kmeans was 0.27% while PSO based kmeans obtained a value of 0.34%. The Human Development Index data set was obtained from the official website of the Central Bureau of Statistics and used as secondary data in this study, especially the East Java region. In addition to program planning in the following year, the clustering carried out from 2019 to 2022 is also an evaluation of the East Java Provincial Government's program targets that have been implemented in that year, especially related to the human quality of life development program.   

References

U. Masduki, W. Rindayati, and S. Mulatsih, “How can quality regional spending reduce poverty and improve human development index?,” J. Asian Econ., vol. 82, p. 101515, Oct. 2022, doi: 10.1016/j.asieco.2022.101515.

M. R. N. Sagara, M. M. Sari, I. Y. Septiariva, and I. W. K. Suryawan, “Relationship between Human Development Index and Gross Regional Domestic Product on Sanitation Access in East Java Region in Achieving Sustainable Development Goals,” J. Perenc. Pembang. Indones. J. Dev. Plan., vol. 6, no. 2, pp. 267–276, 2022, doi: 10.36574/jpp.v6i2.298.

I. Z. A. Aliyadzi, Muchtolifah, and Sishadiyati, “Testing the Kuznets Hypothesis on Income Disparities and Economic Growth in the Horseshoe Region,” J. Res. Business, Econ. Educ., vol. 3, no. 4, pp. 33–48, 2021. [Online]. Available at: https://e-journal.stiekusumanegara.ac.id/index.php/jrbee/article/view/267.

W. Hartanto, N. N. Islami, L. O. Mardiyana, F. A. Ikhsan, and A. Rizal, “Analysis of human development index in East Java Province Indonesia,” IOP Conf. Ser. Earth Environ. Sci., vol. 243, p. 012061, Apr. 2019, doi: 10.1088/1755-1315/243/1/012061.

R. Knevels et al., “Kulturlandschaft im Wandel: Ein indikatorenbasierter Rückblick bis in das 19. Jahrhundert. Fallstudie anhand der Gemeinden Waidhofen/Ybbs und Paldau,” Mitteilungen der Österreichischen Geogr. Gesellschaft, vol. 1, pp. 255–285, 2021, doi: 10.1553/moegg162s255.

N. Ishari, T. N. Ibad, and M. Farid, “Restoration of the Selo Gending Lumajang Site Following the Religious Dualist Controversy,” Engagem. J. Pengabdi. Kpd. Masy., vol. 7, no. 1, pp. 118–129, May 2023, doi: 10.29062/engagement.v7i1.1521.

G. Li et al., “Research on the Natural Language Recognition Method Based on Cluster Analysis Using Neural Network,” Math. Probl. Eng., vol. 2021, pp. 1–13, May 2021, doi: 10.1155/2021/9982305.

P. Xu and J. Lu, “Towards a unified framework for string similarity joins,” Proc. VLDB Endow., vol. 12, no. 11, pp. 1289–1302, Jul. 2019, doi: 10.14778/3342263.3342268.

P. Xu and J. Lu, “Efficient Taxonomic Similarity Joins with Adaptive Overlap Constraint,” in Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Oct. 2018, pp. 1563–1566, doi: 10.1145/3269206.3269236.

M. Y. Matdoan and L. Igo, “Application of x-means alghorithm for district/city clustering based on povetry rate in Maluku Islands and Papua,” J. Mat. Dan Ilmu Pengetah. Alam LLDikti Wil. 1, vol. 3, no. 1, pp. 14–20, Mar. 2023, doi: 10.54076/jumpa.v3i1.270.

P. R. Bhaladhare and D. C. Jinwala, “A Clustering Approach for the Diversity Model in Privacy Preserving Data Mining Using Fractional Calculus-Bacterial Foraging Optimization Algorithm,” Adv. Comput. Eng., vol. 2014, pp. 1–12, Sep. 2014, doi: 10.1155/2014/396529.

R. B. K. F. C-means, “A Precise Telecom Customer Tariff Promotion Method Based on Multi-route Radial Basis Kernel Fuzzy C-means Clustering,” in Algorithms and Architectures for Parallel Processing: 20th International Conference, ICA3PP 2020, New York City, NY, USA, October 2–4, 2020, Proceedings, Part II, 2020, vol. 12453, p. 321. [Online]. Available at: https://link.springer.com/chapter/10.1007/978-3-030-60239-0_22.

A. Pranolo, Y. Mao, A. P. Wibawa, A. B. P. Utama, and F. A. Dwiyanto, “Optimized Three Deep Learning Models Based-PSO Hyperparameters for Beijing PM2.5 Prediction,” Knowl. Eng. Data Sci., vol. 5, no. 1, p. 53, Jun. 2022, doi: 10.17977/um018v5i12022p53-66.

R. Chouhan and A. Purohit, “An approach for document clustering using PSO and K-means algorithm,” in 2018 2nd International Conference on Inventive Systems and Control (ICISC), Jan. 2018, pp. 1380–1384, doi: 10.1109/ICISC.2018.8399034.

F. Ros, R. Riad, and S. Guillaume, “PDBI: A partitioning Davies-Bouldin index for clustering evaluation,” Neurocomputing, vol. 528, pp. 178–199, Apr. 2023, doi: 10.1016/j.neucom.2023.01.043.

S. Wang, D. Tang, Y. Liu, J. Chen, Y. Yan, and J. Zhang, “LDoS Attack Detection using PSO and K-means Algorithm,” in 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), May 2021, pp. 317–322, doi: 10.1109/CSCWD49262.2021.9437702.

Q. Pu, J. Gan, L. Qiu, J. Duan, and H. Wang, “An efficient hybrid approach based on PSO, ABC and k-means for cluster analysis,” Multimed. Tools Appl., vol. 81, no. 14, pp. 19321–19339, Jun. 2022, doi: 10.1007/s11042-021-11016-6.

J. A. Cunningham, M. Menter, and C. Young, “A review of qualitative case methods trends and themes used in technology transfer research,” J. Technol. Transf., vol. 42, no. 4, pp. 923–956, Aug. 2017, doi: 10.1007/s10961-016-9491-6.

S. Dash, S. K. Shakyawar, M. Sharma, and S. Kaushik, “Big data in healthcare: management, analysis and future prospects,” J. Big Data, vol. 6, no. 1, p. 54, Dec. 2019, doi: 10.1186/s40537-019-0217-0.

Suad A. Alasadi and Wesam S. Bhaya, “Review of Data Preprocessing Techniques,” Jurnal of Engineering and Applied Sciences, vol. 12, no. 16. pp. 4102–4107, 2017. [Online]. Available at: https://medwelljournals.com/abstract/?doi=jeasci.2017.4102.4107.

T. M. Shami, A. A. El-Saleh, M. Alswaitti, Q. Al-Tashi, M. A. Summakieh, and S. Mirjalili, “Particle Swarm Optimization: A Comprehensive Survey,” IEEE Access, vol. 10, pp. 10031–10061, 2022, doi: 10.1109/ACCESS.2022.3142859.

K. Sirait, Tulus, and E. B. Nababan, “K-Means Algorithm Performance Analysis With Determining The Value Of Starting Centroid With Random And KD-Tree Method,” J. Phys. Conf. Ser., vol. 930, p. 012016, Dec. 2017, doi: 10.1088/1742-6596/930/1/012016.

S. Sendari, A. B. Putra Utama, N. S. Fanany Putri, P. Widiharso, and R. J. Putra, “K-Means and Fuzzy C-Means Optimization using Genetic Algorithm for Clustering Questions,” Int. J. Adv. Sci. Comput. Appl., vol. 1, no. 1, pp. 1–9, 2021, doi: 10.47679/ijasca.v1i1.2.

M. Alswaitti, M. Albughdadi, and N. A. M. Isa, “Density-based particle swarm optimization algorithm for data clustering,” Expert Syst. Appl., vol. 91, pp. 170–186, Jan. 2018, doi: 10.1016/j.eswa.2017.08.050.

H. Řezanková, “Different approaches to the silhouette coefficient calculation in cluster evaluation,” in 21st international scientific conference AMSE applications of mathematics and statistics in economics, 2018, pp. 1–10. [Online]. Available at: https://www.amse-conference.eu/old/2018/wp-content/uploads/2018/10/%C5%98ezankov%C3%A1.pdf.

R. Ünlü and P. Xanthopoulos, “Estimating the number of clusters in a dataset via consensus clustering,” Expert Syst. Appl., vol. 125, pp. 33–39, Jul. 2019, doi: 10.1016/j.eswa.2019.01.074.

S. M.AqilBurney and H. Tariq, “K-Means Cluster Analysis for Image Segmentation,” Int. J. Comput. Appl., vol. 96, no. 4, pp. 1–8, 2014, doi: 10.5120/16779-6360

Downloads

Published

2024-05-07

How to Cite

Laili, U. H., Faisal, M., & Kurniawan, F. (2024). Optimization of k-means clustering using particle swarm optimization algorithm for human development index. Bulletin of Social Informatics Theory and Application, 8(1), 144–151. https://doi.org/10.31763/businta.v8i1.678