Handling Unbalanced Data with SMOTE Algorithm for Unemployment Classification in Lima Puluh Kota Regency Using CART Method

Authors

  • Aldwi Riandhoko Universitas Negeri Padang
  • Nonong Amalita Universitas Negeri Padang
  • Dodi Vionanda Universitas Negeri Padang
  • Admi Salma Universitas Negeri Padang

DOI:

https://doi.org/10.29244/ijsa.v8i2p166-177

Keywords:

AUC, CART, Lima Puluh Kota Regency, SMOTE, Unemployment

Abstract

Unemployment is a problem that occurs in the labor force, where high unemployment is caused by the low ability of the labor force. A region that is still experiencing unemployment problems in West Sumatera is Lima Puluh Kota Regency. Unemployment in Lima Puluh Kota Regency is caused by the low competence of human resources to fulfill employment market requirements.  Based on the results of the Sakernas survey in August 2023, Lima Puluh Kota Regency has more employed labor force than unemployed labor force, so this results in unbalanced data. A method that can overcome unbalanced data is Synthetic Minority Oversampling Technique (SMOTE). SMOTE is a technique with addition of synthetic data in minority class so that the proportion is balanced. Data imbalance conditions need to be handled so as to improve the performance of the classification model. Classification and Regression Trees (CART) is a classification technique with a decision tree method that can obtain the characteristics of a classification. The purpose of this research is to compare the CART model before and after applying SMOTE which can be measured by comparing the highest Area Under Curve (AUC) value. The AUC value in the CART method before SMOTE applied has a value of 62.1% while the AUC value in the CART method after SMOTE applied has a value of 70.2%. Therefore, it can be concluded that the CART classification analysis after SMOTE applied is able to provide better performance compared to the CART classification analysis before SMOTE applied.

Downloads

Download data is not yet available.

Author Biographies

Aldwi Riandhoko, Universitas Negeri Padang

Department of Statistics, Universitas Negeri Padang, Indonesia

Nonong Amalita, Universitas Negeri Padang

Department of Statistics, Universitas Negeri Padang, Indonesia

Dodi Vionanda, Universitas Negeri Padang

Department of Statistics, Universitas Negeri Padang, Indonesia

Admi Salma, Universitas Negeri Padang

Department of Statistics, Universitas Negeri Padang, Indonesia

Downloads

Published

31-12-2024

How to Cite

Aldwi Riandhoko, Amalita, N., Vionanda, D., & Salma, A. (2024). Handling Unbalanced Data with SMOTE Algorithm for Unemployment Classification in Lima Puluh Kota Regency Using CART Method. Indonesian Journal of Statistics and Its Applications, 8(2), 166–177. https://doi.org/10.29244/ijsa.v8i2p166-177

Issue

Section

Articles