Effectiveness of SMOTE-ENN to Reduce Complexity in Classification Model

Authors

  • Ines Riantika Student
  • Bagus Sartono IPB University
  • Khairil Anwar Notodiputro IPB University

DOI:

https://doi.org/10.29244/ijsa.v8i1p70-82

Keywords:

complexity measures, imbalance class, random forest, smote-enn

Abstract

A failure to produce classification models with high performance might be caused by the dataset's characteristics, such as the between-class overlapping and the class imbalance. The higher the data complexity, the more complicated it is for the algorithm to find good models.  Combining the issues of class imbalance and overlapping would make the problem more challenging. To deal with this problem, this research implemented a hybrid class-balancing technique named SMOTE-ENN. This technique adds observations to the minority class to balance the class frequencies.  After that, it removes some observations to reduce the degree of overlapping.  The research revealed that SMOTE-ENN succeeds in doing that.  We employed a random forest method to evaluate it. In 28 out of 46 cases we investigated, the new datasets generated by SMOTE-ENN could produce models with higher accuracy.

Downloads

Download data is not yet available.

Downloads

Published

11-06-2024

How to Cite

Riantika, I., Sartono, B., & Anwar Notodiputro, K. (2024). Effectiveness of SMOTE-ENN to Reduce Complexity in Classification Model. Indonesian Journal of Statistics and Its Applications, 8(1), 70–82. https://doi.org/10.29244/ijsa.v8i1p70-82

Issue

Section

Articles

Most read articles by the same author(s)