Xplore: Journal of Statistics

Perbandingan Metode K-Means, K-Medoids, dan Hierarki untuk Penggerombolan Kabupaten/Kota di Sumatera Utara Berdasarkan Jenis Kekerasan terhadap Perempuan

Ardelle Albani — 2023-12-31

Violence is an act that is intentionally by a person or group with the aim of oppressing someone to suffer. The high rate of depression which leads to acts of violence against those around them, including women. This study aims to compare the clustering of districts/cities in North Sumatra based on the type of violence against women using the K-Means, K-Medoids, and Hierarchical Methods, and look at the characteristics of the regional distribution of the clustering results. This study uses data the number of victims of violence against women in 2021 from Sistem Informasi Online Perlindungan Perempuan dan Anak and Badan Pusat Statistik Sumatera which consists of 33 districts/cities and six variables. Evaluation of clustering results using the Davies Bouldin Index and Cophenetic Correlation. The optimal cluster analysis method obtained is K-Medoids with 4 clusters. Cluster 1 has the lowest average value for each type of violence. Group 2 has an average value of physical violence that is higher than the average. Group 3 has a higher average score for the types of psychological violence, trafficking, and neglect than the average. Cluster 4 had a highest average score for each type of violence except trafficking than the average.

Analisis Status Indeks Desa Membangun di Kabupaten Buleleng dengan Metode CHAID

Grashella Clara Nesa Br Ginting — 2023-12-31

The integration of the SDGs into the regional development agenda is being carried out as an effort to accelerate national development. To achieve the accuracy of village development targets, the Ministry of Villages established the village development index. The province of Bali has the highest average IDM score in 2021. Of the nine districts in Bali Province, Buleleng Regency has the lowest IDM value, which is 0,7361. Therefore, the purpose of this study is to identify the variables that affect IDM in Buleleng Regency using Chi-squared Automatic Interaction Detection (CHAID) method. There is an imbalance of data from three categories of response variables. The method that is used to handle that is Synthetic Minority Oversampling Technique (SMOTE). The analysis in this study developed the CHAID model without SMOTE, with SMOTE, merging response variables categories without and with SMOTE. The results of this study indicate that the CHAID model with SMOTE without merging response variables categories performs better in classifying the status of the village development index in Buleleng Regency. The obtained accuracy is 57,3%. The variables that influenced the IDM status in Buleleng Regency are population density, distance to the regency capital, and number of recipients of direct cash assistance.

Identifikasi Faktor yang Memengaruhi Tingkat Pengangguran Terbuka di Jawa Barat Menggunakan Regresi Data Panel

Febrina Nurhijah — 2023-12-31

West Java is the province with the highest open unemployment rate in Indonesia by 2022, at 8,31%. This research aims to determine the appropriate model for estimating the Open Unemployment Rate (TPT) in West Java Province using panel data regression so that it can identify the variables that influence it. The data used in this study are secondary data obtained from BPS for five years (2018–2022) and comprised of 27 districts/cities. The observed response variable is the unemployment rate of the open county/city in the West Java province. The factors used as explanatory variables are the participation rate of the labor force, average school age, per capita output, population growth rate, percentage of the poor population, high school gross participation, regional gross domestic product, district/city minimum wage, foreign capital investment, and domestic capital investment. The study results showed that the panel data regression model suitable to describe the open unemployment rate in West Java was a Random Effect Model with a two-way specific effect influence with an R-squared value of 41,71%. Variables significantly affecting the 5% significance level are the labor force participation rate, population growth rate, and district/city minimum wages.

ANALISIS BURSA SAHAM, KOMODITAS PERTAMBANGAN, KURS, DAN INDEKS SAHAM SEKTOR ENERGI MENGGUNAKAN VECTOR ERROR CORRECTION MODEL

Melati — 2023-12-31

Energy Sector is one of the sectors that has a significant impact on the overall economic growth of a country. Economic growth is always linked to energy consumption, as increasing economic development leads to higher energy demand. Therefore, this study aims to analyze the factors influencing the energy sector stock index in Indonesia using Vector Error Correction Model (VECM). The data used include the energy sector stock index, crude oil prices, coal prices, gas prices, Nikkei Index, Shanghai Index, Dow Jones Index, and exchange rates from January 2021 to March 2023. VECM analysis results indicate that in the short term, crude oil prices and coal prices have a significant impact on the energy sector stock index. In the long term, significant factors are coal prices, gas prices, Nikkei Index, and exchange rates. The Impulse Response Function (IRF) analysis reveals that shocks to the energy sector stock index, crude oil prices, and coal prices can increase the energy sector stock index. Conversely, shocks to the Nikkei Index can decrease the energy sector stock index. The Forecast Error Variance Decomposition (FEVD) results demonstrate that the contributions of the energy sector stock index, crude oil prices, coal prices, and gas prices are significant in explaining the behavior of changes in the energy sector stock index.

Penerapan Model Regresi Logistik Biner dan Random Forest terhadap Prospek Atlet Muda pada Liga Basket DBL Tahun 2019

Marta Nur Muhammad — 2023-12-31

Athletic performance is a key indicator of the success of athlete development within a sports discipline, including basketball. Effective development requires a competitive and professional platform for talent identification, such as the DBL East Java Series basketball league for senior high school students. A well-organized competition supports positive athlete development, enabling the evaluation of individual prospects through game statistics. Athlete prospects reflect future potential arising from present performance and are categorized according to the Indonesia Emas (PRIMA) Program Guidelines established by the Indonesian National Sports Committee in 2015, which define the Pratama class for athletes competing at national or regional levels. This study develops classification models to predict athlete prospects using match-level statistics. To address class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) is applied, while k-fold cross-validation is used to obtain robust model estimates. The findings show that all constructed models achieve strong predictive performance based on the Area Under the Curve (AUC). Furthermore, the variables points scored—representing scoring ability—and assists—representing ball-handling and playmaking ability—are identified as the most influential predictors of young athlete prospects in both the binary logistic regression and random forest models.

Penerapan Algoritma C4.5 dan Random Forest pada Tingkat Penjualan Serum Somethinc di Shopee

Rismayanti — 2023-12-31

Online buying and selling activities in Indonesia are increasing. Shopee is an online buying and selling platform with the highest visits in Indonesia in the fourth quarter of 2022. The category with the highest transactions at Shopee is beauty products. Somethinc is a very successful local beauty product at Shopee which have highest sales of serum products in Indonesia. This study applies the classification method C4.5 and Random Forest to see important variables in the sales of Somethinc serum at Shopee. The variables used come from store profiles which include: number of followers, number of products, chat performance, store rating, and length of stay. Continuous sales data is discretized using k-means into ordinal data with low, medium, and high levels. There is an imbalance of data in the sales class so that the SMOTE technique is used. The C4.5 algorithm produces a decision tree that contains rules for classification. Random Forest generates the order of variable importance based on the Mean Decrease Gini (MDG) values in descending order, which are as follows: number of followers, number of products, message performance, joining time, and store rating.