Poverty Level Prediction Based on E-Commerce Data Using Naïve Bayes Algorithm and Similarity-Based Feature Selection

Authors

  • Pramuko Aji School of Applied Science, Telkom University, Bandung, Indonesia http://orcid.org/0000-0003-4356-7157
  • Dedy Rahman Wijaya School of Applied Science, Telkom University, Bandung, Indonesia http://orcid.org/0000-0003-0351-7331
  • Elis Hernawati School of Applied Science, Telkom University, Bandung, Indonesia http://orcid.org/0000-0002-5855-3615
  • Sherla Yualinda School of Applied Science, Telkom University, Bandung, Indonesia
  • Sherli Yualinda School of Applied Science, Telkom University, Bandung, Indonesia
  • Muhammad Akbar Haikal Frasanta School of Applied Science, Telkom University, Bandung, Indonesia
  • Rathimala Kannan PSG Institute of Management, PSG College of Technology, Coimbatore, India

DOI:

https://doi.org/10.25124/ijait.v7i02.5374

Keywords:

Poverty, BPS, naive bayes, feature selection, e-commerce data

Abstract

The poverty rate is an important measure of any country because it indicates how well the economy develops and how well the economic prosperity distributes among citizens. The Central Statistics Agency, or BPS, measures the poverty rates in Indonesia using the concept of the ability to meet demands (basic needs approach). Using this approach, spending becomes a measure of poverty, defined as an economic incapacity to satisfy food and non-food requirements. Thus, the poor are individuals whose monthly per capita spending is less than the poverty threshold. In this study, the machine learning method using Naive Bayes with similarity-based feature selection and e-commerce data has been proposed to predict the poverty level in Indonesia. We proposed the method to be used as a complement to the results of the costly surveys and censuses conducted by BPS. Our experiments show that the classifier shows little relevance between the predicted and the original values or actual poverty prediction based on BPS data. A limited number of features does not necessarily result in poor accuracy, however great accuracy is not always achieved if a lot of features are being used.

Downloads

Download data is not yet available.

Downloads

Published

2023-10-20

Issue

Section

Articles