SlideShare a Scribd company logo
1 of 17
Machine Learning and Data Science for a
Household-Specific Poverty Level Prediction
Project Problem Statement
 The elimination of poverty worldwide is the first of 17 UN Sustainable Development Goals for the
year 2030.
 Despite some improvements, such as the fall of the percentage of household poverty situations
between 2015 and 2016, last year 31.5 percent of households suffered from poverty - monetary,
multidimensional, and other types.
 a necessary action from authorities to fight these structural problems. Measuring poverty is
currently notoriously difficult, time consuming and expensive. Usually, estimates are done by
collecting complex household consumption surveys with data consisting of several hundred
different variables; each can be useful when accessing different poverty levels.
 Machine learning offers new approaches for determining which variables are most productive.
These algorithms can observe the poverty trend and the most important features when data of a certain
period is examined. The main task of this work is to use machine learning algorithms to determine which
households need the most financial support from the government agencies to improve their lifestyle.
Goals and Technical Objectives
 The aim of this work is to build supervised inductive learning models which adopt
classification methods to predict the poverty level of a household. 
 In this work, I have tried to predict poverty using the several features of the
household. For this work, We have created and used the dataset “Household based
Poverty”.
 The inference task is to predict the poverty level of a new household using
attributes of the family home and other attributes found to be relevant by the
learning algorithm.
 We tried to classify poverty level into one of the following labels
(1) Poor
(2) Non-poor
System Architecture
Block Diagram
Sr. No Title Author Journal Year
1 Machine Learning Approach for
Bottom 40 Percent Households
(B40) Poverty Classification
SANI, N. S., RAHMAN,
M. A., BAKAR, A. A.,
SAHRAN, S., & SARIM,
H. M.
International Journal on
Advanced Science,
Engineering and
Information Technology
2018
2 Determinants of Poverty and Their
Variation Across the Poverty
Spectrum: Evidence from Hong
Kong, a High-Income Society with a
High Poverty Level
PENG, C., FANG, L.,
WANG, J. S., LAW, Y. W.
An International and
Interdisciplinary Journal
for Quality-of-Life
Measurement
2019
3 Predicting City Poverty Using
Satellite Imagery.
PIAGGESI, S., GAUVIN,
L., TIZZONI, M.,
CATTUTO
Conference on
Computer Vision and
Pattern Recognition
(CVPR)
2019
4 Combining satellite imagery and
machine learning to predict poverty.
JEAN, N., BURKE, M.,
XIE, M., DAVIS, W. M.
National library of
medicine
2016
5 Using Machine Learning to Predict
Visitors to Totally Protected Areas in
Sarawak, Malaysia
Wan Fairos Wan
Yaacob,Syerina Azlin
Md Nasir ,Serah Jaya
and Suhaili Mokhtar
mdpi 2022
Literature review
Requirement Analysis
 1. Dataset preparation and preprocessing
Data is the foundation for any machine learning project. The second stage of project
implementation is complex and involves data collection, selection, preprocessing, and
transformation.
 Data collection
Dataset can be taken from standard sources like Kaggle etc. each machine learning
problem is unique. In turn, the number of attributes will be used when building a
predictive model depends on the attributes’ predictive value.
 Data visualization
A large amount of information represented in graphic form is easier to understand and
analyze. Some companies specify that a data analyst must know how to create slides,
diagrams, charts, and templates.
 Data Splitting
A dataset used for machine learning should be partitioned into three subsets — training,
test, and validation sets.
 Modelling
trains numerous models to define which one of them provides the most accurate
predictions.
 Model deployment: Testing
Methodology/ Algorithms
 The proposed machine learning model consist of several steps. It starts by
preparing and pre-processing the dataset, which includes data cleaning and
encoding of categorical variables. This is to ensure raw data from external
sources are transformed or encoded into a suitable form for predictive
modelling.
 The next step is feature engineering. It involves removing redundant features,
combining ordinal variable features, and feature selection by using Feature
Selection algorithm.
 the dataset is trained and tested in Linear Regression.
 This study uses Mean Square Error (MSE), Root Mean Square Error (RMSE), and R-
Square (R2) as performance metrics for evaluation.
 The average score of each metric from all iterations will then be calculated and
examined. Further explanation and analysis of the model are done.
 The expected outcome is to find out which features significantly contribute to
poverty and to perform prediction of the poverty level in one of the four scales.
Methodology/ Algorithms
 Algorithm: Linear Regression – Supervised machine learning model
 Different regression models can be used in this study to carry out the analysis.
Linear Regression is one of the simplest regression techniques for prediction.
 It uses a linear approach or a straight line for showing the relationship between
independent variables, X and estimated dependent variables, Y.
 This model considers using the equation of Y = bX + C, where b is the slope of the
line or regression coefficient and is the intercept or constant.
 When the model is being trained, it tries to fit the best regression line to predict
a given value by finding the best values .
 It can be used to determine whether the variables predict the outcome variable
well, and which variables are significant to the outcome variable.
Implementation: Basic Details
 1. Selection of Regression model: Linear Regression is one of the simplest
regression techniques for prediction. It uses a linear approach or a straight line
for showing the relationship between independent variables, X and estimated
dependent variables, Y.
 2. Creating Dataset: household dataset consists of family’s observable household
characteristics such as the material of walls, roof, floor, and ceiling, or the
assets owned i.e., computers or tablets that can predict their level of poverty.
 3. Feature Selection: Feature Selection plays an important role in improving the
accuracy of the model. It is the process of selecting only a few features from the
set of all features available. This process can improve the accuracy and precision
of the model.
 3. Performance Metrics: A good fitting model is one that the difference
between the actual values and predicted values is small. In this study, the
machine learning metrics to evaluate the performance of the models are Mean
Square Error (MSE), Root Mean Square Error (RMSE), and R-Square (R2).
Project Design Diagrams:
Class-UML diagram
Use-case Diagram
Activity Diagram
System Architecture Diagram
Project Planning:
 Project Domain selection
 Preparing Synopsis, Abstract, Presentations
 Project Title, Objective and Methodology selection
 Algorithms Studies for Implementation
 Creating and preparation of dataset
 Python coding preparation (GUI, Application code, model training)
 Code testing in parts on IDE platforms like Pycharm or jupyter notebook
 Code integrations
 Testing Results
 Documentation: Reports
 Paper writing
 Paper publication
Conclusion:
 In conclusion, poverty prevalence is widely happening all over the world. Until
today, a vast effort of finding the best way to measure poverty given their
circumstances has been executed by different governments in ensuring their
people will get sufficient economic aid.
 However, since some survey data are not considerably reliable, the information
obtained could lead to preconception decision-making as well as ignoring other
people in extreme poverty that could not give their poverty status.
 Hence, with the help of the machine learning model, this work aims at predicting
the poverty status of the household based on data collected (their comfortability
in a house, average education level, condition, and hygienic status of the house,
etc.), recognizing which features are highly contributed to poverty.
REFERENCES
 [1] SANI, N. S., RAHMAN, M. A., BAKAR, A. A., SAHRAN, S., & SARIM, H. M. (2018)
Machine Learning Approach for Bottom 40 Percent Households (B40) Poverty
Classification. International Journal on Advanced Science, Engineering and
Information Technology, 8(4-2), pp.1698.
 [2] PENG, C., FANG, L., WANG, J. S., LAW, Y. W., et al. (2018) Determinants of
Poverty and Their Variation Across the Poverty Spectrum: Evidence from Hong Kong,
a High-Income Society with a High Poverty Level. Social Indicators Research, 144(1),
pp. 219-250.
 [3] PIAGGESI, S., GAUVIN, L., TIZZONI, M., CATTUTO, C., et al. (2019) Predicting
City Poverty Using Satellite Imagery. Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 90-96
 [4] JEAN, N., BURKE, M., XIE, M., DAVIS, W. M., et al. (2016) Combining satellite
imagery and machine learning to predict poverty. Science, 353(6301), pp. 790-794.
 [5] SĄCZEWSKA-PIOTROWSKA, A. (2018) Determinants of the state of poverty using
logistic regression. Śląski Przegląd Statystyczny, 16(22), pp. 55-68.
THANK YOU

More Related Content

Similar to Presentation1.pptx

Hybrid model in machine learning–robust regression applied for sustainabilit...
Hybrid model in machine learning–robust regression applied  for sustainabilit...Hybrid model in machine learning–robust regression applied  for sustainabilit...
Hybrid model in machine learning–robust regression applied for sustainabilit...
IJECEIAES
 

Similar to Presentation1.pptx (20)

Data mining software comparison
Data mining software comparison Data mining software comparison
Data mining software comparison
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
 
5316ijccms01.pdf
5316ijccms01.pdf5316ijccms01.pdf
5316ijccms01.pdf
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA  MINING A REVIEW ON PREDICTIVE ANALYTICS IN DATA  MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
 
A comprehensive study on disease risk predictions in machine learning
A comprehensive study on disease risk predictions  in machine learning A comprehensive study on disease risk predictions  in machine learning
A comprehensive study on disease risk predictions in machine learning
 
IRJET- Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
IRJET-  	  Breast Cancer Relapse Prognosis by Classic and Modern Structures o...IRJET-  	  Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
IRJET- Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
 
Regoli fairness deep_learningitalia_20220127
Regoli fairness deep_learningitalia_20220127Regoli fairness deep_learningitalia_20220127
Regoli fairness deep_learningitalia_20220127
 
Forecasting of India’s GDP using Various Regression Algorithms
Forecasting of India’s GDP using Various Regression AlgorithmsForecasting of India’s GDP using Various Regression Algorithms
Forecasting of India’s GDP using Various Regression Algorithms
 
IRJET - A Novel Approach for Software Defect Prediction based on Dimensio...
IRJET -  	  A Novel Approach for Software Defect Prediction based on Dimensio...IRJET -  	  A Novel Approach for Software Defect Prediction based on Dimensio...
IRJET - A Novel Approach for Software Defect Prediction based on Dimensio...
 
COMPARISON OF BANKRUPTCY PREDICTION MODELS WITH PUBLIC RECORDS AND FIRMOGRAPHICS
COMPARISON OF BANKRUPTCY PREDICTION MODELS WITH PUBLIC RECORDS AND FIRMOGRAPHICSCOMPARISON OF BANKRUPTCY PREDICTION MODELS WITH PUBLIC RECORDS AND FIRMOGRAPHICS
COMPARISON OF BANKRUPTCY PREDICTION MODELS WITH PUBLIC RECORDS AND FIRMOGRAPHICS
 
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNINGCREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
 
Fake accounts detection on social media using stack ensemble system
Fake accounts detection on social media using stack ensemble  systemFake accounts detection on social media using stack ensemble  system
Fake accounts detection on social media using stack ensemble system
 
CRIME ANALYSIS AND PREDICTION USING MACHINE LEARNING
CRIME ANALYSIS AND PREDICTION USING MACHINE LEARNINGCRIME ANALYSIS AND PREDICTION USING MACHINE LEARNING
CRIME ANALYSIS AND PREDICTION USING MACHINE LEARNING
 
Hybrid model in machine learning–robust regression applied for sustainabilit...
Hybrid model in machine learning–robust regression applied  for sustainabilit...Hybrid model in machine learning–robust regression applied  for sustainabilit...
Hybrid model in machine learning–robust regression applied for sustainabilit...
 
GDP Prediction and Forecasting using Machine Learning
GDP Prediction and Forecasting using Machine LearningGDP Prediction and Forecasting using Machine Learning
GDP Prediction and Forecasting using Machine Learning
 
A Novel Approach for Forecasting Disease Using Machine Learning
A Novel Approach for Forecasting Disease Using Machine LearningA Novel Approach for Forecasting Disease Using Machine Learning
A Novel Approach for Forecasting Disease Using Machine Learning
 

More from VishalLabde

sonali ppt_Raspberry pi.pptx
sonali ppt_Raspberry pi.pptxsonali ppt_Raspberry pi.pptx
sonali ppt_Raspberry pi.pptx
VishalLabde
 
Mahesh_Smart Garbage Management System.pptx
Mahesh_Smart Garbage Management System.pptxMahesh_Smart Garbage Management System.pptx
Mahesh_Smart Garbage Management System.pptx
VishalLabde
 
Mitali_child safety_PPT.pptx
Mitali_child safety_PPT.pptxMitali_child safety_PPT.pptx
Mitali_child safety_PPT.pptx
VishalLabde
 
Smart Garbage Management System.pptx
Smart Garbage Management System.pptxSmart Garbage Management System.pptx
Smart Garbage Management System.pptx
VishalLabde
 
Skin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptxSkin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptx
VishalLabde
 
Vivek_Presentation1.pptx
Vivek_Presentation1.pptxVivek_Presentation1.pptx
Vivek_Presentation1.pptx
VishalLabde
 

More from VishalLabde (10)

Second PPT.ppt
Second  PPT.pptSecond  PPT.ppt
Second PPT.ppt
 
PPT_1.pptx
PPT_1.pptxPPT_1.pptx
PPT_1.pptx
 
PPT.pptx
PPT.pptxPPT.pptx
PPT.pptx
 
sonali ppt_Raspberry pi.pptx
sonali ppt_Raspberry pi.pptxsonali ppt_Raspberry pi.pptx
sonali ppt_Raspberry pi.pptx
 
Mahesh_Smart Garbage Management System.pptx
Mahesh_Smart Garbage Management System.pptxMahesh_Smart Garbage Management System.pptx
Mahesh_Smart Garbage Management System.pptx
 
Mitali_child safety_PPT.pptx
Mitali_child safety_PPT.pptxMitali_child safety_PPT.pptx
Mitali_child safety_PPT.pptx
 
Smart Garbage Management System.pptx
Smart Garbage Management System.pptxSmart Garbage Management System.pptx
Smart Garbage Management System.pptx
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Skin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptxSkin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptx
 
Vivek_Presentation1.pptx
Vivek_Presentation1.pptxVivek_Presentation1.pptx
Vivek_Presentation1.pptx
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 

Presentation1.pptx

  • 1. Machine Learning and Data Science for a Household-Specific Poverty Level Prediction
  • 2. Project Problem Statement  The elimination of poverty worldwide is the first of 17 UN Sustainable Development Goals for the year 2030.  Despite some improvements, such as the fall of the percentage of household poverty situations between 2015 and 2016, last year 31.5 percent of households suffered from poverty - monetary, multidimensional, and other types.  a necessary action from authorities to fight these structural problems. Measuring poverty is currently notoriously difficult, time consuming and expensive. Usually, estimates are done by collecting complex household consumption surveys with data consisting of several hundred different variables; each can be useful when accessing different poverty levels.  Machine learning offers new approaches for determining which variables are most productive. These algorithms can observe the poverty trend and the most important features when data of a certain period is examined. The main task of this work is to use machine learning algorithms to determine which households need the most financial support from the government agencies to improve their lifestyle.
  • 3. Goals and Technical Objectives  The aim of this work is to build supervised inductive learning models which adopt classification methods to predict the poverty level of a household.  In this work, I have tried to predict poverty using the several features of the household. For this work, We have created and used the dataset “Household based Poverty”.  The inference task is to predict the poverty level of a new household using attributes of the family home and other attributes found to be relevant by the learning algorithm.  We tried to classify poverty level into one of the following labels (1) Poor (2) Non-poor
  • 6. Sr. No Title Author Journal Year 1 Machine Learning Approach for Bottom 40 Percent Households (B40) Poverty Classification SANI, N. S., RAHMAN, M. A., BAKAR, A. A., SAHRAN, S., & SARIM, H. M. International Journal on Advanced Science, Engineering and Information Technology 2018 2 Determinants of Poverty and Their Variation Across the Poverty Spectrum: Evidence from Hong Kong, a High-Income Society with a High Poverty Level PENG, C., FANG, L., WANG, J. S., LAW, Y. W. An International and Interdisciplinary Journal for Quality-of-Life Measurement 2019 3 Predicting City Poverty Using Satellite Imagery. PIAGGESI, S., GAUVIN, L., TIZZONI, M., CATTUTO Conference on Computer Vision and Pattern Recognition (CVPR) 2019 4 Combining satellite imagery and machine learning to predict poverty. JEAN, N., BURKE, M., XIE, M., DAVIS, W. M. National library of medicine 2016 5 Using Machine Learning to Predict Visitors to Totally Protected Areas in Sarawak, Malaysia Wan Fairos Wan Yaacob,Syerina Azlin Md Nasir ,Serah Jaya and Suhaili Mokhtar mdpi 2022 Literature review
  • 7. Requirement Analysis  1. Dataset preparation and preprocessing Data is the foundation for any machine learning project. The second stage of project implementation is complex and involves data collection, selection, preprocessing, and transformation.  Data collection Dataset can be taken from standard sources like Kaggle etc. each machine learning problem is unique. In turn, the number of attributes will be used when building a predictive model depends on the attributes’ predictive value.  Data visualization A large amount of information represented in graphic form is easier to understand and analyze. Some companies specify that a data analyst must know how to create slides, diagrams, charts, and templates.  Data Splitting A dataset used for machine learning should be partitioned into three subsets — training, test, and validation sets.  Modelling trains numerous models to define which one of them provides the most accurate predictions.  Model deployment: Testing
  • 8. Methodology/ Algorithms  The proposed machine learning model consist of several steps. It starts by preparing and pre-processing the dataset, which includes data cleaning and encoding of categorical variables. This is to ensure raw data from external sources are transformed or encoded into a suitable form for predictive modelling.  The next step is feature engineering. It involves removing redundant features, combining ordinal variable features, and feature selection by using Feature Selection algorithm.  the dataset is trained and tested in Linear Regression.  This study uses Mean Square Error (MSE), Root Mean Square Error (RMSE), and R- Square (R2) as performance metrics for evaluation.  The average score of each metric from all iterations will then be calculated and examined. Further explanation and analysis of the model are done.  The expected outcome is to find out which features significantly contribute to poverty and to perform prediction of the poverty level in one of the four scales.
  • 9. Methodology/ Algorithms  Algorithm: Linear Regression – Supervised machine learning model  Different regression models can be used in this study to carry out the analysis. Linear Regression is one of the simplest regression techniques for prediction.  It uses a linear approach or a straight line for showing the relationship between independent variables, X and estimated dependent variables, Y.  This model considers using the equation of Y = bX + C, where b is the slope of the line or regression coefficient and is the intercept or constant.  When the model is being trained, it tries to fit the best regression line to predict a given value by finding the best values .  It can be used to determine whether the variables predict the outcome variable well, and which variables are significant to the outcome variable.
  • 10. Implementation: Basic Details  1. Selection of Regression model: Linear Regression is one of the simplest regression techniques for prediction. It uses a linear approach or a straight line for showing the relationship between independent variables, X and estimated dependent variables, Y.  2. Creating Dataset: household dataset consists of family’s observable household characteristics such as the material of walls, roof, floor, and ceiling, or the assets owned i.e., computers or tablets that can predict their level of poverty.  3. Feature Selection: Feature Selection plays an important role in improving the accuracy of the model. It is the process of selecting only a few features from the set of all features available. This process can improve the accuracy and precision of the model.  3. Performance Metrics: A good fitting model is one that the difference between the actual values and predicted values is small. In this study, the machine learning metrics to evaluate the performance of the models are Mean Square Error (MSE), Root Mean Square Error (RMSE), and R-Square (R2).
  • 14. Project Planning:  Project Domain selection  Preparing Synopsis, Abstract, Presentations  Project Title, Objective and Methodology selection  Algorithms Studies for Implementation  Creating and preparation of dataset  Python coding preparation (GUI, Application code, model training)  Code testing in parts on IDE platforms like Pycharm or jupyter notebook  Code integrations  Testing Results  Documentation: Reports  Paper writing  Paper publication
  • 15. Conclusion:  In conclusion, poverty prevalence is widely happening all over the world. Until today, a vast effort of finding the best way to measure poverty given their circumstances has been executed by different governments in ensuring their people will get sufficient economic aid.  However, since some survey data are not considerably reliable, the information obtained could lead to preconception decision-making as well as ignoring other people in extreme poverty that could not give their poverty status.  Hence, with the help of the machine learning model, this work aims at predicting the poverty status of the household based on data collected (their comfortability in a house, average education level, condition, and hygienic status of the house, etc.), recognizing which features are highly contributed to poverty.
  • 16. REFERENCES  [1] SANI, N. S., RAHMAN, M. A., BAKAR, A. A., SAHRAN, S., & SARIM, H. M. (2018) Machine Learning Approach for Bottom 40 Percent Households (B40) Poverty Classification. International Journal on Advanced Science, Engineering and Information Technology, 8(4-2), pp.1698.  [2] PENG, C., FANG, L., WANG, J. S., LAW, Y. W., et al. (2018) Determinants of Poverty and Their Variation Across the Poverty Spectrum: Evidence from Hong Kong, a High-Income Society with a High Poverty Level. Social Indicators Research, 144(1), pp. 219-250.  [3] PIAGGESI, S., GAUVIN, L., TIZZONI, M., CATTUTO, C., et al. (2019) Predicting City Poverty Using Satellite Imagery. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 90-96  [4] JEAN, N., BURKE, M., XIE, M., DAVIS, W. M., et al. (2016) Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), pp. 790-794.  [5] SĄCZEWSKA-PIOTROWSKA, A. (2018) Determinants of the state of poverty using logistic regression. Śląski Przegląd Statystyczny, 16(22), pp. 55-68.