80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
Presentation1.pptx
1. Machine Learning and Data Science for a
Household-Specific Poverty Level Prediction
2. Project Problem Statement
The elimination of poverty worldwide is the first of 17 UN Sustainable Development Goals for the
year 2030.
Despite some improvements, such as the fall of the percentage of household poverty situations
between 2015 and 2016, last year 31.5 percent of households suffered from poverty - monetary,
multidimensional, and other types.
a necessary action from authorities to fight these structural problems. Measuring poverty is
currently notoriously difficult, time consuming and expensive. Usually, estimates are done by
collecting complex household consumption surveys with data consisting of several hundred
different variables; each can be useful when accessing different poverty levels.
Machine learning offers new approaches for determining which variables are most productive.
These algorithms can observe the poverty trend and the most important features when data of a certain
period is examined. The main task of this work is to use machine learning algorithms to determine which
households need the most financial support from the government agencies to improve their lifestyle.
3. Goals and Technical Objectives
The aim of this work is to build supervised inductive learning models which adopt
classification methods to predict the poverty level of a household.
In this work, I have tried to predict poverty using the several features of the
household. For this work, We have created and used the dataset “Household based
Poverty”.
The inference task is to predict the poverty level of a new household using
attributes of the family home and other attributes found to be relevant by the
learning algorithm.
We tried to classify poverty level into one of the following labels
(1) Poor
(2) Non-poor
6. Sr. No Title Author Journal Year
1 Machine Learning Approach for
Bottom 40 Percent Households
(B40) Poverty Classification
SANI, N. S., RAHMAN,
M. A., BAKAR, A. A.,
SAHRAN, S., & SARIM,
H. M.
International Journal on
Advanced Science,
Engineering and
Information Technology
2018
2 Determinants of Poverty and Their
Variation Across the Poverty
Spectrum: Evidence from Hong
Kong, a High-Income Society with a
High Poverty Level
PENG, C., FANG, L.,
WANG, J. S., LAW, Y. W.
An International and
Interdisciplinary Journal
for Quality-of-Life
Measurement
2019
3 Predicting City Poverty Using
Satellite Imagery.
PIAGGESI, S., GAUVIN,
L., TIZZONI, M.,
CATTUTO
Conference on
Computer Vision and
Pattern Recognition
(CVPR)
2019
4 Combining satellite imagery and
machine learning to predict poverty.
JEAN, N., BURKE, M.,
XIE, M., DAVIS, W. M.
National library of
medicine
2016
5 Using Machine Learning to Predict
Visitors to Totally Protected Areas in
Sarawak, Malaysia
Wan Fairos Wan
Yaacob,Syerina Azlin
Md Nasir ,Serah Jaya
and Suhaili Mokhtar
mdpi 2022
Literature review
7. Requirement Analysis
1. Dataset preparation and preprocessing
Data is the foundation for any machine learning project. The second stage of project
implementation is complex and involves data collection, selection, preprocessing, and
transformation.
Data collection
Dataset can be taken from standard sources like Kaggle etc. each machine learning
problem is unique. In turn, the number of attributes will be used when building a
predictive model depends on the attributes’ predictive value.
Data visualization
A large amount of information represented in graphic form is easier to understand and
analyze. Some companies specify that a data analyst must know how to create slides,
diagrams, charts, and templates.
Data Splitting
A dataset used for machine learning should be partitioned into three subsets — training,
test, and validation sets.
Modelling
trains numerous models to define which one of them provides the most accurate
predictions.
Model deployment: Testing
8. Methodology/ Algorithms
The proposed machine learning model consist of several steps. It starts by
preparing and pre-processing the dataset, which includes data cleaning and
encoding of categorical variables. This is to ensure raw data from external
sources are transformed or encoded into a suitable form for predictive
modelling.
The next step is feature engineering. It involves removing redundant features,
combining ordinal variable features, and feature selection by using Feature
Selection algorithm.
the dataset is trained and tested in Linear Regression.
This study uses Mean Square Error (MSE), Root Mean Square Error (RMSE), and R-
Square (R2) as performance metrics for evaluation.
The average score of each metric from all iterations will then be calculated and
examined. Further explanation and analysis of the model are done.
The expected outcome is to find out which features significantly contribute to
poverty and to perform prediction of the poverty level in one of the four scales.
9. Methodology/ Algorithms
Algorithm: Linear Regression – Supervised machine learning model
Different regression models can be used in this study to carry out the analysis.
Linear Regression is one of the simplest regression techniques for prediction.
It uses a linear approach or a straight line for showing the relationship between
independent variables, X and estimated dependent variables, Y.
This model considers using the equation of Y = bX + C, where b is the slope of the
line or regression coefficient and is the intercept or constant.
When the model is being trained, it tries to fit the best regression line to predict
a given value by finding the best values .
It can be used to determine whether the variables predict the outcome variable
well, and which variables are significant to the outcome variable.
10. Implementation: Basic Details
1. Selection of Regression model: Linear Regression is one of the simplest
regression techniques for prediction. It uses a linear approach or a straight line
for showing the relationship between independent variables, X and estimated
dependent variables, Y.
2. Creating Dataset: household dataset consists of family’s observable household
characteristics such as the material of walls, roof, floor, and ceiling, or the
assets owned i.e., computers or tablets that can predict their level of poverty.
3. Feature Selection: Feature Selection plays an important role in improving the
accuracy of the model. It is the process of selecting only a few features from the
set of all features available. This process can improve the accuracy and precision
of the model.
3. Performance Metrics: A good fitting model is one that the difference
between the actual values and predicted values is small. In this study, the
machine learning metrics to evaluate the performance of the models are Mean
Square Error (MSE), Root Mean Square Error (RMSE), and R-Square (R2).
14. Project Planning:
Project Domain selection
Preparing Synopsis, Abstract, Presentations
Project Title, Objective and Methodology selection
Algorithms Studies for Implementation
Creating and preparation of dataset
Python coding preparation (GUI, Application code, model training)
Code testing in parts on IDE platforms like Pycharm or jupyter notebook
Code integrations
Testing Results
Documentation: Reports
Paper writing
Paper publication
15. Conclusion:
In conclusion, poverty prevalence is widely happening all over the world. Until
today, a vast effort of finding the best way to measure poverty given their
circumstances has been executed by different governments in ensuring their
people will get sufficient economic aid.
However, since some survey data are not considerably reliable, the information
obtained could lead to preconception decision-making as well as ignoring other
people in extreme poverty that could not give their poverty status.
Hence, with the help of the machine learning model, this work aims at predicting
the poverty status of the household based on data collected (their comfortability
in a house, average education level, condition, and hygienic status of the house,
etc.), recognizing which features are highly contributed to poverty.
16. REFERENCES
[1] SANI, N. S., RAHMAN, M. A., BAKAR, A. A., SAHRAN, S., & SARIM, H. M. (2018)
Machine Learning Approach for Bottom 40 Percent Households (B40) Poverty
Classification. International Journal on Advanced Science, Engineering and
Information Technology, 8(4-2), pp.1698.
[2] PENG, C., FANG, L., WANG, J. S., LAW, Y. W., et al. (2018) Determinants of
Poverty and Their Variation Across the Poverty Spectrum: Evidence from Hong Kong,
a High-Income Society with a High Poverty Level. Social Indicators Research, 144(1),
pp. 219-250.
[3] PIAGGESI, S., GAUVIN, L., TIZZONI, M., CATTUTO, C., et al. (2019) Predicting
City Poverty Using Satellite Imagery. Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 90-96
[4] JEAN, N., BURKE, M., XIE, M., DAVIS, W. M., et al. (2016) Combining satellite
imagery and machine learning to predict poverty. Science, 353(6301), pp. 790-794.
[5] SĄCZEWSKA-PIOTROWSKA, A. (2018) Determinants of the state of poverty using
logistic regression. Śląski Przegląd Statystyczny, 16(22), pp. 55-68.