SlideShare a Scribd company logo
1 of 49
PUMP IT UP
TERNARY CLASSIFICATION MODEL
Bao Tram Duong
Tanzania, the largest country in East
Africa, is suffering from a water crisis
• 4 million people lack access to an
improved source of safe water
• 30 million people lack access to
improved sanitation
• Water-borne illnesses, such as
malaria and cholera, account for
over half of the diseases affecting
the population
1 2 3 4 5
STEP STEP STEP STEP STEP
MODEL
EXPLORE INTERPRET
SCRUB
OBTAIN
OSEMN
1
STEP
OBTAIN
Using data provided by The Tanzanian Water
Ministry and Taarifa, we will build a classification
system to predict whether or not a given water
source is working correctly.
• 59,400 water points
• 40 features
54% 38% 7%
1 2
STEP STEP
SCRUB
OBTAIN
Challenges
1. Replace missing values with mean,
median, or classify them as ‘other’
2. Remove redundant features
3. Fix misspellings and variations
4. Select for the 20 most common
values with features that have high
number of unique values and
categorize the rest as ‘other’
5. Correct class imbalance with SMOTE
(Synthetic Minority Over-Sampling
Technique)
1 2 3
STEP STEP STEP
EXPLORE
SCRUB
OBTAIN
Recommendations
1. Focus on sustainability: early preventative strategy rather than letting
things go broken
2. Decentralized management: we need to restructure authority so that
there is a system of co-responsibility between the central, regional and
local levels.
3. Improved payment system:
• A local payment system should be put in place so that the user-group
can be independently responsible for their own water points
• Direct funding from international donors to village-level should also
be implemented instead of having to go through the long
bureaucratic process where money get lost along the way between
ministry and district level.
1 2 3 4
STEP STEP STEP STEP
MODEL
EXPLORE
SCRUB
OBTAIN
1. First, we convert the labels from strings to integers:
• Non-functional = 0
• Functional = 1
• Functional-needs-repair = 2
2. Dummies encode all categorical variables
SCALE
MinMaxScaler
CLEAN DATA
FIT
TRAIN
20% Test
TEST
(BALANCE
SMOTE)
80 % Train
Baseline GridSearchCV
Best Model
Gradient Boosting Classifier is a group of machine learning algorithms that
combine many weak learning models together to create a strong predictive
model.
The weak learner used is Decision Tree Classifier, which is the simplest tree-
based method.
1 2 3 4 5
STEP STEP STEP STEP STEP
MODEL
EXPLORE INTERPRET
SCRUB
OBTAIN
Accuracy
81.30%
• Train accuracy: 99.33%
• Test accuracy: 81.30%
Imbalanced Gradient Boost Model
81.30%
Balanced Gradient Boost Model
79.86%
Future Work
1. Since correcting class imbalance did not improve the model, we can try
model stacking i.e build a binary classification between functional vs non-
functional and another binary classification between functional vs.
functional needs repair.
2. Try more parameters tuning with more and wider range of options
3. Work to reduce overfit while maintaining and/or improving accuracy
score
THANK YOU
APPENDIX
Problems
1. Sustainability: Regardless of hundreds of millions of dollars over budget
and years past the original deadline of the Water Sector Development
Program (WSDP), local government and communities find themselves
unable to raise the money to fix and maintain their water points.
2. Power struggle: Full responsibility for operating, maintaining and
sustaining water points is done at the village level. However,
disbursement of funds and report of functionality must follow a long
bureaucratic process all the way from the village, to the district, and,
finally, to the Ministry of Water. The problem found is not only the
miscommunication but also the power struggle around roles,
responsibilities, and accountability between many different levels of
government.
• Within the first year, 30% of water points become non-functional.
• Only 54% of water points are working 15 years after installation.
Problem: Regardless of hundreds of millions of dollars over budget and years past the
original deadline of the Water Sector Development Programme (WSDP), local
government and communities find themselves unable to raise the money to fix and
maintain their water points.
• Under NAWAPO (National Water Policy ), user groups is to take the full
responsibility for operating, maintaining and sustaining water points at the village
level.
• Data is published by the ministry which are based on the coverage reported by
district, are not reliable.
• Disbursement of funds must follow a long bureaucratic process of accountability,
requiring upwards (vertical) reporting at each level of government, all the way
from the village, to the district, and, finally, to the Ministry of Water.
Problem: Power struggle around roles, responsibilities, and accountability between
different levels of government
• If the water point management charges money, the more likely that it is better
maintained and kept functional.
• Regular payment, used for regular maintenance and upkeep, is a better
approach for preventative treatments rather than trying to secure a large
amount of fund for when the system breaks down.
Balanced
Imbalanced
Metrics
• Accuracy: Out of all the classes, how much we predicted correctly. It should
be high as possible.
• Precision: Out of all positives that we predicted correctly, how many are
positive? i.e. TP / (TP + FP). It should be high as possible.
• Recall: Out of all positives, how many are predicted correctly? i.e. TP / (TP +
FN). It should be high as possible.
• F1: It is difficult to compare two models with low precision and high recall or
vice versa. It is often convenient to combine precision and recall into a single
metric called the F1 score. We want to maximize f1 score as well.
• Cross validation: this should be as close as possible with the model’s RMSE
or else we overfit our model.
• Macro average is the average of precision/recall/f1-score.
• Weighted average is just the weighted average of precision/recall/f1-score.
funder
installer
lga
scheme management
ward
management group
payment
permit
population
public meeting
water point names
basin
subvillage
region
region code
district code
age
amount tsh
gps height
extraction type
water quality
quantity
source
source class
water point type

More Related Content

Similar to Data Science vs. Pump It Up Competition

Session SDM - Juntopas.Thailand rural water supply 21
Session SDM - Juntopas.Thailand rural water supply 21Session SDM - Juntopas.Thailand rural water supply 21
Session SDM - Juntopas.Thailand rural water supply 21IRC
 
Session SDM - IRC Namibia presentation
Session SDM - IRC Namibia presentation Session SDM - IRC Namibia presentation
Session SDM - IRC Namibia presentation IRC
 
Evaluating Public Policies using Experiments
Evaluating Public Policies using ExperimentsEvaluating Public Policies using Experiments
Evaluating Public Policies using ExperimentsMarcos Vera
 
CONCLUSIONS (Student 1)This section should include a clear.docx
CONCLUSIONS (Student 1)This section should include a clear.docxCONCLUSIONS (Student 1)This section should include a clear.docx
CONCLUSIONS (Student 1)This section should include a clear.docxmaxinesmith73660
 
Key steps in transforming a calibration program slideshare
Key steps in transforming a calibration program slideshareKey steps in transforming a calibration program slideshare
Key steps in transforming a calibration program slideshareJohn Cummins, CPIP
 
Managing the Meter Shop of the Future Through Better Tools and Information
Managing the Meter Shop of the Future Through Better Tools and InformationManaging the Meter Shop of the Future Through Better Tools and Information
Managing the Meter Shop of the Future Through Better Tools and InformationTESCO - The Eastern Specialty Company
 
Does Strengthening Extension at the Meso Level Improve Quality at the Village...
Does Strengthening Extension at the Meso Level Improve Quality at the Village...Does Strengthening Extension at the Meso Level Improve Quality at the Village...
Does Strengthening Extension at the Meso Level Improve Quality at the Village...IFPRI-PIM
 
Results WFP West Bengal case study
Results WFP West Bengal case studyResults WFP West Bengal case study
Results WFP West Bengal case studyIRC
 
Aees summit 2014 unlocking employment opportunities in line with epwp phase 3...
Aees summit 2014 unlocking employment opportunities in line with epwp phase 3...Aees summit 2014 unlocking employment opportunities in line with epwp phase 3...
Aees summit 2014 unlocking employment opportunities in line with epwp phase 3...AEES_AEEN
 
Implementing large scale food security programmes in rural ethiopia insights ...
Implementing large scale food security programmes in rural ethiopia insights ...Implementing large scale food security programmes in rural ethiopia insights ...
Implementing large scale food security programmes in rural ethiopia insights ...essp2
 
FAO NextGen Policy Brief Webinar_Christen.pptx
FAO NextGen Policy Brief Webinar_Christen.pptxFAO NextGen Policy Brief Webinar_Christen.pptx
FAO NextGen Policy Brief Webinar_Christen.pptxevanwchristen
 

Similar to Data Science vs. Pump It Up Competition (20)

Session SDM - Juntopas.Thailand rural water supply 21
Session SDM - Juntopas.Thailand rural water supply 21Session SDM - Juntopas.Thailand rural water supply 21
Session SDM - Juntopas.Thailand rural water supply 21
 
Session SDM - IRC Namibia presentation
Session SDM - IRC Namibia presentation Session SDM - IRC Namibia presentation
Session SDM - IRC Namibia presentation
 
Evaluating Public Policies using Experiments
Evaluating Public Policies using ExperimentsEvaluating Public Policies using Experiments
Evaluating Public Policies using Experiments
 
CONCLUSIONS (Student 1)This section should include a clear.docx
CONCLUSIONS (Student 1)This section should include a clear.docxCONCLUSIONS (Student 1)This section should include a clear.docx
CONCLUSIONS (Student 1)This section should include a clear.docx
 
Key steps in transforming a calibration program slideshare
Key steps in transforming a calibration program slideshareKey steps in transforming a calibration program slideshare
Key steps in transforming a calibration program slideshare
 
Managing the Meter Shop of the Future Through Better Tools and Information
Managing the Meter Shop of the Future Through Better Tools and InformationManaging the Meter Shop of the Future Through Better Tools and Information
Managing the Meter Shop of the Future Through Better Tools and Information
 
Meter Operations in a Post AMI World
Meter Operations in a Post AMI WorldMeter Operations in a Post AMI World
Meter Operations in a Post AMI World
 
The Watt Campaign (178)
The Watt Campaign (178)The Watt Campaign (178)
The Watt Campaign (178)
 
Wattcampaign
WattcampaignWattcampaign
Wattcampaign
 
ISBMavericks
ISBMavericksISBMavericks
ISBMavericks
 
Does Strengthening Extension at the Meso Level Improve Quality at the Village...
Does Strengthening Extension at the Meso Level Improve Quality at the Village...Does Strengthening Extension at the Meso Level Improve Quality at the Village...
Does Strengthening Extension at the Meso Level Improve Quality at the Village...
 
Jyothirgamaya2
Jyothirgamaya2Jyothirgamaya2
Jyothirgamaya2
 
Meaningful Use Mini-Camp Presentation
Meaningful Use Mini-Camp PresentationMeaningful Use Mini-Camp Presentation
Meaningful Use Mini-Camp Presentation
 
Results WFP West Bengal case study
Results WFP West Bengal case studyResults WFP West Bengal case study
Results WFP West Bengal case study
 
UWP with Charge
UWP with ChargeUWP with Charge
UWP with Charge
 
Aees summit 2014 unlocking employment opportunities in line with epwp phase 3...
Aees summit 2014 unlocking employment opportunities in line with epwp phase 3...Aees summit 2014 unlocking employment opportunities in line with epwp phase 3...
Aees summit 2014 unlocking employment opportunities in line with epwp phase 3...
 
20150114_presentation_RS_LEAN
20150114_presentation_RS_LEAN20150114_presentation_RS_LEAN
20150114_presentation_RS_LEAN
 
Implementing large scale food security programmes in rural ethiopia insights ...
Implementing large scale food security programmes in rural ethiopia insights ...Implementing large scale food security programmes in rural ethiopia insights ...
Implementing large scale food security programmes in rural ethiopia insights ...
 
Meter Testing 101
Meter Testing 101Meter Testing 101
Meter Testing 101
 
FAO NextGen Policy Brief Webinar_Christen.pptx
FAO NextGen Policy Brief Webinar_Christen.pptxFAO NextGen Policy Brief Webinar_Christen.pptx
FAO NextGen Policy Brief Webinar_Christen.pptx
 

Recently uploaded

➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 

Recently uploaded (20)

➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 

Data Science vs. Pump It Up Competition

  • 1. PUMP IT UP TERNARY CLASSIFICATION MODEL Bao Tram Duong
  • 2. Tanzania, the largest country in East Africa, is suffering from a water crisis • 4 million people lack access to an improved source of safe water • 30 million people lack access to improved sanitation • Water-borne illnesses, such as malaria and cholera, account for over half of the diseases affecting the population
  • 3. 1 2 3 4 5 STEP STEP STEP STEP STEP MODEL EXPLORE INTERPRET SCRUB OBTAIN OSEMN
  • 4. 1 STEP OBTAIN Using data provided by The Tanzanian Water Ministry and Taarifa, we will build a classification system to predict whether or not a given water source is working correctly. • 59,400 water points • 40 features
  • 6. 1 2 STEP STEP SCRUB OBTAIN Challenges 1. Replace missing values with mean, median, or classify them as ‘other’ 2. Remove redundant features 3. Fix misspellings and variations 4. Select for the 20 most common values with features that have high number of unique values and categorize the rest as ‘other’ 5. Correct class imbalance with SMOTE (Synthetic Minority Over-Sampling Technique)
  • 7. 1 2 3 STEP STEP STEP EXPLORE SCRUB OBTAIN
  • 8. Recommendations 1. Focus on sustainability: early preventative strategy rather than letting things go broken 2. Decentralized management: we need to restructure authority so that there is a system of co-responsibility between the central, regional and local levels. 3. Improved payment system: • A local payment system should be put in place so that the user-group can be independently responsible for their own water points • Direct funding from international donors to village-level should also be implemented instead of having to go through the long bureaucratic process where money get lost along the way between ministry and district level.
  • 9. 1 2 3 4 STEP STEP STEP STEP MODEL EXPLORE SCRUB OBTAIN 1. First, we convert the labels from strings to integers: • Non-functional = 0 • Functional = 1 • Functional-needs-repair = 2 2. Dummies encode all categorical variables
  • 11. Best Model Gradient Boosting Classifier is a group of machine learning algorithms that combine many weak learning models together to create a strong predictive model. The weak learner used is Decision Tree Classifier, which is the simplest tree- based method.
  • 12. 1 2 3 4 5 STEP STEP STEP STEP STEP MODEL EXPLORE INTERPRET SCRUB OBTAIN
  • 13. Accuracy 81.30% • Train accuracy: 99.33% • Test accuracy: 81.30%
  • 14. Imbalanced Gradient Boost Model 81.30% Balanced Gradient Boost Model 79.86%
  • 15. Future Work 1. Since correcting class imbalance did not improve the model, we can try model stacking i.e build a binary classification between functional vs non- functional and another binary classification between functional vs. functional needs repair. 2. Try more parameters tuning with more and wider range of options 3. Work to reduce overfit while maintaining and/or improving accuracy score
  • 18. Problems 1. Sustainability: Regardless of hundreds of millions of dollars over budget and years past the original deadline of the Water Sector Development Program (WSDP), local government and communities find themselves unable to raise the money to fix and maintain their water points. 2. Power struggle: Full responsibility for operating, maintaining and sustaining water points is done at the village level. However, disbursement of funds and report of functionality must follow a long bureaucratic process all the way from the village, to the district, and, finally, to the Ministry of Water. The problem found is not only the miscommunication but also the power struggle around roles, responsibilities, and accountability between many different levels of government.
  • 19. • Within the first year, 30% of water points become non-functional. • Only 54% of water points are working 15 years after installation. Problem: Regardless of hundreds of millions of dollars over budget and years past the original deadline of the Water Sector Development Programme (WSDP), local government and communities find themselves unable to raise the money to fix and maintain their water points.
  • 20. • Under NAWAPO (National Water Policy ), user groups is to take the full responsibility for operating, maintaining and sustaining water points at the village level. • Data is published by the ministry which are based on the coverage reported by district, are not reliable. • Disbursement of funds must follow a long bureaucratic process of accountability, requiring upwards (vertical) reporting at each level of government, all the way from the village, to the district, and, finally, to the Ministry of Water. Problem: Power struggle around roles, responsibilities, and accountability between different levels of government
  • 21. • If the water point management charges money, the more likely that it is better maintained and kept functional. • Regular payment, used for regular maintenance and upkeep, is a better approach for preventative treatments rather than trying to secure a large amount of fund for when the system breaks down.
  • 24. Metrics • Accuracy: Out of all the classes, how much we predicted correctly. It should be high as possible. • Precision: Out of all positives that we predicted correctly, how many are positive? i.e. TP / (TP + FP). It should be high as possible. • Recall: Out of all positives, how many are predicted correctly? i.e. TP / (TP + FN). It should be high as possible. • F1: It is difficult to compare two models with low precision and high recall or vice versa. It is often convenient to combine precision and recall into a single metric called the F1 score. We want to maximize f1 score as well. • Cross validation: this should be as close as possible with the model’s RMSE or else we overfit our model. • Macro average is the average of precision/recall/f1-score. • Weighted average is just the weighted average of precision/recall/f1-score.
  • 27. lga
  • 29. ward
  • 36. basin
  • 41. age