SlideShare a Scribd company logo
1 of 24
Funding
Education
through
Donors
Choose
General Assembly 2016
Fernando Hidalgo
Problem
Description
Task: Predict Whether a Donor’s Choose
Project will get Funded
Experience: Donor’s Choose Data from Sept
2002 - Currently
Performance: Classification Accuracy, the
Number of correct prediction out of all
predictions made.
The Data
Labels
Completed:
592,757
&
Expired:
261,536
Class Skewness:
Use F1 Score as a way to use recall
and precision in check.
Baseline:
.69
Features Abbreviations Descriptions
total_price_excluding_optional_support Total Price of the Project
(integer)(dollars)
students_reached # of students that are project reaches
(integer)
school_type Types of School:
Charter, magnet, year_round, nlns, kipp,
Charter_ready_promise
(categorical)
date_posted Day that the project was posted
(categorical)
resource_type Type of Resources the project asks
(categorical)
grade_level The Grade Level of the Project
(categorical
poverty_level Poverty Level (categorial)
school_state From what state the project is posted
(categorical)
Eligible_double_your_impact_match
Whether it was eligible to be matched
(categorical
teacher_prefix The Prefix of the Teacher Posting
(categorical)
primary_focus_area The Project’s Primary Area of Focus
(categorical)
primary_focus_subject The Project’s Primary Subject of Focus
(categorical)
Original
Features
Feature
Engineering
New Features Description
price_per_student total_price/students_reach
ed
project_length Date_expiration -
date_posted
month_posted Extracted from
date_posted
day_posted Extracted from
date_posted
Visualizations
Rate of Projects Funded to Total Projects per Resource
Rate of Projects Funded to Total Projects per Month
Rate of Projects Funded to Total Projects per Grades
Rate of Projects Funded to Total Projects per Primary Focus Area
Rate of Projects Funded to Total Projects per Teacher Prefix
Rate of Projects Funded to Total Projects per Poverty Level
Relationship Between Project Length and Funding
Relationship Between Project Price and Funding
Relationship Between Price per Student and Funding
Predictive
Model
The 3 Models:
1. AdaBoost
2. Random
Forest
3. Logistic
Regression
GridSearch
Accuracy Scores
using F1 Score Metric
Model Accuracy Best Parameter
Random Forest 0.759 Criterion: Entropy
AdaBoost .7676 N_estimators: 60
Logistic Regression 0.811 Penalty: L2
Simplest Model with Best Score:
Logistic Regression
Checking Feature
Significance:
Using Random Forest Classifier
The top 5 Features Seem to Have Most of the
Predictive Power
Using Only the 5 Most Significant
Features
1. Total_price_excluding_optional_support
2. Eligible_double_your_impact_match
3. Resource_Type_Books
4. Resource_Type_Technology
5. price_per_student
New Score with
Logistic Regression:
.8171
Overview
● Model Improvement of .1271 over the baseline using
Logistic Regression with F1 Score.
● Most of Predictive Power Lies in 5 Features
● Ethical Implications:
○ The features with the most predictive power are not
ones that can be changed without fabrication
Model Improvements
Add Prescriptive Data:
Project Essays
Project Materials
Use Data Based on Location:
Census
Skewed Data:
Find Reasons
Methods

More Related Content

Similar to Donors Choose Project (1)

Front-End Analysis - Objective & Media Analysis.pdf
Front-End Analysis - Objective & Media Analysis.pdfFront-End Analysis - Objective & Media Analysis.pdf
Front-End Analysis - Objective & Media Analysis.pdfRidzuan Digital4All
 
Entrepreneurial Opportunities PlanEntrepreneurial Opportunit
Entrepreneurial Opportunities PlanEntrepreneurial OpportunitEntrepreneurial Opportunities PlanEntrepreneurial Opportunit
Entrepreneurial Opportunities PlanEntrepreneurial OpportunitTanaMaeskm
 
LAK16 Practitioner Track presentation: Model Accuracy. Training vs Reality
LAK16 Practitioner Track presentation: Model Accuracy. Training vs RealityLAK16 Practitioner Track presentation: Model Accuracy. Training vs Reality
LAK16 Practitioner Track presentation: Model Accuracy. Training vs RealityDan Rinzel
 
Data-Driven Learning Strategy
Data-Driven Learning StrategyData-Driven Learning Strategy
Data-Driven Learning StrategyJessie Chuang
 
CIS 5681 Research ProjectBig Data Solution for Businesses
CIS 5681 Research ProjectBig Data Solution for BusinessesCIS 5681 Research ProjectBig Data Solution for Businesses
CIS 5681 Research ProjectBig Data Solution for BusinessesVinaOconner450
 
6062 comp cwk2 17 18 new template
6062 comp cwk2 17 18 new template6062 comp cwk2 17 18 new template
6062 comp cwk2 17 18 new templateElliot Byrne
 
Major proj term3
Major proj term3Major proj term3
Major proj term3hccit
 
Customer Profile - Account Vision, Targeted Funding, Implementation
Customer Profile - Account Vision, Targeted Funding, ImplementationCustomer Profile - Account Vision, Targeted Funding, Implementation
Customer Profile - Account Vision, Targeted Funding, ImplementationWendy Colby
 
Ngs Hsm 700bl Module 1 01272009
Ngs Hsm 700bl Module 1 01272009Ngs Hsm 700bl Module 1 01272009
Ngs Hsm 700bl Module 1 01272009Peter Stinson
 
This Weeks AssignmentBy Day 2 of this week, select an issue or p.docx
This Weeks AssignmentBy Day 2 of this week, select an issue or p.docxThis Weeks AssignmentBy Day 2 of this week, select an issue or p.docx
This Weeks AssignmentBy Day 2 of this week, select an issue or p.docxamit657720
 
MITS6004Enterprise Resource Planning .docx
MITS6004Enterprise Resource Planning .docxMITS6004Enterprise Resource Planning .docx
MITS6004Enterprise Resource Planning .docxaudeleypearl
 
MITS6004Enterprise Resource Planning .docx
MITS6004Enterprise Resource Planning .docxMITS6004Enterprise Resource Planning .docx
MITS6004Enterprise Resource Planning .docxaltheaboyer
 
Application for instructional initiatives 2011
Application for instructional initiatives 2011Application for instructional initiatives 2011
Application for instructional initiatives 2011LynnHuck
 
Application for instructional initiatives 2011
Application for instructional initiatives 2011Application for instructional initiatives 2011
Application for instructional initiatives 2011LynnHuck
 
SEU Management the Blue Sky Project Case Questions.docx
SEU Management the Blue Sky Project Case Questions.docxSEU Management the Blue Sky Project Case Questions.docx
SEU Management the Blue Sky Project Case Questions.docxwrite5
 
SE Genres - Carter et al
SE Genres - Carter et alSE Genres - Carter et al
SE Genres - Carter et alJerry Gannod
 
IRJET- Online Course Recommendation System
IRJET- Online Course Recommendation SystemIRJET- Online Course Recommendation System
IRJET- Online Course Recommendation SystemIRJET Journal
 
Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.IOSRjournaljce
 
What data from 3 million learners can tell us about effective course design
What data from 3 million learners can tell us about effective course designWhat data from 3 million learners can tell us about effective course design
What data from 3 million learners can tell us about effective course designJohn Whitmer, Ed.D.
 
Ins and Outs of Program Evaluation
Ins and Outs of Program EvaluationIns and Outs of Program Evaluation
Ins and Outs of Program Evaluationkbrockmeier
 

Similar to Donors Choose Project (1) (20)

Front-End Analysis - Objective & Media Analysis.pdf
Front-End Analysis - Objective & Media Analysis.pdfFront-End Analysis - Objective & Media Analysis.pdf
Front-End Analysis - Objective & Media Analysis.pdf
 
Entrepreneurial Opportunities PlanEntrepreneurial Opportunit
Entrepreneurial Opportunities PlanEntrepreneurial OpportunitEntrepreneurial Opportunities PlanEntrepreneurial Opportunit
Entrepreneurial Opportunities PlanEntrepreneurial Opportunit
 
LAK16 Practitioner Track presentation: Model Accuracy. Training vs Reality
LAK16 Practitioner Track presentation: Model Accuracy. Training vs RealityLAK16 Practitioner Track presentation: Model Accuracy. Training vs Reality
LAK16 Practitioner Track presentation: Model Accuracy. Training vs Reality
 
Data-Driven Learning Strategy
Data-Driven Learning StrategyData-Driven Learning Strategy
Data-Driven Learning Strategy
 
CIS 5681 Research ProjectBig Data Solution for Businesses
CIS 5681 Research ProjectBig Data Solution for BusinessesCIS 5681 Research ProjectBig Data Solution for Businesses
CIS 5681 Research ProjectBig Data Solution for Businesses
 
6062 comp cwk2 17 18 new template
6062 comp cwk2 17 18 new template6062 comp cwk2 17 18 new template
6062 comp cwk2 17 18 new template
 
Major proj term3
Major proj term3Major proj term3
Major proj term3
 
Customer Profile - Account Vision, Targeted Funding, Implementation
Customer Profile - Account Vision, Targeted Funding, ImplementationCustomer Profile - Account Vision, Targeted Funding, Implementation
Customer Profile - Account Vision, Targeted Funding, Implementation
 
Ngs Hsm 700bl Module 1 01272009
Ngs Hsm 700bl Module 1 01272009Ngs Hsm 700bl Module 1 01272009
Ngs Hsm 700bl Module 1 01272009
 
This Weeks AssignmentBy Day 2 of this week, select an issue or p.docx
This Weeks AssignmentBy Day 2 of this week, select an issue or p.docxThis Weeks AssignmentBy Day 2 of this week, select an issue or p.docx
This Weeks AssignmentBy Day 2 of this week, select an issue or p.docx
 
MITS6004Enterprise Resource Planning .docx
MITS6004Enterprise Resource Planning .docxMITS6004Enterprise Resource Planning .docx
MITS6004Enterprise Resource Planning .docx
 
MITS6004Enterprise Resource Planning .docx
MITS6004Enterprise Resource Planning .docxMITS6004Enterprise Resource Planning .docx
MITS6004Enterprise Resource Planning .docx
 
Application for instructional initiatives 2011
Application for instructional initiatives 2011Application for instructional initiatives 2011
Application for instructional initiatives 2011
 
Application for instructional initiatives 2011
Application for instructional initiatives 2011Application for instructional initiatives 2011
Application for instructional initiatives 2011
 
SEU Management the Blue Sky Project Case Questions.docx
SEU Management the Blue Sky Project Case Questions.docxSEU Management the Blue Sky Project Case Questions.docx
SEU Management the Blue Sky Project Case Questions.docx
 
SE Genres - Carter et al
SE Genres - Carter et alSE Genres - Carter et al
SE Genres - Carter et al
 
IRJET- Online Course Recommendation System
IRJET- Online Course Recommendation SystemIRJET- Online Course Recommendation System
IRJET- Online Course Recommendation System
 
Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.
 
What data from 3 million learners can tell us about effective course design
What data from 3 million learners can tell us about effective course designWhat data from 3 million learners can tell us about effective course design
What data from 3 million learners can tell us about effective course design
 
Ins and Outs of Program Evaluation
Ins and Outs of Program EvaluationIns and Outs of Program Evaluation
Ins and Outs of Program Evaluation
 

Donors Choose Project (1)