SlideShare a Scribd company logo
1 of 29
Download to read offline
Data-Driven College Counseling
Michael Discenza
Senior Data Scientist - SchooLinks
SchooLinks | A personalized college and career readiness solution
About Me
● Statistics B.A. + M.A. @ Columbia
● Data & Accountability Team @ Success Academies Charter
Network in NYC
● Data Science @JPMorgan, @ RUN Ads (Digital ad targeting)
● Currently Data Science @ SchooLinks
● Why am I here and what do I know about college
planning/counseling?
What should you take away?
1) How being data driven counseling can help you in
counseling
2) Process/Framework to follow
3) Exposure and working knowledge of more advanced
techniques
Why data-driven?
● “Big data”
● It helps businesses figure out what they need to do, or what actions
they need to take to meet their goals - predicting the future? Super
powers?
● Why does it work? Boils down to is Scientific Method and the
availability of data.
● Businesses saw that this framework for thinking about the world
was helpful for them.
What is data useful for in college
counseling?
Conduct research yourself if you’re in academia or want to publish
a paper about college counseling…
But more likely:
○ Learn how to be more effective individually,
○ Be more effective as a department, or
○ Justify the investment to use a new curriculum/ new
approach (be sure it has a positive outcome)
What are your goals?
● Examples from folks that we work with:
○ Increase the number of students who have meaningful
post graduation plans
○ Increase college going rate
○ Close achievement gap in your school/district
○ College retention
Quantify these Goals
● Goal Metrics
○ Outcomes (matriculation, retention)
○ Process metrics (setting up for success… completion date of
applications)
● KPI - intermediate/progress tracking
○ FAFSA Completion
○ PSAT/SAT/ACT, etc completion rates
The Process
1) PLAN
2) Get data
3) Prepare/Analyze data
4) Improve based on the learnings from your data
1. Plan
● Background research to make sure your question is a good one for
primary research: i.e. specific to your school and can’t be more
efficiently answered by reading it in a book or elsewhere
● Write down your question
● Make sure it is important and tied to outcomes you are about…
and will yield actionable insights
● Lab notebook (folder on your computer) etc.
● Ensure data access/you will be able to complete the actual
“research"
● List out assumptions
2. Getting Data
● Two main types of data:
○ Outcome data (dependent variable) - college going rate,
students who were accepted into their top 3 choices
(combination of the KPI and the goal metrics we talked about
before)
○ Treatment data (independent variable) - curriculum they used,
programs/extracurricular at schools, sentiment as reported by
surveys
● Sources of data (we’ll talk about strategies for each):
○ Existing data
○ Data you collect
2.1 Getting Data - Using Existing Sources of
Student Data
● Where do you find it/how do you access?
○ SIS data - grades, attendance, participation -> CSV export
○ College tools such as SchooLinks and Naviance, National
College Clearinghouse data
● What does the data mean, “data generating process”
● FERPA?
2.2 Getting Data - Collecting Your Own Data
● Sources:
○ Structured activities/curriculum exposure (what they do)
○ Surveys/Questionnaires (what they say they do and what
they think)
● Best Practices:
○ Organization is essential: lab notebook, dates/timestamps
○ Pay attention to ID space - your ability to analyze data is
actually really tied to your ability to tie outcome data to
treatment data with a key
3. Preparing Data
● Combining data from different data sets: ID space (key)
○ Vlookup (Excel, Google sheets, Apple Numbers)
○ Joins - SQL, python, etc.
● Messy/missing data, outliers - what to include and not to
include?
● Visualizing data
4. Analyzing Data
● All about the relationship between the treatment data and the
outcome data.
● Conditional probability is the most complicated math you’ll
need and most of these dynamics are really early visualized with
graphs
● Background: ASCA has actually a pretty useful book - a review
of percentages/probability, etc focused on giving counselors the
background to do this work - you might be able to pick to up
here if there’s a book store
Sample Data Prep
Sample Data Analysis
Students who had B achieved success at
33% whereas A achieved success at
22%
What to do with your findings:
● Apply them yourself
● Share them - if they’re worthwhile for you, they’re probably
worthwhile for the rest of your dept (ideally “generalizable”)
● Communicate them - for larger adoption across a dept or
funding
○ Graphs, writing, speaking
○ Keep it simple
Pretty simple… what’s the big deal about?
Concepts you should be aware of:
● Regression
● Classification
● Multivariate Analysis
● Causal Analysis
● Statistical Confidence (p values)
● Machine Learning
Goal: know how these are useful
Classification
● Determining the class or group of a
case
● Outcome is the probability case being
part of a certain group
● Most common use case is binary
classification
● Many different statistical methods
(“families of models” can be used)
Example:
Predicted whether a student will fill out FAFSA for
by a certain date based on academic performance
Regression
● Predicting continuous outcomes
● Similar to our exercise but instead
of the probability, we’re looking at
the average score for particular
treatment groups or the average
change in one variable for a unit
change in the other “slope”
Example:
Predicting number of AP classes by house of
extracurricular activity per week
Multivariate Analysis
● Incorporating more than one
independent variable, still only one
response variable
● Can think of it as data in more than two
dimensions
● Think about the effect of one variable
controlling for all others
● Could be in classification setting or
regression setting
http://metabolomicsplatform.com/projects/gc-ms/
Example:
Predicted whether a student will fill out FAFSA for by a
certain date based on academic performance,
demographics, and survey data
Causal Analysis
● Accounting for the fact that example cases
aren’t always assigned to treatment in a
randomized way
● Many techniques, usually require a lot of data,
simplest is Propensity Score Matching (PSM)
● 1) build model to understand probability of
assignment to treatment/control
● 2) Pick groups of subjects in treatment and
control groups that had the same chance of
being assigned to the treatment or control
based on all of the other day (controls for bias)
P-values
● All about quantifying how certain
you are that your finding is a real
finding and not just “random
variation” or statistical noise
● Dependent on sample size and
variability of you data
● Given that there was no true
difference between groups, how
likely would you be to find the
two groups as different as you did
in your analysis
http://uk.cochrane.org/news/key-statistical-result-i
nterpretation-p-value-plain-english
Machine Learning
● Figuring out how to encode knowledge
and patterns into structures we can use
● All about predictive accuracy vs.
statistics which is more about
assembling knowledge of the
underlying patterns that we study
● Supervised vs. Unsupervised Learning
● Many different methodologies:
decisions trees, bayesian learning,
deep learning, clustering, expectation
maximization
Additional Resources
Concluding Remarks
● Data skills - more about logic, domain knowledge, and posing
good questions rather than hard technical skills
● Get 80% of the way there with conditional probability
● Use tools to automate workflow and save time
● Ask questions... of your data, your vendors, colleagues, the
internet
Questions?
Contact Info:
Mike@schoolinks.com
SchooLinks | A personalized college and career readiness solution

More Related Content

What's hot

Data editing ( In research methodology )
Data editing ( In research methodology )Data editing ( In research methodology )
Data editing ( In research methodology )
Np Shakeel
 
Business Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysisBusiness Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysis
Ahsan Khan Eco (Superior College)
 
DataGathering-Qualitative and Quantitative
DataGathering-Qualitative and QuantitativeDataGathering-Qualitative and Quantitative
DataGathering-Qualitative and Quantitative
Sreenivas Ravi
 

What's hot (20)

Education analytics – reporting students growth using sgp model
Education analytics – reporting students growth using sgp modelEducation analytics – reporting students growth using sgp model
Education analytics – reporting students growth using sgp model
 
The information needs of Occupational Therapy students - Jane Morgan Daniel
The information needs of Occupational Therapy students - Jane Morgan DanielThe information needs of Occupational Therapy students - Jane Morgan Daniel
The information needs of Occupational Therapy students - Jane Morgan Daniel
 
Business Statistics
Business StatisticsBusiness Statistics
Business Statistics
 
Classification of research
Classification of researchClassification of research
Classification of research
 
Uop qnt 561 week 6 signature assignment (hospital) new
Uop qnt 561 week 6 signature assignment (hospital) newUop qnt 561 week 6 signature assignment (hospital) new
Uop qnt 561 week 6 signature assignment (hospital) new
 
Data Analysis, Intepretation
Data Analysis, IntepretationData Analysis, Intepretation
Data Analysis, Intepretation
 
Some Glaring Mistakes made by Researchers in Education in Statistical Analysis
Some Glaring Mistakes made by Researchers in Education in Statistical AnalysisSome Glaring Mistakes made by Researchers in Education in Statistical Analysis
Some Glaring Mistakes made by Researchers in Education in Statistical Analysis
 
Research design
Research designResearch design
Research design
 
Questionnaires and surveys
Questionnaires and surveysQuestionnaires and surveys
Questionnaires and surveys
 
Data analysis
Data analysisData analysis
Data analysis
 
Data collection in research (Course code-8613)
Data collection in research  (Course code-8613)Data collection in research  (Course code-8613)
Data collection in research (Course code-8613)
 
How to select the appropriate method for our study of Interest?
How to select the appropriate method for our study of Interest?How to select the appropriate method for our study of Interest?
How to select the appropriate method for our study of Interest?
 
Projctppt (1)
Projctppt (1)Projctppt (1)
Projctppt (1)
 
Data editing ( In research methodology )
Data editing ( In research methodology )Data editing ( In research methodology )
Data editing ( In research methodology )
 
Business Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysisBusiness Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysis
 
Analysing/Interpreting Quantitative Research
Analysing/Interpreting  Quantitative Research Analysing/Interpreting  Quantitative Research
Analysing/Interpreting Quantitative Research
 
Lecture 01 - Some basic terminology, History, Application of statistics - Def...
Lecture 01 - Some basic terminology, History, Application of statistics - Def...Lecture 01 - Some basic terminology, History, Application of statistics - Def...
Lecture 01 - Some basic terminology, History, Application of statistics - Def...
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
 
Term Paper Topics
Term Paper TopicsTerm Paper Topics
Term Paper Topics
 
DataGathering-Qualitative and Quantitative
DataGathering-Qualitative and QuantitativeDataGathering-Qualitative and Quantitative
DataGathering-Qualitative and Quantitative
 

Similar to Data Driven College Counseling by SchooLinks

Presentation For Gene S Revision 3
Presentation For Gene S Revision 3Presentation For Gene S Revision 3
Presentation For Gene S Revision 3
WSU Cougars
 
EBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting BlEBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting Bl
Dr. Bruce A. Johnson
 
Assessment Institute August 21 2008
Assessment Institute August 21 2008Assessment Institute August 21 2008
Assessment Institute August 21 2008
middlesex
 

Similar to Data Driven College Counseling by SchooLinks (20)

How AI will change the way you help students succeed - SchooLinks
How AI will change the way you help students succeed - SchooLinksHow AI will change the way you help students succeed - SchooLinks
How AI will change the way you help students succeed - SchooLinks
 
Starr Hoffman - Data Collection & Research Design
Starr Hoffman - Data Collection & Research Design Starr Hoffman - Data Collection & Research Design
Starr Hoffman - Data Collection & Research Design
 
Presentation For Gene S Revision 3
Presentation For Gene S Revision 3Presentation For Gene S Revision 3
Presentation For Gene S Revision 3
 
Learning Analytics In Higher Education: Struggles & Successes (Part 2)
Learning Analytics In Higher Education: Struggles & Successes (Part 2)Learning Analytics In Higher Education: Struggles & Successes (Part 2)
Learning Analytics In Higher Education: Struggles & Successes (Part 2)
 
General Tips to Fast-Track Your Quantitative Methodology
General Tips to Fast-Track Your Quantitative MethodologyGeneral Tips to Fast-Track Your Quantitative Methodology
General Tips to Fast-Track Your Quantitative Methodology
 
Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014
Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014
Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014
 
Learning Analytics
Learning AnalyticsLearning Analytics
Learning Analytics
 
data science course with placement in hyderabad
data science course with placement in hyderabaddata science course with placement in hyderabad
data science course with placement in hyderabad
 
GBS MSCBDA - Dissertation Guidelines.pdf
GBS MSCBDA - Dissertation Guidelines.pdfGBS MSCBDA - Dissertation Guidelines.pdf
GBS MSCBDA - Dissertation Guidelines.pdf
 
data science and business analytics
data science and business analyticsdata science and business analytics
data science and business analytics
 
Data Driven Decision Making Presentation
Data Driven Decision Making PresentationData Driven Decision Making Presentation
Data Driven Decision Making Presentation
 
EBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting BlEBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting Bl
 
Group4 present3 3-15
Group4 present3 3-15Group4 present3 3-15
Group4 present3 3-15
 
Research Fundamentals_ lecture2.pdf
Research Fundamentals_ lecture2.pdfResearch Fundamentals_ lecture2.pdf
Research Fundamentals_ lecture2.pdf
 
Rearch methodology
Rearch methodologyRearch methodology
Rearch methodology
 
Assessment Institute August 21 2008
Assessment Institute August 21 2008Assessment Institute August 21 2008
Assessment Institute August 21 2008
 
unit 4 deta analysis bbaY Dr kanchan.pptx
unit 4 deta analysis bbaY Dr kanchan.pptxunit 4 deta analysis bbaY Dr kanchan.pptx
unit 4 deta analysis bbaY Dr kanchan.pptx
 
unit 4 deta analysis bbaY Dr kanchan.pptx
unit 4 deta analysis bbaY Dr kanchan.pptxunit 4 deta analysis bbaY Dr kanchan.pptx
unit 4 deta analysis bbaY Dr kanchan.pptx
 
Data analysis 2011
Data analysis 2011Data analysis 2011
Data analysis 2011
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
 

Recently uploaded

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Recently uploaded (20)

psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 

Data Driven College Counseling by SchooLinks

  • 1. Data-Driven College Counseling Michael Discenza Senior Data Scientist - SchooLinks SchooLinks | A personalized college and career readiness solution
  • 2. About Me ● Statistics B.A. + M.A. @ Columbia ● Data & Accountability Team @ Success Academies Charter Network in NYC ● Data Science @JPMorgan, @ RUN Ads (Digital ad targeting) ● Currently Data Science @ SchooLinks ● Why am I here and what do I know about college planning/counseling?
  • 3. What should you take away? 1) How being data driven counseling can help you in counseling 2) Process/Framework to follow 3) Exposure and working knowledge of more advanced techniques
  • 4. Why data-driven? ● “Big data” ● It helps businesses figure out what they need to do, or what actions they need to take to meet their goals - predicting the future? Super powers? ● Why does it work? Boils down to is Scientific Method and the availability of data. ● Businesses saw that this framework for thinking about the world was helpful for them.
  • 5.
  • 6. What is data useful for in college counseling?
  • 7. Conduct research yourself if you’re in academia or want to publish a paper about college counseling… But more likely: ○ Learn how to be more effective individually, ○ Be more effective as a department, or ○ Justify the investment to use a new curriculum/ new approach (be sure it has a positive outcome)
  • 8. What are your goals? ● Examples from folks that we work with: ○ Increase the number of students who have meaningful post graduation plans ○ Increase college going rate ○ Close achievement gap in your school/district ○ College retention
  • 9. Quantify these Goals ● Goal Metrics ○ Outcomes (matriculation, retention) ○ Process metrics (setting up for success… completion date of applications) ● KPI - intermediate/progress tracking ○ FAFSA Completion ○ PSAT/SAT/ACT, etc completion rates
  • 10. The Process 1) PLAN 2) Get data 3) Prepare/Analyze data 4) Improve based on the learnings from your data
  • 11. 1. Plan ● Background research to make sure your question is a good one for primary research: i.e. specific to your school and can’t be more efficiently answered by reading it in a book or elsewhere ● Write down your question ● Make sure it is important and tied to outcomes you are about… and will yield actionable insights ● Lab notebook (folder on your computer) etc. ● Ensure data access/you will be able to complete the actual “research" ● List out assumptions
  • 12. 2. Getting Data ● Two main types of data: ○ Outcome data (dependent variable) - college going rate, students who were accepted into their top 3 choices (combination of the KPI and the goal metrics we talked about before) ○ Treatment data (independent variable) - curriculum they used, programs/extracurricular at schools, sentiment as reported by surveys ● Sources of data (we’ll talk about strategies for each): ○ Existing data ○ Data you collect
  • 13. 2.1 Getting Data - Using Existing Sources of Student Data ● Where do you find it/how do you access? ○ SIS data - grades, attendance, participation -> CSV export ○ College tools such as SchooLinks and Naviance, National College Clearinghouse data ● What does the data mean, “data generating process” ● FERPA?
  • 14. 2.2 Getting Data - Collecting Your Own Data ● Sources: ○ Structured activities/curriculum exposure (what they do) ○ Surveys/Questionnaires (what they say they do and what they think) ● Best Practices: ○ Organization is essential: lab notebook, dates/timestamps ○ Pay attention to ID space - your ability to analyze data is actually really tied to your ability to tie outcome data to treatment data with a key
  • 15. 3. Preparing Data ● Combining data from different data sets: ID space (key) ○ Vlookup (Excel, Google sheets, Apple Numbers) ○ Joins - SQL, python, etc. ● Messy/missing data, outliers - what to include and not to include? ● Visualizing data
  • 16. 4. Analyzing Data ● All about the relationship between the treatment data and the outcome data. ● Conditional probability is the most complicated math you’ll need and most of these dynamics are really early visualized with graphs ● Background: ASCA has actually a pretty useful book - a review of percentages/probability, etc focused on giving counselors the background to do this work - you might be able to pick to up here if there’s a book store
  • 18. Sample Data Analysis Students who had B achieved success at 33% whereas A achieved success at 22%
  • 19. What to do with your findings: ● Apply them yourself ● Share them - if they’re worthwhile for you, they’re probably worthwhile for the rest of your dept (ideally “generalizable”) ● Communicate them - for larger adoption across a dept or funding ○ Graphs, writing, speaking ○ Keep it simple
  • 20. Pretty simple… what’s the big deal about? Concepts you should be aware of: ● Regression ● Classification ● Multivariate Analysis ● Causal Analysis ● Statistical Confidence (p values) ● Machine Learning Goal: know how these are useful
  • 21. Classification ● Determining the class or group of a case ● Outcome is the probability case being part of a certain group ● Most common use case is binary classification ● Many different statistical methods (“families of models” can be used) Example: Predicted whether a student will fill out FAFSA for by a certain date based on academic performance
  • 22. Regression ● Predicting continuous outcomes ● Similar to our exercise but instead of the probability, we’re looking at the average score for particular treatment groups or the average change in one variable for a unit change in the other “slope” Example: Predicting number of AP classes by house of extracurricular activity per week
  • 23. Multivariate Analysis ● Incorporating more than one independent variable, still only one response variable ● Can think of it as data in more than two dimensions ● Think about the effect of one variable controlling for all others ● Could be in classification setting or regression setting http://metabolomicsplatform.com/projects/gc-ms/ Example: Predicted whether a student will fill out FAFSA for by a certain date based on academic performance, demographics, and survey data
  • 24. Causal Analysis ● Accounting for the fact that example cases aren’t always assigned to treatment in a randomized way ● Many techniques, usually require a lot of data, simplest is Propensity Score Matching (PSM) ● 1) build model to understand probability of assignment to treatment/control ● 2) Pick groups of subjects in treatment and control groups that had the same chance of being assigned to the treatment or control based on all of the other day (controls for bias)
  • 25. P-values ● All about quantifying how certain you are that your finding is a real finding and not just “random variation” or statistical noise ● Dependent on sample size and variability of you data ● Given that there was no true difference between groups, how likely would you be to find the two groups as different as you did in your analysis http://uk.cochrane.org/news/key-statistical-result-i nterpretation-p-value-plain-english
  • 26. Machine Learning ● Figuring out how to encode knowledge and patterns into structures we can use ● All about predictive accuracy vs. statistics which is more about assembling knowledge of the underlying patterns that we study ● Supervised vs. Unsupervised Learning ● Many different methodologies: decisions trees, bayesian learning, deep learning, clustering, expectation maximization
  • 28. Concluding Remarks ● Data skills - more about logic, domain knowledge, and posing good questions rather than hard technical skills ● Get 80% of the way there with conditional probability ● Use tools to automate workflow and save time ● Ask questions... of your data, your vendors, colleagues, the internet
  • 29. Questions? Contact Info: Mike@schoolinks.com SchooLinks | A personalized college and career readiness solution