SlideShare a Scribd company logo
1 of 22
Download to read offline
Introduction to ROC Curves 
Data Science Basics Series 
May 14, 2014
What is ROC? 
Receiver Operating Characteristic 
Systematically 
trade off detection against false alarm Using 
You woke me 
up at 3 am!!! 
Wake up, 
you’re late for 
class!!! 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION
A Brief History of ROC Curves 
• Developed by electrical engineers and radar 
operators during WWII to detect enemy airplanes 
vs. geese. 
• Illustrates the performance of binary classifiers - 
elements in a set divided into two groups 
• Compares trade-offs between detection and false 
alarm rate 
• Now used in many fields 
• Psychology 
• Medicine and biometrics 
• More recently in machine learning and data mining 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION
Detection vs. False Alarm 
• Detec7on/sensi7vity/true 
posi7ve 
rate 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
measures 
how 
many 
true 
posi0ve 
cases 
are 
correctly 
detected 
• False 
alarm/specificity/false 
posi7ve 
rate 
measures 
the 
number 
of 
false 
alarms 
• Tradeoff: 
Usually 
can 
op0mize 
for 
one 
but 
not 
both 
• Example: 
Disease 
detec0on 
• Sacrifice 
false 
alarm 
for 
detec0on 
if 
cost 
of 
missed 
detec0on 
is 
alarmingly 
high
How is ROC Generated? 
Features à Scores à PDF à ROC 
Model 
GPA 
Activities 
Courses 
Financial aid 
SAT/ACT 
High school 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
Probability of detection 
Optimal point on 
the ROC curve 
depends on reach 
capacity and ROI 
Probability of false alarm 
Predicted 
risk Score
How is ROC Generated? 
Features à Scores à PDF à ROC 
GPA 
Activities 
Courses 
Financial aid 
SAT/ACT 
High school 
Model 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
Probability of detection 
Optimal point on 
the ROC curve 
depends on reach 
capacity and ROI 
Probability of false alarm 
Predicted 
risk Score
How is ROC Generated? 
Features à Scores à PDF à ROC 
Model 
Cutoff 
threshold 
GPA 
Activities 
Courses 
Financial aid 
SAT/ACT 
High school 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
Probability of detection 
Optimal point on 
the ROC curve 
depends on reach 
capacity and ROI 
Probability of false alarm 
Predicted 
risk Score
Model Performance 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
Overlap is a measure of the 
model’s ability to separate 
between success and failure. 
With a strong model you can 
be confident of assigning a 
particular score to an outcome 
category. 
With a weaker model, there is 
a large amount of overlap, so a 
particular score could mean 
that an outcome can be either 
good or bad with equal 
probability. 
STRONG 
MODEL 
WEAK 
MODEL 
Predicted 
risk 
score 
ROC
Parts of a ROC Curve 
False 
Alarm 
Rate 
Detec0on 
Rate 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
Civitas 
Model 
Random 
Ordering
Parts of a ROC Curve 
False 
Alarm 
Rate 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
Detec0on 
Rate 
Total Population: 
• 10,000 students 
• 9,000 continued 
• 1,000 did not continue 
Point on Line: 
• 1,250 students 
• 1,125 continued 
• 125 did not continue 
ROC Information 
• Correct identification rate of non-continuing 
students = 125/1,250 = 10%
Parts of a ROC Curve 
False 
Alarm 
Rate 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
Detec0on 
Rate 
Total Population: 
• 10,000 students 
• 9,000 continued 
• 1,000 did not continue 
Point on Line: 
• 7,500 students 
• 6,750 continued 
• 750 did not continue 
ROC Information 
• Correct identification rate of non-continuing 
students = 750/7,500 = 10%
Tradeoffs: Without the model, more advisors are needed to reach 
more students who will not persist. 
False 
Alarm 
Rate 
Detec0on 
Rate 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
As you go up and to the right, 
you would be reaching out to 
more at-risk students (higher 
detection rate), but more 
interventions require more 
advising time and resources 
since correct identification 
rate of non-continuing 
students remains at the same 
10%.
Model Performance: With the model, the same number of 
advisors can reach out to 5X more students who will not persist. 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
Total Population: 
• 10,000 students 
• 9,000 continued 
• 1,000 did not continue 
Point on Line: 
• 1,250 students 
• 1,125 continued 
• 125 did not continue 
• Correct = 125/1250 = 10.0% 
ROC Information: 
• 1,250 students 
• 650 continued 
• 600 did not continue 
• Correct identification rate of non-continuing 
students = 600/1250 = 48.0% 
False 
Alarm 
Rate 
Detec0on 
Rate 
Civitas 
Model 
Random 
Ordering 
~5X
Model Evaluation 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION 
With a stronger 
predictive model 
• Detection rate improves 
• False alarm rate decreases 
• Correctness increases at 
every student threshold 
False 
Alarm 
Rate 
Detec0on 
Rate 
Civitas 
Model 
Random 
Ordering
ACCURACY VS. ROC CURVES 
Why is accuracy an incomplete and likely 
misleading measure of a predictive model?
Accuracy vs. ROC Curves 
Case: You use an algorithm to 
identify students who are at 
risk of not continuing to the 
next term. 
Following the case study, 10% 
of students do not persist. 
You test your predictive model 
on the data and find that you 
made correct predictions 92% 
of the time. 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION
Accuracy vs. ROC Curves 
A crackpot scientist tells you, 
“I could’ve gotten 90% 
accuracy just by predicting 
everyone will persist. After all 
the math, you gained only 
2%?!” 
Don’t give up yet! 
Your predictive model 
is still helpful.
Accuracy vs. ROC Curves 
You have a team of advisors, and they have time to 
reach out to 1,250 students to suggest ways they can 
increase their likelihood of persisting. 
= 
100 
students
Accuracy vs. ROC Curves 
Without the predictive model, you have to pick 1,250 
students at random to assist. If 10% of them are 
expected to not persist, only 125 students would be 
likely to benefit from the intervention. 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION
Accuracy vs. ROC Curves 
With the predictive model, you can choose the 
1,250 students by ordering them by the highest 
predicted risk score. 
The test case reveals 600 of these students are at 
risk and would be most likely to benefit from the 
right intervention at the right time. 
CIVITAS 
LEARNING, 
INC. 
– 
CONFIDENTIAL 
INFORMATION
The ROC Curve Tradeoff 
Students most likely to benefit from an intervention 
WITHOUT 
PREDICTIVE MODEL 
WITH 
PREDICTIVE MODEL 
~5x improvement
THANK YOU 
VIEW this webinar on-demand on our LinkedIn Page 
FOLLOW @CivitasLearning to continue the conversation on Twitter 
SHARE comments and ideas for future webinars on the Civitas Learning Space 
linkedin.com/company/Civitas-Learning 
twitter.com/CivitasLearning 
civitaslearningspace.com

More Related Content

Viewers also liked

Análisis de problematica de las Tic
Análisis de problematica de las TicAnálisis de problematica de las Tic
Análisis de problematica de las Tic
luzneida26
 
Deconstructing Academic Writing: A Look at Nominalization
Deconstructing Academic Writing: A Look at NominalizationDeconstructing Academic Writing: A Look at Nominalization
Deconstructing Academic Writing: A Look at Nominalization
ProofreadingServices.com
 
Enigma, and Editing in Horr
Enigma, and Editing in HorrEnigma, and Editing in Horr
Enigma, and Editing in Horr
Blue-Clouds
 

Viewers also liked (11)

Análisis de problematica de las Tic
Análisis de problematica de las TicAnálisis de problematica de las Tic
Análisis de problematica de las Tic
 
Knowledge Management Kiat Sukses Mengelola Modal & Aset Perusahaan
Knowledge Management Kiat Sukses Mengelola Modal & Aset PerusahaanKnowledge Management Kiat Sukses Mengelola Modal & Aset Perusahaan
Knowledge Management Kiat Sukses Mengelola Modal & Aset Perusahaan
 
Codes and conventions for Thriller Films
Codes and conventions for Thriller FilmsCodes and conventions for Thriller Films
Codes and conventions for Thriller Films
 
Oxford Innovation Leaders in Innovation Fellowships
Oxford Innovation Leaders in Innovation FellowshipsOxford Innovation Leaders in Innovation Fellowships
Oxford Innovation Leaders in Innovation Fellowships
 
Deconstructing Academic Writing: A Look at Nominalization
Deconstructing Academic Writing: A Look at NominalizationDeconstructing Academic Writing: A Look at Nominalization
Deconstructing Academic Writing: A Look at Nominalization
 
Codes and conventions Comparison
Codes and conventions ComparisonCodes and conventions Comparison
Codes and conventions Comparison
 
GV1244 Task 7: Factors Affecting LLS
GV1244 Task 7: Factors Affecting LLSGV1244 Task 7: Factors Affecting LLS
GV1244 Task 7: Factors Affecting LLS
 
Введение в распределенные системы
Введение в распределенные системыВведение в распределенные системы
Введение в распределенные системы
 
Film idea & opening sequence idea
Film idea & opening sequence ideaFilm idea & opening sequence idea
Film idea & opening sequence idea
 
Catalogo megalisp nancy_mary
Catalogo megalisp nancy_maryCatalogo megalisp nancy_mary
Catalogo megalisp nancy_mary
 
Enigma, and Editing in Horr
Enigma, and Editing in HorrEnigma, and Editing in Horr
Enigma, and Editing in Horr
 

Similar to Civitas Learning: Understanding ROC Curves

Mir 2012 13 session #4
Mir 2012 13 session #4Mir 2012 13 session #4
Mir 2012 13 session #4
RichardGroom
 
The Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged CareThe Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged Care
Altegra Health
 

Similar to Civitas Learning: Understanding ROC Curves (20)

AASCU Carmean & Baer on Predictive Analytics
AASCU Carmean & Baer on Predictive AnalyticsAASCU Carmean & Baer on Predictive Analytics
AASCU Carmean & Baer on Predictive Analytics
 
AASCU Analytics
AASCU  Analytics AASCU  Analytics
AASCU Analytics
 
Storyfying your Data: How to go from Data to Insights to Stories
Storyfying your Data: How to go from Data to Insights to StoriesStoryfying your Data: How to go from Data to Insights to Stories
Storyfying your Data: How to go from Data to Insights to Stories
 
21 may2014 f healey ps congres
21 may2014 f healey ps congres21 may2014 f healey ps congres
21 may2014 f healey ps congres
 
How to Do a Formal Risk Assessment
How to Do a Formal Risk AssessmentHow to Do a Formal Risk Assessment
How to Do a Formal Risk Assessment
 
Mir 2012 13 session #4
Mir 2012 13 session #4Mir 2012 13 session #4
Mir 2012 13 session #4
 
San Juan College - Quality in Online Learning
San Juan College - Quality in Online LearningSan Juan College - Quality in Online Learning
San Juan College - Quality in Online Learning
 
Survey Training and LQAS
Survey Training and LQASSurvey Training and LQAS
Survey Training and LQAS
 
G2002 s17 usabilitytestreport
G2002 s17 usabilitytestreportG2002 s17 usabilitytestreport
G2002 s17 usabilitytestreport
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Using Data to Improve Student SuccessFaculty Development Model - Competency-B...
Using Data to Improve Student SuccessFaculty Development Model - Competency-B...Using Data to Improve Student SuccessFaculty Development Model - Competency-B...
Using Data to Improve Student SuccessFaculty Development Model - Competency-B...
 
Sampling method
Sampling methodSampling method
Sampling method
 
LOMA - How actuaries can use advanced analytical techniques to modernize thei...
LOMA - How actuaries can use advanced analytical techniques to modernize thei...LOMA - How actuaries can use advanced analytical techniques to modernize thei...
LOMA - How actuaries can use advanced analytical techniques to modernize thei...
 
Analytics, Big Data and The Cloud II Conference - Kiribatu Labs
Analytics, Big Data and The Cloud II Conference - Kiribatu LabsAnalytics, Big Data and The Cloud II Conference - Kiribatu Labs
Analytics, Big Data and The Cloud II Conference - Kiribatu Labs
 
How to Enter the Data Analytics Industry?
How to Enter the Data Analytics Industry?How to Enter the Data Analytics Industry?
How to Enter the Data Analytics Industry?
 
The Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged CareThe Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged Care
 
Barga Data Science lecture 6
Barga Data Science lecture 6Barga Data Science lecture 6
Barga Data Science lecture 6
 
Confidence Intervals: Basic concepts and overview
Confidence Intervals: Basic concepts and overviewConfidence Intervals: Basic concepts and overview
Confidence Intervals: Basic concepts and overview
 
Using Machine Learning to Find a needle in a haystack Aureus Analytics
Using Machine Learning to Find a needle in a haystack  Aureus AnalyticsUsing Machine Learning to Find a needle in a haystack  Aureus Analytics
Using Machine Learning to Find a needle in a haystack Aureus Analytics
 
The value of storytelling through data
The value of storytelling through dataThe value of storytelling through data
The value of storytelling through data
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Civitas Learning: Understanding ROC Curves

  • 1. Introduction to ROC Curves Data Science Basics Series May 14, 2014
  • 2. What is ROC? Receiver Operating Characteristic Systematically trade off detection against false alarm Using You woke me up at 3 am!!! Wake up, you’re late for class!!! CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION
  • 3. A Brief History of ROC Curves • Developed by electrical engineers and radar operators during WWII to detect enemy airplanes vs. geese. • Illustrates the performance of binary classifiers - elements in a set divided into two groups • Compares trade-offs between detection and false alarm rate • Now used in many fields • Psychology • Medicine and biometrics • More recently in machine learning and data mining CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION
  • 4. Detection vs. False Alarm • Detec7on/sensi7vity/true posi7ve rate CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION measures how many true posi0ve cases are correctly detected • False alarm/specificity/false posi7ve rate measures the number of false alarms • Tradeoff: Usually can op0mize for one but not both • Example: Disease detec0on • Sacrifice false alarm for detec0on if cost of missed detec0on is alarmingly high
  • 5. How is ROC Generated? Features à Scores à PDF à ROC Model GPA Activities Courses Financial aid SAT/ACT High school CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION Probability of detection Optimal point on the ROC curve depends on reach capacity and ROI Probability of false alarm Predicted risk Score
  • 6. How is ROC Generated? Features à Scores à PDF à ROC GPA Activities Courses Financial aid SAT/ACT High school Model CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION Probability of detection Optimal point on the ROC curve depends on reach capacity and ROI Probability of false alarm Predicted risk Score
  • 7. How is ROC Generated? Features à Scores à PDF à ROC Model Cutoff threshold GPA Activities Courses Financial aid SAT/ACT High school CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION Probability of detection Optimal point on the ROC curve depends on reach capacity and ROI Probability of false alarm Predicted risk Score
  • 8. Model Performance CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION Overlap is a measure of the model’s ability to separate between success and failure. With a strong model you can be confident of assigning a particular score to an outcome category. With a weaker model, there is a large amount of overlap, so a particular score could mean that an outcome can be either good or bad with equal probability. STRONG MODEL WEAK MODEL Predicted risk score ROC
  • 9. Parts of a ROC Curve False Alarm Rate Detec0on Rate CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION Civitas Model Random Ordering
  • 10. Parts of a ROC Curve False Alarm Rate CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION Detec0on Rate Total Population: • 10,000 students • 9,000 continued • 1,000 did not continue Point on Line: • 1,250 students • 1,125 continued • 125 did not continue ROC Information • Correct identification rate of non-continuing students = 125/1,250 = 10%
  • 11. Parts of a ROC Curve False Alarm Rate CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION Detec0on Rate Total Population: • 10,000 students • 9,000 continued • 1,000 did not continue Point on Line: • 7,500 students • 6,750 continued • 750 did not continue ROC Information • Correct identification rate of non-continuing students = 750/7,500 = 10%
  • 12. Tradeoffs: Without the model, more advisors are needed to reach more students who will not persist. False Alarm Rate Detec0on Rate CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION As you go up and to the right, you would be reaching out to more at-risk students (higher detection rate), but more interventions require more advising time and resources since correct identification rate of non-continuing students remains at the same 10%.
  • 13. Model Performance: With the model, the same number of advisors can reach out to 5X more students who will not persist. CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION Total Population: • 10,000 students • 9,000 continued • 1,000 did not continue Point on Line: • 1,250 students • 1,125 continued • 125 did not continue • Correct = 125/1250 = 10.0% ROC Information: • 1,250 students • 650 continued • 600 did not continue • Correct identification rate of non-continuing students = 600/1250 = 48.0% False Alarm Rate Detec0on Rate Civitas Model Random Ordering ~5X
  • 14. Model Evaluation CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION With a stronger predictive model • Detection rate improves • False alarm rate decreases • Correctness increases at every student threshold False Alarm Rate Detec0on Rate Civitas Model Random Ordering
  • 15. ACCURACY VS. ROC CURVES Why is accuracy an incomplete and likely misleading measure of a predictive model?
  • 16. Accuracy vs. ROC Curves Case: You use an algorithm to identify students who are at risk of not continuing to the next term. Following the case study, 10% of students do not persist. You test your predictive model on the data and find that you made correct predictions 92% of the time. CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION
  • 17. Accuracy vs. ROC Curves A crackpot scientist tells you, “I could’ve gotten 90% accuracy just by predicting everyone will persist. After all the math, you gained only 2%?!” Don’t give up yet! Your predictive model is still helpful.
  • 18. Accuracy vs. ROC Curves You have a team of advisors, and they have time to reach out to 1,250 students to suggest ways they can increase their likelihood of persisting. = 100 students
  • 19. Accuracy vs. ROC Curves Without the predictive model, you have to pick 1,250 students at random to assist. If 10% of them are expected to not persist, only 125 students would be likely to benefit from the intervention. CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION
  • 20. Accuracy vs. ROC Curves With the predictive model, you can choose the 1,250 students by ordering them by the highest predicted risk score. The test case reveals 600 of these students are at risk and would be most likely to benefit from the right intervention at the right time. CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION
  • 21. The ROC Curve Tradeoff Students most likely to benefit from an intervention WITHOUT PREDICTIVE MODEL WITH PREDICTIVE MODEL ~5x improvement
  • 22. THANK YOU VIEW this webinar on-demand on our LinkedIn Page FOLLOW @CivitasLearning to continue the conversation on Twitter SHARE comments and ideas for future webinars on the Civitas Learning Space linkedin.com/company/Civitas-Learning twitter.com/CivitasLearning civitaslearningspace.com