SlideShare a Scribd company logo
1 of 50
Download to read offline
ITERATING OVER STATISTICAL MODELS
NCAA TOURNAMENT EDITION
Daniel Lee
WHY ARE YOU LISTENING TO ME?
▸ Stan developer

http://mc-stan.org

▸ Researcher at Columbia

▸ Co-founder of Stan Group

training / statistical support /
consulting






▸ email: bearlee@alum.mit.edu

twitter: @djsyclik / @mcmc_stan

web: http://syclik.com
MARCH MADNESS
WHAT IS MARCH MADNESS?
WHAT IS MARCH MADNESS?
CHAMPIONSHIP GAME: 

VILLANOVA VS NORTH CAROLINA
P(VILLANOVA WIN)?
CHAMPIONSHIP GAME: 

VILLANOVA VS NORTH CAROLINA
P(VILLANOVA WIN)?
VILLANOVA WILDCATS VS. NORTH CAROLINA TAR HEELS
▸ 1:00. 70 - 69
▸ 0:35. 72 - 69
▸ 0:23. 72 - 71
P(VILLANOVA WIN)?
VILLANOVA WILDCATS VS. NORTH CAROLINA TAR HEELS
▸ 1:00. 70 - 69
▸ 0:35. 72 - 69
▸ 0:23. 72 - 71
▸ 0:13. 74 - 71
P(VILLANOVA WIN)?
VILLANOVA WILDCATS VS. NORTH CAROLINA TAR HEELS
▸ 1:00. 70 - 69
▸ 0:35. 72 - 69
▸ 0:23. 72 - 71
▸ 0:13. 74 - 71
▸ 0:06. 74 - 74
P(VILLANOVA WIN)?
VILLANOVA WILDCATS VS. NORTH CAROLINA TAR HEELS
▸ 1:00. 70 - 69
▸ 0:35. 72 - 69
▸ 0:23. 72 - 71
▸ 0:13. 74 - 71
▸ 0:06. 74 - 74
▸ 0:00. 77 - 74. Villanova wins.
STATISTICAL MODELS
TREAT
AS CODE
WRITING CODE IS A DISCIPLINE
WRITING CODE IS A DISCIPLINE
▸ design patterns
▸ testing
▸ code review
▸ maintenance
▸ modularity
▸ collaboration
CAN BE
IS STATISTICAL MODELING A DISCIPLINE?
▸ Art or science?

▸ Models have names

▸ Statistical model vs implementation

▸ Collaboration on the statistical model?
TREAT STATISTICAL MODELS AS CODE
WHAT DO WE NEED TO DO
▸ Elevate statistical models: first class entity

▸ Modularize

▸ Language

▸ Discuss subtle details

▸ Collaborate
TREAT STATISTICAL MODELS AS CODE
STAN GETS US CLOSE
▸ statistical modeling language
▸ domain-specific language

has its own grammar; not R or BUGS!
▸ rstan

shinystan, RStudio integration, rstanarm
▸ open-source

core libraries are new BSD
▸ Stan program
▸ plain text
▸ plays nicely with source
repositories
▸ imperative language
Note: Stan isn’t the only thing you can do this with
MODELING
BASKETBALL
MARCH MADNESS
BASKETBALL HISTORY
▸ 1891
▸ Dr. James Naismith. Springfield, MA
▸ Non-contact conditioning
▸ 13 simple rules
▸ Peach basket





MARCH MADNESS
BASKETBALL HISTORY
▸ 1891
▸ Dr. James Naismith. Springfield, MA
▸ Non-contact conditioning
▸ 13 simple rules
▸ Peach basket

10. The umpire shall be judge of the men and shall note the fouls and notify the
referee when three consecutive fouls have been made. He shall have power to
disqualify men according to Rule 5.
MARCH MADNESS
BASKETBALL NOW
▸ 2 x 20 min half
▸ Increasing score
▸ Points increment by 2, 3, and 1
▸ 5 players, unlimited substitutions
▸ player DQ: 5th foul
▸ Bonus: 7 team fouls

Double bonus: 10 team fouls
MARCH MADNESS
DATA
▸ 351 NCAA Division 1 Men’s basketball teams
▸ 33 conferences
▸ 5421 games
▸ 24 - 35 games per team
▸ Max 3 observations
IS THIS BIG DATA?
TALL DATA VS WIDE DATA
▸ Tall data

lots of replications



▸ Wide data

lots of fields



day, home, score, ot, fgm, fga, 3pm, 3pa, 3a, fta, ftm, or, dr, ast, to, stl, blk, pf
THREE STEPS OF
BAYESIAN DATA ANALYSIS
ANDREW GELMAN IN BDA
THE THREE STEPS OF BAYESIAN DATA ANALYSIS
1. Set up full probability model



2. Condition on observed data



3. Evaluate the fit of the model

ANDREW GELMAN IN BDA
THE THREE STEPS OF BAYESIAN DATA ANALYSIS
1. Set up full probability model

Write a Stan program

2. Condition on observed data



3. Evaluate the fit of the model

ANDREW GELMAN IN BDA
THE THREE STEPS OF BAYESIAN DATA ANALYSIS
1. Set up full probability model

Write a Stan program

2. Condition on observed data

Run RStan

3. Evaluate the fit of the model

ANDREW GELMAN IN BDA
THE THREE STEPS OF BAYESIAN DATA ANALYSIS
1. Set up full probability model

Write a Stan program

2. Condition on observed data

Run RStan

3. Evaluate the fit of the model

R, ShinyStan, posterior predictive checks

ITERATING OVER
MODELS
ITERATING OVER STATISTICAL MODELS
STATISTICAL MODEL #1
▸ Only 2015-2016 matters
▸ Teams have a latent ability
▸ “logistic regression”
▸ “Bradley-Terry model”
y ⇠ bernoulli(logit 1
(✓1 ✓2))
P(VILLANOVA > UNC | DATA) = 0.73
ITERATING OVER STATISTICAL MODELS
TREAT STATISTICAL MODELS AS CODE
▸ Statistical model in a separate file
▸ Git
▸ Testing
▸ Inspection of fit
▸ Backtest on historical data
▸ Priors
▸ “Model #1”
IN WRITING, YOU MUST KILL ALL YOUR
DARLINGS.
Willliam Faulkner
ITERATING OVER STATISTICAL MODELS
STATISTICAL MODEL #2
▸ Home court advantage!

▸ Teams have a latent ability
▸ “logistic regression”
▸ “Bradley-Terry model”
y ⇠ bernoulli(logit 1
(↵ + ✓1 ✓2))
P(VILLANOVA > UNC | DATA) = 0.52
ITERATING OVER STATISTICAL MODELS
STATISTICAL MODEL #3
▸ Assumptions:
▸ Only 2015-2016 matters
▸ Teams have a latent ability
▸ Model points
▸ Add a home court advantage
ITERATING OVER STATISTICAL MODELS
HOW DID WE DO?
▸ Kaggle: 87 / 608
▸ What went wrong?
▸ What’s next?
STATISTICAL MODELS
TREAT
AS CODE
THANKS
▸ Collaborative effort
▸ NCAA modeling:

Rob Trangucci
▸ Stan team

Andrew Gelman, Bob Carpenter, Matt
Hoffman, Ben Goodrich, Michael Betancourt,
Marcus Brubaker, Jiqiang Guo, Peter Li, Allen
Riddell, Marco Ignacio, Jeff Arnold, Mitzi
Morris, Rob Goedman, Brian Lau, Jonah
Gabry, Alp Kucukelbir, Robert Grant, Dustin
Tran, Krzysztof Sakrejda, Aki Vehtari, Rayleigh
Lei, Sebastian Weber
HELP
▸ http://mc-stan.org
▸ stan-users mailing list

▸ Stan Group Inc.

http://stan.fit

training / statistical support / consulting



▸ bearlee@alum.mit.edu / @djsyclik / @mcmc_stan

More Related Content

Viewers also liked

Analyzing NYC Transit Data
Analyzing NYC Transit DataAnalyzing NYC Transit Data
Analyzing NYC Transit DataWork-Bench
 
The Political Impact of Social Penumbras
The Political Impact of Social PenumbrasThe Political Impact of Social Penumbras
The Political Impact of Social PenumbrasWork-Bench
 
Reflection on the Data Science Profession in NYC
Reflection on the Data Science Profession in NYCReflection on the Data Science Profession in NYC
Reflection on the Data Science Profession in NYCWork-Bench
 
A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...
A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...
A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...Work-Bench
 
Dr. Datascience or: How I Learned to Stop Munging and Love Tests
Dr. Datascience or: How I Learned to Stop Munging and Love TestsDr. Datascience or: How I Learned to Stop Munging and Love Tests
Dr. Datascience or: How I Learned to Stop Munging and Love TestsWork-Bench
 
Julia + R for Data Science
Julia + R for Data ScienceJulia + R for Data Science
Julia + R for Data ScienceWork-Bench
 
R for Everything
R for EverythingR for Everything
R for EverythingWork-Bench
 
Using R at NYT Graphics
Using R at NYT GraphicsUsing R at NYT Graphics
Using R at NYT GraphicsWork-Bench
 
Thinking Small About Big Data
Thinking Small About Big DataThinking Small About Big Data
Thinking Small About Big DataWork-Bench
 
Improving Data Interoperability for Python and R
Improving Data Interoperability for Python and RImproving Data Interoperability for Python and R
Improving Data Interoperability for Python and RWork-Bench
 
Building Scalable Prediction Services in R
Building Scalable Prediction Services in RBuilding Scalable Prediction Services in R
Building Scalable Prediction Services in RWork-Bench
 
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
What We Learned Building an R-Python Hybrid Predictive Analytics PipelineWhat We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
What We Learned Building an R-Python Hybrid Predictive Analytics PipelineWork-Bench
 
High-Performance Python
High-Performance PythonHigh-Performance Python
High-Performance PythonWork-Bench
 
Inside the R Consortium
Inside the R ConsortiumInside the R Consortium
Inside the R ConsortiumWork-Bench
 
Scaling Analysis Responsibly
Scaling Analysis ResponsiblyScaling Analysis Responsibly
Scaling Analysis ResponsiblyWork-Bench
 

Viewers also liked (15)

Analyzing NYC Transit Data
Analyzing NYC Transit DataAnalyzing NYC Transit Data
Analyzing NYC Transit Data
 
The Political Impact of Social Penumbras
The Political Impact of Social PenumbrasThe Political Impact of Social Penumbras
The Political Impact of Social Penumbras
 
Reflection on the Data Science Profession in NYC
Reflection on the Data Science Profession in NYCReflection on the Data Science Profession in NYC
Reflection on the Data Science Profession in NYC
 
A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...
A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...
A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...
 
Dr. Datascience or: How I Learned to Stop Munging and Love Tests
Dr. Datascience or: How I Learned to Stop Munging and Love TestsDr. Datascience or: How I Learned to Stop Munging and Love Tests
Dr. Datascience or: How I Learned to Stop Munging and Love Tests
 
Julia + R for Data Science
Julia + R for Data ScienceJulia + R for Data Science
Julia + R for Data Science
 
R for Everything
R for EverythingR for Everything
R for Everything
 
Using R at NYT Graphics
Using R at NYT GraphicsUsing R at NYT Graphics
Using R at NYT Graphics
 
Thinking Small About Big Data
Thinking Small About Big DataThinking Small About Big Data
Thinking Small About Big Data
 
Improving Data Interoperability for Python and R
Improving Data Interoperability for Python and RImproving Data Interoperability for Python and R
Improving Data Interoperability for Python and R
 
Building Scalable Prediction Services in R
Building Scalable Prediction Services in RBuilding Scalable Prediction Services in R
Building Scalable Prediction Services in R
 
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
What We Learned Building an R-Python Hybrid Predictive Analytics PipelineWhat We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
 
High-Performance Python
High-Performance PythonHigh-Performance Python
High-Performance Python
 
Inside the R Consortium
Inside the R ConsortiumInside the R Consortium
Inside the R Consortium
 
Scaling Analysis Responsibly
Scaling Analysis ResponsiblyScaling Analysis Responsibly
Scaling Analysis Responsibly
 

More from Work-Bench

2017 Enterprise Almanac
2017 Enterprise Almanac2017 Enterprise Almanac
2017 Enterprise AlmanacWork-Bench
 
AI to Enable Next Generation of People Managers
AI to Enable Next Generation of People ManagersAI to Enable Next Generation of People Managers
AI to Enable Next Generation of People ManagersWork-Bench
 
Startup Recruiting Workbook: Sourcing and Interview Process
Startup Recruiting Workbook: Sourcing and Interview ProcessStartup Recruiting Workbook: Sourcing and Interview Process
Startup Recruiting Workbook: Sourcing and Interview ProcessWork-Bench
 
Cloud Native Infrastructure Management Solutions Compared
Cloud Native Infrastructure Management Solutions ComparedCloud Native Infrastructure Management Solutions Compared
Cloud Native Infrastructure Management Solutions ComparedWork-Bench
 
Building a Demand Generation Machine at MongoDB
Building a Demand Generation Machine at MongoDBBuilding a Demand Generation Machine at MongoDB
Building a Demand Generation Machine at MongoDBWork-Bench
 
How to Market Your Startup to the Enterprise
How to Market Your Startup to the EnterpriseHow to Market Your Startup to the Enterprise
How to Market Your Startup to the EnterpriseWork-Bench
 
Marketing & Design for the Enterprise
Marketing & Design for the EnterpriseMarketing & Design for the Enterprise
Marketing & Design for the EnterpriseWork-Bench
 
Playing the Marketing Long Game
Playing the Marketing Long GamePlaying the Marketing Long Game
Playing the Marketing Long GameWork-Bench
 

More from Work-Bench (8)

2017 Enterprise Almanac
2017 Enterprise Almanac2017 Enterprise Almanac
2017 Enterprise Almanac
 
AI to Enable Next Generation of People Managers
AI to Enable Next Generation of People ManagersAI to Enable Next Generation of People Managers
AI to Enable Next Generation of People Managers
 
Startup Recruiting Workbook: Sourcing and Interview Process
Startup Recruiting Workbook: Sourcing and Interview ProcessStartup Recruiting Workbook: Sourcing and Interview Process
Startup Recruiting Workbook: Sourcing and Interview Process
 
Cloud Native Infrastructure Management Solutions Compared
Cloud Native Infrastructure Management Solutions ComparedCloud Native Infrastructure Management Solutions Compared
Cloud Native Infrastructure Management Solutions Compared
 
Building a Demand Generation Machine at MongoDB
Building a Demand Generation Machine at MongoDBBuilding a Demand Generation Machine at MongoDB
Building a Demand Generation Machine at MongoDB
 
How to Market Your Startup to the Enterprise
How to Market Your Startup to the EnterpriseHow to Market Your Startup to the Enterprise
How to Market Your Startup to the Enterprise
 
Marketing & Design for the Enterprise
Marketing & Design for the EnterpriseMarketing & Design for the Enterprise
Marketing & Design for the Enterprise
 
Playing the Marketing Long Game
Playing the Marketing Long GamePlaying the Marketing Long Game
Playing the Marketing Long Game
 

Recently uploaded

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 

Recently uploaded (20)

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 

Iterating over statistical models: NCAA tournament edition

  • 1. ITERATING OVER STATISTICAL MODELS NCAA TOURNAMENT EDITION
  • 2. Daniel Lee WHY ARE YOU LISTENING TO ME? ▸ Stan developer
 http://mc-stan.org
 ▸ Researcher at Columbia
 ▸ Co-founder of Stan Group
 training / statistical support / consulting 
 
 
 ▸ email: bearlee@alum.mit.edu
 twitter: @djsyclik / @mcmc_stan
 web: http://syclik.com
  • 4. WHAT IS MARCH MADNESS?
  • 5. WHAT IS MARCH MADNESS?
  • 7. P(VILLANOVA WIN)? CHAMPIONSHIP GAME: 
 VILLANOVA VS NORTH CAROLINA
  • 8. P(VILLANOVA WIN)? VILLANOVA WILDCATS VS. NORTH CAROLINA TAR HEELS ▸ 1:00. 70 - 69 ▸ 0:35. 72 - 69 ▸ 0:23. 72 - 71
  • 9. P(VILLANOVA WIN)? VILLANOVA WILDCATS VS. NORTH CAROLINA TAR HEELS ▸ 1:00. 70 - 69 ▸ 0:35. 72 - 69 ▸ 0:23. 72 - 71 ▸ 0:13. 74 - 71
  • 10. P(VILLANOVA WIN)? VILLANOVA WILDCATS VS. NORTH CAROLINA TAR HEELS ▸ 1:00. 70 - 69 ▸ 0:35. 72 - 69 ▸ 0:23. 72 - 71 ▸ 0:13. 74 - 71 ▸ 0:06. 74 - 74
  • 11. P(VILLANOVA WIN)? VILLANOVA WILDCATS VS. NORTH CAROLINA TAR HEELS ▸ 1:00. 70 - 69 ▸ 0:35. 72 - 69 ▸ 0:23. 72 - 71 ▸ 0:13. 74 - 71 ▸ 0:06. 74 - 74 ▸ 0:00. 77 - 74. Villanova wins.
  • 12.
  • 14. WRITING CODE IS A DISCIPLINE
  • 15. WRITING CODE IS A DISCIPLINE ▸ design patterns ▸ testing ▸ code review ▸ maintenance ▸ modularity ▸ collaboration CAN BE
  • 16. IS STATISTICAL MODELING A DISCIPLINE? ▸ Art or science?
 ▸ Models have names
 ▸ Statistical model vs implementation
 ▸ Collaboration on the statistical model?
  • 17. TREAT STATISTICAL MODELS AS CODE WHAT DO WE NEED TO DO ▸ Elevate statistical models: first class entity
 ▸ Modularize
 ▸ Language
 ▸ Discuss subtle details
 ▸ Collaborate
  • 18. TREAT STATISTICAL MODELS AS CODE STAN GETS US CLOSE ▸ statistical modeling language ▸ domain-specific language
 has its own grammar; not R or BUGS! ▸ rstan
 shinystan, RStudio integration, rstanarm ▸ open-source
 core libraries are new BSD ▸ Stan program ▸ plain text ▸ plays nicely with source repositories ▸ imperative language Note: Stan isn’t the only thing you can do this with
  • 20. MARCH MADNESS BASKETBALL HISTORY ▸ 1891 ▸ Dr. James Naismith. Springfield, MA ▸ Non-contact conditioning ▸ 13 simple rules ▸ Peach basket
 
 

  • 21. MARCH MADNESS BASKETBALL HISTORY ▸ 1891 ▸ Dr. James Naismith. Springfield, MA ▸ Non-contact conditioning ▸ 13 simple rules ▸ Peach basket
 10. The umpire shall be judge of the men and shall note the fouls and notify the referee when three consecutive fouls have been made. He shall have power to disqualify men according to Rule 5.
  • 22. MARCH MADNESS BASKETBALL NOW ▸ 2 x 20 min half ▸ Increasing score ▸ Points increment by 2, 3, and 1 ▸ 5 players, unlimited substitutions ▸ player DQ: 5th foul ▸ Bonus: 7 team fouls
 Double bonus: 10 team fouls
  • 23. MARCH MADNESS DATA ▸ 351 NCAA Division 1 Men’s basketball teams ▸ 33 conferences ▸ 5421 games ▸ 24 - 35 games per team ▸ Max 3 observations
  • 24.
  • 25. IS THIS BIG DATA?
  • 26. TALL DATA VS WIDE DATA ▸ Tall data
 lots of replications
 
 ▸ Wide data
 lots of fields
 
 day, home, score, ot, fgm, fga, 3pm, 3pa, 3a, fta, ftm, or, dr, ast, to, stl, blk, pf
  • 27. THREE STEPS OF BAYESIAN DATA ANALYSIS
  • 28. ANDREW GELMAN IN BDA THE THREE STEPS OF BAYESIAN DATA ANALYSIS 1. Set up full probability model
 
 2. Condition on observed data
 
 3. Evaluate the fit of the model

  • 29. ANDREW GELMAN IN BDA THE THREE STEPS OF BAYESIAN DATA ANALYSIS 1. Set up full probability model
 Write a Stan program
 2. Condition on observed data
 
 3. Evaluate the fit of the model

  • 30. ANDREW GELMAN IN BDA THE THREE STEPS OF BAYESIAN DATA ANALYSIS 1. Set up full probability model
 Write a Stan program
 2. Condition on observed data
 Run RStan
 3. Evaluate the fit of the model

  • 31. ANDREW GELMAN IN BDA THE THREE STEPS OF BAYESIAN DATA ANALYSIS 1. Set up full probability model
 Write a Stan program
 2. Condition on observed data
 Run RStan
 3. Evaluate the fit of the model
 R, ShinyStan, posterior predictive checks

  • 33. ITERATING OVER STATISTICAL MODELS STATISTICAL MODEL #1 ▸ Only 2015-2016 matters ▸ Teams have a latent ability ▸ “logistic regression” ▸ “Bradley-Terry model” y ⇠ bernoulli(logit 1 (✓1 ✓2))
  • 34.
  • 35. P(VILLANOVA > UNC | DATA) = 0.73
  • 36.
  • 37.
  • 38. ITERATING OVER STATISTICAL MODELS TREAT STATISTICAL MODELS AS CODE ▸ Statistical model in a separate file ▸ Git ▸ Testing ▸ Inspection of fit ▸ Backtest on historical data ▸ Priors ▸ “Model #1”
  • 39. IN WRITING, YOU MUST KILL ALL YOUR DARLINGS. Willliam Faulkner
  • 40. ITERATING OVER STATISTICAL MODELS STATISTICAL MODEL #2 ▸ Home court advantage!
 ▸ Teams have a latent ability ▸ “logistic regression” ▸ “Bradley-Terry model” y ⇠ bernoulli(logit 1 (↵ + ✓1 ✓2))
  • 41.
  • 42. P(VILLANOVA > UNC | DATA) = 0.52
  • 43.
  • 44. ITERATING OVER STATISTICAL MODELS STATISTICAL MODEL #3 ▸ Assumptions: ▸ Only 2015-2016 matters ▸ Teams have a latent ability ▸ Model points ▸ Add a home court advantage
  • 45.
  • 46.
  • 47.
  • 48. ITERATING OVER STATISTICAL MODELS HOW DID WE DO? ▸ Kaggle: 87 / 608 ▸ What went wrong? ▸ What’s next?
  • 50. THANKS ▸ Collaborative effort ▸ NCAA modeling:
 Rob Trangucci ▸ Stan team
 Andrew Gelman, Bob Carpenter, Matt Hoffman, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, Allen Riddell, Marco Ignacio, Jeff Arnold, Mitzi Morris, Rob Goedman, Brian Lau, Jonah Gabry, Alp Kucukelbir, Robert Grant, Dustin Tran, Krzysztof Sakrejda, Aki Vehtari, Rayleigh Lei, Sebastian Weber HELP ▸ http://mc-stan.org ▸ stan-users mailing list
 ▸ Stan Group Inc.
 http://stan.fit
 training / statistical support / consulting
 
 ▸ bearlee@alum.mit.edu / @djsyclik / @mcmc_stan