SlideShare a Scribd company logo
1 of 24
Download to read offline
ProbabilisticProbabilistic
ProgrammingProgramming
A Brief introduction to
Probabilistic Programming and Python
EuroSciPy - University of Cambridge August 2015
peadarcoyle@googlemail.com
All opinions my own
Who am I?Who am I?
I work as a Data Scientist for a large Telecommunications Company
Masters in Mathematics
Interned at Amazon
Was a consultant for a while
Occasional contributor to Pandas and other projects
Co-organizer of the Data Science Meetup in Luxembourg
Member of Royal Statistical Society and NumFOCUS
@springcoil
What is Probabilistic ProgrammingWhat is Probabilistic Programming
Basically using random variables instead of variables
Allows you to create a generative story rather than a black box
A different tool to Machine Learning
A different paradigm to frequentist statistics
Forces you to be explicit about your 'subjective' assumptions
Source: Olivier Grisel
Source: Olivier Grisel
Bayesian StatisticsBayesian Statistics
I studied Mathematics, and encountered in textbooks Bayesians
This is a hard area to do by pen and paper, and most integrals can't be
solved in exact form
Thankfully there was an invention of Monte Carlo Simulations
These simulations are used to approximate your likelihood function
Some terminologySome terminology
Attribution: Quantopian blog
How do you pick your prior?How do you pick your prior?
This is a bit of an art
You generally base the prior on experience
As you add more data this matters less and less
Huh but isn't ProbabilisticHuh but isn't Probabilistic
Programming just Stan and BUGS?Programming just Stan and BUGS?
No in Python you have PyMC3No in Python you have PyMC3
A complete rewrite of PyMC2 now in 'Beta' status
Based upon Theano
Computational techniques for handling gradients
Automatic Differentiation and GPU speedup
Theano - is also used in deep learning!
Currently there is a project to port ' ' from
I gave a thorough tutorial on this -
Key authors: John Salvatier, Thomas Wiecki, Chris Fonnesbeck
BMH PyMC2 to PyMC3
my github
Case study: Rugby AnalyticsCase study: Rugby Analytics
I wanted to do a model of the Six Nations last year.
I wanted to build an understandable model to predict the winner
Key Info: Inferring the 'strength' of each team.
We only have scoring data, which is noisy hence Bayesian Stats
What did I do?What did I do?
1. I picked Gamma as a prior for all teams
2. I used a Hierarchical Model because I wanted home advantage to be
stronger for stronger teams based
3. From this I was able to create a novel model based only on historical
results and scoring intensity
4. I simulated the likelihood function using MCMC
Run the modelRun the model
What actually happenedWhat actually happened
The model incorrectly predicted that England would come out on top.
Ireland actually won by points difference of 6 points.
It really came down to the wire!
"Prediction is difficult especially about the future"
One of the problems is what we call 'over-shrinkage' and you can
delve into the results to see what the errors are, my model was within
the errors.
Hat tip: Thanks to Abraham Flaxman and the PyMC3 on helping me
port this from PyMC2 to PyMC3
Lessons learnedLessons learned
I can build an explainable model using PyMC2 and PyMC3
Generative stories help you build up interest with your colleagues
Communication is the 'last mile' problem of Data Science
PyMC3 is cool please use it and please contribute
Wanna learn more?Wanna learn more?
BMHBMH
Jake VanDerPlas
PyMC3PyMC3
peadarcoyle@googlemail.compeadarcoyle@googlemail.com
Probabilistic Programming in Python

More Related Content

What's hot

Writing Smarter Applications with Machine Learning
Writing Smarter Applications with Machine LearningWriting Smarter Applications with Machine Learning
Writing Smarter Applications with Machine LearningAnoop Thomas Mathew
 
Model selection and tuning at scale
Model selection and tuning at scaleModel selection and tuning at scale
Model selection and tuning at scaleOwen Zhang
 
October hug
October hugOctober hug
October hughuguk
 
Knowledge graph convolutional networks - London 2018
Knowledge graph convolutional networks - London 2018Knowledge graph convolutional networks - London 2018
Knowledge graph convolutional networks - London 2018Vaticle
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine LearningFrank Evans
 
New Approaches at Natural Language Processing Systems
New Approaches at Natural Language Processing SystemsNew Approaches at Natural Language Processing Systems
New Approaches at Natural Language Processing SystemsAndrejkovics Zoltán
 
The Promise and Peril of Very Big Models
The Promise and Peril of Very Big ModelsThe Promise and Peril of Very Big Models
The Promise and Peril of Very Big ModelsRebecca Bilbro
 
Demystifying Artificial Intelligence and Neural Networks
Demystifying Artificial Intelligence and Neural NetworksDemystifying Artificial Intelligence and Neural Networks
Demystifying Artificial Intelligence and Neural NetworksGil Fewster
 

What's hot (8)

Writing Smarter Applications with Machine Learning
Writing Smarter Applications with Machine LearningWriting Smarter Applications with Machine Learning
Writing Smarter Applications with Machine Learning
 
Model selection and tuning at scale
Model selection and tuning at scaleModel selection and tuning at scale
Model selection and tuning at scale
 
October hug
October hugOctober hug
October hug
 
Knowledge graph convolutional networks - London 2018
Knowledge graph convolutional networks - London 2018Knowledge graph convolutional networks - London 2018
Knowledge graph convolutional networks - London 2018
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 
New Approaches at Natural Language Processing Systems
New Approaches at Natural Language Processing SystemsNew Approaches at Natural Language Processing Systems
New Approaches at Natural Language Processing Systems
 
The Promise and Peril of Very Big Models
The Promise and Peril of Very Big ModelsThe Promise and Peril of Very Big Models
The Promise and Peril of Very Big Models
 
Demystifying Artificial Intelligence and Neural Networks
Demystifying Artificial Intelligence and Neural NetworksDemystifying Artificial Intelligence and Neural Networks
Demystifying Artificial Intelligence and Neural Networks
 

Similar to Probabilistic Programming in Python

Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech
 
Artificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of IntelligenceArtificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of IntelligenceAbhishek Upadhyay
 
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Dhiana Deva
 
What is Gamification?
What is Gamification? What is Gamification?
What is Gamification? Karl Kapp
 
Probabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complexProbabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complexData Science Leuven
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.Theo Schlossnagle
 
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-DrivenWeapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Drivenindeedeng
 
Modelling for decisions
Modelling for decisionsModelling for decisions
Modelling for decisionscoppeliamla
 
Keepler | Understanding your own predictive models
Keepler | Understanding your own predictive modelsKeepler | Understanding your own predictive models
Keepler | Understanding your own predictive modelsKeepler Data Tech
 
ODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLBryan Bischof
 
No estimates - 10 new principles for testing
No estimates  - 10 new principles for testingNo estimates  - 10 new principles for testing
No estimates - 10 new principles for testingVasco Duarte
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Institute of Contemporary Sciences
 
Fantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl WeirFantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl WeirFuturice
 
[243] turning data into value
[243] turning data into value[243] turning data into value
[243] turning data into valueNAVER D2
 
Camilo Martinez, Software Development Team Lead at Booking.com - The lifecyc...
Camilo Martinez, Software Development Team Lead at Booking.com -  The lifecyc...Camilo Martinez, Software Development Team Lead at Booking.com -  The lifecyc...
Camilo Martinez, Software Development Team Lead at Booking.com - The lifecyc...Codiax
 
Big Data and Internet of Things for Managers
Big Data and Internet of Things for ManagersBig Data and Internet of Things for Managers
Big Data and Internet of Things for ManagersPeadar Coyle
 
Story Points considered harmful – a new look at estimation techniques
Story Points considered harmful – a new look at estimation techniquesStory Points considered harmful – a new look at estimation techniques
Story Points considered harmful – a new look at estimation techniquesVasco Duarte
 
The Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PMThe Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PMProduct School
 
AliceVision : pipeline de reconstruction 3D open source
AliceVision : pipeline de reconstruction 3D open sourceAliceVision : pipeline de reconstruction 3D open source
AliceVision : pipeline de reconstruction 3D open sourceOpen Source Experience
 
Machine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldMachine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldKen Tabor
 

Similar to Probabilistic Programming in Python (20)

Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivos
 
Artificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of IntelligenceArtificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of Intelligence
 
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
 
What is Gamification?
What is Gamification? What is Gamification?
What is Gamification?
 
Probabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complexProbabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complex
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.
 
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-DrivenWeapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
 
Modelling for decisions
Modelling for decisionsModelling for decisions
Modelling for decisions
 
Keepler | Understanding your own predictive models
Keepler | Understanding your own predictive modelsKeepler | Understanding your own predictive models
Keepler | Understanding your own predictive models
 
ODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in ML
 
No estimates - 10 new principles for testing
No estimates  - 10 new principles for testingNo estimates  - 10 new principles for testing
No estimates - 10 new principles for testing
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
 
Fantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl WeirFantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl Weir
 
[243] turning data into value
[243] turning data into value[243] turning data into value
[243] turning data into value
 
Camilo Martinez, Software Development Team Lead at Booking.com - The lifecyc...
Camilo Martinez, Software Development Team Lead at Booking.com -  The lifecyc...Camilo Martinez, Software Development Team Lead at Booking.com -  The lifecyc...
Camilo Martinez, Software Development Team Lead at Booking.com - The lifecyc...
 
Big Data and Internet of Things for Managers
Big Data and Internet of Things for ManagersBig Data and Internet of Things for Managers
Big Data and Internet of Things for Managers
 
Story Points considered harmful – a new look at estimation techniques
Story Points considered harmful – a new look at estimation techniquesStory Points considered harmful – a new look at estimation techniques
Story Points considered harmful – a new look at estimation techniques
 
The Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PMThe Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PM
 
AliceVision : pipeline de reconstruction 3D open source
AliceVision : pipeline de reconstruction 3D open sourceAliceVision : pipeline de reconstruction 3D open source
AliceVision : pipeline de reconstruction 3D open source
 
Machine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldMachine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our World
 

More from Peadar Coyle

Introduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonIntroduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonPeadar Coyle
 
Variational Inference in Python
Variational Inference in PythonVariational Inference in Python
Variational Inference in PythonPeadar Coyle
 
From Lab to Factory: Creating value with data
From Lab to Factory: Creating value with dataFrom Lab to Factory: Creating value with data
From Lab to Factory: Creating value with dataPeadar Coyle
 
Consulting Skills for Data Scientists
Consulting Skills for Data ScientistsConsulting Skills for Data Scientists
Consulting Skills for Data ScientistsPeadar Coyle
 
A Map of the PyData Stack
A Map of the PyData StackA Map of the PyData Stack
A Map of the PyData StackPeadar Coyle
 
Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Peadar Coyle
 
From Lab to Factory: Or how to turn data into value
From Lab to Factory: Or how to turn data into valueFrom Lab to Factory: Or how to turn data into value
From Lab to Factory: Or how to turn data into valuePeadar Coyle
 
How can Data Science benefit your business?
How can Data Science benefit your business?How can Data Science benefit your business?
How can Data Science benefit your business?Peadar Coyle
 

More from Peadar Coyle (8)

Introduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonIntroduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in Python
 
Variational Inference in Python
Variational Inference in PythonVariational Inference in Python
Variational Inference in Python
 
From Lab to Factory: Creating value with data
From Lab to Factory: Creating value with dataFrom Lab to Factory: Creating value with data
From Lab to Factory: Creating value with data
 
Consulting Skills for Data Scientists
Consulting Skills for Data ScientistsConsulting Skills for Data Scientists
Consulting Skills for Data Scientists
 
A Map of the PyData Stack
A Map of the PyData StackA Map of the PyData Stack
A Map of the PyData Stack
 
Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.
 
From Lab to Factory: Or how to turn data into value
From Lab to Factory: Or how to turn data into valueFrom Lab to Factory: Or how to turn data into value
From Lab to Factory: Or how to turn data into value
 
How can Data Science benefit your business?
How can Data Science benefit your business?How can Data Science benefit your business?
How can Data Science benefit your business?
 

Recently uploaded

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 

Recently uploaded (20)

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 

Probabilistic Programming in Python

  • 1. ProbabilisticProbabilistic ProgrammingProgramming A Brief introduction to Probabilistic Programming and Python EuroSciPy - University of Cambridge August 2015 peadarcoyle@googlemail.com All opinions my own
  • 2. Who am I?Who am I? I work as a Data Scientist for a large Telecommunications Company Masters in Mathematics Interned at Amazon Was a consultant for a while Occasional contributor to Pandas and other projects Co-organizer of the Data Science Meetup in Luxembourg Member of Royal Statistical Society and NumFOCUS @springcoil
  • 3. What is Probabilistic ProgrammingWhat is Probabilistic Programming Basically using random variables instead of variables Allows you to create a generative story rather than a black box A different tool to Machine Learning A different paradigm to frequentist statistics Forces you to be explicit about your 'subjective' assumptions
  • 6. Bayesian StatisticsBayesian Statistics I studied Mathematics, and encountered in textbooks Bayesians This is a hard area to do by pen and paper, and most integrals can't be solved in exact form Thankfully there was an invention of Monte Carlo Simulations These simulations are used to approximate your likelihood function
  • 7.
  • 10. How do you pick your prior?How do you pick your prior? This is a bit of an art You generally base the prior on experience As you add more data this matters less and less
  • 11.
  • 12. Huh but isn't ProbabilisticHuh but isn't Probabilistic Programming just Stan and BUGS?Programming just Stan and BUGS?
  • 13. No in Python you have PyMC3No in Python you have PyMC3 A complete rewrite of PyMC2 now in 'Beta' status Based upon Theano Computational techniques for handling gradients Automatic Differentiation and GPU speedup Theano - is also used in deep learning! Currently there is a project to port ' ' from I gave a thorough tutorial on this - Key authors: John Salvatier, Thomas Wiecki, Chris Fonnesbeck BMH PyMC2 to PyMC3 my github
  • 14. Case study: Rugby AnalyticsCase study: Rugby Analytics I wanted to do a model of the Six Nations last year. I wanted to build an understandable model to predict the winner Key Info: Inferring the 'strength' of each team. We only have scoring data, which is noisy hence Bayesian Stats
  • 15. What did I do?What did I do? 1. I picked Gamma as a prior for all teams 2. I used a Hierarchical Model because I wanted home advantage to be stronger for stronger teams based 3. From this I was able to create a novel model based only on historical results and scoring intensity 4. I simulated the likelihood function using MCMC
  • 16.
  • 17.
  • 18.
  • 19. Run the modelRun the model
  • 20.
  • 21. What actually happenedWhat actually happened The model incorrectly predicted that England would come out on top. Ireland actually won by points difference of 6 points. It really came down to the wire! "Prediction is difficult especially about the future" One of the problems is what we call 'over-shrinkage' and you can delve into the results to see what the errors are, my model was within the errors. Hat tip: Thanks to Abraham Flaxman and the PyMC3 on helping me port this from PyMC2 to PyMC3
  • 22. Lessons learnedLessons learned I can build an explainable model using PyMC2 and PyMC3 Generative stories help you build up interest with your colleagues Communication is the 'last mile' problem of Data Science PyMC3 is cool please use it and please contribute
  • 23. Wanna learn more?Wanna learn more? BMHBMH Jake VanDerPlas PyMC3PyMC3 peadarcoyle@googlemail.compeadarcoyle@googlemail.com