SlideShare a Scribd company logo
Striving to Demystify
Bayesian Computational
Modelling
UNIGE R Lunch
Speaker: Marco Wirthlin
@marcowirthlin
Abstract and Context of the Talk
Abstract
Bayesian approaches to computational modelling have experienced a slow, but steady gain in recognition and
usage in academia and industry alike, accompanying the growing availability of evermore powerful computing
platforms at shrinking costs. Why would one use such techniques? How are those models conceived and
implemented? Which is the recommended workflow? Why make life hard when there are P-values?
In his talk, Marco Wirthlin will first attempt an introduction to statistical notions supporting Bayesian computation and
explain the difference to the Frequentist framework. In the second half, an example of a recommended workflow is
outlined on a simple toy model, with simulated data. Live coding will be used as much as possible to illustrate
concepts on an implementational level in the R language. Ample literature and media references for self-learning
will be provided during the talk.
Context and Licence
This talk was performed in the context of the “R Lunch” on the 29 of October 2019 at the University of Geneva and
was organized by Elise Tancoigne (@tancoigne) & Xavier Adam (@xvrdm). Many thanks for inviting me! :D
Code (if any) is licenced under the BSD (3 clause), while the text licence is CC BY-NC 4.0. Any derived work has
been cited. Please contact me if you see non-attributed work (marco.wirthlin@gmail.com).
ONLINE
VERSION
The difference between Bayesian and Frequentist statistics is how
probability theory is applied to achieve their respective goals.
Computational
Models
Mathematical
Description
Solution
Algorithm
Implementation
Relationship
between
Variables
Generative
Structure
Y = a * X + b Simulation
Goal
Conceptual Hygiene
Computational
Models
Mathematical
Description
Solution
Algorithm
Implementation
Relationship
between
Variables
Generative
Structure
Y = a * X + b Simulation
Goal
Conceptual Hygiene
What is a generative model?
●
Ng, A. Y. and Jordan, M. I. (2002). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Advances in neural information processing systems, pages 841–848.
●
Rasmus Bååth, Video Introduction to Bayesian Data Analysis, Part 1: What is Bayes?: https://www.youtube.com/watch?time_continue=366&v=3OJEae7Qb_o
Hypothesis of underlying
mechanisms
AKA: “Learning the class”
No shorcuts!
D =
[2, 8, ..., 9]
θ, μ, ξ, ...
Parameters
Model
Bayesian Inference
Golf Example (Gelman 2019)
Andrew Gelman: https://mc-stan.org/users/documentation/case-studies/golf.html
Andrew Gelman: https://mc-stan.org/users/documentation/case-studies/golf.html
Andrew Gelman: https://mc-stan.org/users/documentation/case-studies/golf.html
Andrew Gelman: https://mc-stan.org/users/documentation/case-studies/golf.html
Lets think about how golf works!
Andrew Gelman: https://mc-stan.org/users/documentation/case-studies/golf.html
Andrew Gelman: https://mc-stan.org/users/documentation/case-studies/golf.html
Andrew Gelman: https://mc-stan.org/users/documentation/case-studies/golf.html
Computational Models can have different Goals
Decision making
Quantifying how well we understand a phenomenon
Fisherian (frequentist) statistics, hypothesis tests, p-values...
Bayesian statistics, generative models, simulations...
Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
Rasmus Bååth, Video Introduction to Bayesian Data Analysis, Part 1: What is Bayes?: https://www.youtube.com/watch?time_continue=366&v=3OJEae7Qb_o
Frequentist statistics: parameter “fixed”
Bayesian statistics: parameter “variable”
乁 ( )⁰⁰ ĹĹ ⁰⁰ ㄏ
Frequentist statistics: maximum likelihood
Bayesian statistics: likelihood
┐('д')┌
¯_( _ )_/¯⊙ ʖ⊙
Rasmus Bååth, Video Introduction to Bayesian Data Analysis, Part 1: What is Bayes?: https://www.youtube.com/watch?time_continue=366&v=3OJEae7Qb_o
Bayesian Inference (BI)
BI
Likelihoods
Frequentist
Statistics
Graphical
Models
Probabilistic
Programming
Background Implementation
Likelihoods
Normal Distribution
L = p(D | θ)
x ~ N(μ, σ2
)
“The probability that D belongs to
a distribution with mean μ and SD
σ”
L = p(D | μ, σ2
)
“X is normally distributed”
PDF: Fix parameters, vary data
L: Fix data, vary parameters
Dashboard link: https://seneketh.shinyapps.io/Likelihood_Intuition
●
https://www.youtube.com/watch?v=ScduwntrMzc
Frequentist Inference
Y = [7, ..., 2]
X = [2, ..., 9]
Y = a * X + b
Y ~ N(a * X + b, σ2
) L = p(D | a, b, σ2
)
argmax(Σln(p(D | a, b, σ2
))
a b σ2
MLE
“True” Population
?
“True” Population
D = [7, 3, 2]
Sample: N=3
Sampling Distribution
e.g. F distribution
Test
statistic
Inter-group var./
Intra-group var.
∞
H0
mean
Central Limit
Theorem
“Long range” probability
● Sampling distribution applet: http://onlinestatbook.com/stat_sim/sampling_dist/index.html
Frequentist Inference
● Sampling distribution applet: http://onlinestatbook.com/stat_sim/sampling_dist/index.html
“When a frequentist says that the probability for
"heads" in a coin toss is 0.5 (50%) she means that in
infinitively many such coin tosses, 50% of the coins
will show "head"”.
Frequentist Inference
If the H0
is very unlikely “in the long run”, we accept H1
.
Bayesian Inference
1) Build a Generative model, imitate the data structure. (model)
2) Assign principled probabilities to your parameters. (priors)
3) Update those probabilities by incorporating data. (posteriors)
Probabilities: Epistemic/ontological uncertainty.
How well do we understand our phenomenon?
https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html
Y = a * X + b
Y ~ N(a * X + b, σ2
)
Assign probabilities to the parameters
p(D, a, b, σ2
) = p(D | a, b, σ2
)*p(a)*p(b)*p(σ2
)
a ~ N(1, 0.1) b ~ N(4, 0.5) σ2
~ G(1, 0.1)
p(a) p(b) p(σ2
)
p(D | a, b, σ2
)
p(D | θ)
From model to code
Y = a * X + b
a ~ N(1, 0.1)
b ~ N(4, 0.5)
σ2
~ G(1, 0.1)
Y ~ N(a * X + b, σ2
)
●
More examples: https://mc-stan.org/users/documentation/case-studies
Implementation in R
Stan
https://mc-stan.org/
https://dougj892.github.io/bayes4ie2/texts/3_intro_to_stan/
Golf Again!
https://github.com/stan-dev/example-models/tree/master/knitr/golf
Build
Generative
Model
Assign
principled
probabilities
to parameters
Simulate
Data
Check if
assumptions
are OK
with Data
Expose model
to Data
Bayesian Workflow
● https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html
● Toward a principled Bayesian workflow in cognitive science: https://arxiv.org/abs/1904.12765
Simulate data
with fitted
parameters
Short demo.
Thank you for your attention… and endurance!
Additional Slides
Sources, links and more!
Who to follow on Twitter?
● Chris Fonnesbeck @fonnesbeck (pyMC3)
● Thomas Wiecki @twiecki (pyMC3)
Blog: https://twiecki.io/ (nice intros)
● Bayes Dose @BayesDose (general info and papers)
● Richard McElreath @rlmcelreath (ecology, Bayesian statistics expert)
All his lectures: https://www.youtube.com/channel/UCNJK6_DZvcMqNSzQdEkzvzA
● Michael Betancourt @betanalpha (Stan)
Blog: https://betanalpha.github.io/writing/
Specifically: https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html
● Rasmus Bååth @rabaath
Great video series: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-one/
● Frank Harrell @f2harrell (statistics sage)
Great Blog: http://www.fharrell.com/
● Andrew Gelman @StatModeling (statistics sage)
https://statmodeling.stat.columbia.edu/
● Judea Pearl @yudapearl
Book of Why: http://bayes.cs.ucla.edu/WHY/ (more about causality, BN and DAG)
● AND MANY MORE!
All sources in one place!
About Generative vs. Discriminative models:
Ng, A. Y. and Jordan, M. I. (2002). On discriminative vs. generative classifiers: A comparison
of logistic regression and naive bayes. In Advances in neural information processing systems,
pages 841–848.
Rasmus Bååth:
Video Introduction to Bayesian Data Analysis, Part 1: What is Bayes?:
https://www.youtube.com/watch?time_continue=366&v=3OJEae7Qb_o
When to use ML vs. Statistical Modelling:
Frank Harrell's Blog:
http://www.fharrell.com/post/stat-ml/
http://www.fharrell.com/post/stat-ml2/
Frequentist approach: How do sampling distributions work (applet):
http://onlinestatbook.com/stat_sim/sampling_dist/index.html
Bayesian inference and computation:
John Kruschke:
Doing Bayesian Data Analysis:
A Tutorial with R, JAGS, and Stan Chapter 5
Rasmus Bååth:
http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/
Richard McElreath:
Statistical Rethinking book and lectures
(https://www.youtube.com/watch?v=4WVelCswXo4 )
Many model examples in Stan:
https://mc-stan.org/users/documentation/case-studies
About Bayesian Neural Networks:
https://alexgkendall.com/computer_vision/bayesian_deep_learning_for_safe_ai/
https://twiecki.io/blog/2018/08/13/hierarchical_bayesian_neural_network/
Volatility Examples:
Hidden Markov Models:
https://github.com/luisdamiano/rfinance17
Volatility Garch Model and Bayesian Workflow:
https://luisdamiano.github.io/personal/volatility_stan2018.pdf
Dictionary: Stats ↔ ML
https://ubc-mds.github.io/resources_pages/terminology/
The Bayesian Workflow:
https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html
Algorithm explanation applet for MCMC exploration of the parameter space:
http://elevanth.org/blog/2017/11/28/build-a-better-markov-chain/
Probabilistic Programming Conference Talks:
https://www.youtube.com/watch?v=crvNIGyqGSU
Dictionary: Stats ML↔
Check: https://ubc-mds.github.io/resources_pages/terminology/ for more terminology
Statistics
Estimation/Fitting
Hypothesis
Data Point
Regression
Classification
Covariates
Parameters
Response
Factor
Likelihood
Machine learning / AI
~ Learning
~ Classification rule
~ Example/ Instance
~ Supervised Learning
~ Supervised Learning
~ Features
~ Features
~ Label
~ Factor (categorical variables)
~ Cost Function (sometimes)
Slides about Bayesian Computation
Bayesian Inference
● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5
● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/
● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
Bayesian Inference
● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5
● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/
● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
Bayesian Inference
● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5
● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/
● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
Discrete Values:
Just sum it up!
:)
Cont. Values:
Integration over
complete parameter
space...
:(
Bayesian Inference:
● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5
● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/
● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
Averaging over the complete
parameter space via integration is
impractical!
Solution: We sample from the
conjugate probability distribution with
smart MCMC algorithms!
(Subject of another talk)
Bayesian Inference
● http://elevanth.org/blog/2017/11/28/build-a-better-markov-chain/
● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5
●
Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/
●
Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
Lets compute this and sample from it!
Principles of Bayesian Modeling
What we know about a
cognitive phenomenon
before the
measurement.
Knowledge
gained by the
measurement.
Knowledge
gained after
measurement.
To-date scientific
knowledge.
Model/
Hypothesis
Data
Updated
believes
Bayesian Formulations
Requirements:
● If:
The prior distribution is specified by a function that is easily evaluated. This simply
means that if you specify a value for θ, then the value of p(θ) is easily determined,
● And If:
The value of the likelihood function, p(D|θ), can be computed for any specified
values of D and θ.
● Then:
The method produces an approximation of the posterior distribution,
p(θ|D), in the form of a large number of θ values sampled from that distribution
(same as we would sample people from a determined population).
MCMC Requirements
Source:Kruschke,J.(2014).
MetropolisAlgorithmSteps
Source:Kruschke,J.(2014).
MetropolisAlgorithmSteps
True param:
0.7
Source: Kruschke, J. (2014).
HMCStep-Sizes

More Related Content

Similar to Striving to Demystify Bayesian Computational Modelling

Recommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenuRecommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenu
Patrice Bellot - Aix-Marseille Université / CNRS (LIS, INS2I)
 
深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////
alicejiang7888
 
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
diannepatricia
 
Lecture20 xing
Lecture20 xingLecture20 xing
Lecture20 xing
Tianlu Wang
 
Towards reproducibility and maximally-open data
Towards reproducibility and maximally-open dataTowards reproducibility and maximally-open data
Towards reproducibility and maximally-open data
Pablo Bernabeu
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
Universitat Politècnica de Catalunya
 
Reproducible Science and Deep Software Variability
Reproducible Science and Deep Software VariabilityReproducible Science and Deep Software Variability
Reproducible Science and Deep Software Variability
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
ANChor: A powerful approach to scientific communication
ANChor: A powerful approach to scientific communicationANChor: A powerful approach to scientific communication
ANChor: A powerful approach to scientific communication
Josh Inouye
 
download
downloaddownload
downloadbutest
 
download
downloaddownload
downloadbutest
 
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
The Statistical and Applied Mathematical Sciences Institute
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
Paul Groth
 
Seminar CCC
Seminar CCCSeminar CCC
Seminar CCC
Antonio Lieto
 
A Survey of Annotation Tools for Lecture Materials
A Survey of Annotation Tools for Lecture MaterialsA Survey of Annotation Tools for Lecture Materials
A Survey of Annotation Tools for Lecture Materials
Kai Michael Höver
 
Big data and SP Theory of Intelligence
Big data and SP Theory of IntelligenceBig data and SP Theory of Intelligence
Big data and SP Theory of Intelligence
Varsha Prabhakar
 
What are we? Statistical Ecologists or Ecological Statisticians?
What are we?  Statistical Ecologists or Ecological Statisticians?What are we?  Statistical Ecologists or Ecological Statisticians?
What are we? Statistical Ecologists or Ecological Statisticians?
Bob O'Hara
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
Andy Petrella
 
Lecture14 xing fei-fei
Lecture14 xing fei-feiLecture14 xing fei-fei
Lecture14 xing fei-fei
Tianlu Wang
 
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
Maryam Farooq
 

Similar to Striving to Demystify Bayesian Computational Modelling (20)

Recommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenuRecommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenu
 
深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////
 
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
 
Lecture20 xing
Lecture20 xingLecture20 xing
Lecture20 xing
 
Towards reproducibility and maximally-open data
Towards reproducibility and maximally-open dataTowards reproducibility and maximally-open data
Towards reproducibility and maximally-open data
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
 
Reproducible Science and Deep Software Variability
Reproducible Science and Deep Software VariabilityReproducible Science and Deep Software Variability
Reproducible Science and Deep Software Variability
 
ANChor: A powerful approach to scientific communication
ANChor: A powerful approach to scientific communicationANChor: A powerful approach to scientific communication
ANChor: A powerful approach to scientific communication
 
download
downloaddownload
download
 
download
downloaddownload
download
 
Montpellier
MontpellierMontpellier
Montpellier
 
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
Seminar CCC
Seminar CCCSeminar CCC
Seminar CCC
 
A Survey of Annotation Tools for Lecture Materials
A Survey of Annotation Tools for Lecture MaterialsA Survey of Annotation Tools for Lecture Materials
A Survey of Annotation Tools for Lecture Materials
 
Big data and SP Theory of Intelligence
Big data and SP Theory of IntelligenceBig data and SP Theory of Intelligence
Big data and SP Theory of Intelligence
 
What are we? Statistical Ecologists or Ecological Statisticians?
What are we?  Statistical Ecologists or Ecological Statisticians?What are we?  Statistical Ecologists or Ecological Statisticians?
What are we? Statistical Ecologists or Ecological Statisticians?
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
 
Lecture14 xing fei-fei
Lecture14 xing fei-feiLecture14 xing fei-fei
Lecture14 xing fei-fei
 
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
 

Recently uploaded

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 

Recently uploaded (20)

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 

Striving to Demystify Bayesian Computational Modelling

  • 1. Striving to Demystify Bayesian Computational Modelling UNIGE R Lunch Speaker: Marco Wirthlin @marcowirthlin
  • 2. Abstract and Context of the Talk Abstract Bayesian approaches to computational modelling have experienced a slow, but steady gain in recognition and usage in academia and industry alike, accompanying the growing availability of evermore powerful computing platforms at shrinking costs. Why would one use such techniques? How are those models conceived and implemented? Which is the recommended workflow? Why make life hard when there are P-values? In his talk, Marco Wirthlin will first attempt an introduction to statistical notions supporting Bayesian computation and explain the difference to the Frequentist framework. In the second half, an example of a recommended workflow is outlined on a simple toy model, with simulated data. Live coding will be used as much as possible to illustrate concepts on an implementational level in the R language. Ample literature and media references for self-learning will be provided during the talk. Context and Licence This talk was performed in the context of the “R Lunch” on the 29 of October 2019 at the University of Geneva and was organized by Elise Tancoigne (@tancoigne) & Xavier Adam (@xvrdm). Many thanks for inviting me! :D Code (if any) is licenced under the BSD (3 clause), while the text licence is CC BY-NC 4.0. Any derived work has been cited. Please contact me if you see non-attributed work (marco.wirthlin@gmail.com). ONLINE VERSION
  • 3. The difference between Bayesian and Frequentist statistics is how probability theory is applied to achieve their respective goals.
  • 4.
  • 6.
  • 8. What is a generative model? ● Ng, A. Y. and Jordan, M. I. (2002). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Advances in neural information processing systems, pages 841–848. ● Rasmus Bååth, Video Introduction to Bayesian Data Analysis, Part 1: What is Bayes?: https://www.youtube.com/watch?time_continue=366&v=3OJEae7Qb_o Hypothesis of underlying mechanisms AKA: “Learning the class” No shorcuts! D = [2, 8, ..., 9] θ, μ, ξ, ... Parameters Model Bayesian Inference
  • 9. Golf Example (Gelman 2019) Andrew Gelman: https://mc-stan.org/users/documentation/case-studies/golf.html
  • 16. Computational Models can have different Goals Decision making Quantifying how well we understand a phenomenon Fisherian (frequentist) statistics, hypothesis tests, p-values... Bayesian statistics, generative models, simulations... Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
  • 17. Rasmus Bååth, Video Introduction to Bayesian Data Analysis, Part 1: What is Bayes?: https://www.youtube.com/watch?time_continue=366&v=3OJEae7Qb_o
  • 18. Frequentist statistics: parameter “fixed” Bayesian statistics: parameter “variable” 乁 ( )⁰⁰ ĹĹ ⁰⁰ ㄏ Frequentist statistics: maximum likelihood Bayesian statistics: likelihood ┐('д')┌ ¯_( _ )_/¯⊙ ʖ⊙ Rasmus Bååth, Video Introduction to Bayesian Data Analysis, Part 1: What is Bayes?: https://www.youtube.com/watch?time_continue=366&v=3OJEae7Qb_o
  • 20. Likelihoods Normal Distribution L = p(D | θ) x ~ N(μ, σ2 ) “The probability that D belongs to a distribution with mean μ and SD σ” L = p(D | μ, σ2 ) “X is normally distributed” PDF: Fix parameters, vary data L: Fix data, vary parameters Dashboard link: https://seneketh.shinyapps.io/Likelihood_Intuition ● https://www.youtube.com/watch?v=ScduwntrMzc
  • 21. Frequentist Inference Y = [7, ..., 2] X = [2, ..., 9] Y = a * X + b Y ~ N(a * X + b, σ2 ) L = p(D | a, b, σ2 ) argmax(Σln(p(D | a, b, σ2 )) a b σ2 MLE “True” Population ?
  • 22. “True” Population D = [7, 3, 2] Sample: N=3 Sampling Distribution e.g. F distribution Test statistic Inter-group var./ Intra-group var. ∞ H0 mean Central Limit Theorem “Long range” probability ● Sampling distribution applet: http://onlinestatbook.com/stat_sim/sampling_dist/index.html Frequentist Inference
  • 23. ● Sampling distribution applet: http://onlinestatbook.com/stat_sim/sampling_dist/index.html “When a frequentist says that the probability for "heads" in a coin toss is 0.5 (50%) she means that in infinitively many such coin tosses, 50% of the coins will show "head"”. Frequentist Inference If the H0 is very unlikely “in the long run”, we accept H1 .
  • 24. Bayesian Inference 1) Build a Generative model, imitate the data structure. (model) 2) Assign principled probabilities to your parameters. (priors) 3) Update those probabilities by incorporating data. (posteriors) Probabilities: Epistemic/ontological uncertainty. How well do we understand our phenomenon? https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html
  • 25. Y = a * X + b Y ~ N(a * X + b, σ2 ) Assign probabilities to the parameters p(D, a, b, σ2 ) = p(D | a, b, σ2 )*p(a)*p(b)*p(σ2 ) a ~ N(1, 0.1) b ~ N(4, 0.5) σ2 ~ G(1, 0.1) p(a) p(b) p(σ2 ) p(D | a, b, σ2 ) p(D | θ)
  • 26. From model to code Y = a * X + b a ~ N(1, 0.1) b ~ N(4, 0.5) σ2 ~ G(1, 0.1) Y ~ N(a * X + b, σ2 ) ● More examples: https://mc-stan.org/users/documentation/case-studies
  • 30.
  • 32.
  • 33. Build Generative Model Assign principled probabilities to parameters Simulate Data Check if assumptions are OK with Data Expose model to Data Bayesian Workflow ● https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html ● Toward a principled Bayesian workflow in cognitive science: https://arxiv.org/abs/1904.12765 Simulate data with fitted parameters
  • 35. Thank you for your attention… and endurance!
  • 37. Who to follow on Twitter? ● Chris Fonnesbeck @fonnesbeck (pyMC3) ● Thomas Wiecki @twiecki (pyMC3) Blog: https://twiecki.io/ (nice intros) ● Bayes Dose @BayesDose (general info and papers) ● Richard McElreath @rlmcelreath (ecology, Bayesian statistics expert) All his lectures: https://www.youtube.com/channel/UCNJK6_DZvcMqNSzQdEkzvzA ● Michael Betancourt @betanalpha (Stan) Blog: https://betanalpha.github.io/writing/ Specifically: https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html ● Rasmus Bååth @rabaath Great video series: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-one/ ● Frank Harrell @f2harrell (statistics sage) Great Blog: http://www.fharrell.com/ ● Andrew Gelman @StatModeling (statistics sage) https://statmodeling.stat.columbia.edu/ ● Judea Pearl @yudapearl Book of Why: http://bayes.cs.ucla.edu/WHY/ (more about causality, BN and DAG) ● AND MANY MORE!
  • 38. All sources in one place! About Generative vs. Discriminative models: Ng, A. Y. and Jordan, M. I. (2002). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Advances in neural information processing systems, pages 841–848. Rasmus Bååth: Video Introduction to Bayesian Data Analysis, Part 1: What is Bayes?: https://www.youtube.com/watch?time_continue=366&v=3OJEae7Qb_o When to use ML vs. Statistical Modelling: Frank Harrell's Blog: http://www.fharrell.com/post/stat-ml/ http://www.fharrell.com/post/stat-ml2/ Frequentist approach: How do sampling distributions work (applet): http://onlinestatbook.com/stat_sim/sampling_dist/index.html Bayesian inference and computation: John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4 ) Many model examples in Stan: https://mc-stan.org/users/documentation/case-studies About Bayesian Neural Networks: https://alexgkendall.com/computer_vision/bayesian_deep_learning_for_safe_ai/ https://twiecki.io/blog/2018/08/13/hierarchical_bayesian_neural_network/ Volatility Examples: Hidden Markov Models: https://github.com/luisdamiano/rfinance17 Volatility Garch Model and Bayesian Workflow: https://luisdamiano.github.io/personal/volatility_stan2018.pdf Dictionary: Stats ↔ ML https://ubc-mds.github.io/resources_pages/terminology/ The Bayesian Workflow: https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html Algorithm explanation applet for MCMC exploration of the parameter space: http://elevanth.org/blog/2017/11/28/build-a-better-markov-chain/ Probabilistic Programming Conference Talks: https://www.youtube.com/watch?v=crvNIGyqGSU
  • 39. Dictionary: Stats ML↔ Check: https://ubc-mds.github.io/resources_pages/terminology/ for more terminology Statistics Estimation/Fitting Hypothesis Data Point Regression Classification Covariates Parameters Response Factor Likelihood Machine learning / AI ~ Learning ~ Classification rule ~ Example/ Instance ~ Supervised Learning ~ Supervised Learning ~ Features ~ Features ~ Label ~ Factor (categorical variables) ~ Cost Function (sometimes)
  • 40. Slides about Bayesian Computation
  • 41. Bayesian Inference ● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 ● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ ● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
  • 42. Bayesian Inference ● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 ● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ ● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
  • 43. Bayesian Inference ● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 ● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ ● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4) Discrete Values: Just sum it up! :) Cont. Values: Integration over complete parameter space... :(
  • 44. Bayesian Inference: ● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 ● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ ● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4) Averaging over the complete parameter space via integration is impractical! Solution: We sample from the conjugate probability distribution with smart MCMC algorithms! (Subject of another talk)
  • 45. Bayesian Inference ● http://elevanth.org/blog/2017/11/28/build-a-better-markov-chain/ ● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 ● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ ● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4) Lets compute this and sample from it!
  • 46. Principles of Bayesian Modeling What we know about a cognitive phenomenon before the measurement. Knowledge gained by the measurement. Knowledge gained after measurement. To-date scientific knowledge. Model/ Hypothesis Data Updated believes
  • 48. Requirements: ● If: The prior distribution is specified by a function that is easily evaluated. This simply means that if you specify a value for θ, then the value of p(θ) is easily determined, ● And If: The value of the likelihood function, p(D|θ), can be computed for any specified values of D and θ. ● Then: The method produces an approximation of the posterior distribution, p(θ|D), in the form of a large number of θ values sampled from that distribution (same as we would sample people from a determined population). MCMC Requirements
  • 51. True param: 0.7 Source: Kruschke, J. (2014). HMCStep-Sizes