Introduction to Model-Based Machine Learning for Transportation

Introduction Probabilistic Programming Case Study
Introduction to Model-Based Machine
Learning
A Webinar to TRB ADB40 Big Data Initiative
by
Daniel Emaasit1
1Ph.D. Student
Department of Civil and Environmental Engineering
University of Nevada, Las Vegas, USA
emaasit@unlv.nevada.edu
September 27 2016
1 / 21

Acknowledgments1
Prof. Francisco C. Pereira Dr. Filipe Rodrigues
1
Machine Learning for Mobility group, DTU: Tutorial from Summer
school on Big Data, Mobility Patterns, Transport Analytics, July 1-3, 2016,
Filipe Rodrigues and Francisco Pereira 2 / 21

Introduction
3 / 21

Current Challenges in Adopting Machine Learning
Generally, current challenges in adopting ML:
Overwhelming number of traditional ML methods to learn
Deciding which algorithm to use or why
Some custom problems may not ﬁt with any existing
algorithm
4 / 21

What is Model-Based Machine Learning?
A diﬀerent viewpoint for machine learning proposed by
Bishop (2013)2, Winn et al. (2015)3
* Goal: + Provide a single development framework which
supports the creation of a wide range of bespoke models
* The core idea: + all assumptions about the problem domain
are made explicit in the form of a model
2
Bishop, C. M. (2013). Model-Based Machine Learning. Philosophical
Transactions of the Royal Society A, 371, pp 1–17
3
Winn, J., Bishop, C. M., Diethe, T. (2015). Model-Based Machine
Learning. Microsoft Research Cambridge. http://www.mbmlbook.com.
5 / 21

What is a Model in MBML?
A Model:
is a set of assumptions, expressed in mathematical/graphical
form
expresses all parameters, variables as random variables
shows the dependency between variables
Figure 2: Description of a model
6 / 21

Key Ideas of MBML?
MBML is built upon 3 key ideas
the use of Probabilistic Graphical Models (PGM)
the adoption of Bayesian ML
the application of fast, deterministic inference algorithms
7 / 21

Key Idea 1: Probabilistic Graphical Models
Combine probability theory with graphs (e.g Factor Graphs)
8 / 21

Key Idea 2: Bayesian Machine Learning
Everything follows from two simple rules of probability
theory
9 / 21

Key Idea 3: Inference Algorithms
the application of fast, approximate inference algorithms by
local message passing
Variational Bayes
Belief Propagation, Loopy Belief Propagation
Expectation Propagation
Learning by local message passing
Inference algorithms
Figure 3: MCMC vs Approximate methods
10 / 21

Stages of MBML
3 stages of MBML
Build the model: Joint probability distribution of all the
relevant variables (e.g as a graph)
Incorporate the observed data
Perform inference to learn parameters of the latent
variables
11 / 21

Special cases of MBML
Special cases
For sequential data
12 / 21

Benefits of MBML
Potential benefits of this approach
Provides a systematic process of creating ML solutions
Allows for incorporation of prior knowledge
Allows for handling uncertainity in a principled manner
Does not suffer from overfitting
Custom solutions are built for specific problems
Allows for quick building of several alternative models
Easy to compare those alternatives
It’s general purpose: No need to learn the 1000s of existing
ML algorithms
Separates model from inference/training code
13 / 21

Probabilistic Programming
14 / 21

What is Probabilistic Programming?
A software package that takes the model and then
automatically generate inference routines (even source code!)
to solve a wide variety of models
Takes programming languages and adds support for:
random variables
constraints on variables
inference
Examples of PP software packages
Infer.Net (C#, C++)
Stan (R, python, C++)
BUGS
church
PyMC (python)
15 / 21

How Probabilistic Programming works
Figure 5: How infer.NET works
16 / 21

Case Study
17 / 21

A Bicyclist’s Daily Travel
Analysing the distribution of an individual cyclist’s daily
travel time to work
18 / 21

travel time to work
Identify the variables of interest
ttn - travel time in the
nth
day
at - average travel-time
tu - uncertainty
ttn
at
tu
N
18 / 21

travel time to work
Specify relationships between variables
nth
day
tu - uncertainty
ttn
at
tu
N
19 / 21

travel time to work
Specify relationships between variables
nth
day
tu - uncertainty
ttn
at
tu
N
Joint distribution is given by
p(tt, at, tu) = p(at) p(tu)
priors
×
N
n=1
p(ttn|at, tu)
likelihood
19 / 21

travel time to work
p(tt, as, tu) = p(at) p(tu)
priors
×
N
n=1
p(ttn|at, tu)
likelihood
How should we deﬁne the likelihood p(ttn|at, tu)?
20 / 21

travel time to work
priors
×
N
n=1
p(ttn|at, tu)
likelihood
the distribution’s mean is the cyclist’s average travel time
the distribution’s variance determines how much the travel
time varies from day to day (e.g. variations in traﬃc
conditions)
20 / 21

travel time to work
priors
×
N
n=1
p(ttn|at, tu)
likelihood
conditions)
What distributions should p(at) and p(tu) have?
20 / 21

travel time to work
priors
×
N
n=1
p(ttn|at, tu)
likelihood
conditions)
What distributions should p(at) and p(tu) have?
conjugate priors!
20 / 21

Likelihood given by
p(ttn|at, tu) = N(ttn|at, tu)
We now know what distribution forms to assign to the
priors...
21 / 21

Likelihood given by
priors...
p(at) = N(at|µ, σ2
)
p(tu) = cauchy(tu|µ, σ2
)
21 / 21

Likelihood given by
priors...
p(at) = N(at|µ, σ2
)
p(tu) = cauchy(tu|µ, σ2
)
The choice of the initial parameters of the prior is signiﬁcant
only if you have a small number of observations
As the number of observations increases, the inﬂuence of the
initial prior on the posterior declines
21 / 21

Introduction to Model-Based Machine Learning for Transportation

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Viewers also liked

Viewers also liked (20)

Similar to Introduction to Model-Based Machine Learning for Transportation

Similar to Introduction to Model-Based Machine Learning for Transportation (20)

Recently uploaded

Recently uploaded (20)

Introduction to Model-Based Machine Learning for Transportation