SlideShare a Scribd company logo
Change Point Analysis
1
Problem Statement
• A sample of observations (counts) from a Poisson
process, where events occur randomly over a given
time period
• A chart representation typically shows time on the
horizontal axis and the number of events on the
vertical axis.
• We want to identify the point(s) in time at which the
rate of event occurrences changes, where the number
of events is increasing or decreasing in frequency.
2
Poisson Distribution
pwäˈsän ˌdistrəˌbyo͞oSH(ə)n/
noun: Poisson distribution; plural noun: Poisson distributions
A discrete frequency distribution that gives the probability of a
number of independent events occurring in a fixed time.
Origin:
Named after the French mathematical physicist Siméon-Denis
Poisson (1781–1840).
https://www.google.com/webhp?sourceid=chrome-
instant&ion=1&espv=2&ie=UTF-8#q=poisson%20distribution
3
Poisson Distribution
[https://en.wikipedia.org/wiki/Poisson_distribution]
The horizontal axis is the index k, the number of occurrences. λ is the
expected value.
4
Classic Example
Coal Mining Disasters
5
Calculations
• Estimate the rate at which events occur
before the shift (μ).
• Estimate the rate at which events occur
after the shift (ƛ).
• Estimate the time when the actual shift
happens (k).
6
Methodology
From the article:
“Apply a Markov Chain Monte Carlo (MCMC) sampling
approach to estimate the population parameters at each
possible k, from the beginning of your data set to the end of it.
The values you get at each time step will be dependent only on
the values you computed at the previous time step (that’s where
the Markov Chain part of this problem comes in). There are lots
of different ways to hop around the parameter space, and each
hopping strategy has a fancy name (e.g. Metropolis-Hastings,
Gibbs Sampler, “reversible jump”).”
7
Markov Chain Monte Carlo
(MCMC)
From Wikipedia:
“In statistics, Markov chain Monte Carlo (MCMC) methods are a
class of algorithms for sampling from a probability distribution based
on constructing a Markov chain that has the desired distribution as
its equilibrium distribution. The state of the chain after a number of
steps is then used as a sample of the desired distribution. The
quality of the sample improves as a function of the number of steps.”
http://stats.stackexchange.com/questions/165/how-would-you-
explain-markov-chain-monte-carlo-mcmc-to-a-layperson
8
MCMC, continued
“Convergence of the Metropolis-Hastings algorithm. MCMC attempts
to approximate the blue distribution with the orange distribution”
9
Back to the Example…
Rizzo, author of Statistical Computing with R

https://www.crcpress.com/Statistical-Computing-with-R/Rizzo/9781584885450



From the article: “Here are the models for prior (hypothesized) distributions that Rizzo uses, based on the
Gibbs Sampler approach:
• mu comes from a Gamma distribution with shape parameter of (0.5 + the sum of all your frequencies
UP TO the point in time, k, you’re currently at) and a rate of (k + b1)
• lambda comes from a Gamma distribution with shape parameter of (0.5 + the sum of all your
frequencies after the point in time, k, you’re currently at) and a rate of (n – k + b1) where n is the
number of the year you’re currently processing
• b1 comes from a Gamma distribution with a shape parameter of 0.5 and a rate of (mu + 1)
• b2 comes from a Gamma distribution with a shape parameter of 0.5 and a rate of (lambda + 1)
• L is a likelihood function based on k, mu, lambda, and the sum of all the frequencies up until that point
in time, k”
10
Gibbs Sampler Code
“At each iteration, you pick a value of k to represent a
point in time where a change might have occurred. You
slice your data into two chunks: the chunk that
happened BEFORE this point in time, and the chunk that
happened AFTER this point in time. Using your data,
you apply a Poisson Process with a (Hypothesized)
Gamma Distributed Rate as your model. This is a pretty
common model for this particular type of problem. It’s
like randomly cutting a deck of cards and taking the
average of the values in each of the two cuts… then
doing the same thing again… a thousand times.”
11
Simulation Results
“Knowing the distributions of mu, lambda, and k from
hopping around our data will help us estimate values for
the true population parameters. At the end of the
simulation, we have an array of 1000 values of k, an
array of 1000 values of mu, and an array of 1000 values
of lambda — we use these to estimate the real values of
the population parameters. Typically, algorithms that do
this automatically throw out a whole bunch of them in the
beginning (the “burn-in” period) — Rizzo tosses out 200
observations — even though some statisticians (e.g.
Geyer) say that the burn-in period is unnecessary.”
12
Simulation Results, continued
Histograms
13
Demo
• Source Code from Rizzo’s Book
• Run in RStudio
• “The change point happened between the 39th
and 40th observations, the arrival rate before the
change point was 3.14 arrivals per unit time, and
the rate after the change point was 0.93 arrivals per
unit time. (Cool!)”
14
Results
• Add 40 to 1851 => Change Point @ 1891
15
Random Thoughts
• How soon can we detect change points? It can be
quite easy to locate one in retrospect, but what about
real-time change detection?
• Identifying changes in trend is difficult, so why is
change point detection better than traditional time
series analysis?
• My recurring thought has been that if someone
showed you this chart, then you could just point to the
change point and save some compute cycles. Duh!
16

More Related Content

What's hot

First paper with the NITheCS affiliation
First paper with the NITheCS affiliationFirst paper with the NITheCS affiliation
First paper with the NITheCS affiliation
Rene Kotze
 
[Seminar] hyunwook 0624
[Seminar] hyunwook 0624[Seminar] hyunwook 0624
[Seminar] hyunwook 0624
ivaderivader
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
The Statistical and Applied Mathematical Sciences Institute
 
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
CRS4 Research Center in Sardinia
 
Rinfret, Jonathan poster(2)
Rinfret, Jonathan poster(2)Rinfret, Jonathan poster(2)
Rinfret, Jonathan poster(2)Jonathan Rinfret
 
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
CRS4 Research Center in Sardinia
 
INFOCOM 2019: On the Distribution of Traffic Volumes in the Internet and its ...
INFOCOM 2019: On the Distribution of Traffic Volumes in the Internet and its ...INFOCOM 2019: On the Distribution of Traffic Volumes in the Internet and its ...
INFOCOM 2019: On the Distribution of Traffic Volumes in the Internet and its ...
MohammedAlasmar3
 
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
Lorena Santos
 
LSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationLSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in Recommendation
Maruf Aytekin
 
Alam afrizal tambahan
Alam afrizal tambahanAlam afrizal tambahan
Alam afrizal tambahan
Alam Afrizal
 
Ill-posedness formulation of the emission source localization in the radio- d...
Ill-posedness formulation of the emission source localization in the radio- d...Ill-posedness formulation of the emission source localization in the radio- d...
Ill-posedness formulation of the emission source localization in the radio- d...
Ahmed Ammar Rebai PhD
 
DM Measurement with CHIME
DM Measurement with CHIMEDM Measurement with CHIME
DM Measurement with CHIMEBen Izmirli
 
Entropy and its significance related to GIS
Entropy and its significance related to GISEntropy and its significance related to GIS
Entropy and its significance related to GISNaina Gupta
 
Streaming multiscale anomaly detection
Streaming multiscale anomaly detectionStreaming multiscale anomaly detection
Streaming multiscale anomaly detection
Ravi Kiran B.
 
5. Pedro Ladeira, Datamine - Probability of Achieving Certain Economic Outcom...
5. Pedro Ladeira, Datamine - Probability of Achieving Certain Economic Outcom...5. Pedro Ladeira, Datamine - Probability of Achieving Certain Economic Outcom...
5. Pedro Ladeira, Datamine - Probability of Achieving Certain Economic Outcom...
Kristy Marshall
 
CLEARMiner: Mining of Multitemporal Remote Sensing Images
CLEARMiner: Mining of Multitemporal Remote Sensing ImagesCLEARMiner: Mining of Multitemporal Remote Sensing Images
CLEARMiner: Mining of Multitemporal Remote Sensing Images
Editor IJCATR
 
Data Analysis Second Lab
Data Analysis Second Lab Data Analysis Second Lab
Data Analysis Second Lab
Marco Braggion
 
Finding Top-k Dominance on Incomplete Big Data Using MapReduce Framework
Finding Top-k Dominance on Incomplete Big Data Using MapReduce FrameworkFinding Top-k Dominance on Incomplete Big Data Using MapReduce Framework
Finding Top-k Dominance on Incomplete Big Data Using MapReduce Framework
Navid Kalaei
 
Data analysis of weather forecasting
Data analysis of weather forecastingData analysis of weather forecasting
Data analysis of weather forecasting
Trupti Shingala, WAS, CPACC, CPWA, JAWS, CSM
 

What's hot (19)

First paper with the NITheCS affiliation
First paper with the NITheCS affiliationFirst paper with the NITheCS affiliation
First paper with the NITheCS affiliation
 
[Seminar] hyunwook 0624
[Seminar] hyunwook 0624[Seminar] hyunwook 0624
[Seminar] hyunwook 0624
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
 
Rinfret, Jonathan poster(2)
Rinfret, Jonathan poster(2)Rinfret, Jonathan poster(2)
Rinfret, Jonathan poster(2)
 
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
 
INFOCOM 2019: On the Distribution of Traffic Volumes in the Internet and its ...
INFOCOM 2019: On the Distribution of Traffic Volumes in the Internet and its ...INFOCOM 2019: On the Distribution of Traffic Volumes in the Internet and its ...
INFOCOM 2019: On the Distribution of Traffic Volumes in the Internet and its ...
 
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
 
LSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationLSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in Recommendation
 
Alam afrizal tambahan
Alam afrizal tambahanAlam afrizal tambahan
Alam afrizal tambahan
 
Ill-posedness formulation of the emission source localization in the radio- d...
Ill-posedness formulation of the emission source localization in the radio- d...Ill-posedness formulation of the emission source localization in the radio- d...
Ill-posedness formulation of the emission source localization in the radio- d...
 
DM Measurement with CHIME
DM Measurement with CHIMEDM Measurement with CHIME
DM Measurement with CHIME
 
Entropy and its significance related to GIS
Entropy and its significance related to GISEntropy and its significance related to GIS
Entropy and its significance related to GIS
 
Streaming multiscale anomaly detection
Streaming multiscale anomaly detectionStreaming multiscale anomaly detection
Streaming multiscale anomaly detection
 
5. Pedro Ladeira, Datamine - Probability of Achieving Certain Economic Outcom...
5. Pedro Ladeira, Datamine - Probability of Achieving Certain Economic Outcom...5. Pedro Ladeira, Datamine - Probability of Achieving Certain Economic Outcom...
5. Pedro Ladeira, Datamine - Probability of Achieving Certain Economic Outcom...
 
CLEARMiner: Mining of Multitemporal Remote Sensing Images
CLEARMiner: Mining of Multitemporal Remote Sensing ImagesCLEARMiner: Mining of Multitemporal Remote Sensing Images
CLEARMiner: Mining of Multitemporal Remote Sensing Images
 
Data Analysis Second Lab
Data Analysis Second Lab Data Analysis Second Lab
Data Analysis Second Lab
 
Finding Top-k Dominance on Incomplete Big Data Using MapReduce Framework
Finding Top-k Dominance on Incomplete Big Data Using MapReduce FrameworkFinding Top-k Dominance on Incomplete Big Data Using MapReduce Framework
Finding Top-k Dominance on Incomplete Big Data Using MapReduce Framework
 
Data analysis of weather forecasting
Data analysis of weather forecastingData analysis of weather forecasting
Data analysis of weather forecasting
 

Similar to Change Point Analysis

Markov Chain Monte Carlo explained
Markov Chain Monte Carlo explainedMarkov Chain Monte Carlo explained
Markov Chain Monte Carlo explained
dariodigiuni
 
Time series and forecasting from wikipedia
Time series and forecasting from wikipediaTime series and forecasting from wikipedia
Time series and forecasting from wikipedia
Monica Barros
 
Semi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.pptSemi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.ppt
VivekDixit100
 
Week08.pdf
Week08.pdfWeek08.pdf
NLP_KASHK:Markov Models
NLP_KASHK:Markov ModelsNLP_KASHK:Markov Models
NLP_KASHK:Markov Models
Hemantha Kulathilake
 
Probability and random processes project based learning template.pdf
Probability and random processes project based learning template.pdfProbability and random processes project based learning template.pdf
Probability and random processes project based learning template.pdf
Vedant Srivastava
 
Pseudo Random Number
Pseudo Random NumberPseudo Random Number
Pseudo Random Number
Hemant Chetwani
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNS
LiemNguyenDuy
 
Shor’s algorithm the ppt
Shor’s algorithm the pptShor’s algorithm the ppt
Shor’s algorithm the ppt
Mrinal Mondal
 
Monte carlo Technique - An algorithm for Radiotherapy Calculations
Monte carlo Technique - An algorithm for Radiotherapy CalculationsMonte carlo Technique - An algorithm for Radiotherapy Calculations
Monte carlo Technique - An algorithm for Radiotherapy Calculations
Sambasivaselli R
 
Modeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesModeling & Simulation Lecture Notes
Modeling & Simulation Lecture Notes
FellowBuddy.com
 
ORMR_Monte Carlo Method.pdf
ORMR_Monte Carlo Method.pdfORMR_Monte Carlo Method.pdf
ORMR_Monte Carlo Method.pdf
SanjayBalu7
 
basics of stochastic and queueing theory
basics of stochastic and queueing theorybasics of stochastic and queueing theory
basics of stochastic and queueing theory
jyoti daddarwal
 
myppt for health issues at IITB. Don't come to IITB
myppt for health issues at IITB. Don't come to IITBmyppt for health issues at IITB. Don't come to IITB
myppt for health issues at IITB. Don't come to IITB
dhvaniliitb
 
Introduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonIntroduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in Python
Peadar Coyle
 
Quantum & AI in Finance
Quantum & AI in FinanceQuantum & AI in Finance
Quantum & AI in Finance
Object Automation
 
Від побудови сейсмічних зображень до інверсії
Від побудови сейсмічних зображень до інверсіїВід побудови сейсмічних зображень до інверсії
Від побудови сейсмічних зображень до інверсії
Sergey Starokadomsky
 
Combining remote sensing earth observations and in situ networks: detection o...
Combining remote sensing earth observations and in situ networks: detection o...Combining remote sensing earth observations and in situ networks: detection o...
Combining remote sensing earth observations and in situ networks: detection o...
Integrated Carbon Observation System (ICOS)
 

Similar to Change Point Analysis (20)

Markov Chain Monte Carlo explained
Markov Chain Monte Carlo explainedMarkov Chain Monte Carlo explained
Markov Chain Monte Carlo explained
 
Time series and forecasting from wikipedia
Time series and forecasting from wikipediaTime series and forecasting from wikipedia
Time series and forecasting from wikipedia
 
Semi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.pptSemi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.ppt
 
Week08.pdf
Week08.pdfWeek08.pdf
Week08.pdf
 
NLP_KASHK:Markov Models
NLP_KASHK:Markov ModelsNLP_KASHK:Markov Models
NLP_KASHK:Markov Models
 
Probability and random processes project based learning template.pdf
Probability and random processes project based learning template.pdfProbability and random processes project based learning template.pdf
Probability and random processes project based learning template.pdf
 
Pseudo Random Number
Pseudo Random NumberPseudo Random Number
Pseudo Random Number
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNS
 
Simulation notes
Simulation notesSimulation notes
Simulation notes
 
Shor’s algorithm the ppt
Shor’s algorithm the pptShor’s algorithm the ppt
Shor’s algorithm the ppt
 
Monte carlo Technique - An algorithm for Radiotherapy Calculations
Monte carlo Technique - An algorithm for Radiotherapy CalculationsMonte carlo Technique - An algorithm for Radiotherapy Calculations
Monte carlo Technique - An algorithm for Radiotherapy Calculations
 
Modeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesModeling & Simulation Lecture Notes
Modeling & Simulation Lecture Notes
 
ORMR_Monte Carlo Method.pdf
ORMR_Monte Carlo Method.pdfORMR_Monte Carlo Method.pdf
ORMR_Monte Carlo Method.pdf
 
basics of stochastic and queueing theory
basics of stochastic and queueing theorybasics of stochastic and queueing theory
basics of stochastic and queueing theory
 
myppt for health issues at IITB. Don't come to IITB
myppt for health issues at IITB. Don't come to IITBmyppt for health issues at IITB. Don't come to IITB
myppt for health issues at IITB. Don't come to IITB
 
CNN.pptx
CNN.pptxCNN.pptx
CNN.pptx
 
Introduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonIntroduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in Python
 
Quantum & AI in Finance
Quantum & AI in FinanceQuantum & AI in Finance
Quantum & AI in Finance
 
Від побудови сейсмічних зображень до інверсії
Від побудови сейсмічних зображень до інверсіїВід побудови сейсмічних зображень до інверсії
Від побудови сейсмічних зображень до інверсії
 
Combining remote sensing earth observations and in situ networks: detection o...
Combining remote sensing earth observations and in situ networks: detection o...Combining remote sensing earth observations and in situ networks: detection o...
Combining remote sensing earth observations and in situ networks: detection o...
 

Recently uploaded

Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 

Recently uploaded (20)

Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 

Change Point Analysis

  • 2. Problem Statement • A sample of observations (counts) from a Poisson process, where events occur randomly over a given time period • A chart representation typically shows time on the horizontal axis and the number of events on the vertical axis. • We want to identify the point(s) in time at which the rate of event occurrences changes, where the number of events is increasing or decreasing in frequency. 2
  • 3. Poisson Distribution pwäˈsän ˌdistrəˌbyo͞oSH(ə)n/ noun: Poisson distribution; plural noun: Poisson distributions A discrete frequency distribution that gives the probability of a number of independent events occurring in a fixed time. Origin: Named after the French mathematical physicist Siméon-Denis Poisson (1781–1840). https://www.google.com/webhp?sourceid=chrome- instant&ion=1&espv=2&ie=UTF-8#q=poisson%20distribution 3
  • 4. Poisson Distribution [https://en.wikipedia.org/wiki/Poisson_distribution] The horizontal axis is the index k, the number of occurrences. λ is the expected value. 4
  • 6. Calculations • Estimate the rate at which events occur before the shift (μ). • Estimate the rate at which events occur after the shift (ƛ). • Estimate the time when the actual shift happens (k). 6
  • 7. Methodology From the article: “Apply a Markov Chain Monte Carlo (MCMC) sampling approach to estimate the population parameters at each possible k, from the beginning of your data set to the end of it. The values you get at each time step will be dependent only on the values you computed at the previous time step (that’s where the Markov Chain part of this problem comes in). There are lots of different ways to hop around the parameter space, and each hopping strategy has a fancy name (e.g. Metropolis-Hastings, Gibbs Sampler, “reversible jump”).” 7
  • 8. Markov Chain Monte Carlo (MCMC) From Wikipedia: “In statistics, Markov chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability distribution based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. The state of the chain after a number of steps is then used as a sample of the desired distribution. The quality of the sample improves as a function of the number of steps.” http://stats.stackexchange.com/questions/165/how-would-you- explain-markov-chain-monte-carlo-mcmc-to-a-layperson 8
  • 9. MCMC, continued “Convergence of the Metropolis-Hastings algorithm. MCMC attempts to approximate the blue distribution with the orange distribution” 9
  • 10. Back to the Example… Rizzo, author of Statistical Computing with R
 https://www.crcpress.com/Statistical-Computing-with-R/Rizzo/9781584885450
 
 From the article: “Here are the models for prior (hypothesized) distributions that Rizzo uses, based on the Gibbs Sampler approach: • mu comes from a Gamma distribution with shape parameter of (0.5 + the sum of all your frequencies UP TO the point in time, k, you’re currently at) and a rate of (k + b1) • lambda comes from a Gamma distribution with shape parameter of (0.5 + the sum of all your frequencies after the point in time, k, you’re currently at) and a rate of (n – k + b1) where n is the number of the year you’re currently processing • b1 comes from a Gamma distribution with a shape parameter of 0.5 and a rate of (mu + 1) • b2 comes from a Gamma distribution with a shape parameter of 0.5 and a rate of (lambda + 1) • L is a likelihood function based on k, mu, lambda, and the sum of all the frequencies up until that point in time, k” 10
  • 11. Gibbs Sampler Code “At each iteration, you pick a value of k to represent a point in time where a change might have occurred. You slice your data into two chunks: the chunk that happened BEFORE this point in time, and the chunk that happened AFTER this point in time. Using your data, you apply a Poisson Process with a (Hypothesized) Gamma Distributed Rate as your model. This is a pretty common model for this particular type of problem. It’s like randomly cutting a deck of cards and taking the average of the values in each of the two cuts… then doing the same thing again… a thousand times.” 11
  • 12. Simulation Results “Knowing the distributions of mu, lambda, and k from hopping around our data will help us estimate values for the true population parameters. At the end of the simulation, we have an array of 1000 values of k, an array of 1000 values of mu, and an array of 1000 values of lambda — we use these to estimate the real values of the population parameters. Typically, algorithms that do this automatically throw out a whole bunch of them in the beginning (the “burn-in” period) — Rizzo tosses out 200 observations — even though some statisticians (e.g. Geyer) say that the burn-in period is unnecessary.” 12
  • 14. Demo • Source Code from Rizzo’s Book • Run in RStudio • “The change point happened between the 39th and 40th observations, the arrival rate before the change point was 3.14 arrivals per unit time, and the rate after the change point was 0.93 arrivals per unit time. (Cool!)” 14
  • 15. Results • Add 40 to 1851 => Change Point @ 1891 15
  • 16. Random Thoughts • How soon can we detect change points? It can be quite easy to locate one in retrospect, but what about real-time change detection? • Identifying changes in trend is difficult, so why is change point detection better than traditional time series analysis? • My recurring thought has been that if someone showed you this chart, then you could just point to the change point and save some compute cycles. Duh! 16