Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Change	
  Point	
  Detec.on	
  
with	
  Bayesian	
  Inference	
  
By	
  Frank	
  Kelly	
  
Py	
  data	
  
6th	
  January	
...
Overview	
  
•  Nigeria,	
  oil	
  wells	
  &	
  drilling	
  
•  Noisy	
  data	
  
•  Some	
  maths	
  
•  Python	
  imple...
FPSO	
  (oil	
  plaIorm	
  picture)	
  
Mud	
  pulse	
  telemetry	
  
•  InformaDon	
  
encoded	
  digitally,	
  
transmiOed	
  via	
  
pressure	
  pulses	
  
thr...
The	
  problem	
  
•  Poor	
  bit	
  rate	
  and	
  
resoluDon	
  
•  Time	
  consuming	
  
analysis	
  
Approaches	
  to	
  staDsDcs	
  
•  FrequenDst	
  
– Data	
  gathered	
  is	
  a	
  
repeatable	
  random	
  
sample.	
  “...
FrequenDst	
  vs.	
  Bayesian	
  
The	
  Theory:	
  Bayesian	
  inference	
  
•  Methodology	
  of	
  mathemaDcal	
  inference:	
  	
  
–  Choosing	
  betwe...
Modelling	
  the	
  problem	
  
µ2
1µ
m
N
0	
   20	
   40	
   60	
   80	
   100	
   120	
   140	
   160	
   180	
   200	
  
0.5	
  
1	
  
1.5	
  
2	
  
2.5	
  
data...
Single	
  changepoint	
  detector:	
  
How	
  does	
  it	
  work?	
  
	
  
•  SubsDtute	
  likelihood	
  into	
  Bayes’ La...
The	
  maths	
  
More	
  maths	
  
•  Integrate	
  w.r.t.	
  (and	
  thereby	
  remove)	
  
nuisance	
  parameters	
  
Other	
  applicaDons…	
  
hOp://moz.com/google-­‐algorithm-­‐change	
  
“Google’s	
  algorithm	
  is	
  the	
  “secret	
  sauce	
  recipe”	
  that	
  has	
  enabled	
  it	
  to	
  dominate	
  se...
Web	
  traffic	
  
30000	
  
35000	
  
40000	
  
45000	
  
50000	
  
55000	
  
60000	
  
raw	
  daily	
  google	
  search-­‐...
Web	
  traffic	
  (2)	
  
30000	
  
35000	
  
40000	
  
45000	
  
50000	
  
55000	
  
60000	
  
smoothed	
  data	
  using	
 ...
Web	
  traffic	
  (3)	
  
30000	
  
35000	
  
40000	
  
45000	
  
50000	
  
55000	
  
60000	
  
smoothed	
  data	
  with	
  ...
Web	
  traffic	
  (4)	
  
-­‐838	
  
-­‐837.5	
  
-­‐837	
  
-­‐836.5	
  
-­‐836	
  
-­‐835.5	
  
-­‐835	
  
-­‐834.5	
  
-­...
number	
  of	
  tropical	
  storms	
  per	
  year	
  in	
  the	
  North	
  AtlanDc	
  
Data	
  obtained	
  from	
  ibtracs...
"Amo	
  Dmeseries	
  1856-­‐present"	
  by	
  Rosentod,	
  Marsupilami	
  -­‐	
  hOp://www.cdc.noaa.gov/CorrelaDon/amon.us...
Other	
  applicaDons	
  /	
  possibiliDes	
  
•  Financial	
  markets	
  and	
  poliDcal	
  events	
  
•  Combine	
  with	...
Thank	
  you	
  
•  Link	
  to	
  Python	
  code	
  on	
  github:	
  
hOps://github.com/swhustla/pydata-­‐bayes-­‐changepo...
Thank	
  you	
  
•  AddiDonal	
  links:	
  
–  Google	
  Algo	
  updates:	
  	
  hOp://moz.com/google-­‐algorithm-­‐change...
Changepoint Detection with Bayesian Inference
Changepoint Detection with Bayesian Inference
Changepoint Detection with Bayesian Inference
Changepoint Detection with Bayesian Inference
Changepoint Detection with Bayesian Inference
Changepoint Detection with Bayesian Inference
Upcoming SlideShare
Loading in …5
×

Changepoint Detection with Bayesian Inference

6,608 views

Published on

An overview of the application of Bayesian Inference in the detection of changepoints in noisy time series data, applied to three different and diverse domains.

Published in: Data & Analytics
  • Be the first to comment

Changepoint Detection with Bayesian Inference

  1. 1. Change  Point  Detec.on   with  Bayesian  Inference   By  Frank  Kelly   Py  data   6th  January  2015  
  2. 2. Overview   •  Nigeria,  oil  wells  &  drilling   •  Noisy  data   •  Some  maths   •  Python  implementaDon   •  Examples  in  different  domains  
  3. 3. FPSO  (oil  plaIorm  picture)  
  4. 4. Mud  pulse  telemetry   •  InformaDon   encoded  digitally,   transmiOed  via   pressure  pulses   through  mud  fluid.   •  Alert  drillers  that   they  have  reached   oil,  detect  rock  types   and  general   monitoring.  
  5. 5. The  problem   •  Poor  bit  rate  and   resoluDon   •  Time  consuming   analysis  
  6. 6. Approaches  to  staDsDcs   •  FrequenDst   – Data  gathered  is  a   repeatable  random   sample.  “Frequency”   – Underlying   parameters  are   constant   – Fisher’s  0.05   •  Bayesian   – Data  are,  fixed  and   observed  from  the   realised  sample   – Parameters  unknown   and  described   probabilisDcally   – Introduce   “subjecDvity”    
  7. 7. FrequenDst  vs.  Bayesian  
  8. 8. The  Theory:  Bayesian  inference   •  Methodology  of  mathemaDcal  inference:     –  Choosing  between  several  possible  models   –  ExtracDng  parameters  for  these  models   •  Bayes’  Theorem:   Rev  Thomas  Bayes  1702   -­‐  1761   p(w | D) = p(D | w)p(w) p(D) Likelihood   Prior   Probability   Posterior   Probability   Evidence   -­‐  Remove  nuisance   parameters  by   marginalisaDon   -­‐  InteresDng  ones   remain  
  9. 9. Modelling  the  problem   µ2 1µ m N
  10. 10. 0   20   40   60   80   100   120   140   160   180   200   0.5   1   1.5   2   2.5   data  =  model  +  noise     •  a  sequence  of  N   samples  of  data   from  a  piecewise   constant  source   with  added   Gaussian  noise.   •  Noise  independent   of  mean,  idenDcally   distributed  and  S.D.   =  σ   •  Heterogenous:   divide  into  two   homogenous   segments   µ2 ⎩ ⎨ ⎧ + + = i i i e e d 2 1 µ µ Nim mi ≤< ≤ 1µ Nm
  11. 11. Single  changepoint  detector:   How  does  it  work?     •  SubsDtute  likelihood  into  Bayes’ Law   –  Simple  model-­‐  consider  Ockham’s  Razor   •  Interested  in  changepoint  locaDon  m,  integrate  w.r.t.  the   nuisance  parameters  (µ1,  µ2  and  σ)…rearrange  this…   •  …get  a  BIG  expression  for  p({m}|dI),  code  in  Python   •  On  running  obtain  most  likely  changepoint  locaDon   Ockham’s  razor:   hOp://www.jstor.org/discover/10.2307/29774559?sid=21105568247973&uid=3738032&uid=4&uid=2    
  12. 12. The  maths  
  13. 13. More  maths   •  Integrate  w.r.t.  (and  thereby  remove)   nuisance  parameters  
  14. 14. Other  applicaDons…  
  15. 15. hOp://moz.com/google-­‐algorithm-­‐change  
  16. 16. “Google’s  algorithm  is  the  “secret  sauce  recipe”  that  has  enabled  it  to  dominate  search.”       -­‐  FT.com  16th  Sept  2014   hOp://www.p.com/cms/s/0/9615661c-­‐3ce1-­‐11e4-­‐9733-­‐00144feabdc0.html? siteediDon=uk#axzz3DSwXYAW8   Any  business  with  an  online  presence  today  open  struggles  to  accurately  evaluate:       ●  The  quality  of  their  website  and  associated  linking  pages,  as  perceived  by  Google     ●  The  robustness  of  their  website  to  a  sudden  change  in  Google’s  search  algorithm  
  17. 17. Web  traffic   30000   35000   40000   45000   50000   55000   60000   raw  daily  google  search-­‐sourced  pageviews  
  18. 18. Web  traffic  (2)   30000   35000   40000   45000   50000   55000   60000   smoothed  data  using  moving  average  
  19. 19. Web  traffic  (3)   30000   35000   40000   45000   50000   55000   60000   smoothed  data  with  cyclicality  removed  
  20. 20. Web  traffic  (4)   -­‐838   -­‐837.5   -­‐837   -­‐836.5   -­‐836   -­‐835.5   -­‐835   -­‐834.5   -­‐834   -­‐833.5   -­‐833   30000   35000   40000   45000   50000   55000   60000   likelihood  of  change  in  data  plo>ed  over  .me   day  removed   likelihood  CP  
  21. 21. number  of  tropical  storms  per  year  in  the  North  AtlanDc   Data  obtained  from  ibtracs  database:   hOps://www.ncdc.noaa.gov/ibtracs/  
  22. 22. "Amo  Dmeseries  1856-­‐present"  by  Rosentod,  Marsupilami  -­‐  hOp://www.cdc.noaa.gov/CorrelaDon/amon.us.long.data.  Licensed  under  Public   Domain  via  Wikimedia  Commons  -­‐  hOp://commons.wikimedia.org/wiki/File:Amo_Dmeseries_1856-­‐present.svg#mediaviewer/ File:Amo_Dmeseries_1856-­‐present.svg  
  23. 23. Other  applicaDons  /  possibiliDes   •  Financial  markets  and  poliDcal  events   •  Combine  with  frequenDst  staDcal  methods:   – Use  of  GLR  in  online  (moving  window)  detecDon   applicaDon   •  Your  own  data/  ideas  !  
  24. 24. Thank  you   •  Link  to  Python  code  on  github:   hOps://github.com/swhustla/pydata-­‐bayes-­‐changepoint     –  Single  changepoint  detector  (as  seen  tonight)   –  Dual  changepoint  detector   –  Ramp  detector   •  Further  reading:   –  Numerical  Bayesian  Methods  Applied  to  Signal  Processing   (StaDsDcs  and  CompuDng)  by  Fitzgerald,  O’Ruanaidh,  1996  :   hOp://www.amazon.co.uk/Numerical-­‐Bayesian-­‐Processing-­‐ StaDsDcs-­‐CompuDng/dp/0387946292       –  Bayesian  Inference  on  Change  Point  Problems  (2007) hOp://www.cs.ubc.ca/~murphyk/Students/Xuan_MSc07.pdf       TwiOer:  @norhustla   Email:  frank.kelly@cantab.net  
  25. 25. Thank  you   •  AddiDonal  links:   –  Google  Algo  updates:    hOp://moz.com/google-­‐algorithm-­‐change     –  Mathsight  -­‐>  insights  into  algorithm  changes  hOp://mathsight.org     –  AtlanDc  mulD-­‐decadal  oscillaDon  spaDal  paOern: hOp://commons.wikimedia.org/wiki/File:AMO_PaOern.png   –  NaDonal  climaDc  data  center  hOps://www.ncdc.noaa.gov/ibtracs/     –  Ockham’s  Razor  and  Bayesian  Inference:   hOp://www.jstor.org/discover/10.2307/29774559? sid=21105568247973&uid=3738032&uid=4&uid=2   –  ConverDng  from  Matlab  to  Python:   hOp://mathesaurus.sourceforge.net/matlab-­‐numpy.html       TwiOer:  @norhustla   Email:  frank.kelly@cantab.net  

×