Change	
  Point	
  Detec.on	
  
with	
  Bayesian	
  Inference	
  
By	
  Frank	
  Kelly	
  
Py	
  data	
  
6th	
  January	
  2015	
  
Overview	
  
•  Nigeria,	
  oil	
  wells	
  &	
  drilling	
  
•  Noisy	
  data	
  
•  Some	
  maths	
  
•  Python	
  implementaDon	
  
•  Examples	
  in	
  different	
  domains	
  
FPSO	
  (oil	
  plaIorm	
  picture)	
  
Mud	
  pulse	
  telemetry	
  
•  InformaDon	
  
encoded	
  digitally,	
  
transmiOed	
  via	
  
pressure	
  pulses	
  
through	
  mud	
  fluid.	
  
•  Alert	
  drillers	
  that	
  
they	
  have	
  reached	
  
oil,	
  detect	
  rock	
  types	
  
and	
  general	
  
monitoring.	
  
The	
  problem	
  
•  Poor	
  bit	
  rate	
  and	
  
resoluDon	
  
•  Time	
  consuming	
  
analysis	
  
Approaches	
  to	
  staDsDcs	
  
•  FrequenDst	
  
– Data	
  gathered	
  is	
  a	
  
repeatable	
  random	
  
sample.	
  “Frequency”	
  
– Underlying	
  
parameters	
  are	
  
constant	
  
– Fisher’s	
  0.05	
  
•  Bayesian	
  
– Data	
  are,	
  fixed	
  and	
  
observed	
  from	
  the	
  
realised	
  sample	
  
– Parameters	
  unknown	
  
and	
  described	
  
probabilisDcally	
  
– Introduce	
  
“subjecDvity”	
  
	
  
FrequenDst	
  vs.	
  Bayesian	
  
The	
  Theory:	
  Bayesian	
  inference	
  
•  Methodology	
  of	
  mathemaDcal	
  inference:	
  	
  
–  Choosing	
  between	
  several	
  possible	
  models	
  
–  ExtracDng	
  parameters	
  for	
  these	
  models	
  
•  Bayes’	
  Theorem:	
  
Rev	
  Thomas	
  Bayes	
  1702	
  
-­‐	
  1761	
  
p(w | D) =
p(D | w)p(w)
p(D)
Likelihood	
  
Prior	
  
Probability	
  
Posterior	
  
Probability	
   Evidence	
  
-­‐  Remove	
  nuisance	
  
parameters	
  by	
  
marginalisaDon	
  
-­‐  InteresDng	
  ones	
  
remain	
  
Modelling	
  the	
  problem	
  
µ2
1µ
m
N
0	
   20	
   40	
   60	
   80	
   100	
   120	
   140	
   160	
   180	
   200	
  
0.5	
  
1	
  
1.5	
  
2	
  
2.5	
  
data	
  =	
  model	
  +	
  noise	
  
	
  
•  a	
  sequence	
  of	
  N	
  
samples	
  of	
  data	
  
from	
  a	
  piecewise	
  
constant	
  source	
  
with	
  added	
  
Gaussian	
  noise.	
  
•  Noise	
  independent	
  
of	
  mean,	
  idenDcally	
  
distributed	
  and	
  S.D.	
  
=	
  σ	
  
•  Heterogenous:	
  
divide	
  into	
  two	
  
homogenous	
  
segments	
  
µ2
⎩
⎨
⎧
+
+
=
i
i
i
e
e
d
2
1
µ
µ
Nim
mi
≤<
≤
1µ
Nm
Single	
  changepoint	
  detector:	
  
How	
  does	
  it	
  work?	
  
	
  
•  SubsDtute	
  likelihood	
  into	
  Bayes’ Law	
  
–  Simple	
  model-­‐	
  consider	
  Ockham’s	
  Razor	
  
•  Interested	
  in	
  changepoint	
  locaDon	
  m,	
  integrate	
  w.r.t.	
  the	
  
nuisance	
  parameters	
  (µ1,	
  µ2	
  and	
  σ)…rearrange	
  this…	
  
•  …get	
  a	
  BIG	
  expression	
  for	
  p({m}|dI),	
  code	
  in	
  Python	
  
•  On	
  running	
  obtain	
  most	
  likely	
  changepoint	
  locaDon	
  
Ockham’s	
  razor:	
  
hOp://www.jstor.org/discover/10.2307/29774559?sid=21105568247973&uid=3738032&uid=4&uid=2	
  	
  
The	
  maths	
  
More	
  maths	
  
•  Integrate	
  w.r.t.	
  (and	
  thereby	
  remove)	
  
nuisance	
  parameters	
  
Other	
  applicaDons…	
  
hOp://moz.com/google-­‐algorithm-­‐change	
  
“Google’s	
  algorithm	
  is	
  the	
  “secret	
  sauce	
  recipe”	
  that	
  has	
  enabled	
  it	
  to	
  dominate	
  search.”	
  	
  
	
  
-­‐	
  FT.com	
  16th	
  Sept	
  2014	
  
hOp://www.p.com/cms/s/0/9615661c-­‐3ce1-­‐11e4-­‐9733-­‐00144feabdc0.html?
siteediDon=uk#axzz3DSwXYAW8	
  
Any	
  business	
  with	
  an	
  online	
  presence	
  today	
  open	
  struggles	
  to	
  accurately	
  evaluate:	
  	
  
	
  
●	
  The	
  quality	
  of	
  their	
  website	
  and	
  associated	
  linking	
  pages,	
  as	
  perceived	
  by	
  Google	
  
	
  
●	
  The	
  robustness	
  of	
  their	
  website	
  to	
  a	
  sudden	
  change	
  in	
  Google’s	
  search	
  algorithm	
  
Web	
  traffic	
  
30000	
  
35000	
  
40000	
  
45000	
  
50000	
  
55000	
  
60000	
  
raw	
  daily	
  google	
  search-­‐sourced	
  pageviews	
  
Web	
  traffic	
  (2)	
  
30000	
  
35000	
  
40000	
  
45000	
  
50000	
  
55000	
  
60000	
  
smoothed	
  data	
  using	
  moving	
  average	
  
Web	
  traffic	
  (3)	
  
30000	
  
35000	
  
40000	
  
45000	
  
50000	
  
55000	
  
60000	
  
smoothed	
  data	
  with	
  cyclicality	
  removed	
  
Web	
  traffic	
  (4)	
  
-­‐838	
  
-­‐837.5	
  
-­‐837	
  
-­‐836.5	
  
-­‐836	
  
-­‐835.5	
  
-­‐835	
  
-­‐834.5	
  
-­‐834	
  
-­‐833.5	
  
-­‐833	
  
30000	
  
35000	
  
40000	
  
45000	
  
50000	
  
55000	
  
60000	
  
likelihood	
  of	
  change	
  in	
  data	
  plo>ed	
  over	
  .me	
  
day	
  removed	
   likelihood	
  CP	
  
number	
  of	
  tropical	
  storms	
  per	
  year	
  in	
  the	
  North	
  AtlanDc	
  
Data	
  obtained	
  from	
  ibtracs	
  database:	
  
hOps://www.ncdc.noaa.gov/ibtracs/	
  
"Amo	
  Dmeseries	
  1856-­‐present"	
  by	
  Rosentod,	
  Marsupilami	
  -­‐	
  hOp://www.cdc.noaa.gov/CorrelaDon/amon.us.long.data.	
  Licensed	
  under	
  Public	
  
Domain	
  via	
  Wikimedia	
  Commons	
  -­‐	
  hOp://commons.wikimedia.org/wiki/File:Amo_Dmeseries_1856-­‐present.svg#mediaviewer/
File:Amo_Dmeseries_1856-­‐present.svg	
  
Other	
  applicaDons	
  /	
  possibiliDes	
  
•  Financial	
  markets	
  and	
  poliDcal	
  events	
  
•  Combine	
  with	
  frequenDst	
  staDcal	
  methods:	
  
– Use	
  of	
  GLR	
  in	
  online	
  (moving	
  window)	
  detecDon	
  
applicaDon	
  
•  Your	
  own	
  data/	
  ideas	
  !	
  
Thank	
  you	
  
•  Link	
  to	
  Python	
  code	
  on	
  github:	
  
hOps://github.com/swhustla/pydata-­‐bayes-­‐changepoint	
  	
  
–  Single	
  changepoint	
  detector	
  (as	
  seen	
  tonight)	
  
–  Dual	
  changepoint	
  detector	
  
–  Ramp	
  detector	
  
•  Further	
  reading:	
  
–  Numerical	
  Bayesian	
  Methods	
  Applied	
  to	
  Signal	
  Processing	
  
(StaDsDcs	
  and	
  CompuDng)	
  by	
  Fitzgerald,	
  O’Ruanaidh,	
  1996	
  :	
  
hOp://www.amazon.co.uk/Numerical-­‐Bayesian-­‐Processing-­‐
StaDsDcs-­‐CompuDng/dp/0387946292	
  	
  	
  
–  Bayesian	
  Inference	
  on	
  Change	
  Point	
  Problems	
  (2007)
hOp://www.cs.ubc.ca/~murphyk/Students/Xuan_MSc07.pdf	
  	
  
	
  
TwiOer:	
  @norhustla	
  
Email:	
  frank.kelly@cantab.net	
  
Thank	
  you	
  
•  AddiDonal	
  links:	
  
–  Google	
  Algo	
  updates:	
  	
  hOp://moz.com/google-­‐algorithm-­‐change	
  	
  
–  Mathsight	
  -­‐>	
  insights	
  into	
  algorithm	
  changes	
  hOp://mathsight.org	
  	
  
–  AtlanDc	
  mulD-­‐decadal	
  oscillaDon	
  spaDal	
  paOern:
hOp://commons.wikimedia.org/wiki/File:AMO_PaOern.png	
  
–  NaDonal	
  climaDc	
  data	
  center	
  hOps://www.ncdc.noaa.gov/ibtracs/	
  	
  
–  Ockham’s	
  Razor	
  and	
  Bayesian	
  Inference:	
  
hOp://www.jstor.org/discover/10.2307/29774559?
sid=21105568247973&uid=3738032&uid=4&uid=2	
  
–  ConverDng	
  from	
  Matlab	
  to	
  Python:	
  
hOp://mathesaurus.sourceforge.net/matlab-­‐numpy.html	
  	
  
	
  
TwiOer:	
  @norhustla	
  
Email:	
  frank.kelly@cantab.net	
  

Changepoint Detection with Bayesian Inference

  • 1.
    Change  Point  Detec.on   with  Bayesian  Inference   By  Frank  Kelly   Py  data   6th  January  2015  
  • 2.
    Overview   •  Nigeria,  oil  wells  &  drilling   •  Noisy  data   •  Some  maths   •  Python  implementaDon   •  Examples  in  different  domains  
  • 3.
  • 6.
    Mud  pulse  telemetry   •  InformaDon   encoded  digitally,   transmiOed  via   pressure  pulses   through  mud  fluid.   •  Alert  drillers  that   they  have  reached   oil,  detect  rock  types   and  general   monitoring.  
  • 7.
    The  problem   • Poor  bit  rate  and   resoluDon   •  Time  consuming   analysis  
  • 8.
    Approaches  to  staDsDcs   •  FrequenDst   – Data  gathered  is  a   repeatable  random   sample.  “Frequency”   – Underlying   parameters  are   constant   – Fisher’s  0.05   •  Bayesian   – Data  are,  fixed  and   observed  from  the   realised  sample   – Parameters  unknown   and  described   probabilisDcally   – Introduce   “subjecDvity”    
  • 9.
  • 10.
    The  Theory:  Bayesian  inference   •  Methodology  of  mathemaDcal  inference:     –  Choosing  between  several  possible  models   –  ExtracDng  parameters  for  these  models   •  Bayes’  Theorem:   Rev  Thomas  Bayes  1702   -­‐  1761   p(w | D) = p(D | w)p(w) p(D) Likelihood   Prior   Probability   Posterior   Probability   Evidence   -­‐  Remove  nuisance   parameters  by   marginalisaDon   -­‐  InteresDng  ones   remain  
  • 11.
  • 12.
    0   20   40   60   80   100   120   140   160   180   200   0.5   1   1.5   2   2.5   data  =  model  +  noise     •  a  sequence  of  N   samples  of  data   from  a  piecewise   constant  source   with  added   Gaussian  noise.   •  Noise  independent   of  mean,  idenDcally   distributed  and  S.D.   =  σ   •  Heterogenous:   divide  into  two   homogenous   segments   µ2 ⎩ ⎨ ⎧ + + = i i i e e d 2 1 µ µ Nim mi ≤< ≤ 1µ Nm
  • 13.
    Single  changepoint  detector:   How  does  it  work?     •  SubsDtute  likelihood  into  Bayes’ Law   –  Simple  model-­‐  consider  Ockham’s  Razor   •  Interested  in  changepoint  locaDon  m,  integrate  w.r.t.  the   nuisance  parameters  (µ1,  µ2  and  σ)…rearrange  this…   •  …get  a  BIG  expression  for  p({m}|dI),  code  in  Python   •  On  running  obtain  most  likely  changepoint  locaDon   Ockham’s  razor:   hOp://www.jstor.org/discover/10.2307/29774559?sid=21105568247973&uid=3738032&uid=4&uid=2    
  • 14.
  • 15.
    More  maths   • Integrate  w.r.t.  (and  thereby  remove)   nuisance  parameters  
  • 18.
  • 19.
  • 20.
    “Google’s  algorithm  is  the  “secret  sauce  recipe”  that  has  enabled  it  to  dominate  search.”       -­‐  FT.com  16th  Sept  2014   hOp://www.p.com/cms/s/0/9615661c-­‐3ce1-­‐11e4-­‐9733-­‐00144feabdc0.html? siteediDon=uk#axzz3DSwXYAW8   Any  business  with  an  online  presence  today  open  struggles  to  accurately  evaluate:       ●  The  quality  of  their  website  and  associated  linking  pages,  as  perceived  by  Google     ●  The  robustness  of  their  website  to  a  sudden  change  in  Google’s  search  algorithm  
  • 21.
    Web  traffic   30000   35000   40000   45000   50000   55000   60000   raw  daily  google  search-­‐sourced  pageviews  
  • 22.
    Web  traffic  (2)   30000   35000   40000   45000   50000   55000   60000   smoothed  data  using  moving  average  
  • 23.
    Web  traffic  (3)   30000   35000   40000   45000   50000   55000   60000   smoothed  data  with  cyclicality  removed  
  • 24.
    Web  traffic  (4)   -­‐838   -­‐837.5   -­‐837   -­‐836.5   -­‐836   -­‐835.5   -­‐835   -­‐834.5   -­‐834   -­‐833.5   -­‐833   30000   35000   40000   45000   50000   55000   60000   likelihood  of  change  in  data  plo>ed  over  .me   day  removed   likelihood  CP  
  • 26.
    number  of  tropical  storms  per  year  in  the  North  AtlanDc   Data  obtained  from  ibtracs  database:   hOps://www.ncdc.noaa.gov/ibtracs/  
  • 27.
    "Amo  Dmeseries  1856-­‐present"  by  Rosentod,  Marsupilami  -­‐  hOp://www.cdc.noaa.gov/CorrelaDon/amon.us.long.data.  Licensed  under  Public   Domain  via  Wikimedia  Commons  -­‐  hOp://commons.wikimedia.org/wiki/File:Amo_Dmeseries_1856-­‐present.svg#mediaviewer/ File:Amo_Dmeseries_1856-­‐present.svg  
  • 29.
    Other  applicaDons  /  possibiliDes   •  Financial  markets  and  poliDcal  events   •  Combine  with  frequenDst  staDcal  methods:   – Use  of  GLR  in  online  (moving  window)  detecDon   applicaDon   •  Your  own  data/  ideas  !  
  • 30.
    Thank  you   • Link  to  Python  code  on  github:   hOps://github.com/swhustla/pydata-­‐bayes-­‐changepoint     –  Single  changepoint  detector  (as  seen  tonight)   –  Dual  changepoint  detector   –  Ramp  detector   •  Further  reading:   –  Numerical  Bayesian  Methods  Applied  to  Signal  Processing   (StaDsDcs  and  CompuDng)  by  Fitzgerald,  O’Ruanaidh,  1996  :   hOp://www.amazon.co.uk/Numerical-­‐Bayesian-­‐Processing-­‐ StaDsDcs-­‐CompuDng/dp/0387946292       –  Bayesian  Inference  on  Change  Point  Problems  (2007) hOp://www.cs.ubc.ca/~murphyk/Students/Xuan_MSc07.pdf       TwiOer:  @norhustla   Email:  frank.kelly@cantab.net  
  • 31.
    Thank  you   • AddiDonal  links:   –  Google  Algo  updates:    hOp://moz.com/google-­‐algorithm-­‐change     –  Mathsight  -­‐>  insights  into  algorithm  changes  hOp://mathsight.org     –  AtlanDc  mulD-­‐decadal  oscillaDon  spaDal  paOern: hOp://commons.wikimedia.org/wiki/File:AMO_PaOern.png   –  NaDonal  climaDc  data  center  hOps://www.ncdc.noaa.gov/ibtracs/     –  Ockham’s  Razor  and  Bayesian  Inference:   hOp://www.jstor.org/discover/10.2307/29774559? sid=21105568247973&uid=3738032&uid=4&uid=2   –  ConverDng  from  Matlab  to  Python:   hOp://mathesaurus.sourceforge.net/matlab-­‐numpy.html       TwiOer:  @norhustla   Email:  frank.kelly@cantab.net