SlideShare a Scribd company logo
Kalman Filters and Adaptive Windows for Learning
                 in Data Streams

                             Albert Bifet          Ricard Gavaldà

                              Universitat Politècnica de Catalunya


                                     DS’06 Barcelona




A. Bifet, R. Gavaldà (UPC)     Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                          DS’06             1 / 29
Outline

1   Introduction

2   The Kalman Filter and the CUSUM Test

3   The ADWIN Algorithm

4   General Framework

5   K-ADWIN

6   Experimental Validation of K-ADWIN

7   Conclusions


    A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                            DS’06             2 / 29
Introduction


Introduction


    Data Streams
            Sequence potentially infinite
            High amount of data:
                    sublinear space
            High Speed of arrival:
                    small constant time per example
    Estimation and prediction
    Distribution and concept drift
    K-ADWIN : Combination
            Kalman filter
            ADWIN : Adaptive window of recently seen data items.




  A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                          DS’06             3 / 29
Introduction


Introduction




Problem
Given an input sequence x1 , x2 , . . . , xt , . . . we want to output at instant t
a prediction xt+1 minimizing prediction error:

                                           |xt+1 − xt+1 |

considering distribution changes overtime.




   A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                           DS’06             4 / 29
Introduction


Time Change Detectors and Predictors: A General
Framework


                                                                                             Estimation
                                                                                           -
 xt
           -
                     Estimator




      A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                              DS’06             5 / 29
Introduction


Time Change Detectors and Predictors: A General
Framework


                                                                                             Estimation
                                                                                           -
 xt
           -
                     Estimator                                                               Alarm
                                                 -                                         -
                                                     Change Detect.




      A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                              DS’06             5 / 29
Introduction


Time Change Detectors and Predictors: A General
Framework


                                                                                              Estimation
                                                                                            -
 xt
           -
                     Estimator                                                                Alarm
                                                  -                                         -
                                                      Change Detect.
                              6
                                                           6
                                                           ?
                    -
                                   Memory




      A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                               DS’06             5 / 29
Introduction


Introduction



                                                 Our particular proposal: K-ADWIN
  Our generic proposal:
                                                        Kalman filter as estimator
        Use change detector
                                                        Use ADWIN as change
        Use memory
                                                        detector with memory [BG06]

Application
    Estimate statistics from data streams
    In Data Mining Algorithms based on counters, replace them for
    estimators.




  A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                          DS’06             6 / 29
Introduction


Data Mining Algorithms with Concept Drift

 No Concept Drift                                      Concept drift

                DM Algorithm
                                                                        DM Algorithm
  input                               output              input                                     output
          -
                    Counter5               -                      -                                         -

                    Counter4                                             Static Model

                    Counter3                                                        6

                    Counter2
                    Counter1                                      -
                                                                      Change Detect.
                                                                                                  




  A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                          DS’06                 7 / 29
Introduction


Data Mining Algorithms with Concept Drift

 No Concept Drift                                      Concept Drift

                DM Algorithm                                            DM Algorithm
  input                               output              input                                     output
          -
                    Counter5               -                      -      Estimator5                         -

                    Counter4                                             Estimator4
                    Counter3                                             Estimator3
                    Counter2                                             Estimator2
                    Counter1                                             Estimator1




  A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                          DS’06                 7 / 29
The Kalman Filter and the CUSUM Test


The Kalman Filter
    Optimal recursive algorithm
    Minimum mean-square error estimator
    Estimate the state x ∈ n of a discrete-time controlled process

                                      xk = Axk −1 + Buk + wk −1

    with a measurement z ∈                       m   that is

                                               Zk = Hxk + vk .

    The random variables wk and vk represent the process and
    measurement noise (respectively). They are assumed to be
    independent (of each other), white, and with normal probability
    distributions

                               p(w) ∼ N(0, Q)                   p(v ) ∼ N(0, R).

  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                              DS’06             8 / 29
The Kalman Filter and the CUSUM Test


The Kalman Filter



The difference equation of our discrete-time controlled process is

                                     Kk = Pk −1 /(Pk −1 + R)
                                 Xk = Xk −1 + Kk (zk − Xk −1 )
                                      Pk = Pk (1 − Kk ) + Q




  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                              DS’06             9 / 29
The Kalman Filter and the CUSUM Test


The Kalman Filter


The difference equation of our discrete-time controlled process is

                                     Kk = Pk −1 /(Pk −1 + R)
                                 Xk = Xk −1 + Kk (zk − Xk −1 )
                                      Pk = Pk (1 − Kk ) + Q
 The performance of the Kalman filter depends on the accuracy of the
a-priori assumptions:
    linearity of the difference stochastic equation
    estimation of covariances Q and R, assumed to be fixed, known,
    and follow normal distributions with zero mean.



  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona
                                                                                              DS’06             9 / 29
The Kalman Filter and the CUSUM Test


The CUSUM Test



    The cumulative sum (CUSUM algorithm),is a change detection
    algorithm that gives an alarm when the mean of the input data is
    significantly different from zero.
    The CUSUM test is memoryless, and its accuracy depends on the
    choice of parameters υ and h. It is as follows:

                               g0 = 0,          gt = max (0, gt−1 +                 t   − υ)

                                  if gt  h then alarm and gt = 0




  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   10 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]

Example
W = 101010110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 101010110111111
W0 = 1 W1 = 01010110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 101010110111111
W0 = 10 W1 = 1010110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 101010110111111
W0 = 101 W1 = 010110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 101010110111111
W0 = 1010 W1 = 10110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 101010110111111
W0 = 10101 W1 = 0110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 101010110111111
W0 = 101010 W1 = 110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 101010110111111
W0 = 1010101 W1 = 10111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 101010110111111
W0 = 10101011 W1 = 0111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 101010110111111 |ˆW0 − µW1 | ≥
                     µ     ˆ                                   c   : CHANGE DETECTED!
W0 = 101010110 W1 = 111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 101010110111111 Drop elements from the tail of W
W0 = 101010110 W1 = 111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]
Example
W = 01010110111111 Drop elements from the tail of W
W0 = 101010110 W1 = 111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |ˆW0 − µW1 | ≥ c holds
                     µ     ˆ
6               for every split of W into W = W0 · W1
7           Output µW
                    ˆ


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   11 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     1 01010110111111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     10 1010110111111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     101 010110111111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     1010 10110111111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     10101 0110111111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     101010 110111111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     1010101 10111111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     10101011 0111111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     101010110 111111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     1010101101 11111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     10101011011 1111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     101010110111 111



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models

                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     1010101101111 11



   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Window Management Models


                                W = 101010110111111

Equal  fixed size subwindows                            Total window against subwindow

         1010 1011011 1111                                            10101011011 1111
D. Kifer, S. Ben-David, and J.                          J. Gama, P. Medas, G. Castillo,
Gehrke. Detecting change in data                        and P. Rodrigues. Learning with
streams. 2004                                           drift detection. 2004

ADWIN: All Adjacent subwindows

                                     10101011011111 1


   A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                           DS’06 Barcelona   12 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]




ADWIN has rigorous guarantees
    On ratio of false positives
    On ratio of false negatives
    On the relation of the size of the current window and change rates




  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   13 / 29
The ADWIN Algorithm


Algorithm ADWIN [BG06]

Theorem
At every time step we have:
 1     (Few false positives guarantee) If µt remains constant within W ,
       the probability that ADWIN shrinks the window at this step is at
       most δ.
 2     (Few false negatives guarantee) If for any partition W in two parts
       W0 W1 (where W1 contains the most recent items) we have
       |µW0 − µW1 |  , and if

                                                 3 max{µW0 , µW1 } 4n
                                    ≥4·                           ln
                                                    min{n0 , n1 }    δ

       then with probability 1 − δ ADWIN shrinks W to W1 , or shorter.

     A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   14 / 29
The ADWIN Algorithm


Data Streams Algorithm ADWIN2 [BG06]

ADWIN2 using a Data Stream Sliding Window Model,
    can provide the exact counts of 1’s in O(1) time per point.
    tries O(log W ) cutpoints
    uses O( 1 log W ) memory words
    the processing time per example is O(log W ) (amortized) and
    O(log2 W ) (worst-case).

                                  Sliding Window Model

                 1010101 101 11 1 1
Content:                  4      2         2     1 1
Capacity:                  7      3        2     1 1


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   15 / 29
The ADWIN Algorithm


Algorithm ADWIN2

ADWIN2 using a Data Stream Sliding Window Model,
    can provide the exact counts of 1’s in O(1) time per point.
    tries O(log W ) cutpoints
    uses O( 1 log W ) memory words
    the processing time per example is O(log W ) (amortized) and
    O(log2 W ) (worst-case).

                                       Insert new Item

                 1010101 101 11 1 1
Content:                  4      2         2     1 1
Capacity:                  7      3        2     1 1


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   16 / 29
The ADWIN Algorithm


Algorithm ADWIN2

ADWIN2 using a Data Stream Sliding Window Model,
    can provide the exact counts of 1’s in O(1) time per point.
    tries O(log W ) cutpoints
    uses O( 1 log W ) memory words
    the processing time per example is O(log W ) (amortized) and
    O(log2 W ) (worst-case).

                                       Insert new Item

                 1010101 101 11 1 1                          1
Content:                  4      2         2     1 1         1
Capacity:                  7      3        2     1 1         1


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   16 / 29
The ADWIN Algorithm


Algorithm ADWIN2

ADWIN2 using a Data Stream Sliding Window Model,
    can provide the exact counts of 1’s in O(1) time per point.
    tries O(log W ) cutpoints
    uses O( 1 log W ) memory words
    the processing time per example is O(log W ) (amortized) and
    O(log2 W ) (worst-case).

                                  Compressing Buckets

                 1010101 101 11 1 1                          1
Content:                  4      2         2     1 1         1
Capacity:                  7      3        2     1 1         1


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   16 / 29
The ADWIN Algorithm


Algorithm ADWIN2

ADWIN2 using a Data Stream Sliding Window Model,
    can provide the exact counts of 1’s in O(1) time per point.
    tries O(log W ) cutpoints
    uses O( 1 log W ) memory words
    the processing time per example is O(log W ) (amortized) and
    O(log2 W ) (worst-case).

                                  Compressing Buckets

                 1010101 101 11 11 1
Content:                  4      2         2      2      1
Capacity:                  7      3        2      2      1


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   17 / 29
The ADWIN Algorithm


Algorithm ADWIN2

ADWIN2 using a Data Stream Sliding Window Model,
    can provide the exact counts of 1’s in O(1) time per point.
    tries O(log W ) cutpoints
    uses O( 1 log W ) memory words
    the processing time per example is O(log W ) (amortized) and
    O(log2 W ) (worst-case).

                                  Compressing Buckets

                 1010101 101 11 11 1
Content:                  4      2         2      2      1
Capacity:                  7      3        2      2      1


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   17 / 29
The ADWIN Algorithm


Algorithm ADWIN2

ADWIN2 using a Data Stream Sliding Window Model,
    can provide the exact counts of 1’s in O(1) time per point.
    tries O(log W ) cutpoints
    uses O( 1 log W ) memory words
    the processing time per example is O(log W ) (amortized) and
    O(log2 W ) (worst-case).

                                  Compressing Buckets

                 1010101 10111 11 1
Content:                  4          4         2     1
Capacity:                  7         5         2     1


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   18 / 29
The ADWIN Algorithm


Algorithm ADWIN2

ADWIN2 using a Data Stream Sliding Window Model,
    can provide the exact counts of 1’s in O(1) time per point.
    tries O(log W ) cutpoints
    uses O( 1 log W ) memory words
    the processing time per example is O(log W ) (amortized) and
    O(log2 W ) (worst-case).

                         Detecting Change: Delete last Bucket

                 1010101 10111 11 1
Content:                 4           4         2     1
Capacity:                7           5         2     1


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   19 / 29
The ADWIN Algorithm


Algorithm ADWIN2

ADWIN2 using a Data Stream Sliding Window Model,
    can provide the exact counts of 1’s in O(1) time per point.
    tries O(log W ) cutpoints
    uses O( 1 log W ) memory words
    the processing time per example is O(log W ) (amortized) and
    O(log2 W ) (worst-case).

                        Detecting Change: Delete last Bucket

                  10111 11 1
Content:                4      2    1
Capacity:               5      2    1


  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   19 / 29
General Framework


General Framework Time Change Detectors and
Predictors : Type I

Example (Kalman Filter)
                                                                                             Estimation
                                                                                           -
 xt
           -
                     Estimator




      A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   20 / 29
General Framework


General Framework Time Change Detectors and
Predictors : Type II

Example (Kalman Filter + CUSUM)
                                                                                             Estimation
                                                                                           -
 xt
           -
                     Estimator                                                               Alarm
                                                 -                                         -
                                                     Change Detect.




      A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   20 / 29
General Framework


General Framework Time Change Detectors and
Predictors : Type III

Example (Adaptive Kalman Filter)
                                                                                              Estimation
                                                                                            -
 xt
           -
                     Estimator

                              6


                    -
                                   Memory




      A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                              DS’06 Barcelona   20 / 29
General Framework


General Framework Time Change Detectors and
Predictors : Type IV

Example (ADWIN, Kalman Filter+ADWIN)
                                                                                              Estimation
                                                                                            -
 xt
           -
                     Estimator                                                                Alarm
                                                  -                                         -
                                                      Change Detect.
                              6
                                                          6
                                                          ?
                    -
                                   Memory




      A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                              DS’06 Barcelona   20 / 29
General Framework


Time Change Detectors and Predictors: A General
Framework


                                    No memory                                       Memory


   No Change                          Type I                                   Type III
    Detector                       Kalman Filter                        Adaptive Kalman Filter


     Change                           Type II                                 Type IV
     Detector                  Kalman Filter + CUSUM                          ADWIN
                                                                        Kalman Filter + ADWIN



  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   21 / 29
General Framework


Time Change Detectors and Predictors: A General
Framework



                               No memory                                          Memory


 No Change                        Type I                                  Type III
  Detector                     Kalman Filter                       Adaptive Kalman Filter
                                                                 Q,R estimated from window

  Change                       Type II                                    Type IV
  Detector              Kalman Filter + CUSUM                             ADWIN
                                                                   Kalman Filter + ADWIN
                                                                 Q,R estimated from window

  A. Bifet, R. Gavaldà (UPC)    Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                          DS’06 Barcelona   21 / 29
K-ADWIN


K-ADWIN = ADWIN + Kalman Filtering

                                                                                             Estimation
                                                                                           -
 xt
           -
                       Kalman                                                                Alarm
                                                 -                                         -
                                                             ADWIN
                             6
                                                        6
                                                        ?
                   -
                              ADWIN Memory


R = W 2 /50 and Q = 200/W , where W is the length of the window
maintained by ADWIN.


      A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   22 / 29
Experimental Validation of K-ADWIN


Tracking Experiments



KALMAN: R=1000;Q=1 Error= 854.97




  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   23 / 29
Experimental Validation of K-ADWIN


Tracking Experiments
ADWIN : Error= 674.66




KALMAN: R=1000;Q=1 Error= 854.97




  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   23 / 29
Experimental Validation of K-ADWIN


Tracking Experiments
K-ADWIN Error= 530.13




ADWIN : Error= 674.66




  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   23 / 29
Experimental Validation of K-ADWIN


Naïve Bayes

                                   Example
 Data set that                       outlook              temp.       humidity            windy          play
 describes the                        sunny                 hot         high              false           no
 weather                              sunny                 hot         high               true           no
 conditions for                      overcast               hot         high              false          yes
 playing some                          rainy               mild         high              false          yes
 game.                                 rainy               cool        normal             false          yes
                                       rainy               cool        normal              true           no
                                     overcast              cool        normal              true          yes


          Assume we have to classify the following new instance:
              outlook temp. humidity windy play
               sunny    cool        high       true     ?


  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   24 / 29
Experimental Validation of K-ADWIN


Naïve Bayes

    Assume we have to classify the following new instance:
     outlook temp. humidity windy play
      sunny   cool       high        true     ?

    We classify the new instance:

    νNB = arg             max        P(νj )P(sunny |νj )P(cool|νj )P(high|νj )P(true|νj )
                      ν∈{yes,no}

    Conditional probabilities can be estimated directly as frequencies:

                               number of instances with attribute ai and class νj
          P(ai |νj ) =
                                total number of training instances with class νj

    Create one estimator for each frequence that needs estimation

  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   24 / 29
Experimental Validation of K-ADWIN


Experimental Validation of K-ADWIN



    We test Naïve Bayes Predictor and k-means clustering
    Method: replace counters by estimators
    Synthetic data where change is controllable
    Naïve Bayes: We compare accuracy of
            Static model: Training of 1000 samples every instant
            Dynamic model: replace probabilities counters by estimators
                                      %Dynamic
    computing the ratio                Static        with tests using 2000 samples.




  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   25 / 29
Experimental Validation of K-ADWIN


Naïve Bayes Predictor


                                          Width           %Static      %Dynamic             % Dynamic/Static
         ADWIN                                            83,36%          80,30%                    96,33%
 Kalman Q = 1, R = 1000                                   83,22%          71,13%                    85,48%
  Kalman Q = 1, R = 1                                     83,21%          56,91%                    68,39%
 Kalman Q = .25, R = .25                                  83,26%          56,91%                    68,35%
    Adaptive Kalman                                       83,24%          76,21%                    91,56%
     CUSUM Kalman                                         83,30%          50,65%                    60,81%
       K-ADWIN                                            83,24%          81,39%                    97,77%
   Fixed-sized Window                        32           83,28%          67,64%                    81,22%
   Fixed-sized Window                       128           83,30%          75,40%                    90,52%
   Fixed-sized Window                       512           83,28%          80,47%                    96,62%
   Fixed-sized Window                      2048           83,24%          82,19%                    98,73%




  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   26 / 29
Experimental Validation of K-ADWIN


k -means Clustering

                                                              σ = 0.15
                                          Width           Static Dynamic
         ADWIN                                            9,72         21,54
 Kalman Q = 1, R = 1000                                   9,72         19,72
 Kalman Q = 1, R = 100                                    9,71         17,60
 Kalman Q = .25, R = .25                                  9,71         22,63
    Adaptive Kalman                                       9,72         18,98
     CUSUM Kalman                                         9,72         18,29
        K-ADWIN                                           9,72         17,30
   Fixed-sized Window                       32            9,72         25,70
   Fixed-sized Window                      128            9,72         36,42
   Fixed-sized Window                      512            9,72         38,75
   Fixed-sized Window                      2048           9,72         39,64
   Fixed-sized Window                      8192           9,72         43,39
   Fixed-sized Window                     32768           9,72         53,82


  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   27 / 29
Experimental Validation of K-ADWIN


Results



    No estimator ever does much better than K-ADWIN
    K-ADWIN does much better than every other estimators in at least
    one context.
    Tracking problem K-ADWIN and ADWIN automatically do about as
    well as the Kalman filter with the best set of fixed covariance
    parameters.
    Naïve Bayes and k -means: K-ADWIN does somewhat better than
    ADWIN and far better than any memoryless Kalman filter.




  A. Bifet, R. Gavaldà (UPC)       Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                             DS’06 Barcelona   28 / 29
Conclusions


Conclusions and Future Work



    K-ADWIN tunes itself to the data stream at hand, with no need for
    the user to hardwire or precompute parameters.
    Better results than either memoryless Kalman Filtering or sliding
    windows with linear estimators.
    Future work :
            Tests on real-world, not only synthetic data.
            Other learning algorithms: algorithms for induction of decision
            trees.




  A. Bifet, R. Gavaldà (UPC)   Kalman Filters and Adaptive Windows for Learning in Data Streams
                                                                                         DS’06 Barcelona   29 / 29

More Related Content

More from Albert Bifet

Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
Albert Bifet
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016
Albert Bifet
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
Albert Bifet
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream Classifiers
Albert Bifet
 
Apache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkApache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache Flink
Albert Bifet
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
Albert Bifet
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Albert Bifet
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
Albert Bifet
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
Albert Bifet
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream Mining
Albert Bifet
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
Albert Bifet
 
Multi-label Classification with Meta-labels
Multi-label Classification with Meta-labelsMulti-label Classification with Meta-labels
Multi-label Classification with Meta-labels
Albert Bifet
 
Pitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themPitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid them
Albert Bifet
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
Albert Bifet
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Albert Bifet
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
Albert Bifet
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
Albert Bifet
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
Albert Bifet
 
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and SolutionsPAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
Albert Bifet
 
Sentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataSentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming Data
Albert Bifet
 

More from Albert Bifet (20)

Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream Classifiers
 
Apache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkApache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache Flink
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream Mining
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
 
Multi-label Classification with Meta-labels
Multi-label Classification with Meta-labelsMulti-label Classification with Meta-labels
Multi-label Classification with Meta-labels
 
Pitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themPitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid them
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive Windows
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
 
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and SolutionsPAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
 
Sentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataSentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming Data
 

Recently uploaded

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 

Recently uploaded (20)

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 

Kalman Filters and Adaptive Windows for Learning in Data Streams

  • 1. Kalman Filters and Adaptive Windows for Learning in Data Streams Albert Bifet Ricard Gavaldà Universitat Politècnica de Catalunya DS’06 Barcelona A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 1 / 29
  • 2. Outline 1 Introduction 2 The Kalman Filter and the CUSUM Test 3 The ADWIN Algorithm 4 General Framework 5 K-ADWIN 6 Experimental Validation of K-ADWIN 7 Conclusions A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 2 / 29
  • 3. Introduction Introduction Data Streams Sequence potentially infinite High amount of data: sublinear space High Speed of arrival: small constant time per example Estimation and prediction Distribution and concept drift K-ADWIN : Combination Kalman filter ADWIN : Adaptive window of recently seen data items. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 3 / 29
  • 4. Introduction Introduction Problem Given an input sequence x1 , x2 , . . . , xt , . . . we want to output at instant t a prediction xt+1 minimizing prediction error: |xt+1 − xt+1 | considering distribution changes overtime. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 4 / 29
  • 5. Introduction Time Change Detectors and Predictors: A General Framework Estimation - xt - Estimator A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 5 / 29
  • 6. Introduction Time Change Detectors and Predictors: A General Framework Estimation - xt - Estimator Alarm - - Change Detect. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 5 / 29
  • 7. Introduction Time Change Detectors and Predictors: A General Framework Estimation - xt - Estimator Alarm - - Change Detect. 6 6 ? - Memory A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 5 / 29
  • 8. Introduction Introduction Our particular proposal: K-ADWIN Our generic proposal: Kalman filter as estimator Use change detector Use ADWIN as change Use memory detector with memory [BG06] Application Estimate statistics from data streams In Data Mining Algorithms based on counters, replace them for estimators. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 6 / 29
  • 9. Introduction Data Mining Algorithms with Concept Drift No Concept Drift Concept drift DM Algorithm DM Algorithm input output input output - Counter5 - - - Counter4 Static Model Counter3 6 Counter2 Counter1 - Change Detect. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 7 / 29
  • 10. Introduction Data Mining Algorithms with Concept Drift No Concept Drift Concept Drift DM Algorithm DM Algorithm input output input output - Counter5 - - Estimator5 - Counter4 Estimator4 Counter3 Estimator3 Counter2 Estimator2 Counter1 Estimator1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 7 / 29
  • 11. The Kalman Filter and the CUSUM Test The Kalman Filter Optimal recursive algorithm Minimum mean-square error estimator Estimate the state x ∈ n of a discrete-time controlled process xk = Axk −1 + Buk + wk −1 with a measurement z ∈ m that is Zk = Hxk + vk . The random variables wk and vk represent the process and measurement noise (respectively). They are assumed to be independent (of each other), white, and with normal probability distributions p(w) ∼ N(0, Q) p(v ) ∼ N(0, R). A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 8 / 29
  • 12. The Kalman Filter and the CUSUM Test The Kalman Filter The difference equation of our discrete-time controlled process is Kk = Pk −1 /(Pk −1 + R) Xk = Xk −1 + Kk (zk − Xk −1 ) Pk = Pk (1 − Kk ) + Q A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 9 / 29
  • 13. The Kalman Filter and the CUSUM Test The Kalman Filter The difference equation of our discrete-time controlled process is Kk = Pk −1 /(Pk −1 + R) Xk = Xk −1 + Kk (zk − Xk −1 ) Pk = Pk (1 − Kk ) + Q The performance of the Kalman filter depends on the accuracy of the a-priori assumptions: linearity of the difference stochastic equation estimation of covariances Q and R, assumed to be fixed, known, and follow normal distributions with zero mean. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams Barcelona DS’06 9 / 29
  • 14. The Kalman Filter and the CUSUM Test The CUSUM Test The cumulative sum (CUSUM algorithm),is a change detection algorithm that gives an alarm when the mean of the input data is significantly different from zero. The CUSUM test is memoryless, and its accuracy depends on the choice of parameters υ and h. It is as follows: g0 = 0, gt = max (0, gt−1 + t − υ) if gt h then alarm and gt = 0 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 10 / 29
  • 15. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 16. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 W0 = 1 W1 = 01010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 17. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 W0 = 10 W1 = 1010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 18. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 W0 = 101 W1 = 010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 19. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 W0 = 1010 W1 = 10110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 20. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 W0 = 10101 W1 = 0110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 21. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 W0 = 101010 W1 = 110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 22. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 W0 = 1010101 W1 = 10111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 23. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 W0 = 10101011 W1 = 0111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 24. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 |ˆW0 − µW1 | ≥ µ ˆ c : CHANGE DETECTED! W0 = 101010110 W1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 25. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 101010110111111 Drop elements from the tail of W W0 = 101010110 W1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 26. The ADWIN Algorithm Algorithm ADWIN [BG06] Example W = 01010110111111 Drop elements from the tail of W W0 = 101010110 W1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |ˆW0 − µW1 | ≥ c holds µ ˆ 6 for every split of W into W = W0 · W1 7 Output µW ˆ A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 11 / 29
  • 27. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 1 01010110111111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 28. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 10 1010110111111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 29. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 101 010110111111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 30. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 1010 10110111111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 31. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 10101 0110111111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 32. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 101010 110111111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 33. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 1010101 10111111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 34. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 10101011 0111111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 35. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 101010110 111111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 36. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 1010101101 11111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 37. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 10101011011 1111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 38. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 101010110111 111 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 39. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 1010101101111 11 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 40. The ADWIN Algorithm Window Management Models W = 101010110111111 Equal fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 D. Kifer, S. Ben-David, and J. J. Gama, P. Medas, G. Castillo, Gehrke. Detecting change in data and P. Rodrigues. Learning with streams. 2004 drift detection. 2004 ADWIN: All Adjacent subwindows 10101011011111 1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 12 / 29
  • 41. The ADWIN Algorithm Algorithm ADWIN [BG06] ADWIN has rigorous guarantees On ratio of false positives On ratio of false negatives On the relation of the size of the current window and change rates A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 13 / 29
  • 42. The ADWIN Algorithm Algorithm ADWIN [BG06] Theorem At every time step we have: 1 (Few false positives guarantee) If µt remains constant within W , the probability that ADWIN shrinks the window at this step is at most δ. 2 (Few false negatives guarantee) If for any partition W in two parts W0 W1 (where W1 contains the most recent items) we have |µW0 − µW1 | , and if 3 max{µW0 , µW1 } 4n ≥4· ln min{n0 , n1 } δ then with probability 1 − δ ADWIN shrinks W to W1 , or shorter. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 14 / 29
  • 43. The ADWIN Algorithm Data Streams Algorithm ADWIN2 [BG06] ADWIN2 using a Data Stream Sliding Window Model, can provide the exact counts of 1’s in O(1) time per point. tries O(log W ) cutpoints uses O( 1 log W ) memory words the processing time per example is O(log W ) (amortized) and O(log2 W ) (worst-case). Sliding Window Model 1010101 101 11 1 1 Content: 4 2 2 1 1 Capacity: 7 3 2 1 1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 15 / 29
  • 44. The ADWIN Algorithm Algorithm ADWIN2 ADWIN2 using a Data Stream Sliding Window Model, can provide the exact counts of 1’s in O(1) time per point. tries O(log W ) cutpoints uses O( 1 log W ) memory words the processing time per example is O(log W ) (amortized) and O(log2 W ) (worst-case). Insert new Item 1010101 101 11 1 1 Content: 4 2 2 1 1 Capacity: 7 3 2 1 1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 16 / 29
  • 45. The ADWIN Algorithm Algorithm ADWIN2 ADWIN2 using a Data Stream Sliding Window Model, can provide the exact counts of 1’s in O(1) time per point. tries O(log W ) cutpoints uses O( 1 log W ) memory words the processing time per example is O(log W ) (amortized) and O(log2 W ) (worst-case). Insert new Item 1010101 101 11 1 1 1 Content: 4 2 2 1 1 1 Capacity: 7 3 2 1 1 1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 16 / 29
  • 46. The ADWIN Algorithm Algorithm ADWIN2 ADWIN2 using a Data Stream Sliding Window Model, can provide the exact counts of 1’s in O(1) time per point. tries O(log W ) cutpoints uses O( 1 log W ) memory words the processing time per example is O(log W ) (amortized) and O(log2 W ) (worst-case). Compressing Buckets 1010101 101 11 1 1 1 Content: 4 2 2 1 1 1 Capacity: 7 3 2 1 1 1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 16 / 29
  • 47. The ADWIN Algorithm Algorithm ADWIN2 ADWIN2 using a Data Stream Sliding Window Model, can provide the exact counts of 1’s in O(1) time per point. tries O(log W ) cutpoints uses O( 1 log W ) memory words the processing time per example is O(log W ) (amortized) and O(log2 W ) (worst-case). Compressing Buckets 1010101 101 11 11 1 Content: 4 2 2 2 1 Capacity: 7 3 2 2 1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 17 / 29
  • 48. The ADWIN Algorithm Algorithm ADWIN2 ADWIN2 using a Data Stream Sliding Window Model, can provide the exact counts of 1’s in O(1) time per point. tries O(log W ) cutpoints uses O( 1 log W ) memory words the processing time per example is O(log W ) (amortized) and O(log2 W ) (worst-case). Compressing Buckets 1010101 101 11 11 1 Content: 4 2 2 2 1 Capacity: 7 3 2 2 1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 17 / 29
  • 49. The ADWIN Algorithm Algorithm ADWIN2 ADWIN2 using a Data Stream Sliding Window Model, can provide the exact counts of 1’s in O(1) time per point. tries O(log W ) cutpoints uses O( 1 log W ) memory words the processing time per example is O(log W ) (amortized) and O(log2 W ) (worst-case). Compressing Buckets 1010101 10111 11 1 Content: 4 4 2 1 Capacity: 7 5 2 1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 18 / 29
  • 50. The ADWIN Algorithm Algorithm ADWIN2 ADWIN2 using a Data Stream Sliding Window Model, can provide the exact counts of 1’s in O(1) time per point. tries O(log W ) cutpoints uses O( 1 log W ) memory words the processing time per example is O(log W ) (amortized) and O(log2 W ) (worst-case). Detecting Change: Delete last Bucket 1010101 10111 11 1 Content: 4 4 2 1 Capacity: 7 5 2 1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 19 / 29
  • 51. The ADWIN Algorithm Algorithm ADWIN2 ADWIN2 using a Data Stream Sliding Window Model, can provide the exact counts of 1’s in O(1) time per point. tries O(log W ) cutpoints uses O( 1 log W ) memory words the processing time per example is O(log W ) (amortized) and O(log2 W ) (worst-case). Detecting Change: Delete last Bucket 10111 11 1 Content: 4 2 1 Capacity: 5 2 1 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 19 / 29
  • 52. General Framework General Framework Time Change Detectors and Predictors : Type I Example (Kalman Filter) Estimation - xt - Estimator A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 20 / 29
  • 53. General Framework General Framework Time Change Detectors and Predictors : Type II Example (Kalman Filter + CUSUM) Estimation - xt - Estimator Alarm - - Change Detect. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 20 / 29
  • 54. General Framework General Framework Time Change Detectors and Predictors : Type III Example (Adaptive Kalman Filter) Estimation - xt - Estimator 6 - Memory A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 20 / 29
  • 55. General Framework General Framework Time Change Detectors and Predictors : Type IV Example (ADWIN, Kalman Filter+ADWIN) Estimation - xt - Estimator Alarm - - Change Detect. 6 6 ? - Memory A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 20 / 29
  • 56. General Framework Time Change Detectors and Predictors: A General Framework No memory Memory No Change Type I Type III Detector Kalman Filter Adaptive Kalman Filter Change Type II Type IV Detector Kalman Filter + CUSUM ADWIN Kalman Filter + ADWIN A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 21 / 29
  • 57. General Framework Time Change Detectors and Predictors: A General Framework No memory Memory No Change Type I Type III Detector Kalman Filter Adaptive Kalman Filter Q,R estimated from window Change Type II Type IV Detector Kalman Filter + CUSUM ADWIN Kalman Filter + ADWIN Q,R estimated from window A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 21 / 29
  • 58. K-ADWIN K-ADWIN = ADWIN + Kalman Filtering Estimation - xt - Kalman Alarm - - ADWIN 6 6 ? - ADWIN Memory R = W 2 /50 and Q = 200/W , where W is the length of the window maintained by ADWIN. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 22 / 29
  • 59. Experimental Validation of K-ADWIN Tracking Experiments KALMAN: R=1000;Q=1 Error= 854.97 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 23 / 29
  • 60. Experimental Validation of K-ADWIN Tracking Experiments ADWIN : Error= 674.66 KALMAN: R=1000;Q=1 Error= 854.97 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 23 / 29
  • 61. Experimental Validation of K-ADWIN Tracking Experiments K-ADWIN Error= 530.13 ADWIN : Error= 674.66 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 23 / 29
  • 62. Experimental Validation of K-ADWIN Naïve Bayes Example Data set that outlook temp. humidity windy play describes the sunny hot high false no weather sunny hot high true no conditions for overcast hot high false yes playing some rainy mild high false yes game. rainy cool normal false yes rainy cool normal true no overcast cool normal true yes Assume we have to classify the following new instance: outlook temp. humidity windy play sunny cool high true ? A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 24 / 29
  • 63. Experimental Validation of K-ADWIN Naïve Bayes Assume we have to classify the following new instance: outlook temp. humidity windy play sunny cool high true ? We classify the new instance: νNB = arg max P(νj )P(sunny |νj )P(cool|νj )P(high|νj )P(true|νj ) ν∈{yes,no} Conditional probabilities can be estimated directly as frequencies: number of instances with attribute ai and class νj P(ai |νj ) = total number of training instances with class νj Create one estimator for each frequence that needs estimation A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 24 / 29
  • 64. Experimental Validation of K-ADWIN Experimental Validation of K-ADWIN We test Naïve Bayes Predictor and k-means clustering Method: replace counters by estimators Synthetic data where change is controllable Naïve Bayes: We compare accuracy of Static model: Training of 1000 samples every instant Dynamic model: replace probabilities counters by estimators %Dynamic computing the ratio Static with tests using 2000 samples. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 25 / 29
  • 65. Experimental Validation of K-ADWIN Naïve Bayes Predictor Width %Static %Dynamic % Dynamic/Static ADWIN 83,36% 80,30% 96,33% Kalman Q = 1, R = 1000 83,22% 71,13% 85,48% Kalman Q = 1, R = 1 83,21% 56,91% 68,39% Kalman Q = .25, R = .25 83,26% 56,91% 68,35% Adaptive Kalman 83,24% 76,21% 91,56% CUSUM Kalman 83,30% 50,65% 60,81% K-ADWIN 83,24% 81,39% 97,77% Fixed-sized Window 32 83,28% 67,64% 81,22% Fixed-sized Window 128 83,30% 75,40% 90,52% Fixed-sized Window 512 83,28% 80,47% 96,62% Fixed-sized Window 2048 83,24% 82,19% 98,73% A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 26 / 29
  • 66. Experimental Validation of K-ADWIN k -means Clustering σ = 0.15 Width Static Dynamic ADWIN 9,72 21,54 Kalman Q = 1, R = 1000 9,72 19,72 Kalman Q = 1, R = 100 9,71 17,60 Kalman Q = .25, R = .25 9,71 22,63 Adaptive Kalman 9,72 18,98 CUSUM Kalman 9,72 18,29 K-ADWIN 9,72 17,30 Fixed-sized Window 32 9,72 25,70 Fixed-sized Window 128 9,72 36,42 Fixed-sized Window 512 9,72 38,75 Fixed-sized Window 2048 9,72 39,64 Fixed-sized Window 8192 9,72 43,39 Fixed-sized Window 32768 9,72 53,82 A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 27 / 29
  • 67. Experimental Validation of K-ADWIN Results No estimator ever does much better than K-ADWIN K-ADWIN does much better than every other estimators in at least one context. Tracking problem K-ADWIN and ADWIN automatically do about as well as the Kalman filter with the best set of fixed covariance parameters. Naïve Bayes and k -means: K-ADWIN does somewhat better than ADWIN and far better than any memoryless Kalman filter. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 28 / 29
  • 68. Conclusions Conclusions and Future Work K-ADWIN tunes itself to the data stream at hand, with no need for the user to hardwire or precompute parameters. Better results than either memoryless Kalman Filtering or sliding windows with linear estimators. Future work : Tests on real-world, not only synthetic data. Other learning algorithms: algorithms for induction of decision trees. A. Bifet, R. Gavaldà (UPC) Kalman Filters and Adaptive Windows for Learning in Data Streams DS’06 Barcelona 29 / 29