October
2019
How to use a Kalman Filter in Brand Tracking?
Jane Tang
Webinar, 4
October 2019
#NewMR Sponsors
Communication
Gold
Silver
October
2019
How to use a Kalman Filter in Brand Tracking?
Jane Tang
jane.tang@bootstrapanalytics.com
3
FLY ME TO THE MOON:
The Application Of Kalman Filter To Tracking Data
Jane Tang, Andrew Grenville & Karen Buros
SAMPLING IN THE RIVER TWITTER:
A DIY Tool for Social Media Tracking via the REST API
Jack Horne & Jane Tang
Advanced Research Techniques Forum 2016 & 2017
4
- “Smooth” trajectory
- Periodical observations
- Random measurement errors
- Simple and flexible tool to optimally separate the signal from the
noise in the data
5
Tracking movement of
a spacecraft going to the moon
https://ieeexplore.ieee.org/document/5466132
Apollo Space Program
Tracking perception of a brand in the
minds of consumers
Brand Tracking
How Kalman Filter Works
http://bilgin.esme.org/BitsBytes/KalmanFilterforDummies.aspx
http://www.cs.unc.edu/~tracker/media/pdf/SIGGRAPH2001_CoursePack_08.pdf
!𝑋# = 𝐾# & 𝑋# + 1 − 𝐾# & !𝑋#*+
estimation at time t
Kalman gain
measured value estimation at time (t – 1)
State/prior equation
,𝑥# = .𝑥#*+ + 𝑤# → 𝑝 𝑤 ~𝑁 0, 𝑄
Measurement/update equation
.𝑥# = 𝑥# + 𝑣# → 𝑝 𝑣 ~𝑁 0, 𝑅
6
Equations Simplify to a Random Walk
!𝑋# = 𝐾# & 𝑋# + 1 − 𝐾# & !𝑋#*+
estimation at time t
Kalman gain
measured value estimation at time (t – 1)
Time update
“prediction”
,𝑥# = .𝑥#*+
(1) Prior for subsequent state
(2) Prior error covariance
9𝑃# = 𝑃#*+ + 𝑄
external shock
Measurement update
“correction”
random walk
(1) Compute the Kalman gain
𝐾# = 9𝑃#
9𝑃# + 𝑅
*+
random measurement error
(2) Update estimate
.𝑥# = ,𝑥# + 𝐾# 𝑥# − ,𝑥#
(3) Update error covariance
𝑃# = 9𝑃# 1 − 𝐾#random walk
7
A Simple Example
𝑡 𝑥# ,𝑥#
9𝑃# Time update Measurement update .𝑥# 𝑃#
1 0.390 0.000 1.000 ,𝑥< = 0
9𝑃< = 1
𝐾# = 9𝑃#
9𝑃# + 𝑅
*+
=1 1 + 0.1 *+
= 0.909
.𝑥# = ,𝑥# + 𝐾# 𝑥# − ,𝑥#
= 0 + 0.909 0.390 − 0 = 0.355
𝑃# = 9𝑃# 1 − 𝐾#
=1 1 − 0.909 = 0.091
0.355 0.091
2 0.500 0.355 0.092 ,𝑥# = .𝑥#*+ = 0.355
9𝑃# = 𝑃#*+ + 𝑄 = 0.092
𝐾# = 0.092 0.092 + 0.1 *+
= 0.479
.𝑥# = 0.355 + 0.479 0.5 − 0.355 = 0.424
𝑃# = 0.092 1 − 0.479 = 0.048
0.424 0.048
3 0.480 0.424 0.049 0.443 0.033
4 0.290 0.443 0.034 0.404 0.025
𝑅 = 0.1 variance around observations (best guess) 𝑄 = 0.001 external shock variance
8
A Simple Example
Value(x)
Time (t)
9
Size of the Measurement Error (R)
Value(x)
Time (t)
10
Detecting Trend
Value(x)
Time (t)
11
12
† We have large, long-term (6+ years) tracking programs, carefully controlled wave-to-
wave, with trend lines like these. How clear is the story?
Continuing to gain ground?
Year 1.1Year 1.2Year 2.1 Year2.2 Year 3.1Year 3.2Year 4.1Year 4.2Year 5.1Year 5.2Year 6.1Year 6.2Year 7.1
Likelihood to Recommend
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Competitor 5
Competitor 6
Yr 1A Yr 1B Yr 2A Yr 2B Yr 3A Yr 3B Yr 4A Yr 4B Yr 5A Yr 5B Yr 6A Yr 6B Yr 7A
Favorite
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Competitor 5
Competitor 6
Stabilizing?
Losing steam?
The Truth: I Dread Tracking Data
13
First, let’s standardize the results to the
baseline measurement
Then we can apply the Kalman Filter
80
90
100
110
120
130
140
150
160
Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1
Kalman Estimate Likely to Recommend: Ratio to Baseline
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Competitor 5
Competitor 6
80
90
100
110
120
130
140
150
160
Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1
Observed Likely to Recommend: Ratio to Baseline
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Competitor 5
Competitor 6
What’s happening here?
Can the Kalman Filter Clarify the Story?
14
A More Detailed The Effect of the Filter?
† Kalman Filter for Competitors 2 and 3 – Softening the extremes
80
90
100
110
120
130
140
150
160
Year
1.1
Year
1.2
Year
2.1
Year2.2 Year
3.1
Year
3.2
Year
4.1
Year
4.2
Year
5.1
Year
5.2
Year
6.1
Year
6.2
Year
7.1
Likelihood to Recommend: Competitors 2 and 3
Observed 2
Estimated 2
Observed 3
Estimated 3
15
Taking a Look at Favorite Brand…
We standardize the results to the baseline
measurement
Then apply the Kalman Filter
40
60
80
100
120
140
160
180
Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1
Observed Favorite: Ratio to Baseline
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Competitor 5
Competitor 6
40
60
80
100
120
140
160
180
Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1
Kalman Estimate Favorite: Ratio to Baseline
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Competitor 5
Competitor 6
Competitor 4 Suffers
16
A More Detailed Look at Competitors 4 and 5
† Again, softening the extremes
40
60
80
100
120
140
160
180
Year
1.1
Year
1.2
Year
2.1
Year2.2
Year
3.1
Year
3.2
Year
4.1
Year
4.2
Year
5.1
Year
5.2
Year
6.1
Year
6.2
Year
7.1
Favorite: Competitors 4 and 5
Observed 4
Estimated 4
Observed 5
Estimated 5
KF provides a way to
simultaneously account
for ‘time’ (as an
alternative to ‘three
month rolling average’)
and measurement error.
17
Can We Apply This Approach to Social Media Data?
† Social positive sentiment over the tracking
period
† Application of the Kalman Filter smooths the
trend lines a bit. Greater variation across time is
likely due to changing social media audience
creating the ‘sentiment’ values.
0.40
0.45
0.50
0.55
0.60
0.65
0.70
Year 1.1Year 1.2Year 2.1 Year2.2 Year 3.1Year 3.2Year 4.1Year 4.2Year 5.1Year 5.2Year 6.1Year 6.2Year 7.1
Social Media: Positive Sentiment
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Competitor 5
Competitor 6
0.40
0.45
0.50
0.55
0.60
0.65
0.70
Year
1.1
Year
1.2
Year
2.1
Year2.2 Year
3.1
Year
3.2
Year
4.1
Year
4.2
Year
5.1
Year
5.2
Year
6.1
Year
6.2
Year
7.1
Kalman Estimate: Positive Sentiment
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Competitor 5
Competitor 6
18
Running Average vs. Kalman Filter
Royal Bank, Olympic Sponsorship Awareness, Jan-Jun 2008 (pre), Jan-Jun 2010 (during), and Jan-Jun 2011 (post)
Incorporating Known System Instability
𝑡 𝑥# ,𝑥#
9𝑃# .𝑥# 𝑃#
1 0.390 0.000 1.000 0.355 0.091
𝑅 = 0.1 variance around observations (best guess) 𝑄 = 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒
𝑞#
0.001
2 0.500 0.355 0.092 0.424 0.0480.001
3 0.480 0.424 0.049 0.443 0.0330.001
4 0.290 0.443 0.034 0.404 0.0250.001
5 0.250 0.404 0.026 0.356 0.0310.020
6 0.320 0.356 0.031 0.340 0.0450.050
7 0.340 0.340 0.045 0.340 0.0590.100
8 0.480 0.340 0.059 0.397 0.0410.010
9 0.410 0.397 0.041 0.401 0.0300.001
9𝑃# = 𝑃#*+ + 𝑄
external shock
𝐾# = 9𝑃#
9𝑃# + 𝑅
*+
random measurement error
𝐾#
0.909
0.479
0.328
0.253
0.312
0.448
0.592
0.409
0.295
𝑃# = 9𝑃# 1 − 𝐾#
19
Incorporating Known System Instability
0.000
0.100
0.200
0.300
0.400
0.500
0.600
1 2 3 4 5 6 7 8 9 10
Observed Estimated with System Shock
Value(x)
Time (t)
Q
20
K
20K
40K
60K
19 26 2 9 16 23 2 9 16 23 30 6 13 20 27 4 11 18 25 1
NumberofTweetsmentioning“trump”
0
January 2017 February March April May
Incorporating Twitter Brand Mention in Brand Tracking
Twitter Brand mention is used as an indicator of
system instability.
“Buzz” brand equity volatility∝
9𝑃# = 𝑃#*+ + 𝑄
June
21
Tracking the “Trump” Brand
January 2017 February March April May
Trumpapprovalrating(SBA)
Total US Sample
June
38%
40%
42%
44%
46%
48%
50%
25 27 30 3 6 10 15 17 27 1 10 13 15 17 22 27 29 5 7 12 19 21 24 26 28 5 8 17 24 2 7
Raw KF - Constant KF w Twitter Updates
22
“Trump” Brand Tracking in Subpopulation
Trumpapprovalrating(SBA)
Midwest Female
January 2017 February March April May June
20%
30%
40%
50%
60%
25 27 30 3 6 10 15 17 27 1 10 13 15 17 22 27 29 5 7 12 19 21 24 26 28 5 8 17 24 2 7
Raw KF - Constant KF w Twitter Updates
23
THANK YOU!
jane.tang@bootstrapanalytics.com
24
25
Q & A
Ray Poynter
Chief Research Officer
Potentiate
Jane Tang
Principal
Bootstrap Analytics
#NewMR Sponsors
Communication
Gold
Silver
October
2019

How to use a Kalman Filter in Brand Tracking?

  • 1.
    October 2019 How to usea Kalman Filter in Brand Tracking? Jane Tang Webinar, 4 October 2019
  • 2.
  • 3.
    How to usea Kalman Filter in Brand Tracking? Jane Tang jane.tang@bootstrapanalytics.com 3
  • 4.
    FLY ME TOTHE MOON: The Application Of Kalman Filter To Tracking Data Jane Tang, Andrew Grenville & Karen Buros SAMPLING IN THE RIVER TWITTER: A DIY Tool for Social Media Tracking via the REST API Jack Horne & Jane Tang Advanced Research Techniques Forum 2016 & 2017 4
  • 5.
    - “Smooth” trajectory -Periodical observations - Random measurement errors - Simple and flexible tool to optimally separate the signal from the noise in the data 5 Tracking movement of a spacecraft going to the moon https://ieeexplore.ieee.org/document/5466132 Apollo Space Program Tracking perception of a brand in the minds of consumers Brand Tracking
  • 6.
    How Kalman FilterWorks http://bilgin.esme.org/BitsBytes/KalmanFilterforDummies.aspx http://www.cs.unc.edu/~tracker/media/pdf/SIGGRAPH2001_CoursePack_08.pdf !𝑋# = 𝐾# & 𝑋# + 1 − 𝐾# & !𝑋#*+ estimation at time t Kalman gain measured value estimation at time (t – 1) State/prior equation ,𝑥# = .𝑥#*+ + 𝑤# → 𝑝 𝑤 ~𝑁 0, 𝑄 Measurement/update equation .𝑥# = 𝑥# + 𝑣# → 𝑝 𝑣 ~𝑁 0, 𝑅 6
  • 7.
    Equations Simplify toa Random Walk !𝑋# = 𝐾# & 𝑋# + 1 − 𝐾# & !𝑋#*+ estimation at time t Kalman gain measured value estimation at time (t – 1) Time update “prediction” ,𝑥# = .𝑥#*+ (1) Prior for subsequent state (2) Prior error covariance 9𝑃# = 𝑃#*+ + 𝑄 external shock Measurement update “correction” random walk (1) Compute the Kalman gain 𝐾# = 9𝑃# 9𝑃# + 𝑅 *+ random measurement error (2) Update estimate .𝑥# = ,𝑥# + 𝐾# 𝑥# − ,𝑥# (3) Update error covariance 𝑃# = 9𝑃# 1 − 𝐾#random walk 7
  • 8.
    A Simple Example 𝑡𝑥# ,𝑥# 9𝑃# Time update Measurement update .𝑥# 𝑃# 1 0.390 0.000 1.000 ,𝑥< = 0 9𝑃< = 1 𝐾# = 9𝑃# 9𝑃# + 𝑅 *+ =1 1 + 0.1 *+ = 0.909 .𝑥# = ,𝑥# + 𝐾# 𝑥# − ,𝑥# = 0 + 0.909 0.390 − 0 = 0.355 𝑃# = 9𝑃# 1 − 𝐾# =1 1 − 0.909 = 0.091 0.355 0.091 2 0.500 0.355 0.092 ,𝑥# = .𝑥#*+ = 0.355 9𝑃# = 𝑃#*+ + 𝑄 = 0.092 𝐾# = 0.092 0.092 + 0.1 *+ = 0.479 .𝑥# = 0.355 + 0.479 0.5 − 0.355 = 0.424 𝑃# = 0.092 1 − 0.479 = 0.048 0.424 0.048 3 0.480 0.424 0.049 0.443 0.033 4 0.290 0.443 0.034 0.404 0.025 𝑅 = 0.1 variance around observations (best guess) 𝑄 = 0.001 external shock variance 8
  • 9.
  • 10.
    Size of theMeasurement Error (R) Value(x) Time (t) 10
  • 11.
  • 12.
    12 † We havelarge, long-term (6+ years) tracking programs, carefully controlled wave-to- wave, with trend lines like these. How clear is the story? Continuing to gain ground? Year 1.1Year 1.2Year 2.1 Year2.2 Year 3.1Year 3.2Year 4.1Year 4.2Year 5.1Year 5.2Year 6.1Year 6.2Year 7.1 Likelihood to Recommend Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 Competitor 6 Yr 1A Yr 1B Yr 2A Yr 2B Yr 3A Yr 3B Yr 4A Yr 4B Yr 5A Yr 5B Yr 6A Yr 6B Yr 7A Favorite Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 Competitor 6 Stabilizing? Losing steam? The Truth: I Dread Tracking Data
  • 13.
    13 First, let’s standardizethe results to the baseline measurement Then we can apply the Kalman Filter 80 90 100 110 120 130 140 150 160 Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1 Kalman Estimate Likely to Recommend: Ratio to Baseline Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 Competitor 6 80 90 100 110 120 130 140 150 160 Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1 Observed Likely to Recommend: Ratio to Baseline Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 Competitor 6 What’s happening here? Can the Kalman Filter Clarify the Story?
  • 14.
    14 A More DetailedThe Effect of the Filter? † Kalman Filter for Competitors 2 and 3 – Softening the extremes 80 90 100 110 120 130 140 150 160 Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1 Likelihood to Recommend: Competitors 2 and 3 Observed 2 Estimated 2 Observed 3 Estimated 3
  • 15.
    15 Taking a Lookat Favorite Brand… We standardize the results to the baseline measurement Then apply the Kalman Filter 40 60 80 100 120 140 160 180 Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1 Observed Favorite: Ratio to Baseline Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 Competitor 6 40 60 80 100 120 140 160 180 Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1 Kalman Estimate Favorite: Ratio to Baseline Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 Competitor 6 Competitor 4 Suffers
  • 16.
    16 A More DetailedLook at Competitors 4 and 5 † Again, softening the extremes 40 60 80 100 120 140 160 180 Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1 Favorite: Competitors 4 and 5 Observed 4 Estimated 4 Observed 5 Estimated 5 KF provides a way to simultaneously account for ‘time’ (as an alternative to ‘three month rolling average’) and measurement error.
  • 17.
    17 Can We ApplyThis Approach to Social Media Data? † Social positive sentiment over the tracking period † Application of the Kalman Filter smooths the trend lines a bit. Greater variation across time is likely due to changing social media audience creating the ‘sentiment’ values. 0.40 0.45 0.50 0.55 0.60 0.65 0.70 Year 1.1Year 1.2Year 2.1 Year2.2 Year 3.1Year 3.2Year 4.1Year 4.2Year 5.1Year 5.2Year 6.1Year 6.2Year 7.1 Social Media: Positive Sentiment Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 Competitor 6 0.40 0.45 0.50 0.55 0.60 0.65 0.70 Year 1.1 Year 1.2 Year 2.1 Year2.2 Year 3.1 Year 3.2 Year 4.1 Year 4.2 Year 5.1 Year 5.2 Year 6.1 Year 6.2 Year 7.1 Kalman Estimate: Positive Sentiment Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 Competitor 6
  • 18.
    18 Running Average vs.Kalman Filter Royal Bank, Olympic Sponsorship Awareness, Jan-Jun 2008 (pre), Jan-Jun 2010 (during), and Jan-Jun 2011 (post)
  • 19.
    Incorporating Known SystemInstability 𝑡 𝑥# ,𝑥# 9𝑃# .𝑥# 𝑃# 1 0.390 0.000 1.000 0.355 0.091 𝑅 = 0.1 variance around observations (best guess) 𝑄 = 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑞# 0.001 2 0.500 0.355 0.092 0.424 0.0480.001 3 0.480 0.424 0.049 0.443 0.0330.001 4 0.290 0.443 0.034 0.404 0.0250.001 5 0.250 0.404 0.026 0.356 0.0310.020 6 0.320 0.356 0.031 0.340 0.0450.050 7 0.340 0.340 0.045 0.340 0.0590.100 8 0.480 0.340 0.059 0.397 0.0410.010 9 0.410 0.397 0.041 0.401 0.0300.001 9𝑃# = 𝑃#*+ + 𝑄 external shock 𝐾# = 9𝑃# 9𝑃# + 𝑅 *+ random measurement error 𝐾# 0.909 0.479 0.328 0.253 0.312 0.448 0.592 0.409 0.295 𝑃# = 9𝑃# 1 − 𝐾# 19
  • 20.
    Incorporating Known SystemInstability 0.000 0.100 0.200 0.300 0.400 0.500 0.600 1 2 3 4 5 6 7 8 9 10 Observed Estimated with System Shock Value(x) Time (t) Q 20
  • 21.
    K 20K 40K 60K 19 26 29 16 23 2 9 16 23 30 6 13 20 27 4 11 18 25 1 NumberofTweetsmentioning“trump” 0 January 2017 February March April May Incorporating Twitter Brand Mention in Brand Tracking Twitter Brand mention is used as an indicator of system instability. “Buzz” brand equity volatility∝ 9𝑃# = 𝑃#*+ + 𝑄 June 21
  • 22.
    Tracking the “Trump”Brand January 2017 February March April May Trumpapprovalrating(SBA) Total US Sample June 38% 40% 42% 44% 46% 48% 50% 25 27 30 3 6 10 15 17 27 1 10 13 15 17 22 27 29 5 7 12 19 21 24 26 28 5 8 17 24 2 7 Raw KF - Constant KF w Twitter Updates 22
  • 23.
    “Trump” Brand Trackingin Subpopulation Trumpapprovalrating(SBA) Midwest Female January 2017 February March April May June 20% 30% 40% 50% 60% 25 27 30 3 6 10 15 17 27 1 10 13 15 17 22 27 29 5 7 12 19 21 24 26 28 5 8 17 24 2 7 Raw KF - Constant KF w Twitter Updates 23
  • 24.
  • 25.
    25 Q & A RayPoynter Chief Research Officer Potentiate Jane Tang Principal Bootstrap Analytics
  • 26.