This document summarizes Ryan Kirk's presentation at StampedeCon 2016 about predicting outcomes in cloud IoT environments. The presentation covers IoT and cloud computing landscapes, challenges with prediction in different business domains, and lessons learned from data science projects. It discusses stages of a prediction lifecycle model and how different domains like business, engineering and research are involved in each stage. Key challenges and solutions addressed include developing a domain model, approaches for handling variability and uncertainty, techniques for anomaly detection, and the importance of feedback loops and training data evaluation.
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyzing - StampedeCon 2016
1.
2. Things we will cover
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 2
GOAL
Explain Cloud IoT, its challenges, and a
principled, agile approach to prediction amidst
uncertainty in such a way that people from a
broad audience can (hopefully) relate.
WILL
► IoT, Cloud landscape, and CTL
► Prediction Lifecycle
► Challenges by business domain
► Data Science Lessons Learned
WILL NOT
► Big Data
► Architecture
► Algorithms
► Technology
4. Who I am
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 4
I am interested in creating intelligent systems
through incorporating humans and machines in
an active learning loop.
► Decision Scientist with PhD in HCI from Iowa
State
► Principal Data Scientist for CenturyLink Cloud
► Curricular Design, Educational Technology,
Online Advertising, Online Retail, Big Data
UX, Cloud, IoT, Physics
► Hiking, Data journalism, Stocks, Horse Racing
ryankirk.info
5. Who we are: CenturyLink Cloud
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 5
+ ++
CLOUD COLOCATION NETWORK MANAGED
SERVICES
6. What is IoT
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 6
Human desire to connect ourselves to
each other via technology
► Modern plumbing…
► Telegraph ! Telephone
► Telephone ! Dial-up
► Dial-up ! HSN
► HSN ! WAN
► WAN ! IoT
Human desire to connect ourselves to
each other via technology to empower
each other
7. Internet growth > Hardware growth
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 7
motherboard.vice.com
newscientist.com
8. CenturyLink Cloud IoT Advantage
► 37 states
► 550,000 miles of network
► Innovative Gigabit
fiber network
► 25MM+ consumer
endpoints
► 60+ DCS
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 8
10. Problem statement:
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 10
► Prevent incidents
through early
detection
► Reduce MTTR by
facilitating root-cause
analytics
► Facilitate domain
experts and harvest
their knowledge "
11. GOAL
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 11
Build a real-time artificial intelligence
capable of analyzing all incoming
streams of data in order to know
which actions our machines need to
automatically take.
It’s simple, really… build Skynet
13. Prediction Adoption Model
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 13
Stage I:
INTRODUCTION
1. Design
2. Measure
Stage III:
MATURITY
5. Predict
6. Act TIME
SOPHISTICATION
INTRO GROWTH MATURITY DECLINE
Stage II:
GROWTH
3. Describe
4. Detect
Stage IV:
DECLINE
7. Feedback
8. Obsolescence
14. Prediction Adoption Model (actual)
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 14
TIME
SOPHISTICATION
CHECK
THIS
OUT
OH NO,
OH NO,
OH NO!
HAHA,
IT
WORKED!
I NEVER
SAID IT
WOULD …
Stage I:
CHECK
THIS OUT
1. It runs
2. Results are
promising
Stage III:
HAHA,
IT WORKED!
5. I surprise myself
sometimes
6. I found a
shortcut to scale it
Stage II:
OH NO, OH NO,
OH NO!
3. It works but it’s
terrible
4. It will never scale
Stage IV:
I NEVER SAID
IT WOULD…
7. How do I prove it is
still working?
8. There is no way to
apply it to this scenario
15. Stage I: INTRODUCTION
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 15
1. Design
► What should we measure?
► What are the core business
processes?
► What is the unit of analysis?
► What are our research questions/
hypotheses?
2. Measure
► Do we push or pull?
► How often should we measure?
► How long do we need the data?
► How do we represent the data
schema?
16. Stage II: GROWTH
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 16
3. Describe
► Which metrics relate to our
outcomes of interest?
► What is the typical value of each
metric?
► How do you visualize each
metric?
4. Detect
► What do we expect to happen?
► Which values/events are
unexpected?
► When should we alert?
► How will we scale our analysis?
17. Stage III: MATURITY
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 17
7. Predict
► Are there patterns?
► Are there more complex
relationships?
► What is going to happen?
► How do we get training data?
6. Act
► What actions should we take?
► How can we incorporate new
outcomes into the current
model?
18. Stage IV: DECLINE
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 18
7. Feedback
► Is my model primarily basing its
decisions upon its previous
decisions?
► Can I separate the model from its
parameters?
► Can I still evaluate accuracy?
8. Obsolescence
► Are my business scenarios still
grounded?
► Do my model assumptions still hold?
► Does it still scale?
► Is the intervention still needed?
19. Domain process involvement
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 19
BUSINESS
► Is involved early
in defining
requirements
ENGINEERING
► Builds MVP
► Solidifies solution
RESEARCH
► Builds prototype
and suggests
solution
21. Working backwards
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 21
ITEM
1 Skynet
2 Action mapping
3 Action landscape
4 Prediction
5 Categorical learning
6 Training Data
7 Feedback loop
8 High SNR
9 Unsupervised learning
10 Anomaly Detection
11 Normalization
12 Retention
13 Sampling
14 Collection
15 Approach
16 Domain model
“In life, unless you’re more gifted than
Einstein, inversion [i.e. working
backwards] will help you solve
problems.”
Charlie Munger
22. Working backwards (cont.)
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 22
ITEM STAGE
1 Skynet ACT
2 Action mapping ACT
3 Action landscape ACT
4 Prediction PREDICT
5 Categorical learning PREDICT
6 Training Data PREDICT
7 Feedback loop PREDICT
8 High SNR DETECT
9 Unsupervised learning DETECT
10 Anomaly Detection DETECT
11 Normalization DESCRIBE
12 Retention DESCRIBE
13 Sampling MEASURE
14 Collection MEASURE
15 Approach DESIGN
16 Domain model DESIGN
TIME
SOPHISTICATION
INTRO GROWTH MATURITY DECLINE
23. Working backwards (cont.)
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 23
ITEM STAGE PRIMARY DOMAIN
1 Skynet ACT ENGINEERING
2 Action mapping ACT BUSINES
3 Action landscape ACT RESEARCH
4 Prediction PREDICT RESEARCH
5 Categorical learning PREDICT RESEARCH
6 Training Data PREDICT ENGINEERING
7 Feedback loop PREDICT BUSINESS
8 High SNR DETECT RESEARCH
9 Unsupervised learning DETECT RESEARCH
10 Anomaly Detection DETECT RESEARCH
11 Normalization DESCRIBE RESEARCH
12 Retention DESCRIBE ENGINEERING
13 Sampling MEASURE RESEARCH
14 Collection MEASURE ENGINEERING
15 Approach DESIGN RESEARCH
16 Domain model DESIGN BUSINESS
24. This is a WIP
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 24
ITEM STAGE PRIMARY DOMAIN
1 Skynet ACT ENGINEERING
2 Action mapping ACT BUSINES
3 Action landscape ACT RESEARCH
4 Prediction PREDICT RESEARCH
5 Categorical learning PREDICT RESEARCH
6 Training Data PREDICT ENGINEERING
7 Feedback loop PREDICT BUSINESS
8 High SNR DETECT RESEARCH
9 Unsupervised learning DETECT RESEARCH
10 Anomaly Detection DETECT RESEARCH
11 Normalization DESCRIBE RESEARCH
12 Sampling MEASURE RESEARCH
13 Collection MEASURE ENGINEERING
14 Domain model DESIGN BUSINESS
QUEUED
(StampedCon 2017?)
WORKING
PRODUCTION
26. 16. DOMAIN MODEL
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 26
► 938,076 metrics
► Verify the unique stream of
data across systems
► Key-based
DESIGN
27. 15. APPROACH
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 27
VARIABILITY
► Changes in observed state
► Plan for variability
UNCERTAINTY
► Unobserved state(s)
► Design for uncertainty
DESIGN (cont.)
28. 14. COLLECTION
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 28
► Agreement of signals
► Cacophony of
signals
► How often should we
measure?
► We have no labeled
training data
► An approach we
can build upon in the
future
MEASURE
29. 13. SAMPLING
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 29
Shannon-Nyquist Paradox
► The more you measure
something the more it varies
► Bias related to time and
variability
► EG. Temperature yesterday
was 68 degrees
MEASURE (cont.)
30. 12. RETENTION
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 30
► Recall that precision relates to
sampling consistency
► Not all metrics are created
equal
► Coverage remains
problematic
DESCRIBE
31. 11. NORMALIZATION
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 31
Kievit, R.A., Frankenhuis, et al. (2013). Simpson’s paradox in
psychological science. Frontiers in Psychology
Simpson’s Paradox
► aggregate trend != sum of
individual trends
► Applies to all aggregates:
sums, averages, correlations,
etc.
► What is the unit of analysis?
DESCRIBE (cont.)
32. 26-Jul-16 32
Predicted
CenturyLink Confidential
Actual Boundary
10. ANOMALY DETECTION
► Capture the time series data
for each piece of connected
platform technology
► Find implicit anomalies within a
time series vector
► Values that are surprising
► Highly scalable
DETECT
presented by Ryan Kirk at StampedeCon 2016
33. 26-Jul-16presented by Ryan Kirk at StampedeCon 2016 33
► Time series data shows
the context behind
anomalies that co-occur
► Group anomalous
vectors based upon
structural properties and
co-occurrence
► Up-level anomalies into
higher-order alerts using
contextual information
9. UNSUPERVISED
LEARNING
DETECT (cont.)
8. HIGH SNR
34. 26-Jul-16presented by Ryan Kirk at StampedeCon 2016 34
► We have also built a search
engine for time series data
that allows us to build cool
looking graphs in real-time
► We basically do all of this to
empower slack alerts
► Allows tags to propagate
forwards
7. FEEDBACK LOOP PREDICT
35. 6. TRAINING DATA
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 35
► Evaluate ALL assumptions
in regards to training data
► Ideally use active learning
approach or risk
becoming tautological
PREDICT (cont.)
37. Prediction Results
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 37
► 38,392,438 predictions every 24hr.
► Anomaly rate < 0.01% (0.0001)
~3K anomalies/day
► Accuracy is ~90%
► Prediction latency ~3.0 seconds
► ~30 Higher order alerts/day
38. Want to join me?
Let’s connect:
► @ryan_kirk
Try CenturyLink Cloud free:
► ctl.io
We are hiring
► ctl.io/careers/jobs
Thanks to:
► StampedeCon2016
► pixabay.com
26-Jul-16presented by Ryan Kirk at StampedeCon 2016 38