BIG DATA
AGILE
ANALYTICS
Ken Collier, PhD
Director, Agile Analytics
@theagilist #thoughtworks
1
Value
Complexity
What happened?
Descriptive
Analytics
Why did it happen?
Diagnostic
Analytics
What will happen?
Predictive
Analytics How can we
make it happen?
Prescriptive
Analytics
Value
Complexity
What happened?
Descriptive
Analytics
Why did it happen?
Diagnostic
Analytics
What will happen?
Predictive
Analytics How can we
make it happen?
Prescriptive
Analytics
3
Traditional
Business Intelligence
Advanced Analytics
Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
DeliveryLean
Learning
Impact
Advanced
Analytics
Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
Delivery
Lean
Learning
Impact
Advanced
Analytics
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Big Data Analytics Pipeline
Modeling
Data
Operational
Data
External
Data
Data
Integration
Reporting
Engine
Dimension
Mapping
Clean
Data
Report
Report
Report
Dimensional
Data
Data
Sampling
Feature
Selection
Data
Partitioning
Test
Data
Training
Data
Analytical
Modeling
Candidate
Model
Model
Validation
Accepted
Model
Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
Delivery
Lean
Learning
Impact
Advanced
Analytics
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Advanced
Analytics
Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
Delivery
Lean
Learning
Impact
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Discover &
Explore
Analyze & Act
Data Convergence Analytical Divergence
Discover
Harvest
Filter
Integrate Augment
Analyze
Act
Analytical Opportunities
How Advanced Analytics Works
If we knew X,
we could do Y
Typical Timeline
3-6 months 2 months 2-4 months
10
Data Convergence Analytical Divergence
Discover
Harvest
Filter
Integrate Augment
Analyze
Act
Analytical Opportunities
Traditional Analytics
If we knew X,
we could do Y
Advanced
Analytics
Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
DeliveryLean
Learning
Impact
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Continuous
Integration
Collaboration
Evolve
Continuous
Delivery
Advanced
Analytics
Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
DeliveryLean
Learning
Impact
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Continuous
Integration
Collaboration
Evolve
Continuous
Delivery
Hypothesis
Build
Learn
Measure
Analytical Divergence
Analytical Opportunities
If we knew X,
we could do Y
Data Convergence
Discover
Harvest
Filter
Integrate Augment
Analyze
Act
Repeat this cycle solving small problems every few days
LEARN
MEASURE
BUILD
Agility in Analytics
Retain high value
customers
High value business
goal
Like this example…
What’s the
smallest, simplest
thing we can do?
Retain high value
customers
Like this example…
Common features of
defectors?
Is it useful &
actionable?
Retain high value
customers
Like this example…
Common features of
defectors?
Repeat!Retain high value
customers
Like this example…
Common features of
defectors?
Shopping behaviors of
defectors?
Retain high value
customers
Like this example…
Common features of
defectors?
What leads to customers
leaving?
Shopping behaviors of
defectors?
What do defectors say
about us?
Customers’ sentiment
before defecting?
What encourages
customers to stay?
Do incentives reduce
defection rates?
Problem
solved or
continue?
What leads to customers
leaving?
Like this example…
Common features of
defectors?
Shopping behaviors of
defectors?
What do defectors say
about us?
Customers’ sentiment
before defecting?
What encourages
customers to stay?
Do incentives reduce
defection rates?
Advanced
Analytics
Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
DeliveryLean
Learning
Impact
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Continuous
Integration
Collaboration
Evolve
Continuous
Delivery
Hypothesis
Build
Learn
Measure
Data
Science
Machine
Learning
Statistics
THE “DATA SCIENTIST”
Machine Learning
Statistical Modeling
Artificial Neural Networks
Decision Tree Learning
Support Vector Machines
Clustering
…and many more…
Bayesian Classification
Monte Carlo Simulation
Logistic Regression
K-Nearest Neighbor
…and many more…
Domain Knowledge
Data Semantics
Business Understanding
Business Communication
Programming Skills
Functional Programming
Data “Wrangling”
Map/Reduce, SQL, & NoSQL
Advanced
Analytics
Data
Science
Visual
Storytelling
Machine
Learning
Statistics Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
DeliveryLean
Learning
Impact
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Continuous
Integration
Collaboration
Evolve
Continuous
Delivery
Hypothesis
Build
Learn
Measure
drones.pitchinteractive.com
Data Visualization
Advanced
Analytics
Data
Science
Visual
Storytelling
Machine
Learning
Statistics Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
DeliveryLean
Learning
Impact
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Continuous
Integration
Collaboration
Evolve
Continuous
Delivery
Hypothesis
Build
Learn
Measure
Data
Reduction
Objective Truth
Discoverable Truth
Uninterpretable
Irrelevant
Noise
Not
Actionable
Impactful
New Insights
“Little Data”
Advanced
Analytics
Data
Science
Visual
Storytelling
Machine
Learning
Statistics Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
DeliveryLean
Learning
Impact
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Continuous
Integration
Collaboration
Evolve
Continuous
Delivery
Hypothesis
Build
Learn
Measure
Data
Reduction
Insight
Knowledge
Action
Disruption
Advanced
Analytics
Data
Science
Visual
Storytelling
Machine
Learning
Statistics Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
DeliveryLean
Learning
Impact
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Continuous
Integration
Collaboration
Evolve
Continuous
Delivery
Hypothesis
Build
Learn
Measure
Data
Reduction
Insight
Knowledge
Action
Disruption
Business vs. IT
Focus vs. Platform
Monitor & Measure
Advanced
Analytics
Data
Science
Visual
Storytelling
Machine
Learning
Statistics Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
DeliveryLean
Learning
Impact
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Continuous
Integration
Collaboration
Evolve
Continuous
Delivery
Hypothesis
Build
Learn
Measure
Data
Reduction
Insight
Knowledge
Action
Disruption
Business vs. IT
Focus vs. Platform
Monitor & Measure
Privacy Controls
Radical Transparency
Data Democracy
Open Data
Advanced
Analytics
Data
Science
Visual
Storytelling
Machine
Learning
Statistics Agile
Analytics
Big Data
Solutions
Thinking
Ethics
Agile
DeliveryLean
Learning
Impact
Volume
Velocity
Variety
NoSQL
Complexity
Polyglot
Persistence
Continuous
Integration
Collaboration
Evolve
Continuous
Delivery
Hypothesis
Build
Learn
Measure
Data
Reduction
Insight
Knowledge
Action
Disruption
Business vs. IT
Focus vs. Platform
Monitor & Measure
Privacy Controls
Radical Transparency
Data Democracy
Open Data
Ken Collier, Director, Agile Analytics
kcollier@thoughtworks.com
Value Creation
Cool New Technologies
+
Sophisticated Analytics
+
Lean Learning Principals
+
Fast Agile Delivery =

Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks