Agile Data
Science
Alexander Bauer
Lead Data Scientist @ Lidl
Frankfurt Analytics Meetup, 2017/02/24
Agenda
• Data Science
• Challenges
• Agile Data Science Projects
• Case Study
What is Data Science?
• Data science, also known as data-driven
science, is an interdisciplinary field about
scientific methods, processes and systems
to extract knowledge or insights from data
in various forms, either structured or
unstructured – Wikipedia
Business Goals
Why do companies hire data scientists?
• Reduce costs
• Increase revenue
• Reduce risk
• Create innovation
Deliverables
How do data scientists deliver?
• Actionable insights (reports)
• Data products
• New product features
• Trials, A/B Testing
Challenges
Why do many data science projects fail?
• Lack of Business Understanding
• Data Access (Security, Privacy)
• Deployment and Operation (Scalability,
Acceptance)
• Time to market (Competition, Budget)
Case Study: Data Science for Sales Department
I want a
recommender
system for my
Sales Reps
Sure, we can use
Alternating Least
Square Singular
Value
Decomposition!
Case Study: Data Science for Sales Department
Show me what you
can do with Deep
Learning
Cool, we can do
something with
Tensorflow on
your data
Case Study: Data Science for Sales Department
I want a
dashboard of
sales by country
and product
Well, we can do
visualizations - but
that‘s actually not my
job!
Typical pitfalls during project execution
Modeling
Trial/Pilot
Operationalization
No access to data
Model does
not scale
Users don‘t
accept solution
Fails to meet business objective
Not enough signal
12 months
Out of budget
Solution: Iterative Approach
CRISP-DM
Agile Data Science
How can we implement CRISP-DM in practice?
• Agile Product Management
• Agile Development
• Data Science Platform / Data Lake
Agile Product Management – The Product Vision Statement1
13
 Close deals
 Prioritize leads
 Prevent churn
 Acquire new leads
 Up-sell
 Cross-sell
 Sales Reps
 Sales Manager
Target Group Needs Product Business Goals
 Increase
conversion rate
 Increase average
basket size
 Reduce churn rate
 Grow customer base
„Leverage data science to increase sales team productivity“
?
1Roman Pichler: Agile Product Management with Scrum
User Stories – Briding the gap between
algorithms and business needs
Association Rules:
As a sales rep, I need to understand which products are often bought together, so that I
can recommend additional products during sales calls and increase upsale.
Churn Factor Analysis:
As a sales rep, I need to understand the factors that drive churn so that I can select
customers to call, make sure they are satisfied with our products and reduce churn.
Recommender system:
As a sales rep, for each customer I need to understand which products were bought by
customers with similar purchase history, so that I can make personalized
recommendations and increase upsale.
Story Mapping and Release Planning
Up/Cross-Selling Churn Prevention Leads Prioritization
User
Interface/Deployment
Association Rules Factor Analysis
Conversion - Factor
Analysis
Item-Item
Recommender
Viz: Top N Items per
customer
A/B Testing
Simple Predictive
Model for Churn
(sales history data)
Improved predictive
model for churn
(incl. CRM data)
Content-based
recommender for cold-
start (incl. CRM data)
Release 1
Release 2
Release 3
A/B Testing
Viz: Top N customer to
likely to churn
Agile Development with Scrum
Data Science is a Team Sport
Data Lake/
Agile Platform
CRM Purchase Data Call Center Tickets
Platform Layer
Application
Layer
Docker/VMs
App
Security/Auth
Auditing
Monitoring
Unstructured Data Structured Data
Scalable Job Execution / Query Engine
App REST
ETL
Query Interface
/Notebooks
Visualization Tools
Scheduling
Legacy
Systems
Business Users
Analysts/
Data Scientists
Summary / Call for Action
• Data science projects rarely fail because of insufficient modeling skills
• Focus on business value, deliver „good enough“ models first
• Deliver in small increments that already provide value end-to-end, present
in Sprint Reviews to all stakeholders
• Manage stakeholers using a clear product vision, a user story backlog and
release plans
• Deploy as early as possible to ensure user acceptance, declare as „beta“
mode
• Build an infrastructure that enables agile development
Thank you! Questions?

Agile Data Science

  • 1.
    Agile Data Science Alexander Bauer LeadData Scientist @ Lidl Frankfurt Analytics Meetup, 2017/02/24
  • 2.
    Agenda • Data Science •Challenges • Agile Data Science Projects • Case Study
  • 3.
    What is DataScience? • Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured – Wikipedia
  • 4.
    Business Goals Why docompanies hire data scientists? • Reduce costs • Increase revenue • Reduce risk • Create innovation
  • 5.
    Deliverables How do datascientists deliver? • Actionable insights (reports) • Data products • New product features • Trials, A/B Testing
  • 6.
    Challenges Why do manydata science projects fail? • Lack of Business Understanding • Data Access (Security, Privacy) • Deployment and Operation (Scalability, Acceptance) • Time to market (Competition, Budget)
  • 7.
    Case Study: DataScience for Sales Department I want a recommender system for my Sales Reps Sure, we can use Alternating Least Square Singular Value Decomposition!
  • 8.
    Case Study: DataScience for Sales Department Show me what you can do with Deep Learning Cool, we can do something with Tensorflow on your data
  • 9.
    Case Study: DataScience for Sales Department I want a dashboard of sales by country and product Well, we can do visualizations - but that‘s actually not my job!
  • 10.
    Typical pitfalls duringproject execution Modeling Trial/Pilot Operationalization No access to data Model does not scale Users don‘t accept solution Fails to meet business objective Not enough signal 12 months Out of budget
  • 11.
  • 12.
    Agile Data Science Howcan we implement CRISP-DM in practice? • Agile Product Management • Agile Development • Data Science Platform / Data Lake
  • 13.
    Agile Product Management– The Product Vision Statement1 13  Close deals  Prioritize leads  Prevent churn  Acquire new leads  Up-sell  Cross-sell  Sales Reps  Sales Manager Target Group Needs Product Business Goals  Increase conversion rate  Increase average basket size  Reduce churn rate  Grow customer base „Leverage data science to increase sales team productivity“ ? 1Roman Pichler: Agile Product Management with Scrum
  • 14.
    User Stories –Briding the gap between algorithms and business needs Association Rules: As a sales rep, I need to understand which products are often bought together, so that I can recommend additional products during sales calls and increase upsale. Churn Factor Analysis: As a sales rep, I need to understand the factors that drive churn so that I can select customers to call, make sure they are satisfied with our products and reduce churn. Recommender system: As a sales rep, for each customer I need to understand which products were bought by customers with similar purchase history, so that I can make personalized recommendations and increase upsale.
  • 15.
    Story Mapping andRelease Planning Up/Cross-Selling Churn Prevention Leads Prioritization User Interface/Deployment Association Rules Factor Analysis Conversion - Factor Analysis Item-Item Recommender Viz: Top N Items per customer A/B Testing Simple Predictive Model for Churn (sales history data) Improved predictive model for churn (incl. CRM data) Content-based recommender for cold- start (incl. CRM data) Release 1 Release 2 Release 3 A/B Testing Viz: Top N customer to likely to churn
  • 16.
    Agile Development withScrum Data Science is a Team Sport
  • 17.
    Data Lake/ Agile Platform CRMPurchase Data Call Center Tickets Platform Layer Application Layer Docker/VMs App Security/Auth Auditing Monitoring Unstructured Data Structured Data Scalable Job Execution / Query Engine App REST ETL Query Interface /Notebooks Visualization Tools Scheduling Legacy Systems Business Users Analysts/ Data Scientists
  • 18.
    Summary / Callfor Action • Data science projects rarely fail because of insufficient modeling skills • Focus on business value, deliver „good enough“ models first • Deliver in small increments that already provide value end-to-end, present in Sprint Reviews to all stakeholders • Manage stakeholers using a clear product vision, a user story backlog and release plans • Deploy as early as possible to ensure user acceptance, declare as „beta“ mode • Build an infrastructure that enables agile development
  • 19.