1
Overcoming
DataOps hurdles for
ML in Production
August 2020
SANDEEP UTTAMCHANDANI
CHIEF DATA OFFICER and VP OF ENGINEERING
sandeep@unraveldata.com
2
Behind the scenes of a ML Model in Production
3
DATA ML Model in
Production
Discover Prep Build Operationalize
DataOps
4
Top 10 DataOps Battlescars
Levels of
Automation
Gather technical metadata
Gather operational metadata
Aggregate tribal
knowledge
1. “I thought the attribute means something else”
Battlescar:
Incorrect assumptions about the meaning of attributes, whether it is the
source of truth, owner/common users, versioning, whether dataset is
trustworthy?
Metric:
Time to
Interpret
Building a Self-Service Metadata Catalog
1. “I thought the attribute means something else?”
Battlescar:
Incorrect assumptions about the meaning of attributes, whether it is the
source of truth, owner/common users, versioning, whether dataset is
trustworthy?
Metric:
Time to
Interpret
Building a Self-Service Metadata Catalog
Intuit
7
2. “Where is the dataset I need for my model?”
Battlescar:
Building a customer support forecasting model. Data was silo’ed across
business units. 4+ months of connecting to data stewards to locate the data
attributes required for building the model
Building a Self-Service Search Service
Levels of
Automation
Indexing of datasets &
artifacts
Search Relevance ranking
Access control of
search results
Metric:
Time to
Find
8
Battlescar:
Building a customer support forecasting model. Data was silo’ed across
business units. 4+ months of connecting to data stewards to locate the data
attributes required for building the model
Building a Self-Service Search Service
Metric:
Time to
Find
2. “Where is the dataset I need for my model?”
9
3. “1000 rows in source database -- why only 50 rows in
data lake?”
Battlescar:
Issues in correctness, completeness, timeliness in moving data
daily/hourly from transactional datastores to centralized data lake
Metric:
Time to
Move
Building a Self-Service Data Movement service
Data Ingestion Configuration
Data Transformation
Change Mgt
Levels of
Automation
10
4. “Job completed but dashboard graphs have data missing?”
Battlescar:
Jobs are orchestrated using schedulers (such as Airflow, Oozie). Several
times, the job dependencies are incorrect, leading to reporting or model
training jobs to be triggered prematurely.
Metric:
Time to
Orchestrate
Building a Self-Service orchestration Service
Levels of
Automation
Defining Job Dependencies
Robust Job Execution
Production
Monitoring
11
5. “Data processing was supposed to complete at 8 am. Its 4pm
and my model retraining job is still waiting?”
Battlescar:
Writing efficient Big Data processing applications is non-trivial. With
plethora of technologies, gaining broad expertise is difficult even for
expert data engineers.
Metric:
Time to
Optimize
Building a Self-Service query optimization Service
Levels of
Automation
Aggregating query, cluster,
resource Stats
Analyzing & correlating
stats
Tuning Jobs
12
6. “Customer changed preference to no marketing emails. Why are
we still including in email campaigns?”
Battlescar:
Without a consistent primary key to identify the customer across data
silos, where recurring issues arise. Emerging Data Rights such as
GDPR, CCPA, require complying with customer preferences on what
data is collected, how it is used, deleted on request.
Metric:
Time to
Comply
Building a Self-Service data rights governance Service
Levels of
Automation
Tracking customer data lifecycle
and preferences
Executing customer’s
data rights requests
Use-case
based access
control
13
7. “Job pipeline ran for 15 hours and now we detect data
quality issue upon completion -- could we be proactive?”
Battlescar:
Data issues in a long running business critical job leads to missing
insights. Only when results don't look correct that we realize there is an
issue.
Metric:
Time to
Insights
Quality
Building a Self-Service data observability Service
Levels of
Automation
Verify accuracy of data
Detect anomalies
Avoid data
quality issues
14
8. “Using the best polyglot datastores -- how do I now write
queries effectively across this data?”
Battlescar:
Significant time spent in planning, design, and writing queries that
process data across datastores
Metric:
Time to
Virtualize
Datastores
Building a Self-Service data virtualization Service
Levels of
Automation
Automatic query routing
Managing datastore
specific queries
Joining across
transactional
sources
15
9. “I ran a A/B experiment -- need to build time-consuming
data pipelines to now analyze the data”
Battlescar:
Analyzing experimental results in a consistent fashion is a nightmare. No
consistent definitions between metrics used for experimental analysis
and business reporting
Metric:
Time to A/B
Test
Building a Self-Service A/B Testing Service
Levels of
Automation
16
10. “Data processing jobs last week cost us 30% more. Why?”
Battlescar:
Especially in the cloud, $ cost is linear to usage. Tracking budgets and
spend to effectively optimize requires non-trivial effort.
Metric:
Time to
Cost
Governance
Building a Self-Service cost governance Service
Levels of
Automation
Expenditure Observability
Matching
Supply-Demand
Continuous Cost
Optimization
17
Wrap up: Advice on Managing your DataOps
18
People
Process Technology
DataOps hurdles vary and depends on...
19
Self-Service has levels (not binary)
20
Discover Prep Build Operationalize
TIME-TO-INSIGHT
Measuring Current DataOps:
Time-to-Insight Metric
DATA
21
Discover Prep Build Operationalize
Time-to-Insight Scorecard
22
Discover Prep Build Operationalize
Creating Your Time-to-Insight Scorecard
WeeksDaysHoursLegend:
23
Call for Action: Making DataOps Self-Service
1. Measure
Create your
Time-to-Insight Scorecard
Self-Service
DataOps
2. Learn
Shortlist 1-2 scorecard
metrics to improve level
of automation
3. Build
Implement well-known
design patterns in your
data platform to make the
metrics self-service
24
Upcoming Book: The Self-Service Data Roadmap
Available Sept’20
Early Release Available on O’Reilly:
https://www.oreilly.com/library/view/the-self-service-data/9781492075240/
25
CONTACT US TO SCHEDULE A DATA OPERATIONS HEALTH CHECK TODAY
hello@unraveldata.com

Overcoming DataOps hurdles for ML in Production

  • 1.
    1 Overcoming DataOps hurdles for MLin Production August 2020 SANDEEP UTTAMCHANDANI CHIEF DATA OFFICER and VP OF ENGINEERING sandeep@unraveldata.com
  • 2.
    2 Behind the scenesof a ML Model in Production
  • 3.
    3 DATA ML Modelin Production Discover Prep Build Operationalize DataOps
  • 4.
    4 Top 10 DataOpsBattlescars
  • 5.
    Levels of Automation Gather technicalmetadata Gather operational metadata Aggregate tribal knowledge 1. “I thought the attribute means something else” Battlescar: Incorrect assumptions about the meaning of attributes, whether it is the source of truth, owner/common users, versioning, whether dataset is trustworthy? Metric: Time to Interpret Building a Self-Service Metadata Catalog
  • 6.
    1. “I thoughtthe attribute means something else?” Battlescar: Incorrect assumptions about the meaning of attributes, whether it is the source of truth, owner/common users, versioning, whether dataset is trustworthy? Metric: Time to Interpret Building a Self-Service Metadata Catalog Intuit
  • 7.
    7 2. “Where isthe dataset I need for my model?” Battlescar: Building a customer support forecasting model. Data was silo’ed across business units. 4+ months of connecting to data stewards to locate the data attributes required for building the model Building a Self-Service Search Service Levels of Automation Indexing of datasets & artifacts Search Relevance ranking Access control of search results Metric: Time to Find
  • 8.
    8 Battlescar: Building a customersupport forecasting model. Data was silo’ed across business units. 4+ months of connecting to data stewards to locate the data attributes required for building the model Building a Self-Service Search Service Metric: Time to Find 2. “Where is the dataset I need for my model?”
  • 9.
    9 3. “1000 rowsin source database -- why only 50 rows in data lake?” Battlescar: Issues in correctness, completeness, timeliness in moving data daily/hourly from transactional datastores to centralized data lake Metric: Time to Move Building a Self-Service Data Movement service Data Ingestion Configuration Data Transformation Change Mgt Levels of Automation
  • 10.
    10 4. “Job completedbut dashboard graphs have data missing?” Battlescar: Jobs are orchestrated using schedulers (such as Airflow, Oozie). Several times, the job dependencies are incorrect, leading to reporting or model training jobs to be triggered prematurely. Metric: Time to Orchestrate Building a Self-Service orchestration Service Levels of Automation Defining Job Dependencies Robust Job Execution Production Monitoring
  • 11.
    11 5. “Data processingwas supposed to complete at 8 am. Its 4pm and my model retraining job is still waiting?” Battlescar: Writing efficient Big Data processing applications is non-trivial. With plethora of technologies, gaining broad expertise is difficult even for expert data engineers. Metric: Time to Optimize Building a Self-Service query optimization Service Levels of Automation Aggregating query, cluster, resource Stats Analyzing & correlating stats Tuning Jobs
  • 12.
    12 6. “Customer changedpreference to no marketing emails. Why are we still including in email campaigns?” Battlescar: Without a consistent primary key to identify the customer across data silos, where recurring issues arise. Emerging Data Rights such as GDPR, CCPA, require complying with customer preferences on what data is collected, how it is used, deleted on request. Metric: Time to Comply Building a Self-Service data rights governance Service Levels of Automation Tracking customer data lifecycle and preferences Executing customer’s data rights requests Use-case based access control
  • 13.
    13 7. “Job pipelineran for 15 hours and now we detect data quality issue upon completion -- could we be proactive?” Battlescar: Data issues in a long running business critical job leads to missing insights. Only when results don't look correct that we realize there is an issue. Metric: Time to Insights Quality Building a Self-Service data observability Service Levels of Automation Verify accuracy of data Detect anomalies Avoid data quality issues
  • 14.
    14 8. “Using thebest polyglot datastores -- how do I now write queries effectively across this data?” Battlescar: Significant time spent in planning, design, and writing queries that process data across datastores Metric: Time to Virtualize Datastores Building a Self-Service data virtualization Service Levels of Automation Automatic query routing Managing datastore specific queries Joining across transactional sources
  • 15.
    15 9. “I rana A/B experiment -- need to build time-consuming data pipelines to now analyze the data” Battlescar: Analyzing experimental results in a consistent fashion is a nightmare. No consistent definitions between metrics used for experimental analysis and business reporting Metric: Time to A/B Test Building a Self-Service A/B Testing Service Levels of Automation
  • 16.
    16 10. “Data processingjobs last week cost us 30% more. Why?” Battlescar: Especially in the cloud, $ cost is linear to usage. Tracking budgets and spend to effectively optimize requires non-trivial effort. Metric: Time to Cost Governance Building a Self-Service cost governance Service Levels of Automation Expenditure Observability Matching Supply-Demand Continuous Cost Optimization
  • 17.
    17 Wrap up: Adviceon Managing your DataOps
  • 18.
  • 19.
  • 20.
    20 Discover Prep BuildOperationalize TIME-TO-INSIGHT Measuring Current DataOps: Time-to-Insight Metric DATA
  • 21.
    21 Discover Prep BuildOperationalize Time-to-Insight Scorecard
  • 22.
    22 Discover Prep BuildOperationalize Creating Your Time-to-Insight Scorecard WeeksDaysHoursLegend:
  • 23.
    23 Call for Action:Making DataOps Self-Service 1. Measure Create your Time-to-Insight Scorecard Self-Service DataOps 2. Learn Shortlist 1-2 scorecard metrics to improve level of automation 3. Build Implement well-known design patterns in your data platform to make the metrics self-service
  • 24.
    24 Upcoming Book: TheSelf-Service Data Roadmap Available Sept’20 Early Release Available on O’Reilly: https://www.oreilly.com/library/view/the-self-service-data/9781492075240/
  • 25.
    25 CONTACT US TOSCHEDULE A DATA OPERATIONS HEALTH CHECK TODAY hello@unraveldata.com