Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 31

Accelerating the ML Lifecycle with an Enterprise-Grade Feature Store

0

Share

Download to read offline

Productionizing real-time ML models poses unique data engineering challenges for enterprises that are coming from batch-oriented analytics. Enterprise data, which has traditionally been centralized in data warehouses and optimized for BI use cases, must now be transformed into features that provide meaningful predictive signals to our ML models.

Accelerating the ML Lifecycle with an Enterprise-Grade Feature Store

  1. 1. Accelerating the ML lifecycle with an enterprise-grade Feature Store Mike Del Balso, Tecton Geoff Sims, Atlassian Spark+AI Summit Wednesday, 6/24 3:40pm PDT 1
  2. 2. 2 Product Manager / Ads ML Product Manager / Created Michelangelo Platform Co-founder and CEO / Data Platform for ML 2 About Us Mike Del Balso Principal Data Scientist Geoff Sims
  3. 3. Machine learning today often falls short of its potential Limited predictive data Long development cycles Painful path to Production 3
  4. 4. Fraud detection CTR Prediction Pricing Customer support Recommendation Search Segmentation Chat bots Common Use CasesOperational ML powers automated business decisions and customer experiences at scale Characteristics 4 Customer-facing impact Time-sensitive Production SLAs Subject to Regulation Cross-functional stakeholders
  5. 5. Data Collection Data Verification Feature Engineering Testing and Debugging Resource Management Metadata Management Process Management Serving Infrastructure Monitoring Data Collection Data Verification Feature Engineering Testing and Debugging Resource Management Metadata Management Process Management Serving Infrastructure Monitoring Building Operational ML applications is very complex. Data is at the core of that complexity. 5 Configuration Automation Data-Related ML Code Model Analysis Sculley, David, et al. "Machine learning: The high interest credit card of technical debt." (2014).
  6. 6. Features are the signals we extract from data and are a critical part of any ML application. 6 “Applied machine learning is basically feature engineering.” – Andrew Ng user_id user_clicks_last7d user_in_target_cou... ... 1001 2 1 ... 1002 13 1 ... 1003 0 0 ... ... ... ... ...
  7. 7. Tooling for managing features is almost non-existent 7 Models MLOps PipelineModel Training Model Serving DevelopmentApps Production DevOps Pipeline RunBuild Development Features / Data Feature 1Feature 1 ... ? ... 22 Feature Engineering Feature Serving
  8. 8. Tecton is a data platform for ML applications Faster development cycles Lower time-to-production Lower operational cost Easier adoption of ML across teams 8
  9. 9. Critical problems solved by data platforms for ML: 9 Managing sprawling feature transform logic Building high-quality training sets from messy data Deploying to production & moving beyond batch to real-time
  10. 10. 10 Managing sprawling and disconnected feature transform logic Challenge 1 Features ModelRaw Data Features are some of the most highly curated and refined data in a business, yet they are also some of the most poorly managed assets
  11. 11. 11 Managing sprawling and disconnected feature transform logic Challenge 1
  12. 12. Tecton centrally manages features as software assets Solution 12 Feature Store Raw Data Model
  13. 13. Solution Easily contribute new features to the feature store 13 1. Define a feature 2. Save it to the feature store feature_repo/ads_behavior/user_clicks.py $ tecton apply Tecton CLI
  14. 14. 14
  15. 15. Common challenges assembling training data Stitching multiple data pipelines / data sources together Feature backfills Time travel + point-in-time correctness Data leakage Delivering training data to training jobs 15 Building high-quality training sets from messy data Challenge 2
  16. 16. 2. Get training data for the events of interest 1. Configure what features you want in a training dataset 16 Solution Configuration-based training data set generation through simple APIs
  17. 17. user_id ad_id timestamp 111 444 2020-02-01... 222 555 2020-02-02... 333 666 2020-02-01... ... ... ... distinct_users user_clicks_last7d user_in_target_cou... ... 24 2 1 ... 21 13 1 ... 20 0 0 ... ... ... ... ... Solution Built-in row-level time travel for accurate training data 17 1. Modeler provides event time and entity IDs 2. Tecton returns point-in-time correct feature values
  18. 18. Common challenges when deploying to production Throwing it over the wall for reimplementation in production environment Infrastructure provisioning Freshness vs. cost Train-serve skew Drift & data quality monitoring 18 Deploying to production & moving from batch to real-time Challenge 3
  19. 19. Training Features 19 Today, moving to production requires reimplementation Challenge 3 TrainingTraining ImplementationRaw Data
  20. 20. Training Features Serving Features 20 Today, moving to production requires reimplementation Challenge 3 Serving Implementation Training Prediction? Training Implementation Production Data Raw Data
  21. 21. Serving FeaturesProduction Data Raw Data 21 Duplicate transform implementations are error-prone Challenge 3 Training Features Training Prediction Training Implementation Serving Implementation
  22. 22. Training Features Training Prediction Training Implementation Serving Features 22 Differences in implementations or data across training and serving can break models Challenge 3 Duplicate transform implementations lead to train-serve skew A B A B Production Data Raw Data Serving Implementation
  23. 23. Unified train/predict pipelines ensure online/offline consistency 23 Solution Training Features Training PredictionServing FeaturesProduction Data Raw Data Training + Serving Implementation
  24. 24. Tecton delivers those features “online” for real-time predictions 24 Solution ● Single row ● Delivered in milliseconds ● REST API Training Features Training PredictionServing Features ● Millions of rows ● Dataframe API Virtual Data Source Training + Serving Implementation
  25. 25. End-to-End Feature Lifecycle Management 25
  26. 26. 26 Principal Data Scientist Geoff Sims How Atlassian Uses Tecton to Get ML Into Production
  27. 27. Example: Automated Content Categorization in Jira 27 Solution Automated labeling is needed for every Jira issue
  28. 28. 28
  29. 29. 29
  30. 30. 30
  31. 31. Thank you 31

×