SlideShare a Scribd company logo
1 of 25
Download to read offline
LEAN AI: SIX STEPS FOR IMPLEMENTING
AI IN THE ENTERPRISE
DR. SOURAV DEY | @resdntalien
DATAWORKS SUMMIT
JUNE 20, 2018
© 2018 Manifold. All rights reserved. | 2
Manifold is an engineering services firm that
accelerates AI development for Global 500
and high-growth companies.
ABOUT US
© 2018 Manifold. All rights reserved. | 3
THE POINT OF THIS TALK
•Share our Lean AI mental model for applied AI
•Make it real using some case studies from our work
© 2018 Manifold. All rights reserved. | 4
LEAN AI PLAYBOOK
#TackleBigRisksEarly
© 2018 Manifold. All rights reserved. | 5
UNDERSTAND
© 2018 Manifold. All rights reserved. | 6
AI VALUE ≤ BUSINESS VALUE X DATA QUALITY X PREDICTIVE SIGNAL
AI UNCERTAINTY PRINCIPLE
Multiplicative! If any term goes to 0, value goes to 0!
© 2018 Manifold. All rights reserved. | 7
BATTLE UNCERTAINTY WITH AN AI SPEC
Business Problem Workshop:
• What is the ROI?
• If this could be predicted,
how would it help your
company?
• Etc.
Data Audit:
• Where are you data
sources?
• How much data?
• How rare is the event?
• Is the data labelled well?
• Is it joinable?
• It is trustworthy?
• tc.
CASE STUDY #1
“We want to use AI to be more customer centric. 80% of revenue
comes from 20% of customers. How can we identify our most loyal
customers and deliver more value to them?”
LEADING BABY REGISTRY IN THE U.S.
© 2018 Manifold. All rights reserved. | 9
CASE STUDY #1: AI SPECIFICATION
• We will create models to predict
if a customer is going to adopt,
activate, and $LTV after 9
months
• We will generate this prediction 1
day after signup, 2 days, 7 days,
30 days, N days
• We will only use transactional DB
as the only data source—there
seems to be enough data in there
to create meaningful features
LEADING BABY REGISTRY IN THE U.S.
9 months
CASE STUDY #2
“We want to use AI to be more efficient across our operations.
The vision is to create a system for making better decisions.”
LEADING INDUSTRIAL SERVICES COMPANY
© 2018 Manifold. All rights reserved. | 11
CASE STUDY #2: AI SPECIFICATION
Lookback
= 2 days
Horizon = 5 days
LEADING INDUSTRIAL SERVICES COMPANY
• Predict major faults where machine is continuously
down for >2 hours.
• Major faults almost always lead to customer calls,
truck rolls, and downtime.
• Predict whether major fault will happen over a horizon
of 1, 2, … ,5 days.
• Use machine-generated data as input features, e.g.
~30 continuous time series, ~20 discrete time series.
• Use demographic data about machines, e.g. unit type,
location, etc.
• Do not use human-generated service data because of
data quality issues.
© 2018 Manifold. All rights reserved. | 12
MODEL
© 2018 Manifold. All rights reserved. | 13
MODELLING IS JUST
THE TIP OF THE
PYRAMID
Source: Monica Rogati
© 2018 Manifold. All rights reserved. | 14
BUILD A BASELINE MODEL, NO EXCEPTIONS
It’s all about learning!
Then iterate, iterate, iterate.
• classification > regression
• class errors are easier to understand learn from
• even for continuous targets, you may want to do a binary (or multiclass)
classifier before regression
• random forest > gradient boosted trees > deep learning
• few parameters to tune, robust to overfitting, quick to train
• interpretable feature importance to learn from
• pick a few features to start, then create more features
© 2018 Manifold. All rights reserved. | 15
EVALUATE TO LEARN
• Aggregate Metrics
• Cross-Validated ROC and AUC = your
score to improve by iterative modelling
• feature importance done properly
• Individual Metrics
• prediction probability distribution
• “Four corners and the middle analysis”
• most accurate negatives
• most accurate positives
• least accurate negatives
• least accurate positives
• least certain estimates
© 2018 Manifold. All rights reserved. | 16
CASE STUDY #1: MODEL ITERATION
NO NEED FOR HYPER OPTIMIZATION
Predict LTV > 0 7 days after signup only using 2 easiest
features to engineer: “platform used to sign up” and “referrer”
Predict LTV > 0 7 days after signup only using 11 most
important features which includes early activity features
AUC = 0.65 AUC = 0.90
© 2018 Manifold. All rights reserved. | 17
CASE STUDY #2: MODEL ITERATION
DIMINISHING RETURNS
Feature Matrix
Deep Learning
(CNNs)
Tree Methods
(RF and GBT)
Feature
Engineering
Mixed Effects
Models
(MERF)
#1
#2
#3
© 2018 Manifold. All rights reserved. | 18
USER FEEDBACK
© 2018 Manifold. All rights reserved. | 19
GET USER FEEDBACK ASAP
• Multiple structured sessions with
final end users
• Use prototype tooling — e.g.,
nothing, Excel, Jupyter notebooks
• Observe their workflow and how
they integrate predictions
© 2018 Manifold. All rights reserved. | 20
TRUST NOBODY, ESPECIALLY MODELS
• Does the aggregate predicted
failure rate for a daily cohort
match the historical average I’m
familiar with?
• If sensor A goes above X psi,
likelihood of failure goes up, what
does the model say?
• Sensitivity analysis can show that
modelling is not magic; it’s
heuristics you know codified into a
mathematical model.
MODELS HAVE TO EARN TRUST
© 2018 Manifold. All rights reserved. | 21
DELIVER SOLUTIONS, NOT MODELS
• The raw predictions almost
always need post processing
before they are useful.
• It is our job as AI engineers to
create workflow tools or APIs that
help users derive value from the
AI.
• BUILD THE UI FOR THE AI
© 2018 Manifold. All rights reserved. | 22
CASE STUDY #1: PRODUCT CHANGE DECISION
Problem
I want to change the product, so that if a user takes certain key actions (that you helped
me identify), they will get a free box full of goodies. I want to run this promotion for a few
weeks and use your model to determine if this is a promotion worth running long term.
I.e., is the cost of the box worth it? Will the higher CAC be offset by higher LTV?
Solution
• Temporal AB test with predicted LTV
• Need to use a model without the features that are incentivized, i.e. need to remove
confounding features.
© 2018 Manifold. All rights reserved. | 23
CASE STUDY #2: DIRECTED TRIAGE
Problem
Most high probability of fault units are known stressed units, so just looking at raw
predictions leads to many false alarms and erosion of trust.
Also, the human can spend lots of time looking at data and may not be able to see what
the AI sees. We want triage to be directed.
Solution
• Rules on historical predictions to find “interesting events”, e.g. day on day % prob
change
• Explainable AI using TreeSHAP that identifies which factors are driving the
increased probability of failure
© 2018 Manifold. All rights reserved. | 24
LOTS MORE WE CAN’T COVER TODAY
Don’t be a pirate, be the Navy.
Embed high cardinality
categorical variables!
Use Docker, damnit.
THANK YOU
Dr. Sourav Dey | sdey@manifold.ai | @resdntalien

More Related Content

More from DataWorks Summit

Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Open Source, Open Data: Driving Innovation in Smart Cities
Open Source, Open Data: Driving Innovation in Smart CitiesOpen Source, Open Data: Driving Innovation in Smart Cities
Open Source, Open Data: Driving Innovation in Smart Cities
DataWorks Summit
 
Data Protection in Hybrid Enterprise Data Lake Environment
Data Protection in Hybrid Enterprise Data Lake EnvironmentData Protection in Hybrid Enterprise Data Lake Environment
Data Protection in Hybrid Enterprise Data Lake Environment
DataWorks Summit
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
DataWorks Summit
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
DataWorks Summit
 

More from DataWorks Summit (20)

Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
 
Applying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real ProblemsApplying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real Problems
 
Open Source, Open Data: Driving Innovation in Smart Cities
Open Source, Open Data: Driving Innovation in Smart CitiesOpen Source, Open Data: Driving Innovation in Smart Cities
Open Source, Open Data: Driving Innovation in Smart Cities
 
Data Protection in Hybrid Enterprise Data Lake Environment
Data Protection in Hybrid Enterprise Data Lake EnvironmentData Protection in Hybrid Enterprise Data Lake Environment
Data Protection in Hybrid Enterprise Data Lake Environment
 
Big Data Technologies in Support of a Medical School Data Science Institute
Big Data Technologies in Support of a Medical School Data Science InstituteBig Data Technologies in Support of a Medical School Data Science Institute
Big Data Technologies in Support of a Medical School Data Science Institute
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
 
IoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsIoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management Things
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

6 Steps for Implementing AI to Enable Efficiency in the Enterprise

  • 1. LEAN AI: SIX STEPS FOR IMPLEMENTING AI IN THE ENTERPRISE DR. SOURAV DEY | @resdntalien DATAWORKS SUMMIT JUNE 20, 2018
  • 2. © 2018 Manifold. All rights reserved. | 2 Manifold is an engineering services firm that accelerates AI development for Global 500 and high-growth companies. ABOUT US
  • 3. © 2018 Manifold. All rights reserved. | 3 THE POINT OF THIS TALK •Share our Lean AI mental model for applied AI •Make it real using some case studies from our work
  • 4. © 2018 Manifold. All rights reserved. | 4 LEAN AI PLAYBOOK #TackleBigRisksEarly
  • 5. © 2018 Manifold. All rights reserved. | 5 UNDERSTAND
  • 6. © 2018 Manifold. All rights reserved. | 6 AI VALUE ≤ BUSINESS VALUE X DATA QUALITY X PREDICTIVE SIGNAL AI UNCERTAINTY PRINCIPLE Multiplicative! If any term goes to 0, value goes to 0!
  • 7. © 2018 Manifold. All rights reserved. | 7 BATTLE UNCERTAINTY WITH AN AI SPEC Business Problem Workshop: • What is the ROI? • If this could be predicted, how would it help your company? • Etc. Data Audit: • Where are you data sources? • How much data? • How rare is the event? • Is the data labelled well? • Is it joinable? • It is trustworthy? • tc.
  • 8. CASE STUDY #1 “We want to use AI to be more customer centric. 80% of revenue comes from 20% of customers. How can we identify our most loyal customers and deliver more value to them?” LEADING BABY REGISTRY IN THE U.S.
  • 9. © 2018 Manifold. All rights reserved. | 9 CASE STUDY #1: AI SPECIFICATION • We will create models to predict if a customer is going to adopt, activate, and $LTV after 9 months • We will generate this prediction 1 day after signup, 2 days, 7 days, 30 days, N days • We will only use transactional DB as the only data source—there seems to be enough data in there to create meaningful features LEADING BABY REGISTRY IN THE U.S. 9 months
  • 10. CASE STUDY #2 “We want to use AI to be more efficient across our operations. The vision is to create a system for making better decisions.” LEADING INDUSTRIAL SERVICES COMPANY
  • 11. © 2018 Manifold. All rights reserved. | 11 CASE STUDY #2: AI SPECIFICATION Lookback = 2 days Horizon = 5 days LEADING INDUSTRIAL SERVICES COMPANY • Predict major faults where machine is continuously down for >2 hours. • Major faults almost always lead to customer calls, truck rolls, and downtime. • Predict whether major fault will happen over a horizon of 1, 2, … ,5 days. • Use machine-generated data as input features, e.g. ~30 continuous time series, ~20 discrete time series. • Use demographic data about machines, e.g. unit type, location, etc. • Do not use human-generated service data because of data quality issues.
  • 12. © 2018 Manifold. All rights reserved. | 12 MODEL
  • 13. © 2018 Manifold. All rights reserved. | 13 MODELLING IS JUST THE TIP OF THE PYRAMID Source: Monica Rogati
  • 14. © 2018 Manifold. All rights reserved. | 14 BUILD A BASELINE MODEL, NO EXCEPTIONS It’s all about learning! Then iterate, iterate, iterate. • classification > regression • class errors are easier to understand learn from • even for continuous targets, you may want to do a binary (or multiclass) classifier before regression • random forest > gradient boosted trees > deep learning • few parameters to tune, robust to overfitting, quick to train • interpretable feature importance to learn from • pick a few features to start, then create more features
  • 15. © 2018 Manifold. All rights reserved. | 15 EVALUATE TO LEARN • Aggregate Metrics • Cross-Validated ROC and AUC = your score to improve by iterative modelling • feature importance done properly • Individual Metrics • prediction probability distribution • “Four corners and the middle analysis” • most accurate negatives • most accurate positives • least accurate negatives • least accurate positives • least certain estimates
  • 16. © 2018 Manifold. All rights reserved. | 16 CASE STUDY #1: MODEL ITERATION NO NEED FOR HYPER OPTIMIZATION Predict LTV > 0 7 days after signup only using 2 easiest features to engineer: “platform used to sign up” and “referrer” Predict LTV > 0 7 days after signup only using 11 most important features which includes early activity features AUC = 0.65 AUC = 0.90
  • 17. © 2018 Manifold. All rights reserved. | 17 CASE STUDY #2: MODEL ITERATION DIMINISHING RETURNS Feature Matrix Deep Learning (CNNs) Tree Methods (RF and GBT) Feature Engineering Mixed Effects Models (MERF) #1 #2 #3
  • 18. © 2018 Manifold. All rights reserved. | 18 USER FEEDBACK
  • 19. © 2018 Manifold. All rights reserved. | 19 GET USER FEEDBACK ASAP • Multiple structured sessions with final end users • Use prototype tooling — e.g., nothing, Excel, Jupyter notebooks • Observe their workflow and how they integrate predictions
  • 20. © 2018 Manifold. All rights reserved. | 20 TRUST NOBODY, ESPECIALLY MODELS • Does the aggregate predicted failure rate for a daily cohort match the historical average I’m familiar with? • If sensor A goes above X psi, likelihood of failure goes up, what does the model say? • Sensitivity analysis can show that modelling is not magic; it’s heuristics you know codified into a mathematical model. MODELS HAVE TO EARN TRUST
  • 21. © 2018 Manifold. All rights reserved. | 21 DELIVER SOLUTIONS, NOT MODELS • The raw predictions almost always need post processing before they are useful. • It is our job as AI engineers to create workflow tools or APIs that help users derive value from the AI. • BUILD THE UI FOR THE AI
  • 22. © 2018 Manifold. All rights reserved. | 22 CASE STUDY #1: PRODUCT CHANGE DECISION Problem I want to change the product, so that if a user takes certain key actions (that you helped me identify), they will get a free box full of goodies. I want to run this promotion for a few weeks and use your model to determine if this is a promotion worth running long term. I.e., is the cost of the box worth it? Will the higher CAC be offset by higher LTV? Solution • Temporal AB test with predicted LTV • Need to use a model without the features that are incentivized, i.e. need to remove confounding features.
  • 23. © 2018 Manifold. All rights reserved. | 23 CASE STUDY #2: DIRECTED TRIAGE Problem Most high probability of fault units are known stressed units, so just looking at raw predictions leads to many false alarms and erosion of trust. Also, the human can spend lots of time looking at data and may not be able to see what the AI sees. We want triage to be directed. Solution • Rules on historical predictions to find “interesting events”, e.g. day on day % prob change • Explainable AI using TreeSHAP that identifies which factors are driving the increased probability of failure
  • 24. © 2018 Manifold. All rights reserved. | 24 LOTS MORE WE CAN’T COVER TODAY Don’t be a pirate, be the Navy. Embed high cardinality categorical variables! Use Docker, damnit.
  • 25. THANK YOU Dr. Sourav Dey | sdey@manifold.ai | @resdntalien