SlideShare a Scribd company logo
1 of 25
Download to read offline
1
Train & Sustain: Why data leaders
need to pay attention to HITL
May 9, 2023
Place image
here within
cropped circle
©2019 CloudFactory Confidential
2
©2023 CloudFactory Confidential
Hi, I’m Matt
Originally from Cambridge, Matt now
helps clients move to a data centric ML
approach as a Senior Solutions
Consultant at CloudFactory. He’s worked
with clients across autonomous vehicles,
green energy and fintech whilst
providing meaningful work in the
developing world.
Place image
here within
cropped circle
©2019 CloudFactory Confidential3
©2023 CloudFactory Confidential
About CloudFactory
CloudFactory is a global leader in combining people
and technology to support the AI development
lifecycle, from data curation and annotation, to
quality assurance and model optimization. Our
human-in-the-loop AI solutions are trusted by AI
leaders at 700+ companies globally.
40M+ 7,000+ 64 $78M
project hours
delivered
data analysts Client NPS
score
in growth
funding
4
Deep Learning Needs
Humans-in-the-Loop
Place image
here within
cropped circle
In 2012, Computer computer vision pioneer
Geoffrey Hinton won important Large Scale
Visual Recognition Challenge (ILSVRC2012)
competition with Deep Learning.
Deep Learning and ILSVRC 2012
6
©2023 CloudFactory Confidential
Traditional wisdom since that time was that more
layers (and more data) was better
Source: Research Gate
Shallow
152 Layers
“One thing ImageNet changed in the field of AI is suddenly
people realized the thankless work of making a dataset
was at the core of AI research.”
Fei-Fei Li
Professor, Computer Science
Co-Director, Stanford Institute of Human-Centered AI (HAI)
Stanford University
8
©2023 CloudFactory Confidential
But at a certain point, that approach stopped being as
effective and attention turned to the data instead
“20% of activities are
automatable by AI advances
in areas like visual object
recognition. - McKinsey
“87% of applied ML projects
never make it to production”
- Gartner
A small group of teams emerges
succeeding with applied ML due
to a data-centric approach.” -
Andrew Ng
2017 2019 2021
9
©2023 CloudFactory Confidential
When you focus on having humans label the right things,
you can dramatically increase model performance
Active learning used
to prioritize workflow
10
Deep Learning Isn’t Just About
Training Data
11
©2023 CloudFactory Confidential
Historically, companies have used HITL primarily for
labeling training data
Label
Data
QA
Data
Train
Model
Monitor
Model
Deploy
Model
Evaluate
Model
TRAIN SUSTAIN
12
©2023 CloudFactory Confidential
Though automation can speed the initial labeling process,
it can’t do everything without human help
13
©2023 CloudFactory Confidential
Though automation can speed the initial labeling process,
it can’t do everything without human help
14
©2023 CloudFactory Confidential
Though automation can speed the initial labeling process,
it can’t do everything without human help
15
©2023 CloudFactory Confidential
Though automation can speed the initial labeling process,
it can’t do everything without human help
16
©2023 CloudFactory Confidential
You still need both humans and automation to get
AI to production reliably
Level 1: Out-of-the-Box
Annotation with
out-of-the-box assistants
Level 2: Custom Trained
Annotation with adaptive,
custom assistants
Level 3: Automated
Annotation with custom
trained model
Input data*
IMAGES VIA API
Active learning ensures only
useful images are labeled
Output data
IMAGES +
LABELS VIA API
17
©2023 CloudFactory Confidential
The future is to use humans for train and for sustain
Label
Data
QA
Data
Train
Model
Monitor
Model
Deploy
Model
Evaluate
Model
TRAIN SUSTAIN
18
©2023 CloudFactory Confidential
Getting your AI to production reliably
Level 1: Out-of-the-Box
Annotation with
out-of-the-box assistants
Level 2: Custom Trained
Annotation with adaptive,
custom assistants
Level 3: Automated
Annotation with custom
trained model
Input data*
IMAGES VIA API
Data insight report
Capture insights and
progress for review
Output data
IMAGES +
LABELS VIA API
cabbage
potato
grain grain
grain
AI Consensus scoring
Review to ensure quality and
resolve ambiguity
Active learning ensures only
useful images are labeled
19
©2023 CloudFactory Confidential
AI Consensus Scoring
Using Bayesian Neural Networks and Confident Learning to check every label at least twice
Edge cases are reviewed by a human. The results generate quality reports.
Our AI checks the data for a variety of
error types.
Dog
Wrong class
Low IoU
Artefacts
Missing labels
20
©2023 CloudFactory Confidential
Inference Validation Example
21
Finding the right HITL expertise at
the right time
22
©2023 CloudFactory Confidential
23
©2023 CloudFactory Confidential
24
©2023 CloudFactory Confidential
The decision to add humans-in-the-loop depends on your place in the AI lifecycle: incubation,
scale training, or production. When you’re ready - here are four things to consider:
Criteria for selecting the right HITL
#1
#2
#3
#4
Use case experience
Flexibility (new use cases, speed, etc)
Workforce management technology
Ethics and risk management
25
Questions?

More Related Content

Similar to Train & Sustain: Why data leaders need to pay attention to HITL

Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTXCustomer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
tsigitnist02
 

Similar to Train & Sustain: Why data leaders need to pay attention to HITL (20)

Business in 2020 and the Top Technology Trends
Business in 2020 and the Top Technology TrendsBusiness in 2020 and the Top Technology Trends
Business in 2020 and the Top Technology Trends
 
Hybrid Cloud Meetup 4
Hybrid Cloud Meetup 4Hybrid Cloud Meetup 4
Hybrid Cloud Meetup 4
 
IBM i & Data Science in the AI era.
IBM i & Data Science in the AI era.  IBM i & Data Science in the AI era.
IBM i & Data Science in the AI era.
 
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTXCustomer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
 
Industry and academic partnerships july 2015 final
Industry and academic partnerships july 2015 finalIndustry and academic partnerships july 2015 final
Industry and academic partnerships july 2015 final
 
Dynniq & GoDataDriven - Shaping the future of traffic with IoT and AI
Dynniq & GoDataDriven - Shaping the future of traffic with IoT and AIDynniq & GoDataDriven - Shaping the future of traffic with IoT and AI
Dynniq & GoDataDriven - Shaping the future of traffic with IoT and AI
 
Managing Data To Drive Competitive Advantage
Managing Data To Drive Competitive Advantage Managing Data To Drive Competitive Advantage
Managing Data To Drive Competitive Advantage
 
20180115 Mobile AIoT Networking-ftsai
20180115 Mobile AIoT Networking-ftsai20180115 Mobile AIoT Networking-ftsai
20180115 Mobile AIoT Networking-ftsai
 
Mindteck Smart/IoT Capabilities
Mindteck Smart/IoT CapabilitiesMindteck Smart/IoT Capabilities
Mindteck Smart/IoT Capabilities
 
Galit Fein IT governance for slideshare 2016
Galit Fein IT governance for slideshare 2016Galit Fein IT governance for slideshare 2016
Galit Fein IT governance for slideshare 2016
 
Cloud without Compromise
Cloud without CompromiseCloud without Compromise
Cloud without Compromise
 
The Shift Is Here: Artificial Intelligence is at Work for the Construction In...
The Shift Is Here: Artificial Intelligence is at Work for the Construction In...The Shift Is Here: Artificial Intelligence is at Work for the Construction In...
The Shift Is Here: Artificial Intelligence is at Work for the Construction In...
 
[Cisco Connect 2018 - Vietnam] Huu thang ho data center transformation - vn
[Cisco Connect 2018 - Vietnam] Huu thang ho   data center transformation - vn[Cisco Connect 2018 - Vietnam] Huu thang ho   data center transformation - vn
[Cisco Connect 2018 - Vietnam] Huu thang ho data center transformation - vn
 
Big Data, customer analytics and loyalty marketing
Big Data, customer analytics and loyalty marketingBig Data, customer analytics and loyalty marketing
Big Data, customer analytics and loyalty marketing
 
AI Trends.pdf
AI Trends.pdfAI Trends.pdf
AI Trends.pdf
 
Achieve cloud visibility, control and automation with IBM Hybrid Cloud Manage...
Achieve cloud visibility, control and automation with IBM Hybrid Cloud Manage...Achieve cloud visibility, control and automation with IBM Hybrid Cloud Manage...
Achieve cloud visibility, control and automation with IBM Hybrid Cloud Manage...
 
Disruptive technologies - Session 1 - introduction
Disruptive technologies - Session 1 - introductionDisruptive technologies - Session 1 - introduction
Disruptive technologies - Session 1 - introduction
 
Artificial Intelligence for Network Telkom Group
Artificial Intelligence for Network Telkom GroupArtificial Intelligence for Network Telkom Group
Artificial Intelligence for Network Telkom Group
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
 
How Artificial Intelligence is Applied to Drive Trade Effectiveness
How Artificial Intelligence is Applied to Drive Trade EffectivenessHow Artificial Intelligence is Applied to Drive Trade Effectiveness
How Artificial Intelligence is Applied to Drive Trade Effectiveness
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Train & Sustain: Why data leaders need to pay attention to HITL

  • 1. 1 Train & Sustain: Why data leaders need to pay attention to HITL May 9, 2023
  • 2. Place image here within cropped circle ©2019 CloudFactory Confidential 2 ©2023 CloudFactory Confidential Hi, I’m Matt Originally from Cambridge, Matt now helps clients move to a data centric ML approach as a Senior Solutions Consultant at CloudFactory. He’s worked with clients across autonomous vehicles, green energy and fintech whilst providing meaningful work in the developing world.
  • 3. Place image here within cropped circle ©2019 CloudFactory Confidential3 ©2023 CloudFactory Confidential About CloudFactory CloudFactory is a global leader in combining people and technology to support the AI development lifecycle, from data curation and annotation, to quality assurance and model optimization. Our human-in-the-loop AI solutions are trusted by AI leaders at 700+ companies globally. 40M+ 7,000+ 64 $78M project hours delivered data analysts Client NPS score in growth funding
  • 5. Place image here within cropped circle In 2012, Computer computer vision pioneer Geoffrey Hinton won important Large Scale Visual Recognition Challenge (ILSVRC2012) competition with Deep Learning. Deep Learning and ILSVRC 2012
  • 6. 6 ©2023 CloudFactory Confidential Traditional wisdom since that time was that more layers (and more data) was better Source: Research Gate Shallow 152 Layers
  • 7. “One thing ImageNet changed in the field of AI is suddenly people realized the thankless work of making a dataset was at the core of AI research.” Fei-Fei Li Professor, Computer Science Co-Director, Stanford Institute of Human-Centered AI (HAI) Stanford University
  • 8. 8 ©2023 CloudFactory Confidential But at a certain point, that approach stopped being as effective and attention turned to the data instead “20% of activities are automatable by AI advances in areas like visual object recognition. - McKinsey “87% of applied ML projects never make it to production” - Gartner A small group of teams emerges succeeding with applied ML due to a data-centric approach.” - Andrew Ng 2017 2019 2021
  • 9. 9 ©2023 CloudFactory Confidential When you focus on having humans label the right things, you can dramatically increase model performance Active learning used to prioritize workflow
  • 10. 10 Deep Learning Isn’t Just About Training Data
  • 11. 11 ©2023 CloudFactory Confidential Historically, companies have used HITL primarily for labeling training data Label Data QA Data Train Model Monitor Model Deploy Model Evaluate Model TRAIN SUSTAIN
  • 12. 12 ©2023 CloudFactory Confidential Though automation can speed the initial labeling process, it can’t do everything without human help
  • 13. 13 ©2023 CloudFactory Confidential Though automation can speed the initial labeling process, it can’t do everything without human help
  • 14. 14 ©2023 CloudFactory Confidential Though automation can speed the initial labeling process, it can’t do everything without human help
  • 15. 15 ©2023 CloudFactory Confidential Though automation can speed the initial labeling process, it can’t do everything without human help
  • 16. 16 ©2023 CloudFactory Confidential You still need both humans and automation to get AI to production reliably Level 1: Out-of-the-Box Annotation with out-of-the-box assistants Level 2: Custom Trained Annotation with adaptive, custom assistants Level 3: Automated Annotation with custom trained model Input data* IMAGES VIA API Active learning ensures only useful images are labeled Output data IMAGES + LABELS VIA API
  • 17. 17 ©2023 CloudFactory Confidential The future is to use humans for train and for sustain Label Data QA Data Train Model Monitor Model Deploy Model Evaluate Model TRAIN SUSTAIN
  • 18. 18 ©2023 CloudFactory Confidential Getting your AI to production reliably Level 1: Out-of-the-Box Annotation with out-of-the-box assistants Level 2: Custom Trained Annotation with adaptive, custom assistants Level 3: Automated Annotation with custom trained model Input data* IMAGES VIA API Data insight report Capture insights and progress for review Output data IMAGES + LABELS VIA API cabbage potato grain grain grain AI Consensus scoring Review to ensure quality and resolve ambiguity Active learning ensures only useful images are labeled
  • 19. 19 ©2023 CloudFactory Confidential AI Consensus Scoring Using Bayesian Neural Networks and Confident Learning to check every label at least twice Edge cases are reviewed by a human. The results generate quality reports. Our AI checks the data for a variety of error types. Dog Wrong class Low IoU Artefacts Missing labels
  • 21. 21 Finding the right HITL expertise at the right time
  • 24. 24 ©2023 CloudFactory Confidential The decision to add humans-in-the-loop depends on your place in the AI lifecycle: incubation, scale training, or production. When you’re ready - here are four things to consider: Criteria for selecting the right HITL #1 #2 #3 #4 Use case experience Flexibility (new use cases, speed, etc) Workforce management technology Ethics and risk management