SlideShare a Scribd company logo
1 of 43
The
Uncanny Valley
of ML
Dr June Andrews Delphi Data Nov 2019
Human Decision Systems
Simple Paradigm Represents:
• Judges setting bail
• Doctors processing images
• DMV clerks renewing licenses
• Muni train drivers stopping & going
• Administrators admitting students
• … you coming to MLConf
Information In, Decision Out, Works Pretty Well
Hype: Technology Will Replace People Overnight
???
Far More Likely Progression of Technology
Machine Learning with Augment Decision Making with Recommendations
Template Design for ML + Decision Systems
Recommendation: Yes
Feature A is high
Keep the Chain of Responsibility Intact
Integrate as soon as ML Accuracy ≈ Human Accuracy
Decreasing
cost of ML
Pressures for Introducing ML into Decision Systems
Increasing data
from revolutions in
sensors, records &
infrastructure
Fewer experts
graduating in
‘older fields’
Increasing number
of decisions
created by more
people
Ideal Result of ML + Human Decisions
ML Accuracy
Terrible Human Perfect
Accuracy
Time & Cost
As ML Accuracy Approaches Human Accuracy, System Performance Improves
LowHigh
The Uncanny Valley of ML
As ML Accuracy Approaches Human Accuracy, System Performance Degrades
ML Accuracy
Terrible Human Perfect
Accuracy
Time & Cost
LowHigh
Finding
the
Uncanny
Valley
of ML
Finding An Uncanny Valley of ML
Use Test Environments To Avoid The Uncanny Valley in Production
1. Create a simple
labeling task with
ground truth labels
2. Measure human
accuracy & speed
3.
1. Add a recommended decision
from a ‘Model’
2. Simulate models of different
accuracy near human accuracy
by perturbing the ground truth
labels
3. Assign each person to a
simulated model and run test
labels for normalization
4. Measure system accuracy &
speed as a function of ML
accuracy
5.
Easy Street - How many coffee mugs do you see?
Throwback to the first demos of Neural Nets for Compute Vision @ Cornell
100+ photos hand labeled
with the number of coffee
mugs for ground truth
Label quality is then
perturbed to simulate
different ML accuracies, with
a bias to perturbing the
images with many mugs
100+ photos hand labeled
with the number of coffee
mugs for ground truth
Label quality is then
perturbed to simulate
different ML accuracies, with
a bias to perturbing the
images with many mugs
…used Amazon Mechanical
Turk workers
Easy Street - How many coffee mugs do you see?
Throwback to the first demos of Neural Nets for Compute Vision @ Cornell
2 mugs
1 mug
Ideal Results
ML Accuracy
Terrible Human Perfect
Accuracy
Time & Cost
LowHigh
Actual System Behavior
SystemAccuracy
ML Accuracy
Terrible Human (94%) Perfect
2 mugs
1 mug
LowHigh
Actual System Behavior
DecisionTime
ML Accuracy
Terrible Human (94%) Perfect
2 mugs
1 mug
LowHigh
Uncanny Valley for ML in
Counting Coffee Mugs?
Do people want machines to be wrong?
We trust machines more than we trust
ourselves when they are near but not over
human accuracy?
We’re lazy and want to defer decision
making (people varied from the ML when
it was correct)
2 mugs
1 mug
The Uncanny
Valley of ML
in the
Judicial
System
Decreasing
cost of ML
Pressures for Introducing ML into Court Systems
Increasing data
from records &
social media
Fewer experts
graduating in
‘older fields’
Increasing number of decisions
created by more people
Estonia is building a ‘Robot
Judge’ to settle disputes
under $8,000 - DailyMail
Broader initiative of e-
government. France wants
to match Estonia's level by
2022
160,000 parking tickets
overturned in the UK & US
with a chatbot -Guardian
on DoNotPay
Risk Score Print Outs in Cleveland. Includes features
like ‘how often are you bored’ -Quartz
Note - arraignment hearings are often under 5
minutes.
ML has a Growing Presence in Courts
Countries are Comparing Notes & Learning How to Use AI in the Courts
Locally ML is used in the
Judicial System for Bail
In California 49 of 58 counties
use a Pretrial Assessment System
(yes SF is one) [courts.ca.gov]
SB 10 signed in 2018 would
make it mandatory in October of
2019, but a 2020 referendum
contradicting SB 10 has created
a temporary pause
Just a sec, does Bail Matter?
• 20% of jail inmates in US are
awaiting trial
• Misdemeanors can take several
months for trial, felonies can take
years. Average wait time in the
Bronx is 642 days for a non-jury
trial and 827 days for a jury trial.
• Pretrial detention leads to 13%
increase in plea agreements, 42%
increase in length of sentence and
41% increase increase in court fees
-Stevenson The Journal of Law,
Economics, and Organization
8th Amendment (Bill of Rights)
‘Excessive bail shall not be
required, nor excessive fines
imposed, nor cruel and unusual
punishments inflicted.’
Finding An Uncanny Valley of ML
Unknown System Accuracy, Show Manipulation of a Single Label
1. Take a Real Case
2.
1. Simulate different UI’s &
different model deliverables
2. Compare label distribution
with actual outcome
Finding An Uncanny Valley of ML
Unknown System Accuracy, Instead Show Manipulation of a Single Label
Details Taken from Machine Bias by Propublica -2016
Summary - high schooler stole a bike for a few blocks, had a
High Risk Compas Score by Equivalent. Bail was set at $1000
400+ Survey Participants From Amazon MTurk
No ML
ML - Low Risk
ML - Medium Risk
ML - High Risk
ML - High Risk Positive Support
ML - High Risk Negative Support
Proportion of Responses
0% 33% 67% 100%
$0 Bail $1000 Bail No Release
Power of Suggestion of High Risk ML (no reason) results in +14% in bail denied
Power of Suggestion of Low Risk ML (no Reason) results in +14% in $0 bail
High Risk ML with Negative Features results in +40% in denying bail
104 Survey Participants From June’s Network
No ML
ML - Positive Support
ML - Negative Support
Proportion of Responses
0% 33% 67% 100%
$0 Bail $1000 Bail No Release
While overall more forgiving group, still +40% increase denying
bail
A Higher Level of Machine Learning Knowledge Does NOT Change the Trend
Fundamental Design Flaw in
SB10 & Compas Scores
• Need to allow for the ML system to
return ‘Uncertain, not enough data’
• The Bureau of Justice Statistics has
a warning of ‘Interpret data with
caution. Estimate based on 10 or
fewer sample cases’ for someone
with Brisha’s details
• … also, effectiveness of requiring
ML in the California courts is not
slated to be measured until 2023,
4 years after release
Aside:
The
Uncanny
Valley of
ML
- Why does it exist?
Uncanny Valley of AI
Discovered by Masahiro Mori (1970)
Box office success of movies is
potentially related to the Uncanny
Valley:
• Final Fantasy
• Polar Express
• Beowulf
• The Incredible Hulk
Uncanny Valley of AI
Why it exists is open field of research:
• Mismatch between expectations
and observations [Tinwell]
• Difficult to classify objects that
move between the boundaries of
categories [Looser & Wheatly]
• Recognizing a similar cognitive
[Fray & Wegner]
• Ambiguity about the presence of
threat [McAndrews]
When it exists is also a debate.
2013 Activision Animation
Uncanny Valley of ML
Additional Theories to Consider:
• People excuse biased decisions on
the machine
• People want machines to be wrong
• Disagreeing is a different skill set
than analyzing
• Providing explanations of reasoning
suppresses intuitive decisions
When it exists should also be studied
further.
Thought Experiment
Yes / No
Imagine if the ML system was
another person, who wasn’t
quite as bright as the first
person
It would take longer, the bright
person would question
themselves more
Small Group Communication: A
Theoretical Approach has
additional details of when
groups underperform
individuals
Lean on the field of Team Research to Bootstrap Expectations on Integration
Crossing
the
Chasm
- Avoiding the Uncanny Valley
Self-Driving Cars - Headed for Uncanny Valley
People viewed as backups who would stay behind the wheel and intervene to
avoid accidents in unpredictable or computer confusing instants. Self-driving
option should be included as soon as possible for competitive advantage.
Left
No-op
Right
Left Right
Decreasing cost of ML
Increasing data
from revolutions in
sensors & records
Aging population
Increasing number of
drivers, commutes
increasing
Best Practice: Bet on the Power of ML
Left
No-op
Right
Left Right
Volvo changed from targeting Level 2 to
Levels {4, 5} after including executives in a
simulation of driving a Level 2 car [Wired]
Delaying Release Until Performance Crosses the Valley
Best Practice: Build Simulators
Nuclear Power Plants, Aviation,
Moon Landings … all use
simulators to refine product
designs before launch
**include actual judges/experts
in simulations
Identify location and impact of the Valley before building
Best Practice: Avoid by Redefining Success
Repurpose - ML
designed for 1 system,
may work well for
another
Reset expectations Relabel Bad Labels
The Uncanny Valley
Doesn’t Always Exist
First UI, people completely ignored the ML suggestion
You could design a system no one uses to avoid the Valley
System
Accuracy
ML Accuracy
Terrible Human Perfect
2 mugs
1 mug
Call to Action:
Source a New Field of Research ‘ML Integration’
• HCI, Team Research, Data
Science, AI, Psychology, User
Research & Application Fields are
all trying to understand
integrating ML into Human
Decision Systems …
independently and slowly
• Binding efforts into a single
discipline will rapidly increase
development and possibly meet
demand
* I am not qualified to give this talk … but who is?
Bootstrapping ML Integration
Funding Sources {Military, Accenture,
Academia, …?}
Initial Areas of Research:
• How to calculate the speed and
accuracy of large distributed
human + ML decision systems
• How to safely train and roll out
new decision processes to experts
• How to fairly explain a ML
decision. Beyond explainable AI.
• Design and run experiments in
these systems
Machine Learning
DS
AI
Politics
Team Research
Effective
ML Integration
The Hard Part - Does The Uncanny Valley matter?
ML Accuracy
Terrible Human Perfect
Accuracy
Time & Cost
LowHigh
Delivering Results in
the Uncanny Valley
Can Lead to Early
Project Termination
• These are critical systems
where a 5% drop in accuracy
undoes years of research and
investments. Mistakes are not
treated kindly in these fields.
• Funding is much more tightly
controlled and hard to obtain
after failed launches.
• Legal modifications may
become barriers
Call to Action: Implement Guardrail Metrics for
Vulnerable Members of the Population
• Guardrail Metrics are
used to allow ML to
optimize as much as it
can within a specified
business boundary
• Let’s define a boundary
in tech to not make
systems worse for black
girls
Thank You.
Slides at /drandrews

More Related Content

Similar to June Andrews - The Uncanny Valley of ML

Tech essentials for Product managers
Tech essentials for Product managersTech essentials for Product managers
Tech essentials for Product managersNitin T Bhat
 
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)Amazon Web Services
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
The Human Side of Data By Colin Strong
The Human Side of Data By Colin StrongThe Human Side of Data By Colin Strong
The Human Side of Data By Colin StrongMarTech Conference
 
Future of data science as a profession
Future of data science as a professionFuture of data science as a profession
Future of data science as a professionJose Quesada
 
Gary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you ThinkGary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you ThinkSaratoga
 
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Analytics India Magazine
 
Enterprise Machine Learning Governance
Enterprise Machine Learning Governance Enterprise Machine Learning Governance
Enterprise Machine Learning Governance Terence Siganakis
 
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsMachine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsArpana Awasthi
 
Analytics in business
Analytics in businessAnalytics in business
Analytics in businessNiko Vuokko
 
The Dangers of Machine Learning
The Dangers of Machine LearningThe Dangers of Machine Learning
The Dangers of Machine LearningtothepointIT
 
Automated by Omniconvert
Automated by Omniconvert Automated by Omniconvert
Automated by Omniconvert Omniconvert
 
Deep Credit Risk Ranking with LSTM with Kyle Grove
Deep Credit Risk Ranking with LSTM with Kyle GroveDeep Credit Risk Ranking with LSTM with Kyle Grove
Deep Credit Risk Ranking with LSTM with Kyle GroveDatabricks
 
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...InterCon
 
The Future of AI (September 2019)
The Future of AI (September 2019)The Future of AI (September 2019)
The Future of AI (September 2019)Julien SIMON
 
Fraud Management_CAS_Presentation_Oct2016
Fraud Management_CAS_Presentation_Oct2016Fraud Management_CAS_Presentation_Oct2016
Fraud Management_CAS_Presentation_Oct2016Mark Jones
 
Is Bigger Data Really Better? 10 Facts from Theory and Practice
Is Bigger Data Really Better? 10 Facts from Theory and PracticeIs Bigger Data Really Better? 10 Facts from Theory and Practice
Is Bigger Data Really Better? 10 Facts from Theory and PracticeDataWorks Summit
 
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...Sri Ambati
 
[REPORT PREVIEW] The AI Maturity Playbook: Five Pillars of Enterprise Success
[REPORT PREVIEW] The AI Maturity Playbook: Five Pillars of Enterprise Success[REPORT PREVIEW] The AI Maturity Playbook: Five Pillars of Enterprise Success
[REPORT PREVIEW] The AI Maturity Playbook: Five Pillars of Enterprise SuccessAltimeter, a Prophet Company
 

Similar to June Andrews - The Uncanny Valley of ML (20)

Tech essentials for Product managers
Tech essentials for Product managersTech essentials for Product managers
Tech essentials for Product managers
 
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
The Human Side of Data By Colin Strong
The Human Side of Data By Colin StrongThe Human Side of Data By Colin Strong
The Human Side of Data By Colin Strong
 
Future of data science as a profession
Future of data science as a professionFuture of data science as a profession
Future of data science as a profession
 
Gary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you ThinkGary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you Think
 
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
 
Enterprise Machine Learning Governance
Enterprise Machine Learning Governance Enterprise Machine Learning Governance
Enterprise Machine Learning Governance
 
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsMachine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
 
Analytics in business
Analytics in businessAnalytics in business
Analytics in business
 
The Dangers of Machine Learning
The Dangers of Machine LearningThe Dangers of Machine Learning
The Dangers of Machine Learning
 
inte
inteinte
inte
 
Automated by Omniconvert
Automated by Omniconvert Automated by Omniconvert
Automated by Omniconvert
 
Deep Credit Risk Ranking with LSTM with Kyle Grove
Deep Credit Risk Ranking with LSTM with Kyle GroveDeep Credit Risk Ranking with LSTM with Kyle Grove
Deep Credit Risk Ranking with LSTM with Kyle Grove
 
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
 
The Future of AI (September 2019)
The Future of AI (September 2019)The Future of AI (September 2019)
The Future of AI (September 2019)
 
Fraud Management_CAS_Presentation_Oct2016
Fraud Management_CAS_Presentation_Oct2016Fraud Management_CAS_Presentation_Oct2016
Fraud Management_CAS_Presentation_Oct2016
 
Is Bigger Data Really Better? 10 Facts from Theory and Practice
Is Bigger Data Really Better? 10 Facts from Theory and PracticeIs Bigger Data Really Better? 10 Facts from Theory and Practice
Is Bigger Data Really Better? 10 Facts from Theory and Practice
 
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
 
[REPORT PREVIEW] The AI Maturity Playbook: Five Pillars of Enterprise Success
[REPORT PREVIEW] The AI Maturity Playbook: Five Pillars of Enterprise Success[REPORT PREVIEW] The AI Maturity Playbook: Five Pillars of Enterprise Success
[REPORT PREVIEW] The AI Maturity Playbook: Five Pillars of Enterprise Success
 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 
Madalina Fiterau - Hybrid Machine Learning Methods for the Interpretation and...
Madalina Fiterau - Hybrid Machine Learning Methods for the Interpretation and...Madalina Fiterau - Hybrid Machine Learning Methods for the Interpretation and...
Madalina Fiterau - Hybrid Machine Learning Methods for the Interpretation and...MLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 
Madalina Fiterau - Hybrid Machine Learning Methods for the Interpretation and...
Madalina Fiterau - Hybrid Machine Learning Methods for the Interpretation and...Madalina Fiterau - Hybrid Machine Learning Methods for the Interpretation and...
Madalina Fiterau - Hybrid Machine Learning Methods for the Interpretation and...
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

June Andrews - The Uncanny Valley of ML

  • 1. The Uncanny Valley of ML Dr June Andrews Delphi Data Nov 2019
  • 2. Human Decision Systems Simple Paradigm Represents: • Judges setting bail • Doctors processing images • DMV clerks renewing licenses • Muni train drivers stopping & going • Administrators admitting students • … you coming to MLConf Information In, Decision Out, Works Pretty Well
  • 3. Hype: Technology Will Replace People Overnight
  • 4. ??? Far More Likely Progression of Technology Machine Learning with Augment Decision Making with Recommendations
  • 5. Template Design for ML + Decision Systems Recommendation: Yes Feature A is high Keep the Chain of Responsibility Intact Integrate as soon as ML Accuracy ≈ Human Accuracy
  • 6. Decreasing cost of ML Pressures for Introducing ML into Decision Systems Increasing data from revolutions in sensors, records & infrastructure Fewer experts graduating in ‘older fields’ Increasing number of decisions created by more people
  • 7. Ideal Result of ML + Human Decisions ML Accuracy Terrible Human Perfect Accuracy Time & Cost As ML Accuracy Approaches Human Accuracy, System Performance Improves LowHigh
  • 8. The Uncanny Valley of ML As ML Accuracy Approaches Human Accuracy, System Performance Degrades ML Accuracy Terrible Human Perfect Accuracy Time & Cost LowHigh
  • 10. Finding An Uncanny Valley of ML Use Test Environments To Avoid The Uncanny Valley in Production 1. Create a simple labeling task with ground truth labels 2. Measure human accuracy & speed 3. 1. Add a recommended decision from a ‘Model’ 2. Simulate models of different accuracy near human accuracy by perturbing the ground truth labels 3. Assign each person to a simulated model and run test labels for normalization 4. Measure system accuracy & speed as a function of ML accuracy 5.
  • 11. Easy Street - How many coffee mugs do you see? Throwback to the first demos of Neural Nets for Compute Vision @ Cornell 100+ photos hand labeled with the number of coffee mugs for ground truth Label quality is then perturbed to simulate different ML accuracies, with a bias to perturbing the images with many mugs
  • 12. 100+ photos hand labeled with the number of coffee mugs for ground truth Label quality is then perturbed to simulate different ML accuracies, with a bias to perturbing the images with many mugs …used Amazon Mechanical Turk workers Easy Street - How many coffee mugs do you see? Throwback to the first demos of Neural Nets for Compute Vision @ Cornell
  • 13. 2 mugs 1 mug Ideal Results ML Accuracy Terrible Human Perfect Accuracy Time & Cost LowHigh
  • 14. Actual System Behavior SystemAccuracy ML Accuracy Terrible Human (94%) Perfect 2 mugs 1 mug LowHigh
  • 15. Actual System Behavior DecisionTime ML Accuracy Terrible Human (94%) Perfect 2 mugs 1 mug LowHigh
  • 16. Uncanny Valley for ML in Counting Coffee Mugs? Do people want machines to be wrong? We trust machines more than we trust ourselves when they are near but not over human accuracy? We’re lazy and want to defer decision making (people varied from the ML when it was correct) 2 mugs 1 mug
  • 17. The Uncanny Valley of ML in the Judicial System
  • 18. Decreasing cost of ML Pressures for Introducing ML into Court Systems Increasing data from records & social media Fewer experts graduating in ‘older fields’ Increasing number of decisions created by more people
  • 19. Estonia is building a ‘Robot Judge’ to settle disputes under $8,000 - DailyMail Broader initiative of e- government. France wants to match Estonia's level by 2022 160,000 parking tickets overturned in the UK & US with a chatbot -Guardian on DoNotPay Risk Score Print Outs in Cleveland. Includes features like ‘how often are you bored’ -Quartz Note - arraignment hearings are often under 5 minutes. ML has a Growing Presence in Courts Countries are Comparing Notes & Learning How to Use AI in the Courts
  • 20. Locally ML is used in the Judicial System for Bail In California 49 of 58 counties use a Pretrial Assessment System (yes SF is one) [courts.ca.gov] SB 10 signed in 2018 would make it mandatory in October of 2019, but a 2020 referendum contradicting SB 10 has created a temporary pause
  • 21. Just a sec, does Bail Matter? • 20% of jail inmates in US are awaiting trial • Misdemeanors can take several months for trial, felonies can take years. Average wait time in the Bronx is 642 days for a non-jury trial and 827 days for a jury trial. • Pretrial detention leads to 13% increase in plea agreements, 42% increase in length of sentence and 41% increase increase in court fees -Stevenson The Journal of Law, Economics, and Organization 8th Amendment (Bill of Rights) ‘Excessive bail shall not be required, nor excessive fines imposed, nor cruel and unusual punishments inflicted.’
  • 22. Finding An Uncanny Valley of ML Unknown System Accuracy, Show Manipulation of a Single Label 1. Take a Real Case 2. 1. Simulate different UI’s & different model deliverables 2. Compare label distribution with actual outcome
  • 23. Finding An Uncanny Valley of ML Unknown System Accuracy, Instead Show Manipulation of a Single Label Details Taken from Machine Bias by Propublica -2016 Summary - high schooler stole a bike for a few blocks, had a High Risk Compas Score by Equivalent. Bail was set at $1000
  • 24. 400+ Survey Participants From Amazon MTurk No ML ML - Low Risk ML - Medium Risk ML - High Risk ML - High Risk Positive Support ML - High Risk Negative Support Proportion of Responses 0% 33% 67% 100% $0 Bail $1000 Bail No Release Power of Suggestion of High Risk ML (no reason) results in +14% in bail denied Power of Suggestion of Low Risk ML (no Reason) results in +14% in $0 bail High Risk ML with Negative Features results in +40% in denying bail
  • 25. 104 Survey Participants From June’s Network No ML ML - Positive Support ML - Negative Support Proportion of Responses 0% 33% 67% 100% $0 Bail $1000 Bail No Release While overall more forgiving group, still +40% increase denying bail A Higher Level of Machine Learning Knowledge Does NOT Change the Trend
  • 26. Fundamental Design Flaw in SB10 & Compas Scores • Need to allow for the ML system to return ‘Uncertain, not enough data’ • The Bureau of Justice Statistics has a warning of ‘Interpret data with caution. Estimate based on 10 or fewer sample cases’ for someone with Brisha’s details • … also, effectiveness of requiring ML in the California courts is not slated to be measured until 2023, 4 years after release Aside:
  • 28. Uncanny Valley of AI Discovered by Masahiro Mori (1970) Box office success of movies is potentially related to the Uncanny Valley: • Final Fantasy • Polar Express • Beowulf • The Incredible Hulk
  • 29. Uncanny Valley of AI Why it exists is open field of research: • Mismatch between expectations and observations [Tinwell] • Difficult to classify objects that move between the boundaries of categories [Looser & Wheatly] • Recognizing a similar cognitive [Fray & Wegner] • Ambiguity about the presence of threat [McAndrews] When it exists is also a debate. 2013 Activision Animation
  • 30. Uncanny Valley of ML Additional Theories to Consider: • People excuse biased decisions on the machine • People want machines to be wrong • Disagreeing is a different skill set than analyzing • Providing explanations of reasoning suppresses intuitive decisions When it exists should also be studied further.
  • 31. Thought Experiment Yes / No Imagine if the ML system was another person, who wasn’t quite as bright as the first person It would take longer, the bright person would question themselves more Small Group Communication: A Theoretical Approach has additional details of when groups underperform individuals Lean on the field of Team Research to Bootstrap Expectations on Integration
  • 33. Self-Driving Cars - Headed for Uncanny Valley People viewed as backups who would stay behind the wheel and intervene to avoid accidents in unpredictable or computer confusing instants. Self-driving option should be included as soon as possible for competitive advantage. Left No-op Right Left Right Decreasing cost of ML Increasing data from revolutions in sensors & records Aging population Increasing number of drivers, commutes increasing
  • 34. Best Practice: Bet on the Power of ML Left No-op Right Left Right Volvo changed from targeting Level 2 to Levels {4, 5} after including executives in a simulation of driving a Level 2 car [Wired] Delaying Release Until Performance Crosses the Valley
  • 35. Best Practice: Build Simulators Nuclear Power Plants, Aviation, Moon Landings … all use simulators to refine product designs before launch **include actual judges/experts in simulations Identify location and impact of the Valley before building
  • 36. Best Practice: Avoid by Redefining Success Repurpose - ML designed for 1 system, may work well for another Reset expectations Relabel Bad Labels
  • 37. The Uncanny Valley Doesn’t Always Exist First UI, people completely ignored the ML suggestion You could design a system no one uses to avoid the Valley System Accuracy ML Accuracy Terrible Human Perfect 2 mugs 1 mug
  • 38. Call to Action: Source a New Field of Research ‘ML Integration’ • HCI, Team Research, Data Science, AI, Psychology, User Research & Application Fields are all trying to understand integrating ML into Human Decision Systems … independently and slowly • Binding efforts into a single discipline will rapidly increase development and possibly meet demand * I am not qualified to give this talk … but who is?
  • 39. Bootstrapping ML Integration Funding Sources {Military, Accenture, Academia, …?} Initial Areas of Research: • How to calculate the speed and accuracy of large distributed human + ML decision systems • How to safely train and roll out new decision processes to experts • How to fairly explain a ML decision. Beyond explainable AI. • Design and run experiments in these systems Machine Learning DS AI Politics Team Research Effective ML Integration
  • 40. The Hard Part - Does The Uncanny Valley matter? ML Accuracy Terrible Human Perfect Accuracy Time & Cost LowHigh
  • 41. Delivering Results in the Uncanny Valley Can Lead to Early Project Termination • These are critical systems where a 5% drop in accuracy undoes years of research and investments. Mistakes are not treated kindly in these fields. • Funding is much more tightly controlled and hard to obtain after failed launches. • Legal modifications may become barriers
  • 42. Call to Action: Implement Guardrail Metrics for Vulnerable Members of the Population • Guardrail Metrics are used to allow ML to optimize as much as it can within a specified business boundary • Let’s define a boundary in tech to not make systems worse for black girls
  • 43. Thank You. Slides at /drandrews