Trusting AI with important decisions
@louisdorard
March 26, 2016
AI is everywhere
Amazon for David Jones (@d_jones, see source)
Amazon for David Jones (@d_jones, see source)
Lars Trieloff
@trieloff
(see source)
@louisdorard
ChurnSpotter.io
• Startups pitch
• AI asks questions live to each startup
• AI assigns score
• Startup with highest score wins 100000 €
18
AI Startup Battle at PAPIs.io
Preseries
How does it work?
Data + Machine Learning
Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)
3 1 860 1950 house 565,000
3 1 1012 1951 house
2 1.5 968 1976 townhouse 447,000
4 1315 1950 house 648,000
3 2 1599 1964 house
3 2 987 1951 townhouse 790,000
1 1 530 2007 condo 122,000
4 2 1574 1964 house 835,000
4 2001 house 855,000
3 2.5 1472 2005 house
4 3.5 1714 2005 townhouse
2 2 1113 1999 condo
1 769 1999 condo 315,000
Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)
3 1 860 1950 house 565,000
3 1 1012 1951 house
2 1.5 968 1976 townhouse 447,000
4 1315 1950 house 648,000
3 2 1599 1964 house
3 2 987 1951 townhouse 790,000
1 1 530 2007 condo 122,000
4 2 1574 1964 house 835,000
4 2001 house 855,000
3 2.5 1472 2005 house
4 3.5 1714 2005 townhouse
2 2 1113 1999 condo
1 769 1999 condo 315,000
Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)
3 1 860 1950 house 565,000
3 1 1012 1951 house
2 1.5 968 1976 townhouse 447,000
4 1315 1950 house 648,000
3 2 1599 1964 house
3 2 987 1951 townhouse 790,000
1 1 530 2007 condo 122,000
4 2 1574 1964 house 835,000
4 2001 house 855,000
3 2.5 1472 2005 house
4 3.5 1714 2005 townhouse
2 2 1113 1999 condo
1 769 1999 condo 315,000
ML is a set of AI techniques where
“intelligence” is built from
examples
30
Use cases
• Real-estate
• Spam filtering
• City bikes
• Startup competition
• Reduce churn
• Optimize pricing
• Anticipate demand
property price
email spam indicator
location & context #bikes
startup success indicator
customer churn indicator
product & price #sales
context demand
Zillow
Gmail
V3 predict
Preseries
ChurnSpotter
Amazon
Blue Yonder
RULES
31
Use cases
• Real-estate
• Spam filtering
• City bikes
• Startup competition
• Reduce churn
• Optimize pricing
• Anticipate demand
property price
email spam indicator
location & context #bikes
startup success indicator
customer churn indicator
product & price #sales
context demand
Zillow
Gmail
V3 predict
Preseries
ChurnSpotter
Amazon
Blue Yonder
RULES
“Weak AI” vs. “Strong AI”
Decisions from predictions
1. Descriptive
2. Predictive
3. Prescriptive
34
Phases of data analysis
1. Show churn rate against time
2. Predict which customers will churn next
3. Suggest what to do about each customer

(e.g. propose to switch plan, send promotional offer, etc.)
35
Churn analysis
“Suggest what to do about each customer”→ prioritised list of actions,
based on…
• Customer representation + context
• Churn prediction & action prediction
• Uncertainty in predictions
• Revenue brought by customer & Cost of actions
• Constraints on frequency of solicitations
36
Churn analysis
37
Pricing optimisation
Again, from David Jones (@d_jones, see source)
Decide price given product and context…
• For several price candidates (within constrained range):
• Predict # sales given product, context, price
• Multiply by price to estimate revenue
38
Pricing optimisation
Decide price given product and context…
• For several price candidates (within constrained range):
• Predict 95%-confidence lower bound on # sales given
product, context, price
• Multiply by price to estimate revenue
39
Pricing optimisation
1. Show past demand against calendar
2. Predict demand for [product] at [store] in next 2 days
3. Suggest how much to ship
• Trade-off: cost of storage vs risk of lost sales
• Constraints on order size, truck volume, capacity of people
putting stuff into shelves
40
Replenishment
AI vs humans
42
Who performs better?
+vs.
Star Wars: The Flat Awakens
by Filipe de Carvalho
vs.
43
AI performs better: Chess
+
44
AI performs better: Go
+
45
AI + Human perform better: Chess
+
46
47
Humans perform better: football
48
AI performs better: replenishment
Decisions are faster, cheaper, and better
49
AI alone performs better: replenishment
Again, from Lars Trieloff @trieloff (see source)
Decision Quality
Status Quo Predictive Prescriptive Automation
Decisionquality
1. Descriptive analysis
2. Predictive analysis
3. Prescriptive analysis
4. Automated decisions
50
Beyond prescriptive analysis
Can we trust AI to be autonomous?
• Spam filter → decide to skip inbox
• Autonomous Vehicles → decide who to kill
52
Autonomous decision-making systems
“Tool AI”vs“High-stakes autonomous AI”
53
Autonomous Vehicles
• Morality in decision-making algorithm:
• Minimize loss of life
• Account for probabilities of survival, age of occupants…

→ optimal formula?
• Sacrifice owner?
• “People are in favor of cars that sacrifice the occupant to save other
lives—as long they don’t have to drive one themselves.”
54
Autonomous Vehicles
• Need wide acceptation to get adoption and provide benefit

(e.g. save lives with AVs)
• “The public is much more likely to go along with a scenario that aligns
with their own views”
• What will the public tolerate? → experimental ethics
• Similar issues whenever AI decides for us and impacts many
Additional rules in decision making
55
High-stakes autonomous AIs
56
Performance guarantees?
“construction worker in orange safety vest
is working on road”
95%-accurate scene description
57
Performance guarantees
“black and white dog jumps over bar”
95%-accurate scene description
58
Performance guarantees
“a young boy is holding a baseball bat”
95%-accurate scene description
59
Performance guarantees
“a young boy is holding a baseball bat”
weapon
SIR, DROP THE WEAPON!
1. A robot may not injure a human being or, through inaction,
allow a human being to come to harm.
2. A robot must obey the orders given it by human beings, except
where such orders would conflict with the First Law.
3. A robot must protect its own existence as long as such
protection does not conflict with the First or Second Laws.
60
Defining desired and acceptable behavior
• Performance of predictions -> monitor accuracy
• Decisions
• Monitor AI with other AI (e.g. anomaly detection)
• Define desired and acceptable behavior

→ objectives and constraints/bounds
61
Ensuring performance of autonomous AI systems
• Context
• Predictions
• Uncertainty in predictions
• Constraints (i.e. acceptable behavior)
• Costs / benefits
• Objectives (i.e. desired behavior)
62
Decisions are based on…
• Trusting decisions when we can’t even interpret them
• Who is responsible when things go wrong?
• …
• Issues are not linked to the AI being weak or strong!
63
Other issues
Original article at stories.papis.io (with references and links)
64
Learn more
meetup.com/Bordeaux-Machine-Learning-Meetup/
@louisdorard
Merci!

Trusting AI with important decisions

  • 1.
    Trusting AI withimportant decisions @louisdorard March 26, 2016
  • 2.
  • 4.
    Amazon for DavidJones (@d_jones, see source)
  • 5.
    Amazon for DavidJones (@d_jones, see source)
  • 11.
  • 12.
  • 13.
  • 18.
    • Startups pitch •AI asks questions live to each startup • AI assigns score • Startup with highest score wins 100000 € 18 AI Startup Battle at PAPIs.io
  • 20.
  • 21.
  • 22.
  • 24.
    Bedrooms Bathrooms Surface(foot²) Year built Type Price ($) 3 1 860 1950 house 565,000 3 1 1012 1951 house 2 1.5 968 1976 townhouse 447,000 4 1315 1950 house 648,000 3 2 1599 1964 house 3 2 987 1951 townhouse 790,000 1 1 530 2007 condo 122,000 4 2 1574 1964 house 835,000 4 2001 house 855,000 3 2.5 1472 2005 house 4 3.5 1714 2005 townhouse 2 2 1113 1999 condo 1 769 1999 condo 315,000
  • 25.
    Bedrooms Bathrooms Surface(foot²) Year built Type Price ($) 3 1 860 1950 house 565,000 3 1 1012 1951 house 2 1.5 968 1976 townhouse 447,000 4 1315 1950 house 648,000 3 2 1599 1964 house 3 2 987 1951 townhouse 790,000 1 1 530 2007 condo 122,000 4 2 1574 1964 house 835,000 4 2001 house 855,000 3 2.5 1472 2005 house 4 3.5 1714 2005 townhouse 2 2 1113 1999 condo 1 769 1999 condo 315,000
  • 27.
    Bedrooms Bathrooms Surface(foot²) Year built Type Price ($) 3 1 860 1950 house 565,000 3 1 1012 1951 house 2 1.5 968 1976 townhouse 447,000 4 1315 1950 house 648,000 3 2 1599 1964 house 3 2 987 1951 townhouse 790,000 1 1 530 2007 condo 122,000 4 2 1574 1964 house 835,000 4 2001 house 855,000 3 2.5 1472 2005 house 4 3.5 1714 2005 townhouse 2 2 1113 1999 condo 1 769 1999 condo 315,000
  • 28.
    ML is aset of AI techniques where “intelligence” is built from examples
  • 30.
    30 Use cases • Real-estate •Spam filtering • City bikes • Startup competition • Reduce churn • Optimize pricing • Anticipate demand property price email spam indicator location & context #bikes startup success indicator customer churn indicator product & price #sales context demand Zillow Gmail V3 predict Preseries ChurnSpotter Amazon Blue Yonder RULES
  • 31.
    31 Use cases • Real-estate •Spam filtering • City bikes • Startup competition • Reduce churn • Optimize pricing • Anticipate demand property price email spam indicator location & context #bikes startup success indicator customer churn indicator product & price #sales context demand Zillow Gmail V3 predict Preseries ChurnSpotter Amazon Blue Yonder RULES
  • 32.
    “Weak AI” vs.“Strong AI”
  • 33.
  • 34.
    1. Descriptive 2. Predictive 3.Prescriptive 34 Phases of data analysis
  • 35.
    1. Show churnrate against time 2. Predict which customers will churn next 3. Suggest what to do about each customer
 (e.g. propose to switch plan, send promotional offer, etc.) 35 Churn analysis
  • 36.
    “Suggest what todo about each customer”→ prioritised list of actions, based on… • Customer representation + context • Churn prediction & action prediction • Uncertainty in predictions • Revenue brought by customer & Cost of actions • Constraints on frequency of solicitations 36 Churn analysis
  • 37.
    37 Pricing optimisation Again, fromDavid Jones (@d_jones, see source)
  • 38.
    Decide price givenproduct and context… • For several price candidates (within constrained range): • Predict # sales given product, context, price • Multiply by price to estimate revenue 38 Pricing optimisation
  • 39.
    Decide price givenproduct and context… • For several price candidates (within constrained range): • Predict 95%-confidence lower bound on # sales given product, context, price • Multiply by price to estimate revenue 39 Pricing optimisation
  • 40.
    1. Show pastdemand against calendar 2. Predict demand for [product] at [store] in next 2 days 3. Suggest how much to ship • Trade-off: cost of storage vs risk of lost sales • Constraints on order size, truck volume, capacity of people putting stuff into shelves 40 Replenishment
  • 41.
  • 42.
    42 Who performs better? +vs. StarWars: The Flat Awakens by Filipe de Carvalho vs.
  • 43.
  • 44.
  • 45.
    45 AI + Humanperform better: Chess +
  • 46.
  • 47.
  • 48.
  • 49.
    Decisions are faster,cheaper, and better 49 AI alone performs better: replenishment Again, from Lars Trieloff @trieloff (see source) Decision Quality Status Quo Predictive Prescriptive Automation Decisionquality
  • 50.
    1. Descriptive analysis 2.Predictive analysis 3. Prescriptive analysis 4. Automated decisions 50 Beyond prescriptive analysis
  • 51.
    Can we trustAI to be autonomous?
  • 52.
    • Spam filter→ decide to skip inbox • Autonomous Vehicles → decide who to kill 52 Autonomous decision-making systems “Tool AI”vs“High-stakes autonomous AI”
  • 53.
  • 54.
    • Morality indecision-making algorithm: • Minimize loss of life • Account for probabilities of survival, age of occupants…
 → optimal formula? • Sacrifice owner? • “People are in favor of cars that sacrifice the occupant to save other lives—as long they don’t have to drive one themselves.” 54 Autonomous Vehicles
  • 55.
    • Need wideacceptation to get adoption and provide benefit
 (e.g. save lives with AVs) • “The public is much more likely to go along with a scenario that aligns with their own views” • What will the public tolerate? → experimental ethics • Similar issues whenever AI decides for us and impacts many Additional rules in decision making 55 High-stakes autonomous AIs
  • 56.
    56 Performance guarantees? “construction workerin orange safety vest is working on road” 95%-accurate scene description
  • 57.
    57 Performance guarantees “black andwhite dog jumps over bar” 95%-accurate scene description
  • 58.
    58 Performance guarantees “a youngboy is holding a baseball bat” 95%-accurate scene description
  • 59.
    59 Performance guarantees “a youngboy is holding a baseball bat” weapon SIR, DROP THE WEAPON!
  • 60.
    1. A robotmay not injure a human being or, through inaction, allow a human being to come to harm. 2. A robot must obey the orders given it by human beings, except where such orders would conflict with the First Law. 3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws. 60 Defining desired and acceptable behavior
  • 61.
    • Performance ofpredictions -> monitor accuracy • Decisions • Monitor AI with other AI (e.g. anomaly detection) • Define desired and acceptable behavior
 → objectives and constraints/bounds 61 Ensuring performance of autonomous AI systems
  • 62.
    • Context • Predictions •Uncertainty in predictions • Constraints (i.e. acceptable behavior) • Costs / benefits • Objectives (i.e. desired behavior) 62 Decisions are based on…
  • 63.
    • Trusting decisionswhen we can’t even interpret them • Who is responsible when things go wrong? • … • Issues are not linked to the AI being weak or strong! 63 Other issues
  • 64.
    Original article atstories.papis.io (with references and links) 64 Learn more
  • 65.