SlideShare a Scribd company logo
1 of 17
MANAGING UNCERTAINTY
IN AI PERFORMANCE
TARGET SETTING
Key methods
• Monte Carlo Simulation
• AI model ‘calibration’
HOW MUCH ACCURACY
DOES YOUR PRODUCT
REQUIRE?
WORKING BACKWARD FROM
PRODUCT REQUIREMENTS
•In classical AI applications, such as CFAR and MNIST, Kaggle contests
and other online challenges, we are accustomed to hearing about
algorithms in the 95-99 percentile + of accuracy
•Frequently predictive solutions have built upon one another over a
period of years, sometimes decades, with new state of the art models
improving performance by a fraction of a percentage point
FOR MANY APPLICATIONS WITH NEW
DATASETS THE RETURN ON
INVESTMENT AND WAITING PERIOD TO
REACH THE 99TH PERCENTILE IN
ACCURACY IS PROHIBITIVE
• New datasets require long periods of data exploration to determine and eliminate errors in the dat
The data collection process
• Frequently, basic models can reach acceptable baseline levels of accuracy with in a time period tha
acceptable for early prototype development
• Not all applications require the highest level of accuracy
• Models with higher and higher levels of accuracy can have diminishing returns as they take longer
and are more expensive to train, especially when using cloud services such as AWS
• Simple models such as MLP and KNN can be implemented quickly using tools like scikit-learn and
decent results
RESULTS OF SIMPLE MODELS
APPLIED TO MNIST
HOW CAN YOU DETERMINE
WHEN YOU NEED TO INVEST
IN DEVELOPING MORE
COMPLEX METHODS?
HIGHEST RISK VS. LOWER RISK
APPLICATIONS
•For many applications the risks are clear and the lowest levels of
error possible are desirable
• Self driving cars
• Medical applications
•For many applications, the risk associated with an error is not fatal,
and the costs associated with 99+ percentile accuracy are large. In
some cases, decision boundaries are not clear to human observers
and/or labels (such as appraisal values) are not agreed upon. These
types of applications are frequent when business, financial or
economic subject matter is the target of a prediction problem, but
can also appear in other low risk applications such as chat-bots,
where occasional errors may not dissuade prospects from converting
to sales.
IDEA – USE MONTE CARLO
SIMULATION
• Use Monte Carlo to simulate algorithm performance on real data before developing algorithm
• For example, you can assess the impact of different levels of accuracy on your product performan
investing time and money into developing an AI algorithm
PROJECTED PERFORMANCE
SIMULATION
•Select percentage of known labels, in this hypothetical case “buy”
recommendations for hypothetical stocks with returns above a threshold,
and create an “AI” selected data set by randomly sampling 1 – p negative
examples to be mislabeled by hypothetical AI
•Create a model of your product or business performance
•Simulate the performance of the product or business using Monte Carlo
trials. In this case a portfolio of 50 hypothetical stocks were chosen by the AI
and compared to those chosen by a hypothetical human, with some
information, from the same universe of stocks
•Probability distribution of false identification in feature space can be
specified and tested for distributions with same mean precision
•In this hypothetical example, an algorithm with 99% accuracy would be a
good target, but should consider whether or not 80-90% would be sufficient
•Also: should consider what level of accuracy is possible (for example, by
considering variability between human experts, in light of Big Data) and
INCOMPLETE DATA
•Another question companies frequently face is whether or not the
cost and time required to gather additional data will significantly
improve model performance
•Concept: utilize simulation on existing data to estimate performance
improvements
•Can sub-sample from data to simulate missing data, either in feature
or label space
•Eliminate field entries or entire examples and track degradation of
algorithm performance
•If performance does not decrease significantly, than more data is
unlikely to be helpful
NEW DATASome population data may be available for target populations at a high level,
but predicting labels for individuals from the population requires data to be gathered and significant
Predictive features. Companies need to decide if the investment is worth it.
For example: should we gather data to predict income in the U.S. or Canada first? Can simulate perfo
determine which country would be more profitable to predict on a per-capita basis, given product or
For example, targeted advertising based on predicted income. Different distribution assumptions can
ALL OF THIS CAN BE
ACCOMPLISHED BEFORE AN
ALGORITHM IS DEVELOPED OR
DURING EARLY STAGES OF
ALGORITHM DEVELOPMENT
As we can see from the error rates of these simple algorithms on MNIST data,
which can be rapidly prototyped using existing packages, a product prototype can be
built while considering the added benefits of further development on the dataset we
need to work with by simulating the performance within our business or product
model
MODEL CALIBRATION
Guo, C., Pleiss, G., Sun,Y., Wienberger, K., (2017) On Calibration of Modern Neural Networks
Proceedings of the 34th International Conference on Machine Learning, 70, pp 1321-1303
IMPORTANCE OF CALIBRATION
• Useful when decisions need to be made or risks need to be assessed at the level of single predictio
• For example, in human-ai collaboration paradigms in which human assistance is requested for cas
Machine confidence falls below a threshold
• Investors buying single art works require risk assessments on a per-item basis
• Current calibration methods as reviewed in the referenced article asses calibration across all featur
• However, there is no reason to assume that an algorithm equally well calibrated across all subsets
• For example, there have been many cases in which facial recognition, sentiment analysis fail for pro
subgroups

More Related Content

What's hot

OpLossModels_A2015
OpLossModels_A2015OpLossModels_A2015
OpLossModels_A2015
WenSui Liu
 
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionSales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Pranov Mishra
 

What's hot (18)

Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...
Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...
Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...
 
Price optimization for high-mix, low-volume environments | Using R and Tablea...
Price optimization for high-mix, low-volume environments | Using R and Tablea...Price optimization for high-mix, low-volume environments | Using R and Tablea...
Price optimization for high-mix, low-volume environments | Using R and Tablea...
 
Integrating A.I. and Machine Learning with your Demand Forecast
Integrating A.I. and Machine Learning with your Demand ForecastIntegrating A.I. and Machine Learning with your Demand Forecast
Integrating A.I. and Machine Learning with your Demand Forecast
 
Machine Learning Application to Manufacturing using Tableau and Google by Pluto7
Machine Learning Application to Manufacturing using Tableau and Google by Pluto7Machine Learning Application to Manufacturing using Tableau and Google by Pluto7
Machine Learning Application to Manufacturing using Tableau and Google by Pluto7
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
 
Customer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in TelecomCustomer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in Telecom
 
Return on-investment (roi)
Return on-investment (roi)Return on-investment (roi)
Return on-investment (roi)
 
Feelink 2014 posts
Feelink 2014 postsFeelink 2014 posts
Feelink 2014 posts
 
OpLossModels_A2015
OpLossModels_A2015OpLossModels_A2015
OpLossModels_A2015
 
Energy Trading and Prescriptive Analytics
Energy Trading and Prescriptive AnalyticsEnergy Trading and Prescriptive Analytics
Energy Trading and Prescriptive Analytics
 
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionSales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 
DTH Case Study
DTH Case StudyDTH Case Study
DTH Case Study
 
Churn Modeling For Mobile Telecommunications
Churn Modeling For Mobile TelecommunicationsChurn Modeling For Mobile Telecommunications
Churn Modeling For Mobile Telecommunications
 
CatchSense
CatchSenseCatchSense
CatchSense
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
Use of Analytics in Procurement
Use of Analytics in ProcurementUse of Analytics in Procurement
Use of Analytics in Procurement
 
Enablers for Maturing your S&OP Processes, SherTrack
Enablers for Maturing your S&OP Processes, SherTrackEnablers for Maturing your S&OP Processes, SherTrack
Enablers for Maturing your S&OP Processes, SherTrack
 

Similar to Managing uncertainty in ai performance target setting

CollectionOptimization
CollectionOptimizationCollectionOptimization
CollectionOptimization
Mike Nguyen
 

Similar to Managing uncertainty in ai performance target setting (20)

Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
 
PPT for project (1).ppt
PPT for project (1).pptPPT for project (1).ppt
PPT for project (1).ppt
 
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Machine Learning in Customer Analytics
Machine Learning in Customer AnalyticsMachine Learning in Customer Analytics
Machine Learning in Customer Analytics
 
Data Driven Engineering 2014
Data Driven Engineering 2014Data Driven Engineering 2014
Data Driven Engineering 2014
 
Business Intelligence, Data Analytics, and AI
Business Intelligence, Data Analytics, and AIBusiness Intelligence, Data Analytics, and AI
Business Intelligence, Data Analytics, and AI
 
Is deep learning is a game changer for marketing analytics
Is deep learning is a game changer for marketing analyticsIs deep learning is a game changer for marketing analytics
Is deep learning is a game changer for marketing analytics
 
Technology Solutions for Manufacturing
Technology Solutions for ManufacturingTechnology Solutions for Manufacturing
Technology Solutions for Manufacturing
 
Modernizing legacy systems
Modernizing legacy systemsModernizing legacy systems
Modernizing legacy systems
 
Five costly mistakes applying spc [whitepaper]
Five costly mistakes applying spc [whitepaper]Five costly mistakes applying spc [whitepaper]
Five costly mistakes applying spc [whitepaper]
 
Leveraging Data Analysis for Sales
Leveraging Data Analysis for SalesLeveraging Data Analysis for Sales
Leveraging Data Analysis for Sales
 
Six Sigma Yellow Belt Training
Six Sigma Yellow Belt TrainingSix Sigma Yellow Belt Training
Six Sigma Yellow Belt Training
 
CollectionOptimization
CollectionOptimizationCollectionOptimization
CollectionOptimization
 
Four stage business analytics model
Four stage business analytics modelFour stage business analytics model
Four stage business analytics model
 
Challenges in adapting predictive analytics
Challenges  in  adapting  predictive  analyticsChallenges  in  adapting  predictive  analytics
Challenges in adapting predictive analytics
 
Smart solutions for productivity gain IQA conference 2017
Smart solutions for productivity gain   IQA conference 2017Smart solutions for productivity gain   IQA conference 2017
Smart solutions for productivity gain IQA conference 2017
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
 
Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420
 
ACT Operations Research The Company
ACT Operations Research  The CompanyACT Operations Research  The Company
ACT Operations Research The Company
 

Recently uploaded

Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
HyderabadDolls
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 

Recently uploaded (20)

Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime GiridihGiridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 

Managing uncertainty in ai performance target setting

  • 1. MANAGING UNCERTAINTY IN AI PERFORMANCE TARGET SETTING Key methods • Monte Carlo Simulation • AI model ‘calibration’
  • 2. HOW MUCH ACCURACY DOES YOUR PRODUCT REQUIRE?
  • 3. WORKING BACKWARD FROM PRODUCT REQUIREMENTS •In classical AI applications, such as CFAR and MNIST, Kaggle contests and other online challenges, we are accustomed to hearing about algorithms in the 95-99 percentile + of accuracy •Frequently predictive solutions have built upon one another over a period of years, sometimes decades, with new state of the art models improving performance by a fraction of a percentage point
  • 4.
  • 5.
  • 6. FOR MANY APPLICATIONS WITH NEW DATASETS THE RETURN ON INVESTMENT AND WAITING PERIOD TO REACH THE 99TH PERCENTILE IN ACCURACY IS PROHIBITIVE • New datasets require long periods of data exploration to determine and eliminate errors in the dat The data collection process • Frequently, basic models can reach acceptable baseline levels of accuracy with in a time period tha acceptable for early prototype development • Not all applications require the highest level of accuracy • Models with higher and higher levels of accuracy can have diminishing returns as they take longer and are more expensive to train, especially when using cloud services such as AWS • Simple models such as MLP and KNN can be implemented quickly using tools like scikit-learn and decent results
  • 7. RESULTS OF SIMPLE MODELS APPLIED TO MNIST
  • 8. HOW CAN YOU DETERMINE WHEN YOU NEED TO INVEST IN DEVELOPING MORE COMPLEX METHODS?
  • 9. HIGHEST RISK VS. LOWER RISK APPLICATIONS •For many applications the risks are clear and the lowest levels of error possible are desirable • Self driving cars • Medical applications •For many applications, the risk associated with an error is not fatal, and the costs associated with 99+ percentile accuracy are large. In some cases, decision boundaries are not clear to human observers and/or labels (such as appraisal values) are not agreed upon. These types of applications are frequent when business, financial or economic subject matter is the target of a prediction problem, but can also appear in other low risk applications such as chat-bots, where occasional errors may not dissuade prospects from converting to sales.
  • 10. IDEA – USE MONTE CARLO SIMULATION • Use Monte Carlo to simulate algorithm performance on real data before developing algorithm • For example, you can assess the impact of different levels of accuracy on your product performan investing time and money into developing an AI algorithm
  • 12. SIMULATION •Select percentage of known labels, in this hypothetical case “buy” recommendations for hypothetical stocks with returns above a threshold, and create an “AI” selected data set by randomly sampling 1 – p negative examples to be mislabeled by hypothetical AI •Create a model of your product or business performance •Simulate the performance of the product or business using Monte Carlo trials. In this case a portfolio of 50 hypothetical stocks were chosen by the AI and compared to those chosen by a hypothetical human, with some information, from the same universe of stocks •Probability distribution of false identification in feature space can be specified and tested for distributions with same mean precision •In this hypothetical example, an algorithm with 99% accuracy would be a good target, but should consider whether or not 80-90% would be sufficient •Also: should consider what level of accuracy is possible (for example, by considering variability between human experts, in light of Big Data) and
  • 13. INCOMPLETE DATA •Another question companies frequently face is whether or not the cost and time required to gather additional data will significantly improve model performance •Concept: utilize simulation on existing data to estimate performance improvements •Can sub-sample from data to simulate missing data, either in feature or label space •Eliminate field entries or entire examples and track degradation of algorithm performance •If performance does not decrease significantly, than more data is unlikely to be helpful
  • 14. NEW DATASome population data may be available for target populations at a high level, but predicting labels for individuals from the population requires data to be gathered and significant Predictive features. Companies need to decide if the investment is worth it. For example: should we gather data to predict income in the U.S. or Canada first? Can simulate perfo determine which country would be more profitable to predict on a per-capita basis, given product or For example, targeted advertising based on predicted income. Different distribution assumptions can
  • 15. ALL OF THIS CAN BE ACCOMPLISHED BEFORE AN ALGORITHM IS DEVELOPED OR DURING EARLY STAGES OF ALGORITHM DEVELOPMENT As we can see from the error rates of these simple algorithms on MNIST data, which can be rapidly prototyped using existing packages, a product prototype can be built while considering the added benefits of further development on the dataset we need to work with by simulating the performance within our business or product model
  • 16. MODEL CALIBRATION Guo, C., Pleiss, G., Sun,Y., Wienberger, K., (2017) On Calibration of Modern Neural Networks Proceedings of the 34th International Conference on Machine Learning, 70, pp 1321-1303
  • 17. IMPORTANCE OF CALIBRATION • Useful when decisions need to be made or risks need to be assessed at the level of single predictio • For example, in human-ai collaboration paradigms in which human assistance is requested for cas Machine confidence falls below a threshold • Investors buying single art works require risk assessments on a per-item basis • Current calibration methods as reviewed in the referenced article asses calibration across all featur • However, there is no reason to assume that an algorithm equally well calibrated across all subsets • For example, there have been many cases in which facial recognition, sentiment analysis fail for pro subgroups