SlideShare a Scribd company logo
1 of 61
Download to read offline
Machine learning done right
An approach to successfully building AI products
Pablo Ribalta Lorenzo
R&D Lead Engineer
pribalta@future-processing.com
Pablo Ribalta Lorenzo
Pablo Ribalta Lorenzo
Pablo Ribalta Lorenzo
Building a Machine Learning product
Machine learning done right: An approach to building successful products
Choosing your
metric
Building your
dataset
Tuning your
parameters
Comparing
your results
Pablo Ribalta Lorenzo
Building a Machine Learning product
Machine learning done right: An approach to building successful products
Choosing your
metric
Building your
dataset
Tuning your
parameters
Comparing
your results
Pablo Ribalta Lorenzo
Building a Machine Learning product
Machine learning done right: An approach to building successful products
Choosing your
metric
Building your
dataset
Tuning your
parameters
Comparing
your results
Pablo Ribalta Lorenzo
Building a Machine Learning product
Machine learning done right: An approach to building successful products
Choosing your
metric
Building your
dataset
Tuning your
parameters
Comparing
your results
Pablo Ribalta Lorenzo
Building a Machine Learning product
Machine learning done right: An approach to building successful products
Choosing your
metric
Building your
dataset
Tuning your
parameters
Comparing
your results
Pablo Ribalta Lorenzo
Choosing your metric
Machine learning done right: An approach to building successful products
MRI scan
Manual MRI segmentation
MRI scan Doctor’s prediction
Manual MRI segmentation
MRI scan
Automatic MRI segmentation
MRI scan
?
Automatic MRI segmentation
Ground truth
MRI scan
Training
Automatic MRI segmentation
Ground truth
MRI scan
ML-system prediction
Training
Automatic MRI segmentation
ML-system predictionGround truth
vs
ML-system predictionGround truth
vs
Approach #0: Pixelwise comparison
ML-system predictionGround truth
vs
Approach #0: Pixelwise comparison
Approach #1: Exploiting confusion matrices
True
positives
False
positives
True negativesFalse negatives
Approach #1: Exploiting confusion matrices
True
positives
False
positives
True negativesFalse negatives
Approach #1: Exploiting confusion matrices
True
positives
False
positives
True negativesFalse negatives
Relevant elements
Selected elements
Precision =
FPTP
TP
Recall =
TP
FN TP
What is our tendency to oversegment? What is our tendency to miss items?
[0, 1] [0, 1]
Pablo Ribalta Lorenzo
Ultimate goal: Single metric
Machine learning done right: An approach to building successful products
𝐹1 𝑠𝑐𝑜𝑟𝑒 = 2 ∗
1
1
𝑟𝑒𝑐𝑎𝑙𝑙
+
1
𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛
= 2 ∗
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
[0, 1]
Pablo Ribalta Lorenzo
Choosing your metric: Summary
Machine learning done right: An approach to building successful products
• Like business requirements, choosing a good metric comes as
result of understanding the needs and expectations of the
model’s users
• A model can be excellent in one metric, but very poor in others
• Train using the metric you plan on judging the model with
Pablo Ribalta Lorenzo
Building your dataset
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
How much data can we collect?
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
How much data can we collect?
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
Dealing with data scarcity
Machine learning done right: An approach to building successful products
Medical records
Pablo Ribalta Lorenzo
Dealing with data scarcity
Machine learning done right: An approach to building successful products
Medical records Only few patients
Pablo Ribalta Lorenzo
Machine learning done right: An approach to building successful products
Secret sauce: Data augmentation
Pablo Ribalta Lorenzo
Deformed Original
Pablo Ribalta Lorenzo
Rotation
Deformed Original
0° 45° 90°
Pablo Ribalta Lorenzo
Rotation
Horizontal
flip
Deformed Original
0° 45° 90°
Yes Yes Yes
Pablo Ribalta Lorenzo
Rotation
Horizontal
flip
Deformed Original
0° 45° 90°
Yes Yes Yes
Vertical
flip
Yes Yes Yes YesYes Yes
Pablo Ribalta Lorenzo
Building your dataset: Summary
Machine learning done right: An approach to building successful products
• Many approaches to augmenting data
• We must ensure that our dataset is balanced and correctly
describes the data’s statistical distribution
• Although not mentioned, splitting a dataset into Training,
Validation and Test is fundamental for a correct training and
evaluation of the results
Pablo Ribalta Lorenzo
Tuning your model
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
Hyperparameter optimisation
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
Hyperparameter optimisation
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
Hyperparameter optimisation
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
Hyperparameter optimisation
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
Machine learning done right: An approach to building successful products
Automatic hyper-parameter selection: Particle Swarm Optimization
Pablo Ribalta Lorenzo
Machine learning done right: An approach to building successful products
Automatic hyper-parameter selection: Particle Swarm Optimization
Pablo Ribalta Lorenzo
Machine learning done right: An approach to building successful products
• Pablo Ribalta Lorenzo, Jakub Nalepa, Michal Kawulok, Luciano Sanchez Ramos, and José
Ranilla Pastor. 2017. Particle swarm optimization for hyper-parameter selection in deep
neural networks. In Proceedings of the Genetic and Evolutionary Computation
Conference (GECCO '17). ACM, New York, NY, USA, 481-488.
• Pablo Ribalta Lorenzo, Jakub Nalepa, Luciano Sanchez Ramos, and José Ranilla Pastor.
2017. Hyper-parameter selection in deep neural networks using parallel particle swarm
optimization. In Proceedings of the Genetic and Evolutionary Computation Conference
Companion (GECCO '17). ACM, New York, NY, USA, 1864-1871.
When possible, go automatic
Pablo Ribalta Lorenzo
Tuning your model: Summary
Machine learning done right: An approach to building successful products
• Hyper-parameter optimization is probably the most time
consuming aspect of building a Machine Learning product
• We need to be confident that our selected settings will
translate well in the majority of the cases
• Use automatic approaches when possible
Pablo Ribalta Lorenzo
Comparing your results
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
• Typical human performance: 3% error
Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
• Typical human performance: 3% error
• Typical doctor performance: 1% error
Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
• Typical human performance: 3% error
• Typical doctor performance: 1% error
• Experienced doctor performance: 0.7% error
Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
• Typical human performance: 3% error
• Typical doctor performance: 1% error
• Experienced doctor performance: 0.7% error
• Team of experienced doctors performance: 0.5% error
Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
• Typical human performance: 3% error
• Typical doctor performance: 1% error
• Experienced doctor performance: 0.7% error
• Team of experienced doctors performance: 0.5% error
What is human performance?
F1 score = 0.817 F1 score = 0.845 F1 score = 0.545 F1 score = 0.801
Pablo Ribalta Lorenzo
Comparing with the state of the art
Machine learning done right: An approach to building successful products
• Superpixel segmentation algorithm
Pablo Ribalta Lorenzo
Comparing with the state of the art
Machine learning done right: An approach to building successful products
• Superpixel segmentation algorithm
3x State of the art performance for
single stage lesions
2x State of the art performance for
multiple stage lesions
Pablo Ribalta Lorenzo
Comparing with the state of the art
Machine learning done right: An approach to building successful products
• Superpixel segmentation algorithm
3x State of the art performance for
single stage lesions
Pablo Ribalta Lorenzo
Comparing your results: Summary
Machine learning done right: An approach to building successful products
• It is hard to compare with human performance, and the
majority of the time can be misleading
• We have to strive for achieving statistically significant results
across different subsets of our data
• Comparing with the state of the art is always a good idea, but
we must ensure a fair comparison
Pablo Ribalta Lorenzo
About us
Machine learning done right: An approach to building successful products
Pablo Ribalta Lorenzo
ECONIB in numbers
Machine learning done right: An approach to building successful products
• 18 months ongoing
• 8 publications
• Featured in social media
• Healthcare and research partnership
• NVIDIA Inception member
• Still more research in progress
Pablo Ribalta Lorenzo
Conclusions
Machine learning done right: An approach to building successful products
• Building ML products is possible with a rigorous scientific approach
• Maximising the performance of our model is a nuanced process that
requires a thorough understanding of the problem and the theory
behind it
• It is not only about the model, but also what’s around it
Pablo Ribalta Lorenzo
Machine learning done right
An approach to building successful ML projects
pribalta@future-processing.com
www.future-processing.pl

More Related Content

Similar to [FDD 2017] Pablo Ribalta - Machine learning done right

Similar to [FDD 2017] Pablo Ribalta - Machine learning done right (20)

Designing for Safety by Lyft Product Lead
Designing for Safety by Lyft Product LeadDesigning for Safety by Lyft Product Lead
Designing for Safety by Lyft Product Lead
 
How to Avoid Over-Optimizing in Product by Trunk Club Sr PM
How to Avoid Over-Optimizing in Product by Trunk Club Sr PMHow to Avoid Over-Optimizing in Product by Trunk Club Sr PM
How to Avoid Over-Optimizing in Product by Trunk Club Sr PM
 
Are You Making These 7 'Testing Metric' Mistakes? Webinar - Mark Bentsen, Phi...
Are You Making These 7 'Testing Metric' Mistakes? Webinar - Mark Bentsen, Phi...Are You Making These 7 'Testing Metric' Mistakes? Webinar - Mark Bentsen, Phi...
Are You Making These 7 'Testing Metric' Mistakes? Webinar - Mark Bentsen, Phi...
 
How reimagining customer journeys can build brand advocacy and customer loyalty
How reimagining customer journeys can build brand advocacy and customer loyaltyHow reimagining customer journeys can build brand advocacy and customer loyalty
How reimagining customer journeys can build brand advocacy and customer loyalty
 
[Webinar] Innovate Faster by Adopting The Modern Growth Stack
[Webinar] Innovate Faster by Adopting The Modern Growth Stack[Webinar] Innovate Faster by Adopting The Modern Growth Stack
[Webinar] Innovate Faster by Adopting The Modern Growth Stack
 
Presentations - Zarget CRO meetup 2017
Presentations - Zarget CRO meetup 2017Presentations - Zarget CRO meetup 2017
Presentations - Zarget CRO meetup 2017
 
Intro to Data Analytics with Oscar's Director of Product
 Intro to Data Analytics with Oscar's Director of Product Intro to Data Analytics with Oscar's Director of Product
Intro to Data Analytics with Oscar's Director of Product
 
Shopify Sales strategy for Retail Entrepreneurs: kachingkings.co - PDF
Shopify Sales strategy for Retail Entrepreneurs: kachingkings.co - PDFShopify Sales strategy for Retail Entrepreneurs: kachingkings.co - PDF
Shopify Sales strategy for Retail Entrepreneurs: kachingkings.co - PDF
 
Lean Startup Tools for Scrum Product Owners
Lean Startup Tools for Scrum Product OwnersLean Startup Tools for Scrum Product Owners
Lean Startup Tools for Scrum Product Owners
 
Clario Webinar 7-29-09
Clario Webinar 7-29-09Clario Webinar 7-29-09
Clario Webinar 7-29-09
 
PSU Web 2013: User Research Power Tool: Pareto Principle Based User Research
PSU Web 2013: User Research Power Tool: Pareto Principle Based User ResearchPSU Web 2013: User Research Power Tool: Pareto Principle Based User Research
PSU Web 2013: User Research Power Tool: Pareto Principle Based User Research
 
I Love APIs Europe 2015: Business Sessions
I Love APIs Europe 2015: Business SessionsI Love APIs Europe 2015: Business Sessions
I Love APIs Europe 2015: Business Sessions
 
6 Guidelines for A/B Testing
6 Guidelines for A/B Testing6 Guidelines for A/B Testing
6 Guidelines for A/B Testing
 
PM in AI-First Organizations by eBay AI Product Leader
PM in AI-First Organizations by eBay AI Product LeaderPM in AI-First Organizations by eBay AI Product Leader
PM in AI-First Organizations by eBay AI Product Leader
 
PM in AI-First Organizations by eBay AI Product Leader
PM in AI-First Organizations by eBay AI Product LeaderPM in AI-First Organizations by eBay AI Product Leader
PM in AI-First Organizations by eBay AI Product Leader
 
Stilnest.com: Wie beste Magento-Shop-Performance für zufriedene Kunden sorgt
Stilnest.com: Wie beste Magento-Shop-Performance für zufriedene Kunden sorgtStilnest.com: Wie beste Magento-Shop-Performance für zufriedene Kunden sorgt
Stilnest.com: Wie beste Magento-Shop-Performance für zufriedene Kunden sorgt
 
UX STRAT 2013: Josh Seiden, Lean UX + UX STRAT
UX STRAT 2013: Josh Seiden, Lean UX + UX STRATUX STRAT 2013: Josh Seiden, Lean UX + UX STRAT
UX STRAT 2013: Josh Seiden, Lean UX + UX STRAT
 
Growth Hacking 101: Growth Metrics, Lean Analytics & Growth Culture
Growth Hacking 101: Growth Metrics, Lean Analytics & Growth CultureGrowth Hacking 101: Growth Metrics, Lean Analytics & Growth Culture
Growth Hacking 101: Growth Metrics, Lean Analytics & Growth Culture
 
Clover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive ExperimentationClover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive Experimentation
 
20 million users and 10 million projects, how to scale like Freelancer.com
20 million users and 10 million projects, how to scale like Freelancer.com20 million users and 10 million projects, how to scale like Freelancer.com
20 million users and 10 million projects, how to scale like Freelancer.com
 

More from Future Processing

More from Future Processing (20)

DPTO_Inżynieria oprogramowania to proces uczenia się.pdf
DPTO_Inżynieria oprogramowania to proces uczenia się.pdfDPTO_Inżynieria oprogramowania to proces uczenia się.pdf
DPTO_Inżynieria oprogramowania to proces uczenia się.pdf
 
DPTO_QA w świecie wartości biznesowych.pdf
DPTO_QA w świecie wartości biznesowych.pdfDPTO_QA w świecie wartości biznesowych.pdf
DPTO_QA w świecie wartości biznesowych.pdf
 
DPTO_Hello_Clean_Architekture.pdf
DPTO_Hello_Clean_Architekture.pdfDPTO_Hello_Clean_Architekture.pdf
DPTO_Hello_Clean_Architekture.pdf
 
[Quality Meetup #20] Michał Górski - Continuous Deployment w chmurze
[Quality Meetup #20] Michał Górski - Continuous Deployment w chmurze[Quality Meetup #20] Michał Górski - Continuous Deployment w chmurze
[Quality Meetup #20] Michał Górski - Continuous Deployment w chmurze
 
[Quality Meetup #20] Dorota Tadych - Hyperion - wystarczy jeden shake
[Quality Meetup #20] Dorota Tadych - Hyperion - wystarczy jeden shake[Quality Meetup #20] Dorota Tadych - Hyperion - wystarczy jeden shake
[Quality Meetup #20] Dorota Tadych - Hyperion - wystarczy jeden shake
 
[Quality Meetup #19] Magdalena Drechsler-Nowak - Tester w pułapce myślenia
[Quality Meetup #19] Magdalena Drechsler-Nowak - Tester w pułapce myślenia[Quality Meetup #19] Magdalena Drechsler-Nowak - Tester w pułapce myślenia
[Quality Meetup #19] Magdalena Drechsler-Nowak - Tester w pułapce myślenia
 
[Quality Meetup #19] Adrian Gonciarz - Testerska ruletka
[Quality Meetup #19] Adrian Gonciarz - Testerska ruletka[Quality Meetup #19] Adrian Gonciarz - Testerska ruletka
[Quality Meetup #19] Adrian Gonciarz - Testerska ruletka
 
[FDD 2018] Krzysztof Sikora - Jak Service Fabric rozwiąże twoje problemy z mi...
[FDD 2018] Krzysztof Sikora - Jak Service Fabric rozwiąże twoje problemy z mi...[FDD 2018] Krzysztof Sikora - Jak Service Fabric rozwiąże twoje problemy z mi...
[FDD 2018] Krzysztof Sikora - Jak Service Fabric rozwiąże twoje problemy z mi...
 
[FDD 2018] Ł. Turchan, A. Hulist, M. Duchnowski - CUDA - results over coffee ...
[FDD 2018] Ł. Turchan, A. Hulist, M. Duchnowski - CUDA - results over coffee ...[FDD 2018] Ł. Turchan, A. Hulist, M. Duchnowski - CUDA - results over coffee ...
[FDD 2018] Ł. Turchan, A. Hulist, M. Duchnowski - CUDA - results over coffee ...
 
[FDD 2018] Lech Kalinowski - Prywatny Blockchain
[FDD 2018] Lech Kalinowski - Prywatny Blockchain[FDD 2018] Lech Kalinowski - Prywatny Blockchain
[FDD 2018] Lech Kalinowski - Prywatny Blockchain
 
[FDD 2018] W. Malara, K. Kotowski - Autoenkodery – czyli zalety funkcji F(X)≈X
[FDD 2018] W. Malara, K. Kotowski - Autoenkodery – czyli zalety funkcji F(X)≈X[FDD 2018] W. Malara, K. Kotowski - Autoenkodery – czyli zalety funkcji F(X)≈X
[FDD 2018] W. Malara, K. Kotowski - Autoenkodery – czyli zalety funkcji F(X)≈X
 
[FDD 2018] Jarosław Ogiegło - Ludzie, zabezpieczajcie się! Wprowadzenie do OA...
[FDD 2018] Jarosław Ogiegło - Ludzie, zabezpieczajcie się! Wprowadzenie do OA...[FDD 2018] Jarosław Ogiegło - Ludzie, zabezpieczajcie się! Wprowadzenie do OA...
[FDD 2018] Jarosław Ogiegło - Ludzie, zabezpieczajcie się! Wprowadzenie do OA...
 
[JuraSIC! Meetup] Krzysztof Sikora- Jak Service Fabric rozwiąże twoje problem...
[JuraSIC! Meetup] Krzysztof Sikora- Jak Service Fabric rozwiąże twoje problem...[JuraSIC! Meetup] Krzysztof Sikora- Jak Service Fabric rozwiąże twoje problem...
[JuraSIC! Meetup] Krzysztof Sikora- Jak Service Fabric rozwiąże twoje problem...
 
[JuraSIC! Meetup] Mateusz Stasch - Monady w .NET
[JuraSIC! Meetup] Mateusz Stasch - Monady w .NET[JuraSIC! Meetup] Mateusz Stasch - Monady w .NET
[JuraSIC! Meetup] Mateusz Stasch - Monady w .NET
 
[QE 2018] Aleksandra Kornecka – Kognitywne podejście do testowania aplikacji ...
[QE 2018] Aleksandra Kornecka – Kognitywne podejście do testowania aplikacji ...[QE 2018] Aleksandra Kornecka – Kognitywne podejście do testowania aplikacji ...
[QE 2018] Aleksandra Kornecka – Kognitywne podejście do testowania aplikacji ...
 
[QE 2018] Adam Stasiak – Nadchodzi React Native – czyli o testowaniu mobilnyc...
[QE 2018] Adam Stasiak – Nadchodzi React Native – czyli o testowaniu mobilnyc...[QE 2018] Adam Stasiak – Nadchodzi React Native – czyli o testowaniu mobilnyc...
[QE 2018] Adam Stasiak – Nadchodzi React Native – czyli o testowaniu mobilnyc...
 
[QE 2018] Łukasz Gawron – Testing Batch and Streaming Spark Applications
[QE 2018] Łukasz Gawron – Testing Batch and Streaming Spark Applications[QE 2018] Łukasz Gawron – Testing Batch and Streaming Spark Applications
[QE 2018] Łukasz Gawron – Testing Batch and Streaming Spark Applications
 
[QE 2018] Marek Puchalski – Web Application Security Test Automation
[QE 2018] Marek Puchalski – Web Application Security Test Automation[QE 2018] Marek Puchalski – Web Application Security Test Automation
[QE 2018] Marek Puchalski – Web Application Security Test Automation
 
[QE 2018] Rob Lambert – How to Thrive as a Software Tester
[QE 2018] Rob Lambert – How to Thrive as a Software Tester[QE 2018] Rob Lambert – How to Thrive as a Software Tester
[QE 2018] Rob Lambert – How to Thrive as a Software Tester
 
[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps
[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps
[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps
 

Recently uploaded

一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
RafigAliyev2
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Valters Lauzums
 

Recently uploaded (20)

一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 

[FDD 2017] Pablo Ribalta - Machine learning done right

  • 1. Machine learning done right An approach to successfully building AI products Pablo Ribalta Lorenzo R&D Lead Engineer pribalta@future-processing.com
  • 4. Pablo Ribalta Lorenzo Building a Machine Learning product Machine learning done right: An approach to building successful products Choosing your metric Building your dataset Tuning your parameters Comparing your results
  • 5. Pablo Ribalta Lorenzo Building a Machine Learning product Machine learning done right: An approach to building successful products Choosing your metric Building your dataset Tuning your parameters Comparing your results
  • 6. Pablo Ribalta Lorenzo Building a Machine Learning product Machine learning done right: An approach to building successful products Choosing your metric Building your dataset Tuning your parameters Comparing your results
  • 7. Pablo Ribalta Lorenzo Building a Machine Learning product Machine learning done right: An approach to building successful products Choosing your metric Building your dataset Tuning your parameters Comparing your results
  • 8. Pablo Ribalta Lorenzo Building a Machine Learning product Machine learning done right: An approach to building successful products Choosing your metric Building your dataset Tuning your parameters Comparing your results
  • 9. Pablo Ribalta Lorenzo Choosing your metric Machine learning done right: An approach to building successful products
  • 10. MRI scan Manual MRI segmentation
  • 11. MRI scan Doctor’s prediction Manual MRI segmentation
  • 12. MRI scan Automatic MRI segmentation
  • 13. MRI scan ? Automatic MRI segmentation
  • 15. Ground truth MRI scan ML-system prediction Training Automatic MRI segmentation
  • 19. Approach #1: Exploiting confusion matrices True positives False positives True negativesFalse negatives
  • 20. Approach #1: Exploiting confusion matrices True positives False positives True negativesFalse negatives
  • 21. Approach #1: Exploiting confusion matrices True positives False positives True negativesFalse negatives Relevant elements Selected elements
  • 22. Precision = FPTP TP Recall = TP FN TP What is our tendency to oversegment? What is our tendency to miss items? [0, 1] [0, 1]
  • 23. Pablo Ribalta Lorenzo Ultimate goal: Single metric Machine learning done right: An approach to building successful products 𝐹1 𝑠𝑐𝑜𝑟𝑒 = 2 ∗ 1 1 𝑟𝑒𝑐𝑎𝑙𝑙 + 1 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 = 2 ∗ 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙 [0, 1]
  • 24. Pablo Ribalta Lorenzo Choosing your metric: Summary Machine learning done right: An approach to building successful products • Like business requirements, choosing a good metric comes as result of understanding the needs and expectations of the model’s users • A model can be excellent in one metric, but very poor in others • Train using the metric you plan on judging the model with
  • 25. Pablo Ribalta Lorenzo Building your dataset Machine learning done right: An approach to building successful products
  • 26. Pablo Ribalta Lorenzo How much data can we collect? Machine learning done right: An approach to building successful products
  • 27. Pablo Ribalta Lorenzo How much data can we collect? Machine learning done right: An approach to building successful products
  • 28. Pablo Ribalta Lorenzo Dealing with data scarcity Machine learning done right: An approach to building successful products Medical records
  • 29. Pablo Ribalta Lorenzo Dealing with data scarcity Machine learning done right: An approach to building successful products Medical records Only few patients
  • 30. Pablo Ribalta Lorenzo Machine learning done right: An approach to building successful products Secret sauce: Data augmentation
  • 32. Pablo Ribalta Lorenzo Rotation Deformed Original 0° 45° 90°
  • 33. Pablo Ribalta Lorenzo Rotation Horizontal flip Deformed Original 0° 45° 90° Yes Yes Yes
  • 34. Pablo Ribalta Lorenzo Rotation Horizontal flip Deformed Original 0° 45° 90° Yes Yes Yes Vertical flip Yes Yes Yes YesYes Yes
  • 35. Pablo Ribalta Lorenzo Building your dataset: Summary Machine learning done right: An approach to building successful products • Many approaches to augmenting data • We must ensure that our dataset is balanced and correctly describes the data’s statistical distribution • Although not mentioned, splitting a dataset into Training, Validation and Test is fundamental for a correct training and evaluation of the results
  • 36. Pablo Ribalta Lorenzo Tuning your model Machine learning done right: An approach to building successful products
  • 37. Pablo Ribalta Lorenzo Hyperparameter optimisation Machine learning done right: An approach to building successful products
  • 38. Pablo Ribalta Lorenzo Hyperparameter optimisation Machine learning done right: An approach to building successful products
  • 39. Pablo Ribalta Lorenzo Hyperparameter optimisation Machine learning done right: An approach to building successful products
  • 40. Pablo Ribalta Lorenzo Hyperparameter optimisation Machine learning done right: An approach to building successful products
  • 41. Pablo Ribalta Lorenzo Machine learning done right: An approach to building successful products
  • 42. Pablo Ribalta Lorenzo Machine learning done right: An approach to building successful products Automatic hyper-parameter selection: Particle Swarm Optimization
  • 43. Pablo Ribalta Lorenzo Machine learning done right: An approach to building successful products Automatic hyper-parameter selection: Particle Swarm Optimization
  • 44. Pablo Ribalta Lorenzo Machine learning done right: An approach to building successful products • Pablo Ribalta Lorenzo, Jakub Nalepa, Michal Kawulok, Luciano Sanchez Ramos, and José Ranilla Pastor. 2017. Particle swarm optimization for hyper-parameter selection in deep neural networks. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '17). ACM, New York, NY, USA, 481-488. • Pablo Ribalta Lorenzo, Jakub Nalepa, Luciano Sanchez Ramos, and José Ranilla Pastor. 2017. Hyper-parameter selection in deep neural networks using parallel particle swarm optimization. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '17). ACM, New York, NY, USA, 1864-1871. When possible, go automatic
  • 45. Pablo Ribalta Lorenzo Tuning your model: Summary Machine learning done right: An approach to building successful products • Hyper-parameter optimization is probably the most time consuming aspect of building a Machine Learning product • We need to be confident that our selected settings will translate well in the majority of the cases • Use automatic approaches when possible
  • 46. Pablo Ribalta Lorenzo Comparing your results Machine learning done right: An approach to building successful products
  • 47. Pablo Ribalta Lorenzo Surpassing human performance in medical classification Machine learning done right: An approach to building successful products
  • 48. Pablo Ribalta Lorenzo Surpassing human performance in medical classification Machine learning done right: An approach to building successful products • Typical human performance: 3% error
  • 49. Pablo Ribalta Lorenzo Surpassing human performance in medical classification Machine learning done right: An approach to building successful products • Typical human performance: 3% error • Typical doctor performance: 1% error
  • 50. Pablo Ribalta Lorenzo Surpassing human performance in medical classification Machine learning done right: An approach to building successful products • Typical human performance: 3% error • Typical doctor performance: 1% error • Experienced doctor performance: 0.7% error
  • 51. Pablo Ribalta Lorenzo Surpassing human performance in medical classification Machine learning done right: An approach to building successful products • Typical human performance: 3% error • Typical doctor performance: 1% error • Experienced doctor performance: 0.7% error • Team of experienced doctors performance: 0.5% error
  • 52. Pablo Ribalta Lorenzo Surpassing human performance in medical classification Machine learning done right: An approach to building successful products • Typical human performance: 3% error • Typical doctor performance: 1% error • Experienced doctor performance: 0.7% error • Team of experienced doctors performance: 0.5% error What is human performance?
  • 53. F1 score = 0.817 F1 score = 0.845 F1 score = 0.545 F1 score = 0.801
  • 54. Pablo Ribalta Lorenzo Comparing with the state of the art Machine learning done right: An approach to building successful products • Superpixel segmentation algorithm
  • 55. Pablo Ribalta Lorenzo Comparing with the state of the art Machine learning done right: An approach to building successful products • Superpixel segmentation algorithm 3x State of the art performance for single stage lesions 2x State of the art performance for multiple stage lesions
  • 56. Pablo Ribalta Lorenzo Comparing with the state of the art Machine learning done right: An approach to building successful products • Superpixel segmentation algorithm 3x State of the art performance for single stage lesions
  • 57. Pablo Ribalta Lorenzo Comparing your results: Summary Machine learning done right: An approach to building successful products • It is hard to compare with human performance, and the majority of the time can be misleading • We have to strive for achieving statistically significant results across different subsets of our data • Comparing with the state of the art is always a good idea, but we must ensure a fair comparison
  • 58. Pablo Ribalta Lorenzo About us Machine learning done right: An approach to building successful products
  • 59. Pablo Ribalta Lorenzo ECONIB in numbers Machine learning done right: An approach to building successful products • 18 months ongoing • 8 publications • Featured in social media • Healthcare and research partnership • NVIDIA Inception member • Still more research in progress
  • 60. Pablo Ribalta Lorenzo Conclusions Machine learning done right: An approach to building successful products • Building ML products is possible with a rigorous scientific approach • Maximising the performance of our model is a nuanced process that requires a thorough understanding of the problem and the theory behind it • It is not only about the model, but also what’s around it
  • 61. Pablo Ribalta Lorenzo Machine learning done right An approach to building successful ML projects pribalta@future-processing.com www.future-processing.pl