4. Pablo Ribalta Lorenzo
Building a Machine Learning product
Machine learning done right: An approach to building successful products
Choosing your
metric
Building your
dataset
Tuning your
parameters
Comparing
your results
5. Pablo Ribalta Lorenzo
Building a Machine Learning product
Machine learning done right: An approach to building successful products
Choosing your
metric
Building your
dataset
Tuning your
parameters
Comparing
your results
6. Pablo Ribalta Lorenzo
Building a Machine Learning product
Machine learning done right: An approach to building successful products
Choosing your
metric
Building your
dataset
Tuning your
parameters
Comparing
your results
7. Pablo Ribalta Lorenzo
Building a Machine Learning product
Machine learning done right: An approach to building successful products
Choosing your
metric
Building your
dataset
Tuning your
parameters
Comparing
your results
8. Pablo Ribalta Lorenzo
Building a Machine Learning product
Machine learning done right: An approach to building successful products
Choosing your
metric
Building your
dataset
Tuning your
parameters
Comparing
your results
23. Pablo Ribalta Lorenzo
Ultimate goal: Single metric
Machine learning done right: An approach to building successful products
𝐹1 𝑠𝑐𝑜𝑟𝑒 = 2 ∗
1
1
𝑟𝑒𝑐𝑎𝑙𝑙
+
1
𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛
= 2 ∗
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
[0, 1]
24. Pablo Ribalta Lorenzo
Choosing your metric: Summary
Machine learning done right: An approach to building successful products
• Like business requirements, choosing a good metric comes as
result of understanding the needs and expectations of the
model’s users
• A model can be excellent in one metric, but very poor in others
• Train using the metric you plan on judging the model with
26. Pablo Ribalta Lorenzo
How much data can we collect?
Machine learning done right: An approach to building successful products
27. Pablo Ribalta Lorenzo
How much data can we collect?
Machine learning done right: An approach to building successful products
28. Pablo Ribalta Lorenzo
Dealing with data scarcity
Machine learning done right: An approach to building successful products
Medical records
29. Pablo Ribalta Lorenzo
Dealing with data scarcity
Machine learning done right: An approach to building successful products
Medical records Only few patients
30. Pablo Ribalta Lorenzo
Machine learning done right: An approach to building successful products
Secret sauce: Data augmentation
35. Pablo Ribalta Lorenzo
Building your dataset: Summary
Machine learning done right: An approach to building successful products
• Many approaches to augmenting data
• We must ensure that our dataset is balanced and correctly
describes the data’s statistical distribution
• Although not mentioned, splitting a dataset into Training,
Validation and Test is fundamental for a correct training and
evaluation of the results
36. Pablo Ribalta Lorenzo
Tuning your model
Machine learning done right: An approach to building successful products
42. Pablo Ribalta Lorenzo
Machine learning done right: An approach to building successful products
Automatic hyper-parameter selection: Particle Swarm Optimization
43. Pablo Ribalta Lorenzo
Machine learning done right: An approach to building successful products
Automatic hyper-parameter selection: Particle Swarm Optimization
44. Pablo Ribalta Lorenzo
Machine learning done right: An approach to building successful products
• Pablo Ribalta Lorenzo, Jakub Nalepa, Michal Kawulok, Luciano Sanchez Ramos, and José
Ranilla Pastor. 2017. Particle swarm optimization for hyper-parameter selection in deep
neural networks. In Proceedings of the Genetic and Evolutionary Computation
Conference (GECCO '17). ACM, New York, NY, USA, 481-488.
• Pablo Ribalta Lorenzo, Jakub Nalepa, Luciano Sanchez Ramos, and José Ranilla Pastor.
2017. Hyper-parameter selection in deep neural networks using parallel particle swarm
optimization. In Proceedings of the Genetic and Evolutionary Computation Conference
Companion (GECCO '17). ACM, New York, NY, USA, 1864-1871.
When possible, go automatic
45. Pablo Ribalta Lorenzo
Tuning your model: Summary
Machine learning done right: An approach to building successful products
• Hyper-parameter optimization is probably the most time
consuming aspect of building a Machine Learning product
• We need to be confident that our selected settings will
translate well in the majority of the cases
• Use automatic approaches when possible
47. Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
48. Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
• Typical human performance: 3% error
49. Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
• Typical human performance: 3% error
• Typical doctor performance: 1% error
50. Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
• Typical human performance: 3% error
• Typical doctor performance: 1% error
• Experienced doctor performance: 0.7% error
51. Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
• Typical human performance: 3% error
• Typical doctor performance: 1% error
• Experienced doctor performance: 0.7% error
• Team of experienced doctors performance: 0.5% error
52. Pablo Ribalta Lorenzo
Surpassing human performance in medical classification
Machine learning done right: An approach to building successful products
• Typical human performance: 3% error
• Typical doctor performance: 1% error
• Experienced doctor performance: 0.7% error
• Team of experienced doctors performance: 0.5% error
What is human performance?
53. F1 score = 0.817 F1 score = 0.845 F1 score = 0.545 F1 score = 0.801
54. Pablo Ribalta Lorenzo
Comparing with the state of the art
Machine learning done right: An approach to building successful products
• Superpixel segmentation algorithm
55. Pablo Ribalta Lorenzo
Comparing with the state of the art
Machine learning done right: An approach to building successful products
• Superpixel segmentation algorithm
3x State of the art performance for
single stage lesions
2x State of the art performance for
multiple stage lesions
56. Pablo Ribalta Lorenzo
Comparing with the state of the art
Machine learning done right: An approach to building successful products
• Superpixel segmentation algorithm
3x State of the art performance for
single stage lesions
57. Pablo Ribalta Lorenzo
Comparing your results: Summary
Machine learning done right: An approach to building successful products
• It is hard to compare with human performance, and the
majority of the time can be misleading
• We have to strive for achieving statistically significant results
across different subsets of our data
• Comparing with the state of the art is always a good idea, but
we must ensure a fair comparison
59. Pablo Ribalta Lorenzo
ECONIB in numbers
Machine learning done right: An approach to building successful products
• 18 months ongoing
• 8 publications
• Featured in social media
• Healthcare and research partnership
• NVIDIA Inception member
• Still more research in progress
60. Pablo Ribalta Lorenzo
Conclusions
Machine learning done right: An approach to building successful products
• Building ML products is possible with a rigorous scientific approach
• Maximising the performance of our model is a nuanced process that
requires a thorough understanding of the problem and the theory
behind it
• It is not only about the model, but also what’s around it
61. Pablo Ribalta Lorenzo
Machine learning done right
An approach to building successful ML projects
pribalta@future-processing.com
www.future-processing.pl