Data Science | Design | Technology
https://www.meetup.com/DSDTMTL
April
28
2021
2
Data Science | Design | Technology
https://www.meetup.com/DSDTMTL
April
28
Please, don't forget to
mute yourself
(2021)
JL Maréchaux
DSDT Co-Organizer
(Google Montreal)
Simon Dagenais
Lead Data Scientist
Snitch AI
https://www.meetup.com/DSDTMTL
Agenda
3:45 - 4:00 Arrival & Networking 
4:00 - 4:15 News & Intro
4:15 - 5:15 How to QA your ML models
5:15 - 5:30: Virtual Snack & Networking
4
DSDT Meetup - April 28, 2021
5
A special thanks to our contributors…
Lorem ipsum congue
tempus
Lorem ipsum
tempus
Lorem ipsum congue
tempus
Lorem ipsum
tempus
Lorem ipsum
congue tempus
Lorem ipsum congue
tempus
Thanks
Merci
The
(virtual)
venue
sponsor
& snacks
The brains
...
DSDT Mtl meetup
Pdipiscing elit
322,722 views
DSDT Meetup
Pdipiscing elit
322,722 views
DSDT Meetup
Pdipiscing elit
322,722 views
DSDT
Pdipiscin
322,722
Virtual Meetups
Until we can do in-person events
again in Montreal…
Past (and future) presentations
available on Slideshare.
http://www.slideshare.net/DSDT_MTL
Survey: http://bit.ly/DSDTsurvey2021
Which topics should be considered for 2021 meetups (select all that apply)
7
Monthly cadence, on Wednesdays.
Incredible sessions already planned for May, June and July.
Contact us with your expectations & ideas.
ML
Validation
Reinforcement
Learning
Explainable
AI
RNN & Time
Series
Lorem ipsum
Commodo
April 28
May 26 July 21
What is coming in 2021
June 16
Your ideas,
your meetup.
http://bit.ly/DSDTsurvey2021
9
Yes No Maybe
Going?
Suscipit commodo arcu
Suscipit commodo arcu
Suscipit commodo arcu
Suscipit commodo arcu
Suscipit commodo arcu
Suscipit commodo arcu
Suscipit commodo arcu
Suscipit commodo arcu
Suscipit commodo arcu
May
26
Data Science | Design | Technology
"Autonomous navigation of stratospheric balloons
using reinforcement learning"
Google Brain
May 26
4:00 pm - 5:30 pm
Based on paper published in Nature on
December 2020
No Maybe
Data Science | Design | Technology
10
“
It's time start a new
collaboration and give
back to the community.
Our donations will help
fight against poverty
and social exclusion.
Let's build a stronger
Greater Montreal
together.
Data Science.
Design.
Technology.
More information soon….
How to QA your ML
models
Data Science | Design | Technology 11
Simon Dagenais
The genesis of an AI system
12
The failure of an AI system
13
The end of an AI system
14
How could we have prevented that:
● The model’s performance would not degrade once in
production
● Trust and willingness to pursue efforts would come from
management
Why are there no systematic QA
approaches in ML?
15
Afterall, ML models are:
● Subject to unexpected inputs
● Built in relationship with other software components
● Expected to be consistent, reliable and usable
How should we perform QA on ML
models?
16
● We should uncover and understand those core and
central functions
● We should gain insights of response to altered inputs
● We should also constantly validate the input to our
model
An efficient framework for validation
17
● Deriving feature explainability
● Robustness to random and targeted altered data
● Detecting data drift
● Other tests
Feature explainability related tests (1)
18
Risk
Errors due to a complex data pipeline.
Data coming from multiple sources
and API
Test
Many features are unimportant in
creating the prediction
Action
Pruning model and dataset
Feature explainability related tests (2)
19
Risk
Model learned erroneous and non-replicable patterns
Test
Weakly correlated features or features with
non-causal relationship with your model have strong
a contribution with the output
Action
● Adversarial training
● Data augmentation
Feature explainability related tests (3)
20
Risk
Concept drift
Test
Change in feature importance
through time
Action
● Model re-training
● Learning changes
● Pre-processing
Robustness to random and targeted noise
21
Risk
The model’s output varies widely to slight
variations in input.
Test
Evaluate the model’s performance with
random or targeted transformation of
input
Action
Data augmentation, adversarial training
Data drift
22
Risk
Evaluate whether the distribution of incoming data is similar to the training data’s
Test
Evaluate whether distribution of feature is similar in training data and production
data
Action
Re-train on non-drifting features, use data that is most similar to in-production
input for training (most recent)
Other tests
23
● Data Leakage
● Model Simplification
● Overfitting
The alternate fate of an AI system
24
The Data science team builds a robust model
On top of that, stakeholders understand:
● On which basis the model emits its prediction
● The associated risk of using the model
● That proper due diligence was conducted by the team
25
Automated scientific validation for your
ML models in a few clicks, without the
need to become an expert.
Questions ?
P.S. : We’re hiring DS!
Data Science | Design | Technology 26
Simon Dagenais
Merci / Thank You
@DsdtMtl
Data Science | Design | Technology
(Check for next DSDT meetup at https://www.meetup.com/DSDTmtl)
http://bit.ly/dsdtmtl-in

DSDT Meetup April 2021

  • 1.
    Data Science |Design | Technology https://www.meetup.com/DSDTMTL April 28 2021
  • 2.
    2 Data Science |Design | Technology https://www.meetup.com/DSDTMTL April 28 Please, don't forget to mute yourself (2021)
  • 3.
    JL Maréchaux DSDT Co-Organizer (GoogleMontreal) Simon Dagenais Lead Data Scientist Snitch AI https://www.meetup.com/DSDTMTL
  • 4.
    Agenda 3:45 - 4:00Arrival & Networking  4:00 - 4:15 News & Intro 4:15 - 5:15 How to QA your ML models 5:15 - 5:30: Virtual Snack & Networking 4 DSDT Meetup - April 28, 2021
  • 5.
    5 A special thanksto our contributors… Lorem ipsum congue tempus Lorem ipsum tempus Lorem ipsum congue tempus Lorem ipsum tempus Lorem ipsum congue tempus Lorem ipsum congue tempus Thanks Merci The (virtual) venue sponsor & snacks The brains ...
  • 6.
    DSDT Mtl meetup Pdipiscingelit 322,722 views DSDT Meetup Pdipiscing elit 322,722 views DSDT Meetup Pdipiscing elit 322,722 views DSDT Pdipiscin 322,722 Virtual Meetups Until we can do in-person events again in Montreal… Past (and future) presentations available on Slideshare. http://www.slideshare.net/DSDT_MTL
  • 7.
    Survey: http://bit.ly/DSDTsurvey2021 Which topicsshould be considered for 2021 meetups (select all that apply) 7
  • 8.
    Monthly cadence, onWednesdays. Incredible sessions already planned for May, June and July. Contact us with your expectations & ideas. ML Validation Reinforcement Learning Explainable AI RNN & Time Series Lorem ipsum Commodo April 28 May 26 July 21 What is coming in 2021 June 16 Your ideas, your meetup. http://bit.ly/DSDTsurvey2021
  • 9.
    9 Yes No Maybe Going? Suscipitcommodo arcu Suscipit commodo arcu Suscipit commodo arcu Suscipit commodo arcu Suscipit commodo arcu Suscipit commodo arcu Suscipit commodo arcu Suscipit commodo arcu Suscipit commodo arcu May 26 Data Science | Design | Technology "Autonomous navigation of stratospheric balloons using reinforcement learning" Google Brain May 26 4:00 pm - 5:30 pm Based on paper published in Nature on December 2020 No Maybe Data Science | Design | Technology
  • 10.
    10 “ It's time starta new collaboration and give back to the community. Our donations will help fight against poverty and social exclusion. Let's build a stronger Greater Montreal together. Data Science. Design. Technology. More information soon….
  • 11.
    How to QAyour ML models Data Science | Design | Technology 11 Simon Dagenais
  • 12.
    The genesis ofan AI system 12
  • 13.
    The failure ofan AI system 13
  • 14.
    The end ofan AI system 14 How could we have prevented that: ● The model’s performance would not degrade once in production ● Trust and willingness to pursue efforts would come from management
  • 15.
    Why are thereno systematic QA approaches in ML? 15 Afterall, ML models are: ● Subject to unexpected inputs ● Built in relationship with other software components ● Expected to be consistent, reliable and usable
  • 16.
    How should weperform QA on ML models? 16 ● We should uncover and understand those core and central functions ● We should gain insights of response to altered inputs ● We should also constantly validate the input to our model
  • 17.
    An efficient frameworkfor validation 17 ● Deriving feature explainability ● Robustness to random and targeted altered data ● Detecting data drift ● Other tests
  • 18.
    Feature explainability relatedtests (1) 18 Risk Errors due to a complex data pipeline. Data coming from multiple sources and API Test Many features are unimportant in creating the prediction Action Pruning model and dataset
  • 19.
    Feature explainability relatedtests (2) 19 Risk Model learned erroneous and non-replicable patterns Test Weakly correlated features or features with non-causal relationship with your model have strong a contribution with the output Action ● Adversarial training ● Data augmentation
  • 20.
    Feature explainability relatedtests (3) 20 Risk Concept drift Test Change in feature importance through time Action ● Model re-training ● Learning changes ● Pre-processing
  • 21.
    Robustness to randomand targeted noise 21 Risk The model’s output varies widely to slight variations in input. Test Evaluate the model’s performance with random or targeted transformation of input Action Data augmentation, adversarial training
  • 22.
    Data drift 22 Risk Evaluate whetherthe distribution of incoming data is similar to the training data’s Test Evaluate whether distribution of feature is similar in training data and production data Action Re-train on non-drifting features, use data that is most similar to in-production input for training (most recent)
  • 23.
    Other tests 23 ● DataLeakage ● Model Simplification ● Overfitting
  • 24.
    The alternate fateof an AI system 24 The Data science team builds a robust model On top of that, stakeholders understand: ● On which basis the model emits its prediction ● The associated risk of using the model ● That proper due diligence was conducted by the team
  • 25.
    25 Automated scientific validationfor your ML models in a few clicks, without the need to become an expert.
  • 26.
    Questions ? P.S. :We’re hiring DS! Data Science | Design | Technology 26 Simon Dagenais
  • 27.
    Merci / ThankYou @DsdtMtl Data Science | Design | Technology (Check for next DSDT meetup at https://www.meetup.com/DSDTmtl) http://bit.ly/dsdtmtl-in