SlideShare a Scribd company logo
Enterprise Machine Learning
Governance
7th August 2019
About me: Terence Siganakis
• BSc (Computer Science) / MSc (Bioinformatics)
• Former: Cancer research at Peter Mac (Genome analysis)
• Former: CTO of Gooroo (ASX: GOO)
• Current: CEO of Growing Data
• Data Science / Engineering consultancy
• 20 Data Scientists, Software Engineers
• Work with Enterprises like ANZ, CSL, DHHS, Metricon
Enterprise Machine Learning Governance
• Governance & why it matters
• Machine Learning in the enterprise
• Well Governed Machine Learning
• Machine Learning Governance Architecture
Governance
(& why it matters)
Governance
Governance encompasses the system by which an organisation
is controlled and operates, and the mechanisms by which it, and
its people, are held to account.
Governance Institute of Australia
Data Governance: Why bother?
• Reputation
• Data breaches
• Legislation
• Privacy Act
• GDPR, Sarbanes-Oxley
• Regulation
• APRA, ASIC, etc
ML Governance: Why bother?
ML Governance: Why bother?
It’s the price of admission into
meaningful problems
(& it makes development easier and faster)
Machine Learning
in the Enterprise
Machine Learning → Decisions
• Predictions relate to decisions
• What movie should I recommend? (Video store clerk)
• What should their credit rating be? (Credit analyst)
• Often these decisions were made by people you could
train and (if need be) fire.
• Someone owns the risk & is accountable
• ML Lets us make decisions at scale
• Based on more data, much faster (larger impact of failures)
• What happens when it all goes wrong?
• Who owns the risk?
• Impact on future projects / buy in for ML generally
Machine Learning Risks
• In-accurate predictions
• Poorly performing models when deployed
• “Good” predictions gone bad
• Models that perform well, but are
problematic
• Bias based on protected features
• Sexist, Racist
• Feedback loops
• Predictions based on bias re-enforces bias
Enterprise Machine Learning
• Enterprises are risk averse
• More to lose
• Regulatory risk, Reputational risk, Financial risk
• Enterprises have more checks and balances
• More people to convince the solution works
• More people to convince the solution won’t break
• More people to convince the solution won’t get them fired!
• Who owns the risk?
• Who gets fired if there is a problem?
Well Governed
Machine Learning
Goals for ML Governance:
Deployments should be:
• Testable
• Reliable
• Monitored
Training should be:
• Reproducible
• Traceable
• Explainable
• Documented
Reproducible
It should be possible to easily regenerate a model and its
predictions from the same source data
• “But it works on my laptop” is never acceptable
• Track source code versions (git SHA), package versions using Docker
• Track the data that was used to train the model
• Requires storing a lot of “versioned” data
• Store random seeds!
It makes debugging easier! (I can reproduce the broken model, locally)
Traceable
It should be possible to track the origins of data, and all the
processing steps which have been applied to the data.
• Micro-service based architectures lead to large numbers of data silos
• Data Pipelines / ETL is increasingly complex
• Broken pipelines lead to Broken models
• Identifying the origin of data related errors is extremely time consuming
It makes debugging easier! (I can track issues back to systems)
Explainable
It should be easy to understand why the model made a
certain prediction
• Management & Regulators are often not trusting people
• Visualization makes it easier to communicate why a new model is better
than an old one
• Ensure visualizations are outputs of model training
It makes debugging easier! (I can see where it went wrong)
Documented
ML Models are inherently complex and need to be
documented so that they can be understood by others.
• Compliance: Document that that the process is being followed
• Education: A colleague should be productive relying only on docs
• Versioned: Documentation needs to be living (and history is useful)
• Service Level Objectives
• How long should it take to train?
• How long should predictions take?
• What level of accuracy, across what measures is acceptable?
• Service Level Indicators
• Metrics for whether or not we are meeting SLO’s
It makes debugging easier! (I can get up to speed faster)
Testable
Each machine learning model should have a suite of tests,
ensuring that it not only scores well, but is resistant to bias
and can handle extreme values
• Performance testing against truly unseen data
• Validation of inputs
• Checks to ensure they are not biased
• Checks to prevent feedback loops
It makes debugging easier! (I can isolate errors, prevent regressions)
Reliable
Predictions should be reliable in the face of unseen data,
even where the unseen data is hostile
• ML Models may handle unseen / improbable data poorly (Black Swans)
• Consider adversarial examples which may lead to poor predictions
• The envelope of reliability should be well understood, and predictions out
side of it should be avoided
Fewer edge cases to consider! (Predictions are only made when confident)
Monitored
Machine Learning models need to be monitored to ensure
that their performance meets expectations
• The performance of models will change over time
• Have usage patterns / source data changed?
• Has a dependent system changed?
• Has data processing changed?
It makes debugging easier! (I can see when the model broke and why)
Machine Learning
Governance
Architecture
Governance Architecture
Architecture: Development
Architecture: Build & Test
Architecture: Train, Tune & Test
Architecture: Deploy & Monitor
Improved Governance is a Journey
• Improve governance with each release
• Create more controls
• Create more test cases
• Improve monitoring
• Improve process compliance
Thank you!
Please reach out to me for a coffee if you would like
to discuss further
• terence@growingdata.com.au
• https://growingdata.com.au

More Related Content

Similar to Enterprise Machine Learning Governance

Monitoring Models in Production
Monitoring Models in ProductionMonitoring Models in Production
Monitoring Models in Production
Jannes Klaas
 
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
Dell World
 
Tech essentials for Product managers
Tech essentials for Product managersTech essentials for Product managers
Tech essentials for Product managers
Nitin T Bhat
 
Design Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning BasicsDesign Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning Basics
Inductive Automation
 
Design Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning BasicsDesign Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning Basics
Inductive Automation
 
Bridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder SupportBridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder Support
Peter Skomoroch
 
Machine learning in Banks
Machine learning in BanksMachine learning in Banks
Machine learning in Banks
Abhishek Upadhyay
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
Weaveworks
 
Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps Practices
Peter Varhol
 
Module_1_Slide_01.pdf
Module_1_Slide_01.pdfModule_1_Slide_01.pdf
Module_1_Slide_01.pdf
FazleeKan
 
Managing Machines: The New AI Dev Stack
Managing Machines: The New AI Dev StackManaging Machines: The New AI Dev Stack
Managing Machines: The New AI Dev Stack
Peter Skomoroch
 
Improving AI Development - Dave Litwiller - Jan 11 2022 - Public
Improving AI Development - Dave Litwiller - Jan 11 2022 - PublicImproving AI Development - Dave Litwiller - Jan 11 2022 - Public
Improving AI Development - Dave Litwiller - Jan 11 2022 - Public
Dave Litwiller
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?
Srinath Perera
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
How to Use Artificial Intelligence by Microsoft Product Manager
 How to Use Artificial Intelligence by Microsoft Product Manager How to Use Artificial Intelligence by Microsoft Product Manager
How to Use Artificial Intelligence by Microsoft Product Manager
Product School
 
Machine learning systems for engineers
Machine learning systems for engineersMachine learning systems for engineers
Machine learning systems for engineers
Cameron Joannidis
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
jagan477830
 
Advancing Testing Using Axioms
Advancing Testing Using AxiomsAdvancing Testing Using Axioms
Advancing Testing Using Axioms
SQALab
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
Krishnaram Kenthapadi
 
A New Model for Testing
A New Model for TestingA New Model for Testing
A New Model for Testing
SQALab
 

Similar to Enterprise Machine Learning Governance (20)

Monitoring Models in Production
Monitoring Models in ProductionMonitoring Models in Production
Monitoring Models in Production
 
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
 
Tech essentials for Product managers
Tech essentials for Product managersTech essentials for Product managers
Tech essentials for Product managers
 
Design Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning BasicsDesign Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning Basics
 
Design Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning BasicsDesign Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning Basics
 
Bridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder SupportBridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder Support
 
Machine learning in Banks
Machine learning in BanksMachine learning in Banks
Machine learning in Banks
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
 
Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps Practices
 
Module_1_Slide_01.pdf
Module_1_Slide_01.pdfModule_1_Slide_01.pdf
Module_1_Slide_01.pdf
 
Managing Machines: The New AI Dev Stack
Managing Machines: The New AI Dev StackManaging Machines: The New AI Dev Stack
Managing Machines: The New AI Dev Stack
 
Improving AI Development - Dave Litwiller - Jan 11 2022 - Public
Improving AI Development - Dave Litwiller - Jan 11 2022 - PublicImproving AI Development - Dave Litwiller - Jan 11 2022 - Public
Improving AI Development - Dave Litwiller - Jan 11 2022 - Public
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
 
How to Use Artificial Intelligence by Microsoft Product Manager
 How to Use Artificial Intelligence by Microsoft Product Manager How to Use Artificial Intelligence by Microsoft Product Manager
How to Use Artificial Intelligence by Microsoft Product Manager
 
Machine learning systems for engineers
Machine learning systems for engineersMachine learning systems for engineers
Machine learning systems for engineers
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
 
Advancing Testing Using Axioms
Advancing Testing Using AxiomsAdvancing Testing Using Axioms
Advancing Testing Using Axioms
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
A New Model for Testing
A New Model for TestingA New Model for Testing
A New Model for Testing
 

Recently uploaded

A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 

Recently uploaded (20)

A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 

Enterprise Machine Learning Governance

  • 2. About me: Terence Siganakis • BSc (Computer Science) / MSc (Bioinformatics) • Former: Cancer research at Peter Mac (Genome analysis) • Former: CTO of Gooroo (ASX: GOO) • Current: CEO of Growing Data • Data Science / Engineering consultancy • 20 Data Scientists, Software Engineers • Work with Enterprises like ANZ, CSL, DHHS, Metricon
  • 3. Enterprise Machine Learning Governance • Governance & why it matters • Machine Learning in the enterprise • Well Governed Machine Learning • Machine Learning Governance Architecture
  • 5. Governance Governance encompasses the system by which an organisation is controlled and operates, and the mechanisms by which it, and its people, are held to account. Governance Institute of Australia
  • 6. Data Governance: Why bother? • Reputation • Data breaches • Legislation • Privacy Act • GDPR, Sarbanes-Oxley • Regulation • APRA, ASIC, etc
  • 8. ML Governance: Why bother? It’s the price of admission into meaningful problems (& it makes development easier and faster)
  • 10. Machine Learning → Decisions • Predictions relate to decisions • What movie should I recommend? (Video store clerk) • What should their credit rating be? (Credit analyst) • Often these decisions were made by people you could train and (if need be) fire. • Someone owns the risk & is accountable • ML Lets us make decisions at scale • Based on more data, much faster (larger impact of failures) • What happens when it all goes wrong? • Who owns the risk? • Impact on future projects / buy in for ML generally
  • 11. Machine Learning Risks • In-accurate predictions • Poorly performing models when deployed • “Good” predictions gone bad • Models that perform well, but are problematic • Bias based on protected features • Sexist, Racist • Feedback loops • Predictions based on bias re-enforces bias
  • 12. Enterprise Machine Learning • Enterprises are risk averse • More to lose • Regulatory risk, Reputational risk, Financial risk • Enterprises have more checks and balances • More people to convince the solution works • More people to convince the solution won’t break • More people to convince the solution won’t get them fired! • Who owns the risk? • Who gets fired if there is a problem?
  • 14. Goals for ML Governance: Deployments should be: • Testable • Reliable • Monitored Training should be: • Reproducible • Traceable • Explainable • Documented
  • 15. Reproducible It should be possible to easily regenerate a model and its predictions from the same source data • “But it works on my laptop” is never acceptable • Track source code versions (git SHA), package versions using Docker • Track the data that was used to train the model • Requires storing a lot of “versioned” data • Store random seeds! It makes debugging easier! (I can reproduce the broken model, locally)
  • 16. Traceable It should be possible to track the origins of data, and all the processing steps which have been applied to the data. • Micro-service based architectures lead to large numbers of data silos • Data Pipelines / ETL is increasingly complex • Broken pipelines lead to Broken models • Identifying the origin of data related errors is extremely time consuming It makes debugging easier! (I can track issues back to systems)
  • 17. Explainable It should be easy to understand why the model made a certain prediction • Management & Regulators are often not trusting people • Visualization makes it easier to communicate why a new model is better than an old one • Ensure visualizations are outputs of model training It makes debugging easier! (I can see where it went wrong)
  • 18. Documented ML Models are inherently complex and need to be documented so that they can be understood by others. • Compliance: Document that that the process is being followed • Education: A colleague should be productive relying only on docs • Versioned: Documentation needs to be living (and history is useful) • Service Level Objectives • How long should it take to train? • How long should predictions take? • What level of accuracy, across what measures is acceptable? • Service Level Indicators • Metrics for whether or not we are meeting SLO’s It makes debugging easier! (I can get up to speed faster)
  • 19. Testable Each machine learning model should have a suite of tests, ensuring that it not only scores well, but is resistant to bias and can handle extreme values • Performance testing against truly unseen data • Validation of inputs • Checks to ensure they are not biased • Checks to prevent feedback loops It makes debugging easier! (I can isolate errors, prevent regressions)
  • 20. Reliable Predictions should be reliable in the face of unseen data, even where the unseen data is hostile • ML Models may handle unseen / improbable data poorly (Black Swans) • Consider adversarial examples which may lead to poor predictions • The envelope of reliability should be well understood, and predictions out side of it should be avoided Fewer edge cases to consider! (Predictions are only made when confident)
  • 21. Monitored Machine Learning models need to be monitored to ensure that their performance meets expectations • The performance of models will change over time • Have usage patterns / source data changed? • Has a dependent system changed? • Has data processing changed? It makes debugging easier! (I can see when the model broke and why)
  • 28. Improved Governance is a Journey • Improve governance with each release • Create more controls • Create more test cases • Improve monitoring • Improve process compliance
  • 29. Thank you! Please reach out to me for a coffee if you would like to discuss further • terence@growingdata.com.au • https://growingdata.com.au