Introduction to
MLOps in Daily Apps
Google:
Trust and Safety
To prevent phishing @
scale with Data Analytics
and ML
Proprietary + Confidential
Today’s Agenda
1
2
3
4
5
[Summary] What is Machine Learning?
Why ML Ops + Case study?
ML Ops Implementation
Tensorflow Extended (TFX)
QnA
What is Machine Learning?
Machine Learning
Computational methods using experience to improve performance
Machine Learning
Using Computer and data to achieve objective
● Computer → Algorithm, complexity analysis, theoretical guarantees.
● Data analysis→ Statistics, probability
● Achieve Objective → Understanding the problem, simulation, evaluation,
etc
Real World Impact
What are the applications of AI and ML?
Important Resources (Teachable Machine)
https://teachablemachine.withgoogle.com/train/image
Street name
Street number
Street View
Sign
Business facade
Sign
Business name
Traffic light
Traffic sign
Street number
Google Translate
11
Frame the Problem
What is your goal?
Who are your stakeholders?
How do you add value to them?
ML Pipeline
Data Collection +
Preprocessing
Model Training
and Evaluation
Machine Learning
Operations (MLOps)
Why ML Ops?
ML Ops
Operating real ML for real Use Case
1. Model Push
2. Model Validation
3. Monitoring/Anomaly Detection
End Goal of ML Values
Implement Ideas to Reality
“More than 87% of data
science projects never make it
into production”
Source: VentureBeat
End Goal of ML Values
Implement Ideas to Reality (Source)
Organization - Managing different ecosystems like programming languages
Compute - Continuous and reliable compute resources (dedicated servers or cloud only option)
Portability - Huge dependencies on legacy systems
Seasonality - ML workloads works in patches; this need autoscaling capabilities
End Goal of ML Values
Implement to Reality
End Goal of ML Values
Paying Technical Debt
Technical debt is the ongoing cost of expedient decisions made during code implementation. It is
caused by shortcuts taken to give short-term benefit for earlier software releases and faster time-
to-market. Technical debt tends to compound.
Deferring the work to pay it off results in increasing costs, system brittleness, and reduced rates
of innovation. – D Sculley, in the Hidden Debt in Machine Learning Systems
End Goal of ML Values
Paying Technical Debt
1. Entanglement
Changing Anything Changing Everything – CACE
1. Complex pipelines
Many different pipelines interlinked with multiple disciplinaries
1. Hidden feedback loops
ML models affect reality, and reality affect ML Models.
1. Fragile Data dependency
Upstream and downstream dependencies (e.g: data schema error)
End Goal of ML Values
You don’t realize it exists.
ML Ops Implementation
Data
ML Model
Dev Ops
ML Ops = Data + ML Model + Dev Ops
ML Ops =
Data + ML Model + Dev Ops
Training
● Information
theory
● Latency
● Interpretability
Testing
● Feedback Loops
● Data Leakage
● Evaluation
metrics
Tuning:
● Optimizer
● Assumptions
Data Eng Pipeline
● Extract
● Transform
● Load
Data Prep
● Wrangling
● Cleaning
Statistical Treatment:
● Data Distribution
● Sampling
● Concept Drift
Continuous dev
● Automated testing
● Version Control
● Deployments env
● Config Control
Continuous integration
● Version Manager
● Package Manager
Monitoring:
● Logs streaming
(kafka)
● Audit
● Dashboard Viz
But Wait
Data + ML Model + Dev Ops + Compliance
Training
● Information
theory
● Latency
● Interpretability
Testing
● Feedback Loops
● Data Leakage
● Evaluation
metrics
Tuning:
● Optimizer
● Assumptions
Data Eng Pipeline
● Extract
● Transform
● Load
Data Prep
● Wrangling
● Cleaning
Statistical Treatment:
● Data Distribution
● Sampling
● Concept Drift
Continuous dev
● Automated testing
● Version Control
● Deployments env
● Config Control
Continuous integration
● Version Manager
● Package Manager
Monitoring:
● Logs streaming
(kafka)
● Audit
● Dashboard Viz
Security Testing
● Penetration Testing
● Data Mart Testing
● Malware protection
Social Eng Protection
● Employee Training
● Access Control
Threat Handling:
● Monitoring
● Kill switch
● Dashboard Viz
But Wait
Data + ML Model + Dev Ops +
Compliance
The solution: TFX Extended
Data + ML Model + Dev Ops + Compliance
TFX Extended
The solution: TFX Extended
The solution: TFX Extended
Data
The solution: TFX Extended
ML
The solution: TFX Extended
DevOps
The solution: TFX Extended
MetaData Store Audit
The solution: TFX Extended
1. Dedicated workforce
a. separation of concerns
2. Fast information transfer
a. control
3. Proactive Reaction to Feedback
a. concept drift handling
My articles
Reach out to me :)
Medium : towardsdatascience.com/@vincentkernn
Linkedin : linkedin.com/in/vincenttatan/
Survey: tiny.cc/vincent-survey

Introduction to ml ops in daily apps

  • 1.
  • 2.
    Google: Trust and Safety Toprevent phishing @ scale with Data Analytics and ML
  • 3.
    Proprietary + Confidential Today’sAgenda 1 2 3 4 5 [Summary] What is Machine Learning? Why ML Ops + Case study? ML Ops Implementation Tensorflow Extended (TFX) QnA
  • 4.
    What is MachineLearning?
  • 5.
    Machine Learning Computational methodsusing experience to improve performance
  • 6.
    Machine Learning Using Computerand data to achieve objective ● Computer → Algorithm, complexity analysis, theoretical guarantees. ● Data analysis→ Statistics, probability ● Achieve Objective → Understanding the problem, simulation, evaluation, etc
  • 8.
    Real World Impact Whatare the applications of AI and ML?
  • 9.
    Important Resources (TeachableMachine) https://teachablemachine.withgoogle.com/train/image
  • 10.
    Street name Street number StreetView Sign Business facade Sign Business name Traffic light Traffic sign Street number
  • 11.
  • 12.
    Frame the Problem Whatis your goal? Who are your stakeholders? How do you add value to them?
  • 13.
    ML Pipeline Data Collection+ Preprocessing Model Training and Evaluation Machine Learning Operations (MLOps)
  • 14.
  • 15.
    ML Ops Operating realML for real Use Case 1. Model Push 2. Model Validation 3. Monitoring/Anomaly Detection
  • 16.
    End Goal ofML Values Implement Ideas to Reality “More than 87% of data science projects never make it into production” Source: VentureBeat
  • 17.
    End Goal ofML Values Implement Ideas to Reality (Source) Organization - Managing different ecosystems like programming languages Compute - Continuous and reliable compute resources (dedicated servers or cloud only option) Portability - Huge dependencies on legacy systems Seasonality - ML workloads works in patches; this need autoscaling capabilities
  • 18.
    End Goal ofML Values Implement to Reality
  • 19.
    End Goal ofML Values Paying Technical Debt Technical debt is the ongoing cost of expedient decisions made during code implementation. It is caused by shortcuts taken to give short-term benefit for earlier software releases and faster time- to-market. Technical debt tends to compound. Deferring the work to pay it off results in increasing costs, system brittleness, and reduced rates of innovation. – D Sculley, in the Hidden Debt in Machine Learning Systems
  • 20.
    End Goal ofML Values Paying Technical Debt 1. Entanglement Changing Anything Changing Everything – CACE 1. Complex pipelines Many different pipelines interlinked with multiple disciplinaries 1. Hidden feedback loops ML models affect reality, and reality affect ML Models. 1. Fragile Data dependency Upstream and downstream dependencies (e.g: data schema error)
  • 21.
    End Goal ofML Values You don’t realize it exists.
  • 22.
  • 26.
  • 27.
  • 28.
  • 29.
    ML Ops =Data + ML Model + Dev Ops
  • 30.
    ML Ops = Data+ ML Model + Dev Ops Training ● Information theory ● Latency ● Interpretability Testing ● Feedback Loops ● Data Leakage ● Evaluation metrics Tuning: ● Optimizer ● Assumptions Data Eng Pipeline ● Extract ● Transform ● Load Data Prep ● Wrangling ● Cleaning Statistical Treatment: ● Data Distribution ● Sampling ● Concept Drift Continuous dev ● Automated testing ● Version Control ● Deployments env ● Config Control Continuous integration ● Version Manager ● Package Manager Monitoring: ● Logs streaming (kafka) ● Audit ● Dashboard Viz
  • 31.
    But Wait Data +ML Model + Dev Ops + Compliance Training ● Information theory ● Latency ● Interpretability Testing ● Feedback Loops ● Data Leakage ● Evaluation metrics Tuning: ● Optimizer ● Assumptions Data Eng Pipeline ● Extract ● Transform ● Load Data Prep ● Wrangling ● Cleaning Statistical Treatment: ● Data Distribution ● Sampling ● Concept Drift Continuous dev ● Automated testing ● Version Control ● Deployments env ● Config Control Continuous integration ● Version Manager ● Package Manager Monitoring: ● Logs streaming (kafka) ● Audit ● Dashboard Viz Security Testing ● Penetration Testing ● Data Mart Testing ● Malware protection Social Eng Protection ● Employee Training ● Access Control Threat Handling: ● Monitoring ● Kill switch ● Dashboard Viz
  • 32.
    But Wait Data +ML Model + Dev Ops + Compliance
  • 33.
    The solution: TFXExtended Data + ML Model + Dev Ops + Compliance
  • 34.
  • 35.
  • 36.
    The solution: TFXExtended Data
  • 37.
    The solution: TFXExtended ML
  • 38.
    The solution: TFXExtended DevOps
  • 39.
    The solution: TFXExtended MetaData Store Audit
  • 40.
    The solution: TFXExtended 1. Dedicated workforce a. separation of concerns 2. Fast information transfer a. control 3. Proactive Reaction to Feedback a. concept drift handling
  • 41.
  • 42.
    Reach out tome :) Medium : towardsdatascience.com/@vincentkernn Linkedin : linkedin.com/in/vincenttatan/ Survey: tiny.cc/vincent-survey

Editor's Notes

  • #2 Welcome the audience Introduce yourself Tell them broadly what you are going to talk about Transition to video
  • #5 Untuk materinya, tidak perlu terlalu dalam mas. Cukup overview saja. Karena ini intro to machine laerning dan pesertanya adalah pemula, jadi isi materi kurang lebih: 1. Apa itu machibe learning? 2. Kegunaanya 3. Jenis-jenis (supervised, unsupervised) 4. Algoritma dari supervised dan unsupervised 5. Contoh penerapanya dari setiap algoritma 6. Workflow / Alur pengerjaan project machine learning (contoh: data preprocessing, modelling, tunning, deployment, monitoring) 7. Library apa yang paling sering digunakan 8. Kemampuan dasar apa yang perlu dipersiapan
  • #9 ML has already made a huge impact in the world especially in the areas of science and health care. ML is impacting almost every industry from Manufacturing to sales and Marketing and from Agriculture to Astronomy.
  • #10 For the simple basic codes that I am going to talk about is using this material from Google Colab In case you don’t know what Google Colab is, it is an impressive tool where you can run your GPU for free using interactive notebooks environments. So if you want to run your machine learnign model quickly using Tensorflow, Keras, and many more but you don’t want to invest a lot. Then you can come to this environment. It is easy. If you are still unsure, then let me know. But for now, you can just know that we are using this training tutorial as our simple intro to CNN
  • #11 Google Maps has created Street View-style visual guides for step-by-step directions overlaid onto the real world, as viewed through the smartphone camera. Further, Google plans to integrate its Assistant, equipped with the computer vision platform Google Lens, into Maps. That way, you’ll be able to pan over a city street and see pop-ups highlighting restaurants and other locations in real time.
  • #12 Now you Google is offering offline downloads for its AI-powered translator. So if you don’t have unlimited data or you have a plan that doesn’t work internationally, you can now download neural machine translation from Google’s Android and iOS apps. Google Translate’s offline AI translations will first be available in 59 languages, including English, Arabic, Chinese, German, and Hindi, to name a few. They’ll take about 35MB per language, so they won’t use up too much of your device’s storage. Lower-specced phones should also be able to support the new update, as Google says it wants users in all markets to have access to the feature.
  • #13 P
  • #14 P
  • #15 5 real-world examples 4 Google products
  • #23 5 real-world examples 4 Google products
  • #35 5 real-world examples 4 Google products