Data engineering track module 2

Data Engineering Track
Pranav Prakash
Module 2

Ensuring your model stays relevant
• ML Life cycle

• Edge Deployment

• Model Feedback

• Online Learning

• AB Experiment

https://towardsdatascience.com/the-machine-learning-lifecycle-in-2021-473717c633bc

Production Grade Tools
• MLFlow, DVC

• Kube
f
low

Model Feedback Loop
• Types

• Interactive, Reinforcement

• Direct, Hidden

• Biases in Models (or Ghosts in Machines)

• Ethnicity, Gender examples

Edge Deployments
• Mobile

• iOS

• Android

• RPi, Arduino

• Industrial AI

• Automotive

Mobile Devices
• iOS - CoreML, coremltools python package

• Android - TF Lite, PyTorch

• ML Kit - On Device ML from Google

• Top Considerations

• Energy

• Resources

• Real Time

• Internet/Connectivity

Architecture
• Using Messaging Systems

• ML Model versioning etc

AB Testing
• Controlled randomised experiment to establish causality

• How does a model contribute to “business objective”?

• De
f
ine “Overall Evaluation Criterion”

AB Testing
Architecture
• De
f
ine experiment parameters (sample size,
duration)

• power (prob of false negative), level (prob of
false positive)

• Search Session = <User, Query, (Time Window)>

• Associate UUID with each session

• Split sessions between baseline (85-95%) and
experimental models.

• Capture feedback

Lessons learned
• More data or better data

• Simple models are better than complex, but complex models are
sometimes needed

• Biases in data

• Evaluation approach

• ML Engineering & Data Science

• https://www.wired.com/story/the-toxic-potential-of-youtubes-feedback-
loop/

• https://win-vector.com/2014/05/03/a-clear-picture-of-power-and-
signi
f
icance-in-ab-tests/

• https://signalvnoise.com/posts/3004-ab-testing-tech-note-determining-
sample-size

• https://abtestguide.com/abtestsize/

•

• https://twitter.com/xpranavprakash

• x@pranavprakash.in

Data engineering track module 2

Recommended

Recommended

More Related Content

Similar to Data engineering track module 2

Similar to Data engineering track module 2 (20)

More from Pranav Prakash

More from Pranav Prakash (20)

Recently uploaded

Recently uploaded (20)

Data engineering track module 2