Machine Learning
In Real Life
Richard Ackon
@esquire_gh
About Me
● Software Engineer @ mPharma
● Technical Reviewer, Data Science @ Packt
● Writer @ AnalyticsVidhya
Disclaimer
I’m not going to talk about:
● What machine learning is
● Applications of machine learning
● What algorithms to use
● What Frameworks or libraries to use
What I will talk about
● The stuff that most tutorials don’t cover
○ Version Control
○ Testing
○ Performance Metrics
○ Reproducibility
○ Going to Production
○ Ethics
● Lessons learned from building and deploying models to production
● My 2 pesewas on how to get started
Overly Simplified Diagram
image_by : Jade Abbott, @alienelf
Version Control
Recording changes to certain components of the machine learning process so you can recall
specific versions from later.
What to Version?
● The general idea is to try versioning anything that requires iteration and continuous
improvement
● Most important components to version
● Code
● Data
● Models
Versioning tools
Testing
● Data cleaning , Modelling, Deployment are all done with code; so treat them as such.
● Test your Data if possible
An idea for the brave
● Continuous Integration for data
Performance Metrics
It’s always good to have one number that tells you how good your model is.
But,
In some cases, you need to select your evaluation metric based with some amount
of
domain expertise.
Reproducibility
The ability to replicate a data science experiment using the same data and code running in the
same environment, producing the same results.
“non-reproducible single occurrences are of no significance to science.” -
Karl Popper
Reproducibility - How
Docker
Interpretability
Going to Production
Production means getting your application used by its intended audience in a real world
situation.
Requirements:
● Accessibility
● Performance
● Fault Tolerance
● Scalability
● Maintenance
ETHICS
Some General
Lessons Learned
Not everything is a machine learning problem
Sometimes bad data is all you have
……… and that’s OK!
Choosing ML libraries and Frameworks
● Focus on people over tools
● Think of stability in production
● If you’re still tied, Follow the crowd
Choosing Deep Learning Architectures
● A good place to start: Research papers
● General Advice: Try to overfit, and add regularization to generalize
My 2 pesewas on how to get started in ML/DS
● Understand what data science is and how it can be used
● Learn the basics
○ Data Science from Scratch from O’Reilly
○ Doing Data Science by Cathy O’Neil and Rachel Schutt
● Work on projects
○ Kaggle
○ Zindi
● Read other people’s work
○ Paperswithcode
○ Medium
○ ArXiv
● Attend events like this and continue solving more problems
● Learn the rest as you go
Thank You!

Day 2 (Lecture 5): A Practitioner's Perspective on Building Machine Product in Real Life

  • 1.
    Machine Learning In RealLife Richard Ackon @esquire_gh
  • 2.
    About Me ● SoftwareEngineer @ mPharma ● Technical Reviewer, Data Science @ Packt ● Writer @ AnalyticsVidhya
  • 3.
    Disclaimer I’m not goingto talk about: ● What machine learning is ● Applications of machine learning ● What algorithms to use ● What Frameworks or libraries to use
  • 4.
    What I willtalk about ● The stuff that most tutorials don’t cover ○ Version Control ○ Testing ○ Performance Metrics ○ Reproducibility ○ Going to Production ○ Ethics ● Lessons learned from building and deploying models to production ● My 2 pesewas on how to get started
  • 6.
    Overly Simplified Diagram image_by: Jade Abbott, @alienelf
  • 7.
    Version Control Recording changesto certain components of the machine learning process so you can recall specific versions from later. What to Version? ● The general idea is to try versioning anything that requires iteration and continuous improvement ● Most important components to version ● Code ● Data ● Models
  • 8.
  • 9.
    Testing ● Data cleaning, Modelling, Deployment are all done with code; so treat them as such. ● Test your Data if possible An idea for the brave ● Continuous Integration for data
  • 10.
    Performance Metrics It’s alwaysgood to have one number that tells you how good your model is. But, In some cases, you need to select your evaluation metric based with some amount of domain expertise.
  • 11.
    Reproducibility The ability toreplicate a data science experiment using the same data and code running in the same environment, producing the same results. “non-reproducible single occurrences are of no significance to science.” - Karl Popper
  • 12.
  • 13.
  • 14.
    Going to Production Productionmeans getting your application used by its intended audience in a real world situation. Requirements: ● Accessibility ● Performance ● Fault Tolerance ● Scalability ● Maintenance
  • 15.
  • 16.
  • 17.
    Not everything isa machine learning problem
  • 18.
    Sometimes bad datais all you have ……… and that’s OK!
  • 19.
    Choosing ML librariesand Frameworks ● Focus on people over tools ● Think of stability in production ● If you’re still tied, Follow the crowd
  • 20.
    Choosing Deep LearningArchitectures ● A good place to start: Research papers ● General Advice: Try to overfit, and add regularization to generalize
  • 21.
    My 2 pesewason how to get started in ML/DS ● Understand what data science is and how it can be used ● Learn the basics ○ Data Science from Scratch from O’Reilly ○ Doing Data Science by Cathy O’Neil and Rachel Schutt ● Work on projects ○ Kaggle ○ Zindi ● Read other people’s work ○ Paperswithcode ○ Medium ○ ArXiv ● Attend events like this and continue solving more problems ● Learn the rest as you go
  • 22.