Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Introduction to
Machine Learning &
Data Analytics
Shishir Choudhary
Why ML ?
What happened
with user at
meeting today ?
What user has
bought in past
What happened
at user house
today morning...
Why Data Analytics ?
Understanding your business/industry/users better
You want to improve your business, industry or user...
Why deep learning ?
• Better results with large amount of data

• Very high dimensions in data or time series / sequence
d...
What is ML = Optimisation Problem

Given Past Data
Parameter that
can’t be
obtained
What user has
bought in past
Parameter...
What is ML = 

Inductive Reasoning Problem
• Guess the function 

• f(1) = 1 

• f(2) = 4

• f(3) = 9 

• f(4) = 16 

• ??...
Alternative = 

Deductive Reasoning 

= Rules Based System
Simple attempt at predicting gender by rules 

• Men => Short H...
Most Popular ML
Problem Statements
• Classification 

• Given lot of examples, learn how to predict 

function(Input) -> Ou...
Classification or Regression
?
$100-120k $120-140k $140-160k $160-180k
100
120
140
160
80 sqm 100 sqm 120 sqm 140 sqm
House...
Crucial challenges
• Bias Error & Variance Error & Irreducible Error
ML Models try to balance between bias and variance. S...
Crucial Challenges
• Curse of dimensionality 

• Amount of training data needed increases exponentially with more features...
ML vs DL
0
25
50
75
100
Untitled 1
ML DL Small DL Large
Larger Training Data
Accuracy
At lower quantum of training data, M...
Models !
Models are fancy words for 

mathematical functions which are used to generalise from training data
Prediction Optimisation
Most learning algorithm work in some sort of derivative of
following steps
1. Based on inputs and ...
Regression
Takes simplifying assumption that prediction
is linear (or binomial or trinomial) relation to input(s)
E.g. Hei...
Decision Trees
Perceptron
Neural Network
Good resources
• Udacity (completely) Free Machine Learning Course 

• Josh Gordon’s Youtube Videos on Introduction to ML ...
Summary &
Q &A
Upcoming SlideShare
Loading in …5
×

Introduction to machine learning and deep learning

394 views

Published on

If you are curious what is ML all about, this is a gentle introduction to Machine Learning and Deep Learning. This includes questions such as why ML/Data Analytics/Deep Learning ? Intuitive Understanding o how they work and some models in detail. At last I share some useful resources to get started.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Introduction to machine learning and deep learning

  1. 1. Introduction to Machine Learning & Data Analytics Shishir Choudhary
  2. 2. Why ML ? What happened with user at meeting today ? What user has bought in past What happened at user house today morning ? What people with similar interest bought What has user recently shown interest in by search ML Prediction System You want to take a good decision • You can’t ask a human to take the decision due to time and dollar cost involved at the scale of problem • You are presently anyway not taking good decisions because decision is presently random , hard coded or complex rules based • You want to reduce cost or verify your own decision E.g. Taking good decision is neither easy, nor perfect, but ML makes it less error prone than simpler old alternatives
  3. 3. Why Data Analytics ? Understanding your business/industry/users better You want to improve your business, industry or users, e.g. What segments of users you have and how well each segment likes your product. How well are different players in industry doing in different segments of products they well ? Which products are growing fast ? Helps to remove bias or opinions and gives more factual insights. Also called data driven decision making. Typically involves visualisation, clustering and at times regression(ML). Sometimes this involves analysing “big” amount of data typically from world or “small” amount of data from your company itself
  4. 4. Why deep learning ? • Better results with large amount of data • Very high dimensions in data or time series / sequence data - E.g. Image, Voice, Text, Time series • End to end Vs Part of workflow
  5. 5. What is ML = Optimisation Problem
 Given Past Data Parameter that can’t be obtained What user has bought in past Parameter that can’t be obtained What people with similar interest bought What has user recently shown interest in by search What product should be recommended to user ? Imagine 20 inputs Prediction System Weight for decision = 0.2 ? Weight for decision = 0.3 ? Weight for decision = 0.5 ? ML = 
 Optimize the Weights Past Purchase Data
  6. 6. What is ML = 
 Inductive Reasoning Problem • Guess the function • f(1) = 1 • f(2) = 4 • f(3) = 9 • f(4) = 16 • ?? what is f(x) ? Involves leap of faith. Does training generalise well is the crucial question ?
  7. 7. Alternative = 
 Deductive Reasoning 
 = Rules Based System Simple attempt at predicting gender by rules • Men => Short Hair • Women => Medium/Long Hair • Arun is a Man => Arun has short hair Quite quickly, rules and their exceptions can become too complex to manage
  8. 8. Most Popular ML Problem Statements • Classification • Given lot of examples, learn how to predict 
 function(Input) -> Output one of Fixed Set Of Categories • E.g. Spam or Not • Good vs bad investment • Image classification / labelling • Regression • Given lot of examples, learn how to predict 
 function(Input) -> Continuous infinite values function • House Price estimation • Pricing a stock • Time for food to arrive for food delivery app
  9. 9. Classification or Regression ? $100-120k $120-140k $140-160k $160-180k 100 120 140 160 80 sqm 100 sqm 120 sqm 140 sqm House Pricing - Classification Class 1 Class 2 Class 3 Class 4 House Pricing - Regression Size -> Size ->
  10. 10. Crucial challenges • Bias Error & Variance Error & Irreducible Error ML Models try to balance between bias and variance. Some models by default have more bias and some have by default more variance. Some some parameters their default leaning can be adjusted. DL can reduce both bias and variance error with huge amount of data and high number of features/parameters
  11. 11. Crucial Challenges • Curse of dimensionality • Amount of training data needed increases exponentially with more features used for prediction • Hence dimensionality reduction techniques • Bad and missing data • Lot of real world data often is bad quality, with either wrong or missing values. Thus good data becomes quite useful asset today. • Feature Engineering • In ML you need to select right features to use for prediction. This is a tough problem and partly solved by domain knowledge and partly by data analysis
  12. 12. ML vs DL 0 25 50 75 100 Untitled 1 ML DL Small DL Large Larger Training Data Accuracy At lower quantum of training data, ML , Small DL models and Large DL models all perform quite similarly. At larger quantum DL is better able to utilise the additional training data. Latest Image Recognition DL Models perform better than human error rates.
  13. 13. Models ! Models are fancy words for 
 mathematical functions which are used to generalise from training data
  14. 14. Prediction Optimisation Most learning algorithm work in some sort of derivative of following steps 1. Based on inputs and initial model parameters, make a prediction 2. Check what was actual answer and calculate error 3. Try to minimise the error by adjusting the models’s parameters in direction which reduces error 4. Repeat till convergence (error % stops reducing and is low)
  15. 15. Regression Takes simplifying assumption that prediction is linear (or binomial or trinomial) relation to input(s) E.g. Height of plant by age , 
 time required to deliver food in food app
  16. 16. Decision Trees
  17. 17. Perceptron
  18. 18. Neural Network
  19. 19. Good resources • Udacity (completely) Free Machine Learning Course • Josh Gordon’s Youtube Videos on Introduction to ML • Libraries • Scikit Learn in Python - Machine Learning • Tensorflow - Deep Learning • Gensim - Natural Language Processing • http://learney.sg/#machine learning • http://learney.sg/#data science • http://learney.sg/#deep learning • http://learney.sg/#natural language processing
  20. 20. Summary & Q &A

×