Machine Learning
Overview
Let’s attempt a definition ...
“ Algorithms for inferring unknowns from
knowns ”
What type of inference are we talking about ?
Statistical Inference
Where do I spot Machine Learning?
● Spam Identification
● Handwriting Recognition
● Image Recognition
● Speech Recognition
● Recommendation Systems
● Climate Modelling
Can I group these applications into
abstract categories?
● Supervised Learning
● Unsupervised Learning
Supervised Learning
● Classification
● Regression
Unsupervised Learning
● Clustering
● Density Estimation
● Dimensionality Reduction
More abstract categories ...
● Semi-supervised Learning
● Active Learning
● Reinforcement Learning
Generative vs Discriminative Models
Generative models contrast with discriminative models, in that a generative model is a full probabilistic
model of all variables, whereas a discriminative model provides a model only for the target variable(s)
conditional on the observed variables.
Discriminative model uses P(y|x)
Generative model uses P(x,y)
P(x,y) = P(x|y) * P(y) = f(x|y) * P(y)
= P(y|x) * P(x) = P(y|x) * f(x)
Thus a generative model can be used, for example, to simulate (i.e. generate) values of any variable in
the model
whereas a discriminative model allows only sampling of the target variables conditional on the
observed quantities.
Generative and Discriminative in Classification
Generative model:
are typically more flexible than discriminative models in expressing dependencies in complex
learning tasks.
more powerful as it models all variables.
estimating densities takes a lot of data and might be difficult to model and so could have worse
performance.
Examples: Naive Bayes, Hidden Markov Model
Discriminative model:
For tasks such as classification and regression that do not require the joint distribution,
discriminative models can yield superior performance.
Examples: Linear Regression, Logistic Regression
k Nearest Neighbour
D = {(x1,y1); (x2,y2); …; (xn,yn) }
where xi belongs to Rd , y is 0 or 1 // binary classification.
classifies a new point x according to majority vote of the k nearest points in D.
defines some distance metric d(xi, xj) , example euclidean distance
Probabilistic Interpretation
for some fix parameter k
Y is a random variable that has pmf defined as
P(y) = P(y | x, D) = fraction of points xi in Nk(x) such that yi = y
yest. = arg-max ( P (y | x, D))
discriminative model as we don’t have any distribution for generating x
parameter k should be chosen according to bias variance trade off or other cross validation techniques
Machine Learning

Machine Learning

  • 1.
  • 2.
    Let’s attempt adefinition ... “ Algorithms for inferring unknowns from knowns ”
  • 3.
    What type ofinference are we talking about ? Statistical Inference
  • 4.
    Where do Ispot Machine Learning? ● Spam Identification ● Handwriting Recognition ● Image Recognition ● Speech Recognition ● Recommendation Systems ● Climate Modelling
  • 5.
    Can I groupthese applications into abstract categories? ● Supervised Learning ● Unsupervised Learning
  • 6.
  • 7.
    Unsupervised Learning ● Clustering ●Density Estimation ● Dimensionality Reduction
  • 8.
    More abstract categories... ● Semi-supervised Learning ● Active Learning ● Reinforcement Learning
  • 9.
    Generative vs DiscriminativeModels Generative models contrast with discriminative models, in that a generative model is a full probabilistic model of all variables, whereas a discriminative model provides a model only for the target variable(s) conditional on the observed variables. Discriminative model uses P(y|x) Generative model uses P(x,y) P(x,y) = P(x|y) * P(y) = f(x|y) * P(y) = P(y|x) * P(x) = P(y|x) * f(x) Thus a generative model can be used, for example, to simulate (i.e. generate) values of any variable in the model whereas a discriminative model allows only sampling of the target variables conditional on the observed quantities.
  • 10.
    Generative and Discriminativein Classification Generative model: are typically more flexible than discriminative models in expressing dependencies in complex learning tasks. more powerful as it models all variables. estimating densities takes a lot of data and might be difficult to model and so could have worse performance. Examples: Naive Bayes, Hidden Markov Model Discriminative model: For tasks such as classification and regression that do not require the joint distribution, discriminative models can yield superior performance. Examples: Linear Regression, Logistic Regression
  • 11.
    k Nearest Neighbour D= {(x1,y1); (x2,y2); …; (xn,yn) } where xi belongs to Rd , y is 0 or 1 // binary classification. classifies a new point x according to majority vote of the k nearest points in D. defines some distance metric d(xi, xj) , example euclidean distance Probabilistic Interpretation for some fix parameter k Y is a random variable that has pmf defined as P(y) = P(y | x, D) = fraction of points xi in Nk(x) such that yi = y yest. = arg-max ( P (y | x, D)) discriminative model as we don’t have any distribution for generating x parameter k should be chosen according to bias variance trade off or other cross validation techniques

Editor's Notes

  • #4 Talk about F = m* (dv/dt) and black box systems. We can’t describe the system in complete mathematical detail. Hence, statistical inference.
  • #6 Start with a mathematical formulation and then proceed ahead.
  • #7 Classification : Symptoms and diseases Regression : Property rates and year of built
  • #8 Clustering : Shopping verbs and user clusters Density Estimation : Data and gaussian kernels Dimensionality Reduction : It’s completely unsupervised, you try to view the data in lower dimensions and find a perfect viewpoint for the shadow.
  • #9 Semi - supervised Learning : Active Learning : Uncertainity sampling, Query by commitee Lot of it depends on the problem, and the kind of data available, costs associated with getting the data.