Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Ajay Ramaseshan
MachinePulse.
Machine Learning & Real-world Applications
 Why Machine Learning?
 Supervised Vs Unsupervised learning.
 Machine learning methods.
 Real world applications.
Agen...
Why Machine Learning?
 Volume of data collected growing day by day.
 Data production will be 44 times greater in 2020 than in 2009.
 Every da...
 Supervised learning – class labels/ target variable known
Overview of Machine Learning Methods
 Unsupervised learning – no class labels provided, need to
detect clusters of similar items in the data.
Overview (Contd)
 Parametric Models:
Assumes prob. distribution for data, and
learn parameters from data
E.g. Naïve Bayes classifier, line...
 Generative model – learns model for generating data, given
some hidden parameters.
 Learns the joint probability distri...
 Classification – Supervised learning.
 Commonly used Methods for Classification –
Naïve Bayes
Decision tree
K neares...
 Regression – Predicting an output variable given input
variables.
 Algorithms used –
Ordinary least squares
 Partial ...
 Clustering:
Group data into clusters using similarity measures.
 Algorithms:
K-means clustering
 Density based EM alg...
 Naïve Bayes classifier
Assumes conditional independence among features.
 P(xi,xj,xk|C) = P(xi|C)P(xj|C)P(xk|C)
Naïve Ba...
 K-nearest neighbors:
Classifies the data point with the class of the k
nearest neighbors.
 Value of k decided using cro...
 Leaves indicate classes.
 Non-terminal nodes – decisions on attribute values
 Algorithms used for decision tree learni...
 Artificial Neural Networks
Modeled after the
human brain
 Consists of an input layer,
many hidden layers, and an
output...
Validation methods
 Cross validation techniques
K-fold cross validation
Leave one out Cross Validation
 ROC curve (fo...
Some real life applications of machine learning:
 Recommender systems – suggesting similar people on
Facebook/LinkedIn, s...
Wisconsin Breast Cancer dataset:
 Instances : 569
 Features : 32
 Class variable : malignant, or benign.
 Steps to cla...
Thank you!
Upcoming SlideShare
Loading in …5
×

Machine Learning and Real-World Applications

18,423 views

Published on

This presentation was created by Ajay, Machine Learning Scientist at MachinePulse, to present at a Meetup on Jan. 30, 2015. These slides provide an overview of widely used machine learning algorithms. The slides conclude with examples of real world applications.

Ajay Ramaseshan, is a Machine Learning Scientist at MachinePulse. He holds a Bachelors degree in Computer Science from NITK, Suratkhal and a Master in Machine Learning and Data Mining from Aalto University School of Science, Finland. He has extensive experience in the machine learning domain and has dealt with various real world problems.

Published in: Technology
  • Be the first to comment

Machine Learning and Real-World Applications

  1. 1. Ajay Ramaseshan MachinePulse. Machine Learning & Real-world Applications
  2. 2.  Why Machine Learning?  Supervised Vs Unsupervised learning.  Machine learning methods.  Real world applications. Agenda
  3. 3. Why Machine Learning?
  4. 4.  Volume of data collected growing day by day.  Data production will be 44 times greater in 2020 than in 2009.  Every day, 2.5 quintillion bytes of data are created, with 90 percent of the world's data created in the past two years.  Very little data will ever be looked at by a human.  Data is cheap and abundant (data warehouses, data marts); knowledge is expensive and scarce.  Knowledge Discovery is NEEDED to make sense and use of data.  Machine Learning is a technique in which computers learn from data to obtain insight and help in knowledge discovery. Why Machine Learning?
  5. 5.  Supervised learning – class labels/ target variable known Overview of Machine Learning Methods
  6. 6.  Unsupervised learning – no class labels provided, need to detect clusters of similar items in the data. Overview (Contd)
  7. 7.  Parametric Models: Assumes prob. distribution for data, and learn parameters from data E.g. Naïve Bayes classifier, linear regression etc.  Non-parametric Models: No fixed number of parameters. E.g. K-NN, histograms etc. Parametric Vs Nonparametric
  8. 8.  Generative model – learns model for generating data, given some hidden parameters.  Learns the joint probability distribution p(x,y). e.g. HMM, GMM, Naïve Bayes etc.  Discriminative model – learns dependence of unobserved variable y on observed variable x.  Tries to model the separation between classes.  Learns the conditional probability distribution p(y|x). e.g. Logistic Regression, SVM, Neural networks etc. Generative Vs Discriminative
  9. 9.  Classification – Supervised learning.  Commonly used Methods for Classification – Naïve Bayes Decision tree K nearest neighbors Neural Networks Support Vector Machines. Classification
  10. 10.  Regression – Predicting an output variable given input variables.  Algorithms used – Ordinary least squares  Partial least squares  Logistic Regression  Stepwise Regression  Support Vector Regression Neural Networks Regression
  11. 11.  Clustering: Group data into clusters using similarity measures.  Algorithms: K-means clustering  Density based EM algorithm  Hierarchical clustering.  Spectral Clustering Clustering
  12. 12.  Naïve Bayes classifier Assumes conditional independence among features.  P(xi,xj,xk|C) = P(xi|C)P(xj|C)P(xk|C) Naïve Bayes
  13. 13.  K-nearest neighbors: Classifies the data point with the class of the k nearest neighbors.  Value of k decided using cross validation K-Nearest Neighbors
  14. 14.  Leaves indicate classes.  Non-terminal nodes – decisions on attribute values  Algorithms used for decision tree learning  C4.5  ID3  CART. Decision trees
  15. 15.  Artificial Neural Networks Modeled after the human brain  Consists of an input layer, many hidden layers, and an output layer.  Multi-Layer Perceptrons, Radial Basis Functions, Kohonen Networks etc. Neural Networks
  16. 16. Validation methods  Cross validation techniques K-fold cross validation Leave one out Cross Validation  ROC curve (for binary classifier)  Confusion Matrix Evaluation of Machine Learning Methods
  17. 17. Some real life applications of machine learning:  Recommender systems – suggesting similar people on Facebook/LinkedIn, similar movies/ books etc. on Amazon,  Business applications – Customer segmentation, Customer retention, Targeted Marketing etc.  Medical applications – Disease diagnosis,  Banking – Credit card issue, fraud detection etc.  Language translation, text to speech or vice versa. Real life applications
  18. 18. Wisconsin Breast Cancer dataset:  Instances : 569  Features : 32  Class variable : malignant, or benign.  Steps to classify using k-NN  Load data into R  Data normalization  Split into training and test datasets  Train model  Evaluate model performance Breast Cancer Dataset and k-NN
  19. 19. Thank you!

×