Successfully reported this slideshow.
Lecture #1: Introduction to machine learning (ML)
Hardware speed and capability increases at a faster rate to software. Th...
Two classes of customers asking for a loan: low-risk and high-risk.
       Input features are their income and savings.


...
ML tries to learn relations or patterns in the data components (also called as attributes or features)
    ML program can ...
Size (dimensionality) of the data can be big
         Document classification may have too many words.
         Use featur...
Upcoming SlideShare
Loading in …5
×

Lecture #1: Introduction to machine learning (ML)

467 views

Published on

  • Be the first to comment

  • Be the first to like this

Lecture #1: Introduction to machine learning (ML)

  1. 1. Lecture #1: Introduction to machine learning (ML) Hardware speed and capability increases at a faster rate to software. The gap is increasing daily. Programs still need to be crafted (handmade) by programmers. Since the 1950s, computer scientists have tried to give computers the ability to learn. ML (Mitchell) – Subfield of AI concerned with computer programs that learn from experience. ML is building computer programs that improve its performance (its learning) of doing some task using observed data or past experience. An ML program (learner) tries to learn from the observed data (examples) and generates a model that could respond (predict) to future data or describe the data seen. A model is then, a structure that represents or summarizes some data. Example: ML program gets a set of patient cases with their diagnoses. The program will either: Predict a disease present in future patients, or Describe the relationship between diseases and symptoms ML is like searching a very large space of hypotheses to find the one that best fits the observed data, and that can generalize well with observations outside the training set. Great goal of ML Tell the computer what task we want it to perform and make it to learn to perform that task efficiently. ML: emphasis on learning, different than expert systems: emphasis on expert knowledge Expert systems don't learn from experiences They encode expert knowledge about ho they make particular kinds of decisions. ML is an interdisciplinary field using principles and concepts from Statistics, Computer Science, Applied Mathematics, Cognitive Science, Engineering, Economics, Neuroscience ML includes algorithms and techniques found in Data Mining, Pattern Recognition, Neural Networks … ML: When? When expertise does not exist (navigating on Mars) Solution cannot be expressed bu a deterministic equation (face recognition) Solution changes in time (routing on a computer network) Solution needs to be adapted to particular cases (user biometrics) ML: Applications Medicine diagnosis Market basket analysis Image/text retrieval Automatic speech recognition Object, face, or hand writing recognition Financial prediction Bioinformatics (e.g., protein structure prediction) Robotics Types of Learning Supervised learning It occurs when the observed data includes the correct or expected output. Learning is called: Detection if output is binary (Y/N, 0/1, True/False). Example: Fraud detection Classification if output is one of several classes (e.g., output is either low, medium, or high). Example: Credit Scoring 1
  2. 2. Two classes of customers asking for a loan: low-risk and high-risk. Input features are their income and savings. Classifier using discriminant: IF income > θ1 AND savings > θ2 THEN low-risk ELSE high-risk Finding the right values for θ1 and θ2 is part of learning Other classifiers use a density estimation functions (instead of finding a discriminant) where each class is represented by a probability density function (e.g., Gaussian) Several classification applications: Face recognition, character recognition, speech recognition, medical diagnosis ... Regression if output is a real value. Example: Determining the price of a car x: car attributes, y: price (y = wx+w0) Finding the right values for w parameters and regression model (e.g., linear, quadratic) is learning. y = wx+w0 Unsupervised learning When the correct output is not given with the observed data. 2
  3. 3. ML tries to learn relations or patterns in the data components (also called as attributes or features) ML program can group the observed data into classes and assign the observations to these classes. Learning is called clustering. Finding the right number of classes and their centers or discriminant is learning. Clustering is used in customer segmentation in CRM, in learning motifs (sequence of aminoacids that occur in proteins) in Bioinformatics Other types of unsupervised learners will be introduced later. Reinforcement learning When the correct output is a sequence of actions, and not a single action or output. The model produces actions and rewards (or punishments). The goal is to find a set of actions that maximizes rewards (and minimizes punishments). Example: Game playing where a single move by itself is not important. ML evaluates a sequence of moves and determine whether how good is the game playing policy. Other applications: robot navigation, Concept learning Learn to predict the value of some concept (e.g., playing some sport) given values of some attributes (e.g., temperature, humidity, wind speed, sky outlook) for some past observations or examples. Values of a past example: outlook=sunny, temperature=hot, humidity=high, windy=false, play = NO Other types of learning: concept learning, instance-based learning, explanation-based learning, bayesian learning, case-based learning, statistical learning Generalization Machine learner uses a collection of observations (called training set) for learning Good generalization requires the reduction of error during the evaluation of a learner using a testing set Avoid model overfitting that happens when the training error is low and the generalization error is high. For example, in regression, you can find a polynomial of order n-1 that fits exact n points. Training error: 0 It does mean that the model will perform well with unseen data. Learning process Learning is a process that consists of: 1. Data selection Data may need to be cleaned and preprocessed. 2. Feature selection 3
  4. 4. Size (dimensionality) of the data can be big Document classification may have too many words. Use features that are easier to extract and less sensitive to noise. Divide the dataset into a training dataset and a testing dataset. 3. Model selection A lot of guessing here. Select the model (or model set) and error function. Select the simplest model first, then select another class of model. Avoid overfitting 4. Learning Train the learner or model. Find the parameter values by minimizing the error function. 5. Evaluation Learner is evaluated on the testing dataset. You may need to select another model, or switch to a different set of features. 6. Application Apply the learning model. For example, perform prediction with new, unseen data using learned model. In this course you will learn the following: Different learning problems and their solutions Choose a right model for a learning problem Finding good parameter values for the models Selecting good features or parameters as input to a model The evaluation of a machine learner 4

×