Machine Learning


Published on

  • Be the first to comment

Machine Learning

  1. 1. Introduction to Machine Learning Algorithms
  2. 2. What is Artificial Intelligence (AI)? <ul><li>Design and study of computer programs that behave intelligently . </li></ul><ul><li>Designing computer programs to make computers smarter . </li></ul><ul><li>Study of how to make computers do things at which, at the moment, people are better . </li></ul>
  3. 3. Research Areas and Approaches Artificial Intelligence Research Rationalism (Logical) Empiricism (Statistical) Connectionism (Neural) Evolutionary (Genetic) Biological (Molecular) Paradigm Application Intelligent Agents Information Retrieval Electronic Commerce Data Mining Bioinformatics Natural Language Proc. Expert Systems Learning Algorithms Inference Mechanisms Knowledge Representation Intelligent System Architecture
  4. 4. Concept of Machine Learning
  5. 5.                                               
  6. 6. Context Information Theory Computer Science (AI) Cognitive Science Statistics Machine Learning
  7. 7. Why Machine Learning? <ul><li>Recent progress in algorithms and theory </li></ul><ul><li>Growing flood of online data </li></ul><ul><li>Computational power is available </li></ul><ul><li>Budding industry </li></ul><ul><li>Three niches for machine learning </li></ul><ul><li>Data mining : using historical data to improve decisions </li></ul><ul><ul><li>Medical records --> medical knowledge </li></ul></ul><ul><li>Software applications we can’t program by hand </li></ul><ul><ul><li>Autonomous driving </li></ul></ul><ul><ul><li>Speech recognition </li></ul></ul><ul><li>Self-customizing programs </li></ul><ul><ul><li>Newsreader that learns user interests </li></ul></ul>
  8. 8. Learning: Definition <ul><li>Definition </li></ul><ul><ul><li>Learning is the improvement of performance in some environment through the acquisition of knowledge resulting from experience in that environment. </li></ul></ul>the improvement of behavior on some performance task through acquisition of knowledge based on partial task experience
  9. 9. A Learning Problem: EnjoySport Sky What is the general concept? Temp Humid Wind Water Forecast EnjoySports Sunny Warm Normal Strong Warm Same Yes Sunny Warm High Strong Warm Same Yes Rainy Cold High Strong Warm Change No Sunny Warm High Strong Cool Change Yes
  10. 10. Metaphors and Methods Neurobiology Biological Evolution Heuristic Search Statistical Inference Memory and Retrieval Connectionist Learning Genetic Learning Tree / Rule Induction Case-Based Learning Probabilistic Induction
  11. 11. What is the Learning Problem? <ul><li>Learning = improving with experience at some task </li></ul><ul><ul><li>Improve over task T , </li></ul></ul><ul><ul><li>With respect to performance measure P , </li></ul></ul><ul><ul><li>Based on experience E . </li></ul></ul><ul><li>E.g., Learn to play checkers </li></ul><ul><li>T : Play checkers </li></ul><ul><li>P : % of games won in world tournament </li></ul><ul><li>E : opportunity to play against self </li></ul>
  12. 12. Machine Learning: Tasks <ul><li>Supervised Learning </li></ul><ul><ul><li>Estimate an unknown mapping from known input- output pairs </li></ul></ul><ul><ul><li>Learn f w from training set D ={( x , y )} s.t. </li></ul></ul><ul><ul><li>Classification : y is discrete </li></ul></ul><ul><ul><li>Regression : y is continuous </li></ul></ul><ul><li>Unsupervised Learning </li></ul><ul><ul><li>Only input values are provided </li></ul></ul><ul><ul><li>Learn f w from D ={( x )} s.t. </li></ul></ul><ul><ul><li>Compression </li></ul></ul><ul><ul><li>Clustering </li></ul></ul><ul><li>Reinforcement Learning </li></ul>
  13. 13. Machine Learning: Strategies <ul><li>Rote learning </li></ul><ul><li>Concept learning </li></ul><ul><li>Learning from examples </li></ul><ul><li>Learning by instruction </li></ul><ul><li>Inductive learning </li></ul><ul><li>Deductive learning </li></ul><ul><li>Explanation-based learning (EBL) </li></ul><ul><li>Learning by analogy </li></ul><ul><li>Learning by observation </li></ul>
  14. 14. Supervised Learning <ul><li>Given a sequence of input/output pairs of the form < x i , y i >, where x i is a possible input and y i is the output associated with x i . </li></ul><ul><li>Learn a function f that accounts for the examples seen so far, f(x i ) = y i for all i , and that makes a good guess for the outputs of the inputs that it has not seen. </li></ul>
  15. 15. Examples of Input-Output Pairs Task Inputs Outputs Recognition Action Janitor robot problem Descriptions of objects Classes that the objects belong to Actions or predictions Descriptions of situations Descriptions of offices (floor, prof’s office) Yes or No (indicating whether or not the office contains a recycling bin)
  16. 16. Unsupervised Learning <ul><li>Clustering </li></ul><ul><ul><li>A clustering algorithm partitions the inputs into a fixed number of subsets or clusters so that inputs in the same cluster are close to one another. </li></ul></ul><ul><li>Discovery learning </li></ul><ul><ul><li>The objective is to uncover new relations in the data. </li></ul></ul>
  17. 17. Online and Batch Learning <ul><li>Batch methods </li></ul><ul><ul><li>Process large sets of examples all at once . </li></ul></ul><ul><li>Online (incremental) methods </li></ul><ul><ul><li>Process examples one at a time. </li></ul></ul>
  18. 18. Machine Learning Algorithms and Applications
  19. 19. Machine Learning Algorithms <ul><li>Neural Learning </li></ul><ul><ul><li>Multilayer Perceptrons (MLPs) </li></ul></ul><ul><ul><li>Self-Organizing Maps (SOMs) </li></ul></ul><ul><li>Evolutionary Learning </li></ul><ul><ul><li>Genetic Algorithms </li></ul></ul><ul><li>Probabilistic Learning </li></ul><ul><ul><li>Bayesian Networks (BNs) </li></ul></ul><ul><li>Other Machine Learning Methods </li></ul><ul><ul><li>Decision Trees (DTs) </li></ul></ul>
  20. 20. Neural Nets for Handwritten Digit Recognition <ul><li>… </li></ul>… Pre-processing … … … Input units Hidden units Output units 0 1 2 3 9 … Training Test … … … 0 1 2 3 9 ? …
  21. 21. ALVINN System: Neural Network Learning to Steer an Autonomous Vehicle
  22. 22. Learning to Navigate a Vehicle by Observing an Human Expert (1/2) <ul><li>Inputs </li></ul><ul><ul><li>The images produces by a camera mounted on the vehicle </li></ul></ul><ul><li>Outputs </li></ul><ul><ul><li>The actions taken by the human driver to steer the vehicle or adjust its speed. </li></ul></ul><ul><li>Result of learning </li></ul><ul><ul><li>A function mapping images to control actions </li></ul></ul>
  23. 23. Learning to Navigate a Vehicle by Observing an Human Expert (2/2)
  24. 24. Data Recorrection by a Hopfield Network original target data corrupted input data Recorrected data after 10 iterations Recorrected data after 20 iterations Fully recorrected data after 35 iterations
  25. 25. ANN for Face Recognition 960 x 3 x 4 network is trained on gray-level images of faces to predict whether a person is looking to their left, right, ahead, or up.
  26. 26. Data Mining -- -- -- -- -- -- -- -- -- Target data Cleaned data Transformed data Patterns/ model Knowledge Database/data warehouse Selection & Sampling Preprocessing & Cleaning Transformation & reduction Interpretation/ Evaluation Data Mining Performance system
  27. 27. Hot Water Flashing Nozzle with Evolutionary Algorithms Start Hot water entering Steam and droplet at exit At throat: Mach 1 and onset of flashing Hans-Paul Schwefel performed the original experiments
  28. 28. Machine Learning Applications in Bioinformatics
  29. 29. Bayesian Networks for Gene Expression Analysis <ul><li>Learning </li></ul><ul><li>Inference </li></ul>Processed data Data Preprocessing Learning algorithm Gene C Gene B Gene A Target Gene D Gene C Gene B Gene A Target Gene D Gene C Gene B Gene A Target Gene D Gene C Gene B Gene A Target Gene D The values of Gene C and Gene B are given. Belief propagation Probability for the target is computed.
  30. 30. Multilayer Perceptrons for Gene Finding and Prediction bases Discrete exon score 0 1 sequence score Coding potential value GC Composition Length Donor Acceptor Intron vocabulary
  31. 31. Self-Organizing Maps for DNA Microarray Data Analysis Two-dimensional array of postsynaptic neurons Bundle of synaptic connections Winning neurons Input
  32. 32. Biological Information Extraction Database Template Filling Data Analysis & Field Identification Data Classification & Field Extraction Information Extraction Field Property Identification & Learning Text Data DB Location Date DB Record
  33. 33. Biomolecular Computing 011001101010001 ATGCTCGAAGCT