Your SlideShare is downloading. ×
0
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Machine learning Lecture 1
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Machine learning Lecture 1

12,595

Published on

Machine learning lecture series by Ravi Gupta, AU-KBC in MIT

Machine learning lecture series by Ravi Gupta, AU-KBC in MIT

Published in: Technology, Education
2 Comments
7 Likes
Statistics
Notes
No Downloads
Views
Total Views
12,595
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
503
Comments
2
Likes
7
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Lecture Series on Machine Learning Ravi Gupta and G. Bharadwaja Kumar AU-KBC Research Centre, MIT Campus, Anna University
  • 2. Books Machine Learning by Tom M. Mitchell
  • 3. Lecture No. 1 Ravi Gupta AU-KBC Research Centre, MIT Campus, Anna University Date: 5.03.2008
  • 4. What is Machine Learning ??? Machine Learning (ML) is concerned with the question of how to construct computer programs that automatically improves with experience.
  • 5. When Machine Learning ??? • Computers applied for solving a complex problem • No known method for computing output is present • When computation is expensive
  • 6. When Machine Learning ??? • Classification of protein types based amino acid sequences • Screening of credit applicants (who will default and who will not) • Modeling a complex chemical reaction, when the precise interaction of the different reactants is unknown
  • 7. What Machine Learning does??? Decision Function / Hypothesis
  • 8. What Machine Learning does??? Training Decision Function Examples / Hypothesis
  • 9. Supervised Learning Apple Decision Function / Hypothesis Orange Supervised Classification
  • 10. Unsupervised Learning Decision Function / Hypothesis Unsupervised Classification
  • 11. Binary Classification Decision Function / Hypothesis
  • 12. Multiclass Classification Decision Function / Hypothesis
  • 13. Application of Machine Learning • Speech and Hand Writing Recognition) • Robotics (Robot locomotion) • Search Engines (Information Retrieval) • Learning to Classify new astronomical structures • Medical Diagnosis • Learning to drive an autonomous vehicle • Computational Biology/Bioinformatics • Computer Vision (Object Detection algorithms) • Detecting credit card fraud • Stock Market analysis • Game playing • ……………….. • ………………….
  • 14. Search Engines
  • 15. Sky Image Cataloging and Analysis Tool (SKI CAT) to Classify new astronomical structures
  • 16. ALVINN (Autonomous Land Vehicle In a Neural Network)
  • 17. SPHINX - A speech recognizer
  • 18. Disciplines that contribute to development ML Field
  • 19. Definition (Learning) A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. (Tom Mitchell, 1998)
  • 20. Checker Learning Problem A computer program that learns to play checkers might improve its performance as measured by its ability to win at the class of tasks involving playing checkers games, through experience obtained by playing games against itself • Task T : playing checkers • Performance measure P: % of game won against opponents • Training experience E : playing practice game against itself
  • 21. A Handwritten Recognition Problem • Task T : recognizing and classifying handwritten words within images • Performance measure P: % of words correctly classified • Training experience E : a database of handwritten words with given classifications
  • 22. A Robot Driving Learning Problem • Task T : driving on public four-lane highways using vision sensors Performance measure P: average distance traveled before an error (as judged by human overseer) • Training experience E : a sequence of images and steering commands recorded while observing a human driver
  • 23. Designing a Learning System
  • 24. Checker Learning Problem • Task T : playing checkers • Performance measure P: % of game won against opponents • Training experience E : playing practice game against itself
  • 25. Checker
  • 26. Choosing the Training Experience The type of training experience E available to a system can have significant impact on success or failure of the learning system.
  • 27. Choosing the Training Experience One key attribute is whether the training experience provides direct or indirect feedback regarding the choices made by the performance system
  • 28. Direct Learning
  • 29. Indirect Learning Next Move Win / Loss
  • 30. Second attribute is the degree to which the learner controls the sequence of training examples. Next Move
  • 31. Another attribute is the degree to which the learner controls the sequence of training examples. Teacher / Expert Player Next Move Player Next Move
  • 32. How well the training experience represents the distribution of examples over which the final system performance P must be measured. (Learning will be most reliable when the training example follow a distribution similar to that of future test examples)
  • 33. V. Anand G. Kasparov
  • 34. Garry Kasparov playing chess with IBM's Deep Blue. Photo courtesy of IBM.
  • 35. Kasparov Monkey
  • 36. Geri’s Game by Pixar in which an elderly man enjoys a game of chess with himself (available at http://www.pixar.com/shorts/gg/theater/index.html) Won an Academy Award for best animated short film.
  • 37. Assumptions • Let us assume that our system will train by playing games against itself. • And it is allowed to generate as much training data as time permits.
  • 38. Issues Related to Experience What type of knowledge/experience should one learn?? How to represent the experience ? (some mathematical representation for experience) What should be the learning mechanism???
  • 39. Main objective of the checker playing Best Next Move
  • 40. Target Function Choose Move: B → M Choose Move is a function where input B is the set of legal board states and produces M which is the set of legal moves
  • 41. Input: Board State Output: M (Moves)
  • 42. Target Function Choose Move: B → M M = Choose Move (B)
  • 43. Alternative Target Function F:B→R F is the target function, Input B is the board state and Output R denotes a set of real number
  • 44. Input: Board State Output: M (Moves) F = 10 F = 7.7 F = 14.3 F = 10 F = 10.8 F = 6.3 F = 9.4
  • 45. Representation of Target Function • xl: the number of white pieces on the board • x2: the number of red pieces on the board • x3: the number of white kings on the board • x4: the number of red kings on the board • x5: the number of white pieces threatened by red (i.e., which can be captured on red's next turn) • X6: the number of red pieces threatened by white F'(b) = w0 + w1x1+ w2x2 + w3x3 + w4x4 + w5x5 + w6x6
  • 46. • Task T: playing checkers • Performance measure P: % of games won in the world tournament • Training experience E: games played against itself • Target function: F : Board → R • Target function representation F'(b) = w0 + w1x1+ w2x2 + w3x3 + w4x4 + w5x5 + w6x6 The problem of learning a checkers strategy reduces to the problem of learning values for the coefficients w0 through w6 in the target function representation
  • 47. Adjustment of Weights E(Error) ≡ ∑ < b , Ftrain ( b ) >∈ training examples (Ftrain (b) − F '(b)) 2
  • 48. Generic issues of Machine Learning • What algorithms exist for learning general target functions from specific training examples? • In what settings will particular algorithms converge to the desired function, given sufficient training data? • Which algorithms perform best for which types of problems and representations? • How much training data is sufficient? • What is the best way to reduce the learning task to one or more function approximation problems?
  • 49. Generic issues of Machine Learning • When and how can prior knowledge held by the learner guide the process of generalizing from examples? • Can prior knowledge be helpful even when it is only approximately correct? • What is the best strategy for choosing a useful next training experience, and how does the choice of this strategy alter the complexity of the learning problem? • How can the learner automatically alter its representation to improve its ability to represent and learn the target function?
  • 50. Concept Learning
  • 51. Concept Learning A task of acquiring a potential hypothesis (solution) that best fits the training examples
  • 52. Concept Learning Task Objective is to learn EnjoySport {Sky, AirTemp, Humidity, Wind, Water, Forecast} → EnjoySport Tom enjoys his favorite water sports
  • 53. Concept Learning Task Objective is to learn EnjoySport {Sky, AirTemp, Humidity, Wind, Water, Forecast} → EnjoySport Input variables Output <x1, x2, x3, x4, x5, x6> → <y>
  • 54. Notations
  • 55. Notations Instances (X) Target Concept (C) Training examples (D)
  • 56. Concept Learning Acquiring the definition of a general category from a given set of positive and negative training examples of the category. instance A hypothesis h in H such that h(x) = c(x) for all x in X hypothesis Target Instance concept
  • 57. Instance Space Suppose that the target hypothesis that we want to learn for the current problem is represented as a conjunction of all the attributes Sky = .... AND AirTemp = …. AND Humidity= …. AND Wind = …. AND Water= …. AND Forecast= …. THEN EnjoySport= ….
  • 58. Instance Space Suppose the attribute Sky has three possible values, and that AirTemp, Humidity, Wind, Water, and Forecast each have two possible values Instance Space Possible distinct instances = 3 * 2 * 2 * 2 * 2 * 2 = 96
  • 59. Hypothesis Space Hypothesis Space: A set of all possible hypotheses Possible syntactically distinct Hypotheses for EnjoySport = 5 * 4 * 4 * 4 * 4 * 4 = 5120 • Sky has three possible values • Fourth value don’t care (?) • Fifth value is empty set Ø
  • 60. Hypothesis Space h = <Sunny, Warm, ?, Strong, Warm, Same> Normal / High
  • 61. Hypothesis Space h = <Ø, Warm, ?, Strong, Warm, Same>
  • 62. Concept Learning as Search Concept learning can be viewed as the task of searching through a large space of hypothesis implicitly defined by the hypothesis representation. The goal of the concept learning search is to find the hypothesis that best fits the training examples.
  • 63. General-to-Specific Learning Every day Tom his enjoy i.e., Only positive examples. Most General Hypothesis: h = <?, ?, ?, ?, ?, ?> Most Specific Hypothesis: h = < Ø, Ø, Ø, Ø, Ø, Ø>
  • 64. General-to-Specific Learning Consider the sets of instances that are classified positive by hl and by h2. h2 imposes fewer constraints on the instance, thus it classifies more instances as positive. Any instance classified positive by hl will also be classified positive by h2. Therefore, we say that h2 is more general than hl.
  • 65. Definition Given hypotheses hj and hk, hj is more_general_than_or_equal_to hk if and only if any instance that satisfies hk also satisfies hj. We can also say that hj is more_specific_than hk when hk is more_general_than hj.
  • 66. More_general_than relation
  • 67. FIND-S: Finding a Maximally Specific Hypothesis
  • 68. Step 1: FIND-S h0 = <Ø, Ø, Ø, Ø, Ø, Ø>
  • 69. Step 2: FIND-S h0 = <Ø, Ø, Ø, Ø, Ø, Ø> a1 a2 a3 a4 a5 a6 x1 = <Sunny, Warm, Normal, Strong, Warm, Same> Iteration 1 h1 = <Sunny, Warm, Normal, Strong, Warm, Same>
  • 70. h1 = <Sunny, Warm, Normal, Strong, Warm, Same> Iteration 2 x2 = <Sunny, Warm, High, Strong, Warm, Same> h2 = <Sunny, Warm, ?, Strong, Warm, Same>
  • 71. Iteration 3 Ignore h3 = <Sunny, Warm, ?, Strong, Warm, Same>
  • 72. h3 = < Sunny, Warm, ?, Strong, Warm, Same > Iteration 4 x4 = < Sunny, Warm, High, Strong, Cool, Change > Step 3 Output h4 = <Sunny, Warm, ?, Strong, ?, ?>
  • 73. Key Properties of FIND-S Algorithm • Guaranteed to output the most specific hypothesis within H that is consistent with the positive training examples. • Final hypothesis will also be consistent with the negative examples provided the correct target concept is contained in H, and provided the training examples are correct.
  • 74. Unanswered Questions by FIND-S • Has the learner converged to the correct target concept? Although FIND-S will find a hypothesis consistent with the training data, it has no way to determine whether it has found the only hypothesis in H consistent with the data (i.e., the correct target concept • Why prefer the most specific hypothesis? In case there are multiple hypotheses consistent with the training examples, FIND-S will find the most specific. It is unclear whether we should prefer this hypothesis over, say, the most general, or some other hypothesis of intermediate generality.
  • 75. Unanswered Questions by FIND-S • Are the training examples consistent? In most practical learning problems there is some chance that the training examples will contain at least some errors or noise. Such inconsistent sets of training examples can severely mislead FIND-S, given the fact that it ignores negative examples. We would prefer an algorithm that could at least detect when the training data is inconsistent and, preferably, accommodate such errors.
  • 76. Unanswered Questions by FIND-S • What if there are several maximally specific consistent hypotheses? In the hypothesis language H for the EnjoySport task, there is always a unique, most specific hypothesis consistent with any set of positive examples. However, for other hypothesis spaces (which will be discussed later) there can be several maximally specific hypotheses consistent with the data.

×