Supervised Learning Introduction to Artificial Intelligence COS302 Michael L. Littman Fall 2001
Administration <ul><li>Exams graded!  </li></ul><ul><li>http://www.cs.princeton.edu/courses/archive/fall01/cs302/whats-new...
Supervised Learning <ul><li>Most studied in machine learning. </li></ul><ul><li>http://www1.ics.uci.edu/~mlearn/MLReposito...
Another Significant App <ul><li>Name A  B  C D E  F   G </li></ul><ul><li>1. Jeffrey B. 1  0  1  0  1  0  1  - </li></ul><...
Features <ul><li>A: First name ends in a vowel? </li></ul><ul><li>B: Neat handwriting?  (Lisa test.) </li></ul><ul><li>C: ...
Decision Tree <ul><li>Internal nodes: features </li></ul><ul><li>Leaves: classification </li></ul>F A D A 0 1 8,9 2,3 , 7 ...
Search <ul><li>Given a set of training data, pick a decision tree: search problem! </li></ul><ul><li>Challenges: </li></ul...
Scoring Function <ul><li>What’s a good tree? </li></ul><ul><li>Low error on training data </li></ul><ul><li>Small </li></u...
Low Error Not Enough C E B 0 1 F middle name? EEC? Neat? Google? Training set Error: 0% (can always do this?)
Memorizing the Data D E F B A A B A A C B A A B A A C
“Learning Curve” error Tree size
What’s the Problem? <ul><li>Memorization w/o generalization </li></ul><ul><li>Want a tree big enough to be correct, but no...
Cross-validation <ul><li>Simple, effective hack method. </li></ul>Data Test Train C-V Train’
Concrete Idea: Pruning <ul><li>Use Train’ to find tree w/ no error. </li></ul><ul><li>Use C-V to score prunings of tree. <...
How Find the Tree? <ul><li>Lots to choose from. </li></ul><ul><li>Could use local search. </li></ul><ul><li>Greedy search…...
Why Might This Fail? <ul><li>No target function, just noise </li></ul><ul><li>Target function too complex (2 2^n  possibil...
Theory: PAC Learning <ul><li>Probably   Approximately  Correct </li></ul><ul><li>Training/testing from distribution. </li>...
Classification <ul><li>Naïve Bayes classifier </li></ul><ul><li>Differentiation vs. modeling </li></ul><ul><li>More on thi...
What to Learn <ul><li>Decision tree representation </li></ul><ul><li>Memorization problem: causes and cures (cross-validat...
Homework 9 (due 12/5) <ul><li>Write a program that decides if a pair of words are synonyms using wordnet.  I’ll send you t...
Upcoming SlideShare
Loading in...5
×

Heuristic Search

689

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
689
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Heuristic Search

  1. 1. Supervised Learning Introduction to Artificial Intelligence COS302 Michael L. Littman Fall 2001
  2. 2. Administration <ul><li>Exams graded! </li></ul><ul><li>http://www.cs.princeton.edu/courses/archive/fall01/cs302/whats-new.html </li></ul><ul><li>Project groups. </li></ul>
  3. 3. Supervised Learning <ul><li>Most studied in machine learning. </li></ul><ul><li>http://www1.ics.uci.edu/~mlearn/MLRepository.html </li></ul><ul><li>Set of examples (usually numeric vectors). Split into: </li></ul><ul><li>Training: Allowed to see it </li></ul><ul><li>Test: Want to minimize error here </li></ul>
  4. 4. Another Significant App <ul><li>Name A B C D E F G </li></ul><ul><li>1. Jeffrey B. 1 0 1 0 1 0 1 - </li></ul><ul><li>2. Paul S. 0 1 1 0 0 0 1 - </li></ul><ul><li>3. Daniel C. 0 0 1 0 0 0 0 - </li></ul><ul><li>4. Gregory P. 1 0 1 0 1 0 0 - </li></ul><ul><li>5. Michael N. 0 0 1 1 0 0 0 - </li></ul><ul><li>6. Corinne N. 1 1 1 0 1 0 1 + </li></ul><ul><li>7. Mariyam M. 0 1 0 1 0 0 1 + </li></ul><ul><li>8. Stephany D. 1 1 1 1 1 1 1 + </li></ul><ul><li>9. Mary D. 1 1 1 1 1 1 1 + </li></ul><ul><li>10. Jamie F. 1 1 1 0 0 1 1 + </li></ul>
  5. 5. Features <ul><li>A: First name ends in a vowel? </li></ul><ul><li>B: Neat handwriting? (Lisa test.) </li></ul><ul><li>C: Middle name listed? </li></ul><ul><li>D: Senior? </li></ul><ul><li>E: Got extra-extra credit? </li></ul><ul><li>F: Google brings up home page? </li></ul><ul><li>G: Google brings up reference? </li></ul>
  6. 6. Decision Tree <ul><li>Internal nodes: features </li></ul><ul><li>Leaves: classification </li></ul>F A D A 0 1 8,9 2,3 , 7 1,4,5 , 6 10 Error: 30%
  7. 7. Search <ul><li>Given a set of training data, pick a decision tree: search problem! </li></ul><ul><li>Challenges: </li></ul><ul><li>Scoring function? </li></ul><ul><li>Large space of trees. </li></ul>
  8. 8. Scoring Function <ul><li>What’s a good tree? </li></ul><ul><li>Low error on training data </li></ul><ul><li>Small </li></ul><ul><li>Small tree is obviously not enough, why isn’t low error? </li></ul>
  9. 9. Low Error Not Enough C E B 0 1 F middle name? EEC? Neat? Google? Training set Error: 0% (can always do this?)
  10. 10. Memorizing the Data D E F B A A B A A C B A A B A A C
  11. 11. “Learning Curve” error Tree size
  12. 12. What’s the Problem? <ul><li>Memorization w/o generalization </li></ul><ul><li>Want a tree big enough to be correct, but not so big that it gets distracted by particulars. </li></ul><ul><li>But, how can we know? </li></ul><ul><li>(Weak) theoretical bounds exist. </li></ul>
  13. 13. Cross-validation <ul><li>Simple, effective hack method. </li></ul>Data Test Train C-V Train’
  14. 14. Concrete Idea: Pruning <ul><li>Use Train’ to find tree w/ no error. </li></ul><ul><li>Use C-V to score prunings of tree. </li></ul><ul><li>Return pruned tree w/ max score. </li></ul>
  15. 15. How Find the Tree? <ul><li>Lots to choose from. </li></ul><ul><li>Could use local search. </li></ul><ul><li>Greedy search… </li></ul>
  16. 16. Why Might This Fail? <ul><li>No target function, just noise </li></ul><ul><li>Target function too complex (2 2^n possibilities, parity) </li></ul><ul><li>Training data doesn’t match target function (PAC bounds) </li></ul>
  17. 17. Theory: PAC Learning <ul><li>Probably Approximately Correct </li></ul><ul><li>Training/testing from distribution. </li></ul><ul><li>With probability 1-  , learned rule will have error smaller than  . </li></ul><ul><li>Bounds on size of training set in terms of  “dimensionality” of the target concept. </li></ul>
  18. 18. Classification <ul><li>Naïve Bayes classifier </li></ul><ul><li>Differentiation vs. modeling </li></ul><ul><li>More on this later. </li></ul>
  19. 19. What to Learn <ul><li>Decision tree representation </li></ul><ul><li>Memorization problem: causes and cures (cross-validation, pruning) </li></ul><ul><li>Greedy heuristic for finding small trees with low error </li></ul>
  20. 20. Homework 9 (due 12/5) <ul><li>Write a program that decides if a pair of words are synonyms using wordnet. I’ll send you the list, you send me the answers. </li></ul><ul><li>Draw a decision tree that represents (a) f 1 +f 2 +…+f n (or), (b) f 1 f 2 …f n (and), (c) parity (odd number of features “on”). </li></ul><ul><li>More soon </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×