Machine Learning Version Spaces Learning
Introduction: <ul><ul><li>Neural Net approaches </li></ul></ul><ul><ul><li>Symbolic approaches: </li></ul></ul><ul><ul><ul...
Version Spaces A  concept learning  technique based on  refining models  of the world.
Concept Learning <ul><li>Example: </li></ul><ul><ul><li>a student has the following observations about having an allergic ...
In general <ul><li>There is a set of all possible events: </li></ul><ul><ul><li>Example: </li></ul></ul><ul><li>There is a...
Pictured: Set of all possible events                                   Given some  examp...
Non-determinism <ul><li>Many different ways to solve this! </li></ul><ul><li>How to choose ? </li></ul>Set of all possible...
An obvious, but bad choice: <ul><li>The concept IS: </li></ul><ul><ul><li>Alma 3   and  breakfast  and  Friday  and  cheap...
Pictured: <ul><li>Only the positive examples are positive ! </li></ul>Set of all possible events + + + + - - - -     ...
Equally bad is: <ul><li>The concept is anything EXCEPT : </li></ul><ul><ul><li>De Moete  and  lunch  and  Friday  and  exp...
Pictured: <ul><li>Everything except the negative examples are positive: </li></ul>Set of all possible events       ...
Solution:  fix a language of hypotheses: <ul><li>We introduce a fix language of concept descriptions. </li></ul>= hypothes...
Reaction - Example: Restaurant Meal Day Cost    Reaction Alma 3 breakfast Friday cheap Yes De Moete lunch Friday expensive...
Hypotheses relate to  sets of possible events        Events    x2      x1 x1 =  <  Alma 3, lunch, Monday, ex...
Expressive power of this hypothesis language: <ul><li>Conjunctions of explicit, individual properties </li></ul><ul><li>Ex...
Other languages of hypotheses are allowed <ul><li>Example:  identify the color of a given sequence of colored objects. </l...
Important about hypothesis languages: <ul><li>They should have a specific <-> general ordering. Corresponding to the set-i...
Defining Concept Learning: <ul><li>Given: </li></ul><ul><ul><li>A set  X  of possible events : </li></ul></ul><ul><ul><ul>...
The inductive learning hypothesis: If  a hypothesis approximates the target function well  over a sufficiently large numbe...
Find-S: a naïve algorithm Replace  h  by a minimal generalization of  h  that covers  x Return   h Initialize :  h :=   F...
Reaction example: minimal generalizations = the individual events Generalization = replace something by ? no more positive...
Properties of Find-S <ul><li>Non-deterministic : </li></ul><ul><ul><li>Depending on H, there maybe several minimal general...
Properties of Find-S (2) <ul><li>May pick incorrect hypothesis (w.r.t. the negative examples): </li></ul>D : Beth Alex Jo ...
Properties of Find-S (3) <ul><li>Cannot detect inconsistency of the training data: </li></ul><ul><li>Nor inability of the ...
Nice about Find-S: <ul><li>It doesn’t have to remember previous examples !  </li></ul>If  h  already covered the 20 first ...
Dual Find-S: Initialize :  h :=  [ ?, ?, .., ?] For  each  negative  training example  x  in  D   Do: If   h  does cover  ...
Reaction example: Restaurant Meal Day Cost    Reaction Alma 3 breakfast Friday cheap Yes De Moete lunch Friday expensive N...
Version Spaces: the idea:  <ul><li>BUT: do  NOT  select 1 minimal generalization or  specialization at each step, but keep...
Version spaces: initialization <ul><li>The version spaces  G  and  S  are initialized to be the smallest and the largest h...
Negative examples: <ul><li>Replace the top hypothesis by ALL minimal specializations that  DO NOT  cover the negative exam...
Positive examples: <ul><li>Replace the bottom hypothesis by ALL minimal generalizations that  DO  cover the positive examp...
Later: negative examples <ul><li>Replace the all hypotheses in  G  that cover a next negative example by ALL minimal speci...
Later: positive examples <ul><li>Replace the all hypotheses in  S  that do not cover a next positive  example by ALL minim...
Optimization: negative: <ul><li>Only consider specializations of elements in  G  that are still more general than some spe...
Optimization: positive: <ul><li>Only consider generalizations of elements in  S  that are still more specific than some ge...
Pruning: negative examples  <ul><li>The new negative example can also be used to prune all the  S  - hypotheses that cover...
Pruning: positive examples  <ul><li>The new positive example can also be used to prune all the  G  - hypotheses that do no...
Eliminate redundant hypotheses <ul><li>If a hypothesis from  G  is more specific than another hypothesis from  G : elimina...
Convergence: <ul><li>Eventually, if  G  and  S  MAY get a common element:  Version Spaces has  converged to a solution . <...
Reaction example <ul><li>Initialization: </li></ul>[ ?, ?, ?, ?] Most general  Most specific
Alma3,  breakfast,  Friday,  cheap:  + <ul><li>Positive example:  minimal generalization of    </li></ul>[ ?, ?, ?, ?]  ...
DeMoete,  lunch,  Friday,  expensive:  - <ul><li>Negative example: minimal specialization of  [ ?, ?, ?, ?] </li></ul><ul>...
Result after example 2: [ ?, ?, ?, ?]  [Alma3, breakfast, Friday, cheap] [ Alma3 , ?, ?, ?] [?,  breakfast , ?, ?] [?, ?,...
Alma3,  lunch,  Saturday,  cheap:  + <ul><li>Positive example: minimal generalization of  [Alma3, breakfast, Friday, cheap...
Sedes,  breakfast,  Sunday,  cheap:  - <ul><li>Negative example: minimal specialization of the general models: </li></ul>T...
Alma 3,  breakfast,  Sunday,  expensive:  - <ul><li>Negative example: minimal specialization of  [Alma3, ?, ?, ?]   </li><...
Version Space Algorithm: Generalize  all hypotheses in  S  that do not cover the    example yet, but ensure the following:...
Version Space Algorithm (2): Specialize  all hypotheses in  G  that cover the example,    but ensure the following:   - On...
Properties of VS: <ul><li>Symmetry: </li></ul><ul><ul><li>positive and negative examples are dealt with in a  completely d...
Termination: <ul><li>If it terminates because of “no more examples”: </li></ul>Then  all these hypotheses , and  all inter...
Termination (2): <ul><li>If it terminates because of  S  or  G  being empty: </li></ul><ul><ul><li>Then either: </li></ul>...
Which example next? <ul><li>VS can decide itself which example would be most useful next. </li></ul><ul><ul><li>It can ‘qu...
Use of partially learned concepts: <ul><li>Example:   <Alma3,lunch,Monday,cheap> </li></ul>[Alma3, ?, ?, ?] [?, ?, Monday,...
Use of partially learned concepts (2): <ul><li>Example:   <Sedes,lunch,Sunday,cheap> </li></ul>[Alma3, ?, ?, ?] [?, ?, Mon...
Use of partially learned concepts (3): <ul><li>Example:   <Alma3,lunch,Monday,expensive>   </li></ul>[Alma3, ?, ?, ?] [?, ...
Use of partially learned concepts (4): <ul><li>Example:   <Sedes,lunch,Monday,expensive> </li></ul>[Alma3, ?, ?, ?] [?, ?,...
The relevance of inductive BIAS: choosing H <ul><li>Our hypothesis language L fails to learn some concepts. </li></ul><ul>...
Inductive BIAS (2): <ul><li>This language H’ allows to represent ANY subset of the complete set of all events X </li></ul>...
Inductive BIAS (3) <ul><li>Version Spaces using H’: </li></ul>Restaurant Meal Day Cost    Reaction Alma 3 breakfast Friday...
Inductive BIAS (3) <ul><li>Resulting Version Spaces: </li></ul><ul><li>We haven’t learned anything </li></ul><ul><ul><li>m...
Shift of Bias <ul><li>Practical approach to the Bias problem:  </li></ul><ul><li>Avoids the choice. </li></ul><ul><li>Give...
Upcoming SlideShare
Loading in …5
×

Version spaces

327 views
247 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
327
On SlideShare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Version spaces

  1. 1. Machine Learning Version Spaces Learning
  2. 2. Introduction: <ul><ul><li>Neural Net approaches </li></ul></ul><ul><ul><li>Symbolic approaches: </li></ul></ul><ul><ul><ul><li>version spaces </li></ul></ul></ul><ul><ul><ul><li>decision trees </li></ul></ul></ul><ul><ul><ul><li>knowledge discovery </li></ul></ul></ul><ul><ul><ul><li>data mining </li></ul></ul></ul><ul><ul><ul><li>speed up learning </li></ul></ul></ul><ul><ul><ul><li>inductive learning </li></ul></ul></ul><ul><ul><ul><li>… </li></ul></ul></ul><ul><li>Very many approaches to machine learning: </li></ul>
  3. 3. Version Spaces A concept learning technique based on refining models of the world.
  4. 4. Concept Learning <ul><li>Example: </li></ul><ul><ul><li>a student has the following observations about having an allergic reaction after meals: </li></ul></ul>Restaurant Meal Day Cost Reaction Alma 3 breakfast Friday cheap Yes De Moete lunch Friday expensive No Alma 3 lunch Saturday cheap Yes Sedes breakfast Sunday cheap No Alma 3 breakfast Sunday expensive No + - + - - <ul><ul><li>Concept to learn : under which circumstances do I get an allergic reaction after meals ?? </li></ul></ul>
  5. 5. In general <ul><li>There is a set of all possible events: </li></ul><ul><ul><li>Example: </li></ul></ul><ul><li>There is a boolean function (implicitly) defined on this set. </li></ul><ul><ul><li>Example: </li></ul></ul><ul><li>We have the value of this function for SOME examples only. </li></ul>Restaurant Meal Day Cost 3 X 3 X 7 X 2 = 126 Reaction: Restaurant X Meal X Day X Cost --> Bool Find an inductive ‘guess’ of the concept, that covers all the examples !
  6. 6. Pictured: Set of all possible events                                   Given some examples: + + + + - - - - Find a concept that covers all the positive examples and none of the negative ones!
  7. 7. Non-determinism <ul><li>Many different ways to solve this! </li></ul><ul><li>How to choose ? </li></ul>Set of all possible events                                   + + + + - - - -
  8. 8. An obvious, but bad choice: <ul><li>The concept IS: </li></ul><ul><ul><li>Alma 3 and breakfast and Friday and cheap OR </li></ul></ul><ul><ul><li>Alma 3 and lunch and Saturday and cheap </li></ul></ul><ul><li>Does NOT generalize the examples any !! </li></ul>Restaurant Meal Day Cost Reaction Alma 3 breakfast Friday cheap Yes De Moete lunch Friday expensive No Alma 3 lunch Saturday cheap Yes Sedes breakfast Sunday cheap No Alma 3 breakfast Sunday expensive No - - - + +
  9. 9. Pictured: <ul><li>Only the positive examples are positive ! </li></ul>Set of all possible events + + + + - - - -                                  
  10. 10. Equally bad is: <ul><li>The concept is anything EXCEPT : </li></ul><ul><ul><li>De Moete and lunch and Friday and expensive AND </li></ul></ul><ul><ul><li>Sedes and breakfast and Sunday and cheap AND </li></ul></ul><ul><ul><li>Alma 3 and breakfast and Sunday and expensive </li></ul></ul>Restaurant Meal Day Cost Reaction Alma 3 breakfast Friday cheap Yes De Moete lunch Friday expensive No Alma 3 lunch Saturday cheap Yes Sedes breakfast Sunday cheap No Alma 3 breakfast Sunday expensive No + - + - -
  11. 11. Pictured: <ul><li>Everything except the negative examples are positive: </li></ul>Set of all possible events                                   + + + + - - - -
  12. 12. Solution: fix a language of hypotheses: <ul><li>We introduce a fix language of concept descriptions. </li></ul>= hypothesis space <ul><li>The concept can only be identified as being one of the hypotheses in this language </li></ul><ul><ul><li>avoids the problem of having ‘useless’ conclusions </li></ul></ul><ul><ul><li>forces some generalization/induction to cover more than just the given examples. </li></ul></ul>
  13. 13. Reaction - Example: Restaurant Meal Day Cost Reaction Alma 3 breakfast Friday cheap Yes De Moete lunch Friday expensive No Alma 3 lunch Saturday cheap Yes Sedes breakfast Sunday cheap No Alma 3 breakfast Sunday expensive No + - + - - <ul><li>Every hypothesis is a 4-tuple: </li></ul><ul><ul><li>maximally specific: ex.: [ Sedes, lunch, Monday, cheap] </li></ul></ul><ul><ul><li>combinations of ? and values are allowed: ex.: [ De Moete, ?, ?, expensive] </li></ul></ul><ul><ul><li>or [ ?, lunch, ?, ?] </li></ul></ul><ul><li>One more hypothesis:  (bottom = denotes no example) </li></ul><ul><ul><li>most general hypothesis: [ ?, ?, ?, ?] </li></ul></ul>
  14. 14. Hypotheses relate to sets of possible events       Events    x2      x1 x1 = < Alma 3, lunch, Monday, expensive> x2 = < Sedes, lunch, Sunday, cheap> h1 = [?, lunch, Monday, ?] h2 = [?, lunch, ?, cheap] h3 = [?, lunch, ?, ?] General Specific                 Hypotheses h2 h3 h1
  15. 15. Expressive power of this hypothesis language: <ul><li>Conjunctions of explicit, individual properties </li></ul><ul><li>Examples: </li></ul><ul><ul><li>[?, lunch, ?, cheap] : Meal = lunch  Cost = cheap </li></ul></ul><ul><ul><li>[?, lunch, ?, ?] : Meal = lunch </li></ul></ul><ul><li>In addition to the 2 special hypotheses: </li></ul><ul><ul><li> : nothing </li></ul></ul><ul><ul><li>[?, lunch, Monday, ?] : Meal = lunch  Day = Monday </li></ul></ul><ul><ul><li>[ ?, ?, ?, ?] : anything </li></ul></ul>
  16. 16. Other languages of hypotheses are allowed <ul><li>Example: identify the color of a given sequence of colored objects. </li></ul><ul><li>A useful language of hypotheses: </li></ul><ul><ul><li>the examples: </li></ul></ul><ul><ul><ul><li>red : + </li></ul></ul></ul><ul><ul><ul><li>purple : - </li></ul></ul></ul><ul><ul><ul><li>blue : + </li></ul></ul></ul>any_color mono_color polyo_color red blue green orange purple 
  17. 17. Important about hypothesis languages: <ul><li>They should have a specific <-> general ordering. Corresponding to the set-inclusion of the events they cover. </li></ul>
  18. 18. Defining Concept Learning: <ul><li>Given: </li></ul><ul><ul><li>A set X of possible events : </li></ul></ul><ul><ul><ul><li>Ex.: Eat-events: <Restaurant, Meal, Day, Cost> </li></ul></ul></ul><ul><ul><li>An (unknown) target function c: X -> {-, +} </li></ul></ul><ul><ul><ul><li>Ex.: Reaction: Eat-events -> { -, +} </li></ul></ul></ul><ul><ul><li>A language of hypotheses H </li></ul></ul><ul><ul><ul><li>Ex.: conjunctions: [ ?, lunch, Monday, ?] </li></ul></ul></ul><ul><ul><li>A set of training examples D , with their value under c </li></ul></ul><ul><ul><ul><li>Ex.: (<Alma 3, breakfast,Friday,cheap>,+) , … </li></ul></ul></ul><ul><li>Find: </li></ul><ul><ul><li>A hypothesis h in H such that for all x in D : x is covered by h  c( x ) = + </li></ul></ul>
  19. 19. The inductive learning hypothesis: If a hypothesis approximates the target function well over a sufficiently large number of examples , then the hypothesis will also approximate the target function well on other unobserved examples .
  20. 20. Find-S: a naïve algorithm Replace h by a minimal generalization of h that covers x Return h Initialize : h :=  For each positive training example x in D Do: If h does not cover x :
  21. 21. Reaction example: minimal generalizations = the individual events Generalization = replace something by ? no more positive examples: return h Restaurant Meal Day Cost Reaction Alma 3 breakfast Friday cheap Yes De Moete lunch Friday expensive No Alma 3 lunch Saturday cheap Yes Sedes breakfast Sunday cheap No Alma 3 breakfast Sunday expensive No + - + - - Initially: h =  Example 1: h = [Alma 3,breakfast,Friday,cheap]  Example 2: h = [Alma 3, ? , ? ,cheap] 
  22. 22. Properties of Find-S <ul><li>Non-deterministic : </li></ul><ul><ul><li>Depending on H, there maybe several minimal generalizations: </li></ul></ul>Beth Jo Alex Hacker Scientist Footballplayer  H Beth can be minimally generalized in 2 ways to include a new example Jo.
  23. 23. Properties of Find-S (2) <ul><li>May pick incorrect hypothesis (w.r.t. the negative examples): </li></ul>D : Beth Alex Jo + - + Beth Jo Alex Hacker Scientist Footballplayer  H Is a wrong solution
  24. 24. Properties of Find-S (3) <ul><li>Cannot detect inconsistency of the training data: </li></ul><ul><li>Nor inability of the language H to learn the concept: </li></ul>D : Beth Jo Beth + + - D : Beth Alex Jo + - + Jo Alex Scientist  H Beth
  25. 25. Nice about Find-S: <ul><li>It doesn’t have to remember previous examples ! </li></ul>If h already covered the 20 first examples, then h’ will as well h        X H <ul><ul><li>If the previous h already covered all previous examples, then a minimal generalization h’ will too ! </li></ul></ul> h’   
  26. 26. Dual Find-S: Initialize : h := [ ?, ?, .., ?] For each negative training example x in D Do: If h does cover x : Replace h by a minimal specialization of h that does not cover x Return h
  27. 27. Reaction example: Restaurant Meal Day Cost Reaction Alma 3 breakfast Friday cheap Yes De Moete lunch Friday expensive No Alma 3 lunch Saturday cheap Yes Sedes breakfast Sunday cheap No Alma 3 breakfast Sunday expensive No + - + - - Initially: h = [ ?, ?, ?, ?] Example 1:  h= [ ?, breakfast, ?, ?] Example 2:  h= [ Alma 3, breakfast, ?, ?] Example 3:  h= [ Alma 3, breakfast, ?, cheap]
  28. 28. Version Spaces: the idea: <ul><li>BUT: do NOT select 1 minimal generalization or specialization at each step, but keep track of ALL minimal generalizations or specializations </li></ul><ul><li>Perform both Find-S and Dual Find-S: </li></ul><ul><ul><li>Find-S deals with positive examples </li></ul></ul><ul><ul><li>Dual Find-S deals with negative examples </li></ul></ul>
  29. 29. Version spaces: initialization <ul><li>The version spaces G and S are initialized to be the smallest and the largest hypotheses only. </li></ul>G = { [ ?, ?, …, ?] } S = {  }
  30. 30. Negative examples: <ul><li>Replace the top hypothesis by ALL minimal specializations that DO NOT cover the negative example. </li></ul>Invariant : only the hypotheses more specific than the ones of G are still possible: they don’t cover the negative example S = {  } G = { [ ?, ?, …, ?] } G = {h1, h2, …, hn}
  31. 31. Positive examples: <ul><li>Replace the bottom hypothesis by ALL minimal generalizations that DO cover the positive example. </li></ul>Invariant : only the hypotheses more general than the ones of S are still possible: they do cover the positive example G = {h1, h2, …, hn} S = {  } S = { h1’, h2’, …,hm’ }
  32. 32. Later: negative examples <ul><li>Replace the all hypotheses in G that cover a next negative example by ALL minimal specializations that DO NOT cover the negative example. </li></ul>Invariant : only the hypotheses more specific than the ones of G are still possible: they don’t cover the negative example S = { h1’, h2’, …,hm’ } G = {h1, h2, …, hn} G = {h1, h21, h22, …, hn}
  33. 33. Later: positive examples <ul><li>Replace the all hypotheses in S that do not cover a next positive example by ALL minimal generalizations that DO cover the example. </li></ul>Invariant : only the hypotheses more general than the ones of S are still possible: they do cover the positive example G = {h1, h21, h22, …, hn} S = { h1’, h2’, …,hm’ } S = { h11’, h12’, h13’, h2’, …,hm’ }
  34. 34. Optimization: negative: <ul><li>Only consider specializations of elements in G that are still more general than some specific hypothesis (in S ) </li></ul>… are more general than ... G = {h1, h21, …, hn} S = { h1’, h2’, …,hm’ } Invariant : on S used !
  35. 35. Optimization: positive: <ul><li>Only consider generalizations of elements in S that are still more specific than some general hypothesis (in G ) </li></ul>… are more specific than ... Invariant : on G used ! G = {h1, h21, h22, …, hn} S = { h13’, h2’, …,hm’ }
  36. 36. Pruning: negative examples <ul><li>The new negative example can also be used to prune all the S - hypotheses that cover the negative example. </li></ul>G = {h1, h21, …, hn} S = { h1’, h3’, …,hm’ } Invariant only works for the previous examples, not the last one Cover the last negative example!
  37. 37. Pruning: positive examples <ul><li>The new positive example can also be used to prune all the G - hypotheses that do not cover the positive example. </li></ul>G = {h1, h21, …, hn} S = { h1’, h3’, …,hm’ } Don’t cover the last positive example
  38. 38. Eliminate redundant hypotheses <ul><li>If a hypothesis from G is more specific than another hypothesis from G : eliminate it ! </li></ul>More specific than another general hypothesis Obviously also for S ! G = {h1, h21, …, hn} S = { h1’, h3’, …,hm’ } Reason: Invariant acts as a wave front : anything above G is not allowed. The most general elements of G define the real boundary
  39. 39. Convergence: <ul><li>Eventually, if G and S MAY get a common element: Version Spaces has converged to a solution . </li></ul><ul><li>Remaining examples need to be verified for the solution . </li></ul>G = {h1, h21, h22, …, hn} S = { h13’, h2’, …,hm’ }
  40. 40. Reaction example <ul><li>Initialization: </li></ul>[ ?, ?, ?, ?] Most general  Most specific
  41. 41. Alma3, breakfast, Friday, cheap: + <ul><li>Positive example: minimal generalization of  </li></ul>[ ?, ?, ?, ?]  [Alma3, breakfast, Friday, cheap]
  42. 42. DeMoete, lunch, Friday, expensive: - <ul><li>Negative example: minimal specialization of [ ?, ?, ?, ?] </li></ul><ul><li>15 possible specializations !! </li></ul>[ Alma3 , ?, ?, ?] [ DeMoete , ?, ?, ?] [ Sedes , ?, ?, ?] [?, breakfast , ?, ?] [?, lunch , ?, ?] [?, dinner , ?, ?] [?, ?, Monday , ?] [?, ?, Tuesday , ?] [?, ?, Wednesday , ?] [?, ?, Thursday , ?] [?, ?, Friday , ?] [?, ?, Saturday , ?] [?, ?, Sunday , ?] [?, ?, ?, cheap ] [?, ?, ?, expensive ] Matches the negative example X X X X Does not generalize the specific model Specific model: [Alma3, breakfast, Friday, cheap] X X X X X X X X Remain !
  43. 43. Result after example 2: [ ?, ?, ?, ?]  [Alma3, breakfast, Friday, cheap] [ Alma3 , ?, ?, ?] [?, breakfast , ?, ?] [?, ?, ?, cheap ]
  44. 44. Alma3, lunch, Saturday, cheap: + <ul><li>Positive example: minimal generalization of [Alma3, breakfast, Friday, cheap] </li></ul>[Alma3, breakfast, Friday, cheap] [Alma3, ?, ?, ?] [?, breakfast, ?, ?] [?, ?, ?, cheap] [Alma3, ? , ? , cheap] does not match new example
  45. 45. Sedes, breakfast, Sunday, cheap: - <ul><li>Negative example: minimal specialization of the general models: </li></ul>The only specialization that is introduced is pruned, because it is more specific than another general hypothesis [Alma3, ?, ?, ?] [?, ?, ?, cheap] [Alma3, ? , ? , cheap] [ Alma3 , ?, ?, cheap]
  46. 46. Alma 3, breakfast, Sunday, expensive: - <ul><li>Negative example: minimal specialization of [Alma3, ?, ?, ?] </li></ul>Cheap food at Alma3 produces the allergy ! [Alma3, ?, ?, ?] [Alma3, ? , ? , cheap] [Alma3, ?, ?, cheap ] Same hypothesis !!!
  47. 47. Version Space Algorithm: Generalize all hypotheses in S that do not cover the example yet, but ensure the following: - Only introduce minimal changes on the hypotheses. - Each new specific hypothesis is a specialization of some general hypothesis . - No new specific hypothesis is a generalization of some other specific hypothesis . Prune away all hypotheses in G that do not cover the example. Initially : G := { the hypothesis that covers everything} S := {  } For each new positive example:
  48. 48. Version Space Algorithm (2): Specialize all hypotheses in G that cover the example, but ensure the following: - Only introduce minimal changes on the hypotheses. - Each new general hypothesis is a generalization of some specific hypothesis . - No new general hypothesis is a specialization of some other general hypothesis . Prune away all hypotheses in S that cover the example. Until there are no more examples: report S and G OR S or G become empty : report failure . . . For each new negative example:
  49. 49. Properties of VS: <ul><li>Symmetry: </li></ul><ul><ul><li>positive and negative examples are dealt with in a completely dual way. </li></ul></ul><ul><li>Does not need to remember previous examples. </li></ul><ul><li>Noise: </li></ul><ul><ul><li>VS cannot deal with noise ! </li></ul></ul><ul><ul><ul><li>If a positive example is given to be negative then VS eliminates the desired hypothesis from the Version Space G ! </li></ul></ul></ul>
  50. 50. Termination: <ul><li>If it terminates because of “no more examples”: </li></ul>Then all these hypotheses , and all intermediate hypotheses , are still correct descriptions covering the test data. <ul><ul><li>Example (spaces on termination): </li></ul></ul>[Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?] VS makes NO unnecessary choices !
  51. 51. Termination (2): <ul><li>If it terminates because of S or G being empty: </li></ul><ul><ul><li>Then either: </li></ul></ul><ul><ul><ul><li>The data is inconsistent (noise?) </li></ul></ul></ul><ul><ul><ul><li>The target concept cannot be represented in the hypothesis-language H. </li></ul></ul></ul>[Alma3, breakfast, ?, cheap]  [Alma3, lunch, ?, cheap] <Alma3, dinner, Sunday, cheap> - <Alma3, breakfast, Sunday, cheap> + <Alma3, lunch, Sunday, cheap> + <ul><li>Example: </li></ul><ul><ul><li>target concept is: </li></ul></ul><ul><ul><li>given examples like: </li></ul></ul><ul><ul><li>this cannot be learned in our language H. </li></ul></ul>
  52. 52. Which example next? <ul><li>VS can decide itself which example would be most useful next. </li></ul><ul><ul><li>It can ‘query’ a user for the most relevant additional classification ! </li></ul></ul><ul><li>Example: </li></ul><Alma3,lunch,Monday, expensive > [Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?] classified negative by 3 hypotheses classified positive by 3 hypotheses is the most informative new example
  53. 53. Use of partially learned concepts: <ul><li>Example: <Alma3,lunch,Monday,cheap> </li></ul>[Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?] <ul><ul><li>can be classified as positive </li></ul></ul><ul><ul><ul><li>is covered by all remaining hypotheses ! </li></ul></ul></ul><ul><ul><ul><li>it is enough to check that it is covered by the hypotheses in S ! (all others generalize these) </li></ul></ul></ul>
  54. 54. Use of partially learned concepts (2): <ul><li>Example: <Sedes,lunch,Sunday,cheap> </li></ul>[Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?] <ul><ul><li>can be classified as negative </li></ul></ul><ul><ul><ul><li>is not covered by any remaining hypothesis ! </li></ul></ul></ul><ul><ul><ul><li>it is enough to check that it is not covered by any hypothesis in G ! (all others specialize these) </li></ul></ul></ul>
  55. 55. Use of partially learned concepts (3): <ul><li>Example: <Alma3,lunch,Monday,expensive> </li></ul>[Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?] <ul><ul><li>can not be classified </li></ul></ul><ul><ul><ul><li>is covered by 3, not covered 3 hypotheses </li></ul></ul></ul><ul><ul><ul><ul><ul><li>no conclusion </li></ul></ul></ul></ul></ul>
  56. 56. Use of partially learned concepts (4): <ul><li>Example: <Sedes,lunch,Monday,expensive> </li></ul>[Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?] probably does not belong to the concept : ratio : 1/6 <ul><ul><li>can only be classified with a certain degree of precision </li></ul></ul><ul><ul><ul><li>is covered by 1, not covered 5 hypotheses </li></ul></ul></ul>
  57. 57. The relevance of inductive BIAS: choosing H <ul><li>Our hypothesis language L fails to learn some concepts. </li></ul><ul><li>What about choosing a more expressive language H? </li></ul><ul><li>Assume H’: </li></ul><ul><ul><li>See example : [Alma3, breakfast, ?, cheap]  [Alma3, lunch, ?, cheap] </li></ul></ul><ul><ul><li>allows conjunctions (as before) </li></ul></ul><ul><ul><li>allows disjunction and negation too ! </li></ul></ul><ul><ul><ul><li>Example: </li></ul></ul></ul><ul><ul><ul><ul><li>Restaurant = Alma3  ~(Day = Monday) </li></ul></ul></ul></ul>
  58. 58. Inductive BIAS (2): <ul><li>This language H’ allows to represent ANY subset of the complete set of all events X </li></ul><ul><li>But X has 126 elements </li></ul><ul><ul><li>we can express 2 126 different hypotheses now ! </li></ul></ul>Restaurant Meal Day Cost 3 X 3 X 7 X 2 = 126
  59. 59. Inductive BIAS (3) <ul><li>Version Spaces using H’: </li></ul>Restaurant Meal Day Cost Reaction Alma 3 breakfast Friday cheap Yes De Moete lunch Friday expensive No Alma 3 lunch Saturday cheap Yes Sedes breakfast Sunday cheap No Alma 3 breakfast Sunday expensive No + - + - - ?  {~[DeMoete. Lunch, Friday, expensive]  ~[Sedes, breakfast, Sunday, cheap]} {~[DeMoete. Lunch, Friday, expensive]  ~[Sedes, breakfast, Sunday, cheap]  ~[Alma 3, breakfast, Sunday, expensive]} {~[DeMoete. Lunch, Friday, expensive]} {[Alma 3, breakfast, Friday, cheap]  [Alma 3, lunch, Saturday, cheap]} {[Alma 3, breakfast, Friday, cheap]}
  60. 60. Inductive BIAS (3) <ul><li>Resulting Version Spaces: </li></ul><ul><li>We haven’t learned anything </li></ul><ul><ul><li>merely restated our positive and negative examples ! </li></ul></ul>G = {~[DeMoete. Lunch, Friday, expensive]  ~[Sedes, breakfast, Sunday, cheap]  ~[Alma 3, breakfast, Sunday, expensive]} S = {[Alma 3, breakfast, Friday, cheap]  [Alma 3, lunch, Saturday, cheap]} <ul><li>In general: in order to be able to learn, we need an inductive BIAS (= assumption): </li></ul><ul><ul><li>Example: “ The desired concept CAN be described as a conjunction of features “ </li></ul></ul>
  61. 61. Shift of Bias <ul><li>Practical approach to the Bias problem: </li></ul><ul><li>Avoids the choice. </li></ul><ul><li>Gives the most general concept that can be learned. </li></ul><ul><ul><ul><li>Start VS with a very weak hypothesis language. </li></ul></ul></ul><ul><ul><ul><li>If the concept is learned: o.k. </li></ul></ul></ul><ul><ul><ul><li>Else </li></ul></ul></ul><ul><ul><ul><ul><li>Refine your language and restart VS </li></ul></ul></ul></ul>

×