Covering (Rules-based) Algorithm

21,577 views

Published on

What is the Covering (Rule-based) algorithm?
Classification Rules- Straightforward
1. If-Then rule
2. Generating rules from Decision Tree
Rule-based Algorithm
1. The 1R Algorithm / Learn One Rule
2. The PRISM Algorithm
3. Other Algorithm
Application of Covering algorithm
Discussion on e/m-learning application

Published in: Technology, Education
2 Comments
5 Likes
Statistics
Notes
No Downloads
Views
Total views
21,577
On SlideShare
0
From Embeds
0
Number of Embeds
97
Actions
Shares
0
Downloads
508
Comments
2
Likes
5
Embeds 0
No embeds

No notes for slide

Covering (Rules-based) Algorithm

  1. 1. Chapter 8 Covering (Rules-based) Algorithm Data Mining Technology
  2. 2. Chapter 8 Covering (Rules-based) Algorithm Written by Shakhina Pulatova Presented by Zhao Xinyou [email_address] 2007.11.13 Data Mining Technology Some materials (Examples) are taken from Website.
  3. 3. Contents <ul><li>What is the Covering (Rule-based) algorithm? </li></ul><ul><li>Classification Rules- Straightforward </li></ul><ul><li>1. If-Then rule </li></ul><ul><li>2. Generating rules from Decision Tree </li></ul><ul><li>Rule-based Algorithm </li></ul><ul><li>1. The 1R Algorithm / Learn One Rule </li></ul><ul><li>2. The PRISM Algorithm </li></ul><ul><li>3. Other Algorithm </li></ul><ul><li>Application of Covering algorithm </li></ul><ul><li>Discussion on e/m-learning application </li></ul>
  4. 4. Introduction-App-1 PP87-88 Training Data Attributes Record Rules <ul><li>Rules given by people </li></ul><ul><li>Rules generated by computer </li></ul>Setting 1.(1.75, 0) short 2. [1.75, 1.95) Medium 3. [1.95, ..) tall
  5. 5. Introduction-App-2 PP87-88 How to get all tall people from B based on A A B + Training Data
  6. 6. What is Rule-based Algorithm? <ul><li>Definition : </li></ul><ul><li>Each classification method uses an algorithm to generate rules from the sample data. These rules are then applied to new data. </li></ul><ul><li>Rule-based algorithm provide mechanisms that generate rules by </li></ul><ul><li>1. concentrating on a specific class at a time </li></ul><ul><li>2. maximizing the probability of the desired classification. </li></ul>PP87-88 Should be compact, easy-to-interpret, and accurate.
  7. 7. Classification Rules- Straightforward <ul><li>If-Then rule </li></ul><ul><li>Generating rules from Decision Tree </li></ul>PP88-89
  8. 8. formal Specification of Rule-based Algorithm <ul><li>The classification r ules, r=<a, c>, consists of : </li></ul><ul><li>a ( a ntecedent/precondition): a series of tests that be valuated as true or false ; </li></ul><ul><li>c ( c onsequent/conclusion): the class or classes that apply to instances covered by rule r. </li></ul>PP88 a=0,b=0 a=0,b=1 a=1,b=0 a=1,b=1 a = x y c = a=0 b=0 b=0 yes no X X Y Y no no yes yes
  9. 9. Remarks of Straightforward classification <ul><li>The a ntecedent contains a predicate that can be valuated as true or false against each tuple in database. </li></ul><ul><li>These rules relate directly to corresponding decision tree (DT) that could be created. </li></ul><ul><li>A DT can always be used to generate rules, but they are not equivalent. </li></ul><ul><li>Differences: </li></ul><ul><li>-the tree has a implied order in which the splitting is performed; rules have no order. </li></ul><ul><li>-a tree is created based on looking at all classes; only one class must be examined at a time. </li></ul>PP88-89
  10. 10. If-Then rule <ul><li>Straightforward way to perform classification is to generate if-then rules that cover all cases. </li></ul>1 PP88
  11. 11. Generating rules from Decision Tree -1-Con’ Decision Tree 2
  12. 12. Generating rules from Decision Tree -2-Con’ y n a b c d x y y
  13. 13. Generating rules from Decision Tree -3-Con’
  14. 14. Remarks <ul><li>Rules may be more complex and incomprehensible from DT. </li></ul><ul><li>A new test or rules need reshaping the whole tree </li></ul><ul><li>Rules obtained without decision trees are more compact and accurate. </li></ul><ul><li>So many other covering algorithms have been proposed. </li></ul>PP89-90 a b x y y c d x y y n n n n c d x y y n n c d x y y n n c d x y y n n duplicate subtrees a=0 b=0 b=0 yes no X X Y Y no no yes yes a=1 and c=0 Y
  15. 15. Rule-based Classification <ul><li>Generate rules </li></ul><ul><li>The 1R Algorithm / Learn One Rule </li></ul><ul><li>The PRISM Algorithm </li></ul><ul><li>Other Algorithm </li></ul>PP90
  16. 16. Generating rules without Decision Trees-1-con’ <ul><li>Goal: find rules that identify the instances of a specific class </li></ul><ul><li>Generate the “best” rule possible by optimizing the desired classification probability </li></ul><ul><li>Usually, the “best” attribute-pair is chosen </li></ul><ul><li>Remark </li></ul><ul><li>-these technologies are also called covering algorithms because they attempt to generate rules which exactly cover a specific class. </li></ul>
  17. 17. Generate Rules-Example-2-Con' <ul><li>Example 3 </li></ul><ul><li>Question: We want to generate a rule to classify persons as tall. Basic format of the rule: </li></ul><ul><li>if ? then class = tall </li></ul><ul><li>Goal: replace “?” with predicates that can be used to obtain the “best” probability of being tall </li></ul>PP90
  18. 18. Generate Rules-Algorithms-3-Con' <ul><li>1.Generate rule R on training data S; </li></ul><ul><li>2.Remove the training data covered by rule R; </li></ul><ul><li>3. Repeat the process. </li></ul>PP90
  19. 19. Generate Rules-Example-4-Con' <ul><li>Sequential Covering </li></ul>(I) Original data (ii) Step 1 r = NULL (iii) Step 2 R1 r = R1 (iii) Step 3 R1 R2 r = R1 U R2 (iii) Step 4 R1 R2 R3 r = R1 U R2 U R3 Wrong Class
  20. 20. 1R Algorithm/ Learn One Rule-Con’ <ul><li>Simple and cheap method </li></ul><ul><li>it only generates a one level decision tree. </li></ul><ul><li>Classify an object on the basis of a single attribute. </li></ul><ul><li>Idea: </li></ul><ul><li>Rules will be constructed to test a single attribute and branch for every value of that attribute. For each branch, the class with the test classification is the one occurring </li></ul>PP91
  21. 21. 1R Algorithm/ Learn One Rule-Con’ <ul><li>Idea : </li></ul><ul><li>1. Rules will be constructed to test a single attribute and branch for every value of that attribute. </li></ul><ul><li>Step </li></ul><ul><li>2. For each branch, the class with the test classification is the one occurring. </li></ul><ul><li>3. Find one biggest number as rules </li></ul><ul><li>4. Error rate will be evaluated. </li></ul><ul><li>5. The minimum error rate will be chosen. </li></ul>PP91 M->T Error=5 F->M Error=3 Total Error=8 Total Error=3 Total Error=.. A2 An Gender F 2 5 1 S M T M 1 4 10 S M T
  22. 22. 1R Algorithm <ul><li>Input: </li></ul><ul><li>D //Training Data </li></ul><ul><li>T //Attributes to consider for rules </li></ul><ul><li>C //Classes </li></ul><ul><li>Output : </li></ul><ul><li>R //Rules </li></ul><ul><li>ALgorithm : </li></ul><ul><li>R=Φ; </li></ul><ul><li>for all A in T do </li></ul><ul><li>R A =Φ; </li></ul><ul><li>for all possbile value, v, of A do </li></ul><ul><li>for all C j ∈C do </li></ul><ul><li>find count(C j ) </li></ul><ul><li>end for </li></ul><ul><li>let C m be the class with the largest count; </li></ul><ul><li>R A =R A ((A=v) ->(class= C m )); </li></ul><ul><li>end for </li></ul><ul><li>ERR A =number of tuples incorrectly classified by R A ; </li></ul><ul><li>e nd for </li></ul><ul><li>R=R A where ERR A is minimum </li></ul>T={Gender, Height} D C={{F, M}, {0, ∞}} C1 C2 Training Data Gender F M Short Medium Tall 3 6 0 Short Medium Tall 1 2 3 R1=F->medium R2=M->tall Height
  23. 23. Example 5 – 1R-3-Con’ Rules based on height … ... … 0/2 0/2 0/3 0/4 1/2 0/2 3/9 3/6 Error 1/15 (0 , 1.6]-> short (1.6, 1.7]->short (1.7, 1.8]-> medium (1.8, 1.9]-> medium (1.9, 2.0]-> medium (2.0, ∞ ]-> tall Height (Step=0.1) 2 6/15 F->medium M->tall Gender 1 Total Error Rules Attribute Option
  24. 24. Example 6 -1R PP92-93 5/14 2/8 3/6 False->yes True->no windy 4 4/14 3/7 1/7 High->no Normal->yes humidity 3 2/4 2/6 1/4 2/5 0/4 2/5 Error 5/14 Hot->no Mild->yes Cool->yes temperature 2 4/14 Sunny->no Overcast->yes Rainy->yes outlook 1 Total Error Rules Attribute Rules based on humidity OR High->no Normal->yes Rules based on outlook Sunny->no Overcast->yes Rainy->yes
  25. 25. PRISM Algorithm-Con’ <ul><li>PRISM generate rules for each class by looking at the training data and adding rules that completely describe all tuples in that class. </li></ul><ul><li>Generates only correct or perfect rules: the accuracy of so-constructed PRISM is 100%. </li></ul><ul><li>Measures the success of a rule by a p/t, where </li></ul><ul><li>-p is number of positive instance, </li></ul><ul><li>-T is total number of instance covered by the rule. </li></ul>Gender=Male P=10, T=10 Gender=Female P=1 T=8 R=Gender = Male …… A2 An Gender F 2 5 1 S M T M 0 0 10 S M T
  26. 26. PRISM Algorithm Step Input D and C (Attribute -> Value) 1.Compute all class P/T (Attribute->Value) 2. Find one or more pair of (Attribute->Value) P/T = 100% 3. Select (Attribute->Value) as Rule 4. Repeat 1-3 until no data in D Input: D //Training Data C //Classes Output: R //Rules
  27. 27. Example 8-Con’-which class may be tall? Compute the value p / t Which one is 100% PP94-95 0/9 Gender = F 1 2/2 2.0< Height 8 ½ 1.9< Height ≤ 2.0 7 0/4 1.8< Height ≤ 1.9 6 0/3 1.7< Height ≤ 1.8 5 0/2 1.6< Height ≤ 1.7 4 0/2 Height ≤ 1.6 3 3/6 Gender = M 2 p / t (Attribute, value) Num R1 = 2.0< Height
  28. 28. R2 = 1.95< Height ≤ 2.0 R = R1 U R2 PP94-96 … … … 1/1 1.95< Height ≤ 2.0 0/1 1.9< Height ≤ 1.95 p / t (Attribute, value) Num
  29. 29. Example 9-Con’-which days may play? The predicate outlook=overcast correctly implies play=yes on all four rows R1 =if outlook=overcast, then play=yes Compute the value p / t
  30. 30. Example 8-Con’ R2= if humidity=normal and windy=false, then play=yes
  31. 31. Example 8-Con’ R3 =….. R = R1 U R2 U R3 U…
  32. 32. Application of Covering Algorithm <ul><li>To derive classification rules applied for diagnosing illness, business planning, banking, government. </li></ul><ul><li>Machine learning </li></ul><ul><li>Text classification. But to photos, it is difficult… </li></ul><ul><li>And so on. </li></ul>
  33. 33. Application on E-learning/M-learning <ul><li>Adaptive and personalized learning materials </li></ul><ul><li>Virtual Group Classification </li></ul>Initial Learner’s information Classification of learning styles or some Provide adaptive and personalized materials Collect learning styles feedback Chapter 2 or 3 Similarity, Bayesian… Rule-based algorithm
  34. 34. Discussion

×