Successfully reported this slideshow.
Upcoming SlideShare
×

# GrC2011(M1 大木基至)_11.11.08

371 views

Published on

Published in: Education, Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### GrC2011(M1 大木基至)_11.11.08

1. 1. 11/09/2011 GrC2011Decision Rule Visualization for KnowledgeDiscovery by Means of Rough Set ApproachMotoyuki Ohki, Masahiro Inuiguchi, Toshinobu HaradaGraduate School of Engineering Science, Osaka UniversityFaculty of Systems Engineering, Wakayama University
2. 2. 00. Outline 1 / 2501. Background and Purpose02. Algorithm for Decision Rule Visualization03. Visualization System04. Evaluation Experiment05. Summary and Future Work
3. 3. 01. Background 2 / 25Rough Set Approach- Attribute Reduction- Induce Decision RulesApplication to various fields
4. 4. 01. Background 3 / 25A Decision Table Decision rule：If “b1” then “1” The numberSample Color (a) Shape (b) Type (d) Preference of doors (c) car1 colored (a1) nature (b1) two (c1) personal (d1) Id like to buy (1) car2 colored (a1) rounded (b2) four (c2) sporty (d2) I dont know (2) car3 monochrome (a2) rounded (b2) four (c2) formal (d3) I dont know (2) car4 monochrome (a2) nature (b1) four (c2) personal (d1) Id like to buy (1) car5 monochrome (a2) rounded (b2) two (c1) personal (d1) I dont know (2) car6 colored (a1) rounded (b2) two (c1) sporty (d2) Id like to buy (1) Decision rule：If “a1 and d2” then “1” We select useful decision rules among many rules. We apply the rules to actual problems.
5. 5. 01. Background 4 / 25Technical issue- Difficulty of interpretation- Depending on analysts Difficulty of findingusuful decision rules ... An example of inducing decision rules[1] [1] HOLON CREATE, Rough Sets Analysis Program, http://www.holon.com/program.html
6. 6. 01. Purpose 5 /25 Proposing Algorithm for Visualization of Decision Rule in Rough Set Approach Supporting discovery of useful decision ruleExamples of visual data mining [1,2,3] [1] SOM Self-organization maps http://www.mindware-jp.com/Viscovery/self-organizing-maps.html [2] Purple Insight MineSet http://journal.mycom.co.jp/news/2006/06/28/347.html [3] Natto View http://www.holon.com/program.html
7. 7. 02. Methods used in the proposed visualization 6 / 25Three Methods(i) The decision matrix-based rule induction(ii) Calculation of Co-occurrence Rates(iii) Hayashi’s Quantification Method ⅣWe evaluate the dependencies between attributevalues and conclusions quantitatively.
8. 8. 02. Co-occurrence Rate 7 / 25Definition- Degrees of the dependencies “between attribute values” and “between attribute values and conclusion”- Jaccard coefficientFormula , |X| : cardinality of set X
9. 9. 02. Co-occurrence Rate 8 / 25Calculation Example The numberSample Color (a) Shape (b) Type (d) Preference of doors (c) car1 colored (a1) nature (b1) two (c1) personal (d1) Id like to buy (1) car2 colored (a1) rounded (b2) four (c2) sporty (d2) I dont know (2) car3 monochrome (a2) rounded (b2) four (c2) formal (d3) I dont know (2) car4 monochrome (a2) nature (b1) four (c2) personal (d1) Id like to buy (1) car5 monochrome (a2) rounded (b2) two (c1) personal (d1) I dont know (2) car6 colored (a1) rounded (b2) two (c1) sporty (d2) Id like to buy (1)the rate between “a1” and “b1”
10. 10. 02. Hayashi’s Quantification Method Ⅳ 9 / 25Definition- A kind of multi-dimensional scaling- Plot all objects in the two dimensional coordinate systemAlgorithm :
11. 11. 02. Flow of the Decision Rule Visualization 10 / 25Input A decision tableAnalysis Calculate Jaccard coefficients between attribute values Apply Hayashi’s quantification method 1. We obtain the locations of attribute values in X-Y coordinate.Output Attribute values
12. 12. 02. Flow of the Decision Rule Visualization 11 / 25Input A decision tableAnalysis Calculate Jaccard coefficients between attribute values and conclusion 2. We obtain the location of attribute values in Z coordinates.Output
13. 13. 02. Flow of the Decision Rule Visualization 12 / 25Input A decision tableAnalysis Induce decision rules by rough set approach Calculate C.I values 3. Decision rules are represented as links. b2 Decision Rule：a1b2Output a1
14. 14. 03. Visualization System 13 / 25 c1 0.500 Strongly dependent with the conclusion Decision rule : c1d3 Candidate for the useful decision rules Decision table - Attribute values : 16 - Induced decision rules : 31
15. 15. 04. Evaluation Experiment 14 / 25Two evaluation experiments- We check the efficiency and usefulness of visualization method.[1] Product evaluation experiment - To check the advantage of visualization method[2] Numerical experiment - To check the usefulness of decision rules selected by examinees utilizing the visualization system
16. 16. 04. Product Evaluation Experiment 15 / 25Procedure 1Samples and attribute values - 24 digital cameras as samples - 7 condition attributes ex) Face shape, Position of lens … etc.Procedure 2We ask three examinees about buying motivation of thesedigital cameras. - conclusion 1 : “I want to buy it” - conclusion 2 : “I will not buy it”
17. 17. 04. Product Evaluation Experiment 16 / 25Procedure 3We compare the advantage of selecting decision rulesby the following two methods. - one : Proposed Visualization Method - the other : Commercial Software provided by HOLON[1] Comparison [1] HOLON CREATE Rough Sets Analysis Program http://www.holon.com/program.html
18. 18. 04. Product Evaluation Experiment 17 / 25Evaluation of Commercial Software List of decision rules with C.I values Decision Rules C.I value e2f3 0.167 b2f2 0.167 Difficulty in finding the useful a2d2 0.167 c1f1g2 0.167 decision rules b1c1f1 0.167 a2f2g1 0.167 The selected decision rules are different a2b1e2 0.167 among examinees. b2e1 0.083 d2f3 0.083 a1d3 0.083 Decision rules and C.I values induced by a commercial software
19. 19. 04. Product Evaluation Experiment 18 / 25Evaluation of Visualization System1. It is easy to understand the strength of dependencies at one look. Examples- e2 (no dial, Z-value = 0.450)- c1 (shape of face is straight line, Z-value = 0.429)- g2 (shape of edge strip is rounded, Z-value = 0.412)
20. 20. 04. Product Evaluation Experiment 19 / 25Evaluation of Visualization System2. We can find a weakly related condition attribute values. Examples- f1, f2, and f3 are located lower position- “f” (location of flash) is not very influential for this examinee’s preference.
21. 21. 04. Product Evaluation Experiment 20 / 25Evaluation of Visualization System3. The length of linkes can e2 express imbalanced influence of attribute values. b1 a2 Examples- “e2f3” : long link→ unreliable decision rule- “a2b1e2” : short link→ reliable decision rule f3 Decision rules composed by three attribute values Decision rules composed by two attribute values
22. 22. 04. Numerical Experiment 21 / 25Procedure 1Partion “car” data set into ten subsets randomly- “car” data set : obtained from UCI web site*1+ 1 2 3 10 [1] UCI Machine Learning Repository http://archive.ics.uci.edu/ml/
23. 23. 04. Numerical Experiment 22 / 25Procedure 2Ask each of three examinees to select three decisionrules to each subsets of “car” data set a1c3 b1d2 a1c2 a1d2 a1c3 d2b1 a1c3 d2b1 b1d2
24. 24. 04. Numerical Experiment 23 / 25Procedure 3Compare the selected three decision rules(Rule Set 1) with non-selected decision rules(Rule Set 2) having the same C.I values Rule Set 1 Rule Set 2 a1d2, a1c3, b1d2 c2d2, b3c3 … 1 2 9 1 2 9 Calculation of Average Accuracy
25. 25. 04. Numerical Experiment 24 / 25Results of Average Accuracy By the paired t-test with significance level α = 0.05, we confirmed the advantage of Rule Set 1 to Rule Set 2. We confirmed the usefulness of the proposed method.
26. 26. 05. Summary and Future Work 25 / 25Summary1. We proposed a method of visualizing decision rules2. We developed a visualization system based on the proposed method3. We conducted two experiments. We confirmed the effectiveness and usefulness of the visualization system.Future Work1. To conduct more experiments with many different decision tables.2. To improve the system in order to enhance the precision of analysis method.
27. 27. Thank you for listening !Motoyuki OhkiGraduate School of Engineering Science, Osaka UniversityE-Mail : ohki@inulab.sys.es.osaka-u.ac.jp
28. 28. Appendix
29. 29. 00. Samples and Attribute 24 digital cameras 7 attribute values
30. 30. 00. Conventional Research Multi-valued decision diagrams [1] - This method uses a multi-valued decision diagram. Hierarchical visualization method[2] - This method uses a hierarchical graph structure.*1+ Y. Tomoto, T. Ohira, T. Nakamura, M. Kanoh, and H. Itoh, “Applying Multi-valued Decision Diagram toVisualization of If-Then Rules” Kansei Engineering International Journal, vol.9, no.2, 2010, pp.259-267.*2+ A. Ito, T. Yoshikawa, T. Furuhashi, S. Mitsumatsu,“Profiling by Association Analysis using Hierarchical Visualization Method” Kansei Engineering International Journal, vol.10, no.2, 2011, pp.205-212.
31. 31. 00. Co-occurrence Rate 30 / 14The reason of selecting Jaccard coefficient- Attribute value X and attribute value YFor example(1) |X| = 100, |Y| = 1, |X∩Y| = 1, |X∪Y| = 100Jaccard = 1/100 Simpson = 1Cosine = 1/10 Dice = 2/101(2) |X| = 100, |Y| = 100, |X∩Y| = 50, |X∪Y| = 150Jaccard = 1/3 Simpson = 1/2Cosine = 1/2 Dice = 1/2