Upcoming SlideShare
×

# WEKA:Output Knowledge Representation

2,894 views
2,750 views

Published on

WEKA:Output Knowledge Representation

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
2,894
On SlideShare
0
From Embeds
0
Number of Embeds
164
Actions
Shares
0
0
0
Likes
0
Embeds 0
No embeds

No notes for slide

### WEKA:Output Knowledge Representation

1. 1. Output: Knowledge Representation <br />
2. 2. Topics Covered<br />We will see how knowledge can be represented:<br />Decision tables<br />Decision tress<br />Classification and Association rules<br />Dealing with complex rules involving exceptions and relations<br />Trees for numeric prediction<br />Instance based representation <br />Clustering<br />
3. 3. Decision Tables<br />Simplest way to represent the output is using the way input was represented<br />Selection of attributes is crucial <br />Only attributes contributing to the results should be a part of a table<br />
4. 4. Decision Trees<br />Divide and conquer approach gives us the results in the form of decision trees<br />
5. 5. Nodes in a decision tree involve testing a particular attribute <br />Leaf nodes give a classification that applies to all instances that reach the leaf<br />The number of children emerging from a node depends on the type of attribute being tested in the node<br />For nominal attribute the number of splits is generally the number of different values of nominal attribute<br />For example we can see 3 splits for outlook as it has three possible value <br />For numeric attribute, generally we have a two way split representing sets of numbers &lt; or &gt; that the attribute<br />For example attribute humidity in the previous example<br />
6. 6. Classification Rules<br />Popular alternative to decision trees<br />Antecedent, or precondition, of a rule is a series of tests (like the ones at the nodes of a decision tree)<br />Consequent, or conclusion, gives the class or classes that apply to instances covered by that rule <br />
7. 7. Rules VS Tree<br />Replicated Sub-tree Problem<br />Some time the transformation of rules into tree is impractical :<br />Consider the following classification rules and the corresponding decision tree<br />If a and b then x<br />If c and d then x<br />
8. 8. Advantages of rules over trees<br />Rules are usually more compact than tree, as we observed in the case of replicated sub tree problem<br />New rules can be added to the existing rule set without disturbing ones already there, whereas a tree may require complete reshaping<br />Advantages of trees over rules<br />Because of the redundancy present in the tree , any sort of ambiguities is avoided<br />An instance might be encountered that the rules fail to classify, usually not the case with trees<br />
9. 9. Disjunctive Normal Form<br />A rule in distinctive normal form follows close world assumption<br />Close world assumption avoids ambiguities<br />These rules are written as logical expressions, that is:<br />Disjunctive(OR) conditions <br />Conjunction(AND) conditions <br />
10. 10. Association Rules<br />Association rules can predict any attribute, not just the class<br />They can predict combination of attributes<br />To select association rules which apply to large number of instances and have high accuracy, we use the following parameter to select an association rule:<br />Coverage/Support : Number of instances for which it predicts correctly <br />Accuracy/Confidence : Number of instances it predicts correctly in proportion to all the instances to which it is applied<br />
11. 11. Rules with Exception<br />For classification rules<br />Exceptions can be expressed using the ‘except’ keyword, for example:<br />We can have exceptions to exceptions and so on<br />Exceptions allows us to scale up well<br />
12. 12. Rules with Relations<br />We generally use propositional rules, where we compare an attribute with a constant. For example :<br />Relational rules are those which express relationship between attributes, for example: <br />
13. 13. Standard Relations:<br />Equality(=) and Inequality (!=) for nominal attributes<br />Comparison operators like &lt; and &gt; with numeric attributes<br />
14. 14. Trees for Numerical Prediction<br />For numerical prediction we use decision trees<br />Right side of the rule, or leaf of tree, would contain a numeric value that is the average of all the training set values to which the rule or leaf applies<br />Prediction of numerical quantities is called regression<br />Therefore trees for numerical prediction are called regression trees<br />
15. 15. Instance based learning<br />In instance based learning we don’t create rules and use the stored instances directly<br />In this all the real work is done during the classification of new instances, no pre-processing of training set<br />The new instance is compared with the existing ones using a distance metric<br />Using the distance metric, the close existing instance is used to assign the class to new one<br />
16. 16. Sometimes more than one nearest neighbor is used, the majority class of the closest k neighbor is assigned to the new instance<br />This technique is called k-nearest-neighbor method<br />Distance metric used should be according to the data set, most popular is Euclidian distance <br />In case of nominal attributes distance metric has to defined manually, for example<br />If two attribute are equal, then distance equals 0 else 1<br />
17. 17. Clusters<br />When clusters rather than a classifier is learned, the output takes the form of a diagram which shows how the instances fall into clusters<br />The output can be of 4 types:<br />Clear demarcation of instances into different clusters <br />An instance can be a part of more than one cluster, represented by a Venn diagram<br />Probability of an instance falling in a cluster, for all the clusters<br />Hierarchical tree like structure dividing trees into sub trees and so on <br />
18. 18. Different output types:<br />
19. 19. Visit more self help tutorials<br />Pick a tutorial of your choice and browse through it at your own pace.<br />The tutorials section is free, self-guiding and will not involve any additional support.<br />Visit us at www.dataminingtools.net<br />