• Save
WEKA:Output Knowledge Representation
Upcoming SlideShare
Loading in...5
×
 

WEKA:Output Knowledge Representation

on

  • 2,713 views

WEKA:Output Knowledge Representation

WEKA:Output Knowledge Representation

Statistics

Views

Total Views
2,713
Views on SlideShare
2,552
Embed Views
161

Actions

Likes
0
Downloads
0
Comments
0

3 Embeds 161

http://www.dataminingtools.net 131
http://dataminingtools.net 29
http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

WEKA:Output Knowledge Representation WEKA:Output Knowledge Representation Presentation Transcript

  • Output: Knowledge Representation
  • Topics Covered
    We will see how knowledge can be represented:
    Decision tables
    Decision tress
    Classification and Association rules
    Dealing with complex rules involving exceptions and relations
    Trees for numeric prediction
    Instance based representation
    Clustering
  • Decision Tables
    Simplest way to represent the output is using the way input was represented
    Selection of attributes is crucial
    Only attributes contributing to the results should be a part of a table
  • Decision Trees
    Divide and conquer approach gives us the results in the form of decision trees
  • Nodes in a decision tree involve testing a particular attribute
    Leaf nodes give a classification that applies to all instances that reach the leaf
    The number of children emerging from a node depends on the type of attribute being tested in the node
    For nominal attribute the number of splits is generally the number of different values of nominal attribute
    For example we can see 3 splits for outlook as it has three possible value
    For numeric attribute, generally we have a two way split representing sets of numbers < or > that the attribute
    For example attribute humidity in the previous example
  • Classification Rules
    Popular alternative to decision trees
    Antecedent, or precondition, of a rule is a series of tests (like the ones at the nodes of a decision tree)
    Consequent, or conclusion, gives the class or classes that apply to instances covered by that rule
  • Rules VS Tree
    Replicated Sub-tree Problem
    Some time the transformation of rules into tree is impractical :
    Consider the following classification rules and the corresponding decision tree
    If a and b then x
    If c and d then x
  • Advantages of rules over trees
    Rules are usually more compact than tree, as we observed in the case of replicated sub tree problem
    New rules can be added to the existing rule set without disturbing ones already there, whereas a tree may require complete reshaping
    Advantages of trees over rules
    Because of the redundancy present in the tree , any sort of ambiguities is avoided
    An instance might be encountered that the rules fail to classify, usually not the case with trees
  • Disjunctive Normal Form
    A rule in distinctive normal form follows close world assumption
    Close world assumption avoids ambiguities
    These rules are written as logical expressions, that is:
    Disjunctive(OR) conditions
    Conjunction(AND) conditions
  • Association Rules
    Association rules can predict any attribute, not just the class
    They can predict combination of attributes
    To select association rules which apply to large number of instances and have high accuracy, we use the following parameter to select an association rule:
    Coverage/Support : Number of instances for which it predicts correctly
    Accuracy/Confidence : Number of instances it predicts correctly in proportion to all the instances to which it is applied
  • Rules with Exception
    For classification rules
    Exceptions can be expressed using the ‘except’ keyword, for example:
    We can have exceptions to exceptions and so on
    Exceptions allows us to scale up well
  • Rules with Relations
    We generally use propositional rules, where we compare an attribute with a constant. For example :
    Relational rules are those which express relationship between attributes, for example:
  • Standard Relations:
    Equality(=) and Inequality (!=) for nominal attributes
    Comparison operators like < and > with numeric attributes
  • Trees for Numerical Prediction
    For numerical prediction we use decision trees
    Right side of the rule, or leaf of tree, would contain a numeric value that is the average of all the training set values to which the rule or leaf applies
    Prediction of numerical quantities is called regression
    Therefore trees for numerical prediction are called regression trees
  • Instance based learning
    In instance based learning we don’t create rules and use the stored instances directly
    In this all the real work is done during the classification of new instances, no pre-processing of training set
    The new instance is compared with the existing ones using a distance metric
    Using the distance metric, the close existing instance is used to assign the class to new one
  • Sometimes more than one nearest neighbor is used, the majority class of the closest k neighbor is assigned to the new instance
    This technique is called k-nearest-neighbor method
    Distance metric used should be according to the data set, most popular is Euclidian distance
    In case of nominal attributes distance metric has to defined manually, for example
    If two attribute are equal, then distance equals 0 else 1
  • Clusters
    When clusters rather than a classifier is learned, the output takes the form of a diagram which shows how the instances fall into clusters
    The output can be of 4 types:
    Clear demarcation of instances into different clusters
    An instance can be a part of more than one cluster, represented by a Venn diagram
    Probability of an instance falling in a cluster, for all the clusters
    Hierarchical tree like structure dividing trees into sub trees and so on
  • Different output types:
  • Visit more self help tutorials
    Pick a tutorial of your choice and browse through it at your own pace.
    The tutorials section is free, self-guiding and will not involve any additional support.
    Visit us at www.dataminingtools.net