Output: Knowledge Representation
Topics CoveredWe will see how knowledge can be represented:Decision tablesDecision tressClassification and Association rulesDealing with complex rules involving exceptions and relationsTrees for numeric predictionInstance based representation Clustering
Decision TablesSimplest way to represent  the output is using the way input was representedSelection of attributes is crucial Only attributes  contributing to the results should be a part of a table
Decision TreesDivide and conquer approach gives us the results in the form of decision trees
Nodes in a decision tree involve testing a particular attribute Leaf nodes give a classification that applies to all instances that reach the leafThe number of children emerging from a node depends on the type of attribute being tested in the nodeFor nominal attribute the number of splits is generally the number of different values of nominal attributeFor example we can see 3 splits for outlook as it has three possible value For numeric attribute, generally we have a two way split representing sets of numbers < or > that the attributeFor example attribute humidity in the previous example
Classification RulesPopular alternative to decision treesAntecedent, or precondition, of a rule is a series of tests  (like the ones at the nodes of a decision tree)Consequent, or conclusion, gives the class or classes that apply to instances covered by that rule
Rules VS TreeReplicated Sub-tree ProblemSome time the transformation of rules into tree is impractical :Consider the following classification rules and the corresponding decision treeIf a and b then xIf c and d then x
Advantages of rules over treesRules are usually more compact than tree, as we observed in the case of replicated sub tree problemNew rules can be added to the existing rule set without disturbing ones already there, whereas a tree may require complete reshapingAdvantages of trees over rulesBecause of the redundancy present in the tree , any sort of ambiguities is avoidedAn instance might be encountered that the rules fail to classify, usually not the case with trees
Disjunctive Normal FormA rule in distinctive normal form follows close world assumptionClose world assumption avoids ambiguitiesThese rules are written as logical expressions, that is:Disjunctive(OR) conditions Conjunction(AND) conditions
Association RulesAssociation rules can predict any attribute, not just the classThey can predict combination of attributesTo select association rules which apply to large number of instances and have high accuracy, we use the following parameter to select an association rule:Coverage/Support : Number of instances for which it predicts correctly Accuracy/Confidence : Number of instances it predicts correctly in proportion to all the instances to which it is applied
Rules with ExceptionFor classification rulesExceptions can be expressed using the ‘except’ keyword, for example:We can have exceptions to exceptions and so onExceptions allows us to scale up well
Rules with RelationsWe generally use propositional rules, where we compare an attribute with a constant. For example :Relational rules are those which express relationship between attributes, for example:
Standard Relations:Equality(=) and Inequality (!=) for nominal attributesComparison operators like < and > with numeric attributes
Trees for Numerical PredictionFor numerical prediction we use decision treesRight side of the rule, or leaf of tree, would contain a numeric value that is the average of all the training set values to which the rule or leaf appliesPrediction of numerical quantities is called regressionTherefore trees for numerical prediction are called regression trees
Instance based learningIn instance based learning we don’t create rules and use the stored instances directlyIn this all the real work is done during the classification of new instances, no pre-processing of training setThe new instance is compared with the existing ones using a distance metricUsing the distance metric,  the close existing instance is used to assign the class to new one
Sometimes more than one nearest neighbor is used, the majority class of the closest k neighbor is assigned to the new instanceThis technique is called k-nearest-neighbor methodDistance metric used should be according to the data set, most popular is Euclidian distance In case of nominal attributes distance metric has to defined manually, for exampleIf two attribute are equal, then distance equals 0 else 1
ClustersWhen clusters rather than a classifier is learned, the output takes the form of a diagram which shows how the instances fall into clustersThe output can be of 4 types:Clear demarcation of instances into different clusters An instance can be a part of more than one cluster, represented by a Venn diagramProbability of an instance falling in a cluster, for all the clustersHierarchical tree like structure dividing trees into sub trees and so on
Different output types:
Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net

WEKA: Output Knowledge Representation

  • 1.
  • 2.
    Topics CoveredWe willsee how knowledge can be represented:Decision tablesDecision tressClassification and Association rulesDealing with complex rules involving exceptions and relationsTrees for numeric predictionInstance based representation Clustering
  • 3.
    Decision TablesSimplest wayto represent the output is using the way input was representedSelection of attributes is crucial Only attributes contributing to the results should be a part of a table
  • 4.
    Decision TreesDivide andconquer approach gives us the results in the form of decision trees
  • 5.
    Nodes in adecision tree involve testing a particular attribute Leaf nodes give a classification that applies to all instances that reach the leafThe number of children emerging from a node depends on the type of attribute being tested in the nodeFor nominal attribute the number of splits is generally the number of different values of nominal attributeFor example we can see 3 splits for outlook as it has three possible value For numeric attribute, generally we have a two way split representing sets of numbers < or > that the attributeFor example attribute humidity in the previous example
  • 6.
    Classification RulesPopular alternativeto decision treesAntecedent, or precondition, of a rule is a series of tests (like the ones at the nodes of a decision tree)Consequent, or conclusion, gives the class or classes that apply to instances covered by that rule
  • 7.
    Rules VS TreeReplicatedSub-tree ProblemSome time the transformation of rules into tree is impractical :Consider the following classification rules and the corresponding decision treeIf a and b then xIf c and d then x
  • 8.
    Advantages of rulesover treesRules are usually more compact than tree, as we observed in the case of replicated sub tree problemNew rules can be added to the existing rule set without disturbing ones already there, whereas a tree may require complete reshapingAdvantages of trees over rulesBecause of the redundancy present in the tree , any sort of ambiguities is avoidedAn instance might be encountered that the rules fail to classify, usually not the case with trees
  • 9.
    Disjunctive Normal FormArule in distinctive normal form follows close world assumptionClose world assumption avoids ambiguitiesThese rules are written as logical expressions, that is:Disjunctive(OR) conditions Conjunction(AND) conditions
  • 10.
    Association RulesAssociation rulescan predict any attribute, not just the classThey can predict combination of attributesTo select association rules which apply to large number of instances and have high accuracy, we use the following parameter to select an association rule:Coverage/Support : Number of instances for which it predicts correctly Accuracy/Confidence : Number of instances it predicts correctly in proportion to all the instances to which it is applied
  • 11.
    Rules with ExceptionForclassification rulesExceptions can be expressed using the ‘except’ keyword, for example:We can have exceptions to exceptions and so onExceptions allows us to scale up well
  • 12.
    Rules with RelationsWegenerally use propositional rules, where we compare an attribute with a constant. For example :Relational rules are those which express relationship between attributes, for example:
  • 13.
    Standard Relations:Equality(=) andInequality (!=) for nominal attributesComparison operators like < and > with numeric attributes
  • 14.
    Trees for NumericalPredictionFor numerical prediction we use decision treesRight side of the rule, or leaf of tree, would contain a numeric value that is the average of all the training set values to which the rule or leaf appliesPrediction of numerical quantities is called regressionTherefore trees for numerical prediction are called regression trees
  • 15.
    Instance based learningIninstance based learning we don’t create rules and use the stored instances directlyIn this all the real work is done during the classification of new instances, no pre-processing of training setThe new instance is compared with the existing ones using a distance metricUsing the distance metric, the close existing instance is used to assign the class to new one
  • 16.
    Sometimes more thanone nearest neighbor is used, the majority class of the closest k neighbor is assigned to the new instanceThis technique is called k-nearest-neighbor methodDistance metric used should be according to the data set, most popular is Euclidian distance In case of nominal attributes distance metric has to defined manually, for exampleIf two attribute are equal, then distance equals 0 else 1
  • 17.
    ClustersWhen clusters ratherthan a classifier is learned, the output takes the form of a diagram which shows how the instances fall into clustersThe output can be of 4 types:Clear demarcation of instances into different clusters An instance can be a part of more than one cluster, represented by a Venn diagramProbability of an instance falling in a cluster, for all the clustersHierarchical tree like structure dividing trees into sub trees and so on
  • 18.
  • 19.
    Visit more selfhelp tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net