4. Classification Methods
Uses set of parameters to characterize each object
Features should be relevant to task at hand
Supervised classification
What classes???
Set of sample objects with known classes
Training set
Set of known objects
Used by classification program
Two phases for classification
??
??
5. Classification Methods
1. Training Phase:
Uses training set
Decision is about
How to weight parameters
How to combine these objects under different classes
1. Application Phase:
Weights determined in phase 1 are used with set of objects
That do not have known classes
Determine their possible class
6. Classification Methods
With few parameters ; process is easy
Example:
With much more parameters ; process is tough
Example:
Depending on structure ; find types of attributes
Multi State Attribute
Example:
Binary State Attribute
Example:
Numerical Attributes
Example
7. Classification Methods
Binary State
Bold , underline
Multi State
Color , position , font type
Execution of operation changes attribute value.
Example:
MOVE
FILL
INSERT
DELETE
CREATE
8. Classification Methods
Relation between Classes & Properties
1. Monothetic:
To get membership of class ,
object must posses the set of properties
which are necessary as well as sufficient
Example
1. Polythetic:
Large number of members have some number of
properties
No individual is having all the properties
example
9. Classification Methods
Relation between Object & Classes
1. Exclusive:
Object belongs to single class
Example
1. Overlapping:
Membership is with different classes
Example
10. Classification Methods
Relationship between Classes & Classes:
1. Ordered:
Structure is imposed
Hierarchical structure
Example
1. Unordered:
No imposed structure
All are at same level
example
11. Measures of Association
Some classification methods are based on a binary
relationship between objects
On the basis of this relationship a classification method
can construct a system of clusters
Relationship type:
1. similarity
2. dissimilarity
3. association
12. Measures of Association
Similarity:
The measure of similarity is designed to quantify the likeness
between objects
so that if one assumes it is possible to group objects in such a
way that an object in a group is more like the other members of
the group
than it is like any object outside the group,
then a cluster method enables such a group structure to be
discovered.
13. Measures of Association
Association:
Association means???
Dependency…
Occurrence…
reserved for the similarity between objects
characterized by discrete-state attributes.
14. Measures of Association
Used to measure strength of relationship
measure of association increases as the number or
proportion of shared attribute states increases.
Five measures of association
1. Simple
2. Dice’s coefficient
3. Saccard’s coefficient
4. Cosine coefficient
5. Overlap coefficient
16. Probabilistic Indexing
Probability of relevance
Experiments and observations
Sample space
May Consist relevant as well as non relevant objects
Consider a document
Find no. of relevant document with respect to it
That gives probability quotient
probability measured as per the terms present in
document
17. Probabilistic Indexing
Probabilistic indexing model
Contains random variable
Denotes no. of relevant documents
If this variable is selected by system
Gives possible relevant document description
Probabilistic information retrieval models are based on the
probabilistic ranking principle,
which says that documents should be ranked according to
their probability of relevance with respect to the actual
request.