Lecture3 (3).ppt

Bayesian Classification Example

Distance based
• Simple Approach
• KNN

Distance based
• Place items in class to which they are “closest”.
• Distance measure is used to find alikeness of different items.
• Simple Approach:Classes represented by
– Centroid: Central value.
ALGORITHM
Input : c1 , ... , cm //Centers for each class
t // Input tuple to classify
Output : C //Class to which t is assigned
Simple distance-based algorithm
dist = ∞;
for i := 1 to m do
if dis(ci , t) < dist, then

Distance based
C= i;
dist = dist(ci , t) ;

K Nearest Neighbor (KNN):
• Common classification scheme based on the use of distance
measures is that of the K nearest neighbors (KNN).
• Training set includes classes along with item set.
• When a classification is to be made for a new item, its distance
to each item in the training set must be determined.
• Only the K closest entries in the training set are considered
further
• New item placed in class that contains the most items from this
set of K closest items.
• O(q) for each tuple to be classified. (Here q is the size of the
training set.)
• KNN technique is extremely sensitive to the value of K. A rule of
thumb is that K ≤

Classification Using Decision Trees
• Partitioning based: Divide search space into rectangular
regions.
• Tuple placed into class based on the region within which it falls.
• DT approaches differ in how the tree is built: DT Induction
• Internal nodes associated with attribute and arcs with values for
that attribute.
• Algorithms: ID3, C4.5, CART

Decision Tree
Given:
– D = {t1, …, tn} where ti=<ti1, …, tih>
– Database schema contains {A1, A2, …, Ah}
– Classes C={C1, …., Cm}
Decision or Classification Tree is a tree associated
with D such that
– Each internal node is labeled with attribute, Ai
– Each arc is labeled with predicate which can be
applied to attribute at parent
– Each leaf node is labeled with a class, Cj

DT Issues
• Choosing Splitting Attributes
• Ordering of Splitting Attributes
• Splits
• Tree Structure
• Stopping Criteria
• Training Data
• Pruning

Lecture3 (3).ppt

Recommended

Recommended

More Related Content

Similar to Lecture3 (3).ppt

Similar to Lecture3 (3).ppt (20)

More from Minakshee Patil

More from Minakshee Patil (7)

Recently uploaded

Recently uploaded (20)

Lecture3 (3).ppt