2. Disadvantages of DBSCAN
Requires two user inputs(Eps and MinPts)
Unable to determine variable density cluster
OPTICS:
Able to get variable density cluster
Mainly requires one inputs (i.e, MinPts)
Eps can be considered as ‘infinite’
3. Idea
Creates an augmented ordering of the database
representing its density-based clustering structure
Help us gain a high level understanding of the way data
is structured
4. Observation
For a constant MinPts value, density-based
clusters with higher density are completely
contained in density-connected sets with respect
to a lower density.
Extend the DBSCAN algorithm such that several
distance parameters are processed at the same
time
5. OPTICS
An infinite number of distance parameters eps’
which are smaller than a “generating distance”
eps (i.e. 0 <= eps’ <= eps).
Order is stored in which the objects are
processed and the information which would be
used by an extended DBSCAN algorithm to
assign cluster memberships
This information consists of only two values for
each object: the core-distance and a reachability-
distance
6. Terminology
Core distance of an object p:
The core-distance of an object p is simply the smallest
distance eps’ between p and an object in its e-
neighborhood such that p would be a core object with
respect to eps’ if this neighbor is contained in Ne(p).
Reachability-distance object p w.r.t. object o:
Reachability-distance of an object p with respect to
another object o is the smallest
distance such that p is directly
density-reachable from o if o is
a core object
7. Algorithm
FOR i FROM 1 TO SetOfObjects.size DO
IF NOT Object.Processed THEN
1. neighbors := SetOfObjects.neighbors(Object, e);
2. Object.Processed := TRUE;
3. Object.reachability_distance := UNDEFINED;
4. Object.setCoreDistance(neighbors, e, MinPts);
5. OrderedFile.write(Object);
6. IF Object.core_distance <> UNDEFINED THEN
OrderSeeds.update(neighbors, Object);
WHILE NOT orderSeeds.empty() DO
Repeat Step 1, 2, 4, 5 and 6
8. If reachability-distance of the current object
Object is larger than the clustering-distance eps’
Object is not density-reachable from any of the objects
which are located before the current object in the
cluster-ordering.
We look at the core-distance of Object and start a new
cluster if Object is a core object with respect to eps’
and MinPts; otherwise, Object is assigned to NOISE
If reachability-distance of the current object is
smaller than eps’
Can simply assign this object to the current cluster
because then it is density-reachable from a preceding
core object in the cluster-ordering.
9. Reachability plot insensitive to input
parameter
• The smaller the Eps
value, the more
objects have an
UNDEFINED
reachability-distance
• Lower values MinPts
reachability-plot looks
more jagged and
higher value
smoothen the curve.