SlideShare a Scribd company logo
1 of 92
Download to read offline
Machine Learning for Data Mining
Measure Properties for Clustering
Andres Mendez-Vazquez
July 27, 2015
1 / 37
Images/cinvestav-
Outline
1 Introduction
Introduction
2 Features
Types of Features
3 Similarity and Dissimilarity Measures
Basic Definition
4 Proximity Measures between Two Points
Real-Valued Vectors
Dissimilarity Measures
Similarity Measures
Discrete-Valued Vectors
Dissimilarity Measures
Similarity Measures
5 Other Examples
Between Sets
2 / 37
Images/cinvestav-
Outline
1 Introduction
Introduction
2 Features
Types of Features
3 Similarity and Dissimilarity Measures
Basic Definition
4 Proximity Measures between Two Points
Real-Valued Vectors
Dissimilarity Measures
Similarity Measures
Discrete-Valued Vectors
Dissimilarity Measures
Similarity Measures
5 Other Examples
Between Sets
3 / 37
Images/cinvestav-
Introduction
Observation
Clusters are considered as groups containing data objects that are similar
to each other.
When they are not
We assume there is some way to measure that difference!!!
4 / 37
Images/cinvestav-
Introduction
Observation
Clusters are considered as groups containing data objects that are similar
to each other.
When they are not
We assume there is some way to measure that difference!!!
4 / 37
Images/cinvestav-
Therefore
It is possible to give a mathematical definition of two important types
of clustering
Partitional Clustering
Hierarchical Clustering
Given
Given a set of input patterns X = {x1, x2, ..., xN }, where
xj = (x1j, x2j, ..., xdj)T
∈ Rd, with each measure xij called a feature
(attribute, dimension, or variable).
5 / 37
Images/cinvestav-
Therefore
It is possible to give a mathematical definition of two important types
of clustering
Partitional Clustering
Hierarchical Clustering
Given
Given a set of input patterns X = {x1, x2, ..., xN }, where
xj = (x1j, x2j, ..., xdj)T
∈ Rd, with each measure xij called a feature
(attribute, dimension, or variable).
5 / 37
Images/cinvestav-
Therefore
It is possible to give a mathematical definition of two important types
of clustering
Partitional Clustering
Hierarchical Clustering
Given
Given a set of input patterns X = {x1, x2, ..., xN }, where
xj = (x1j, x2j, ..., xdj)T
∈ Rd, with each measure xij called a feature
(attribute, dimension, or variable).
5 / 37
Images/cinvestav-
Hard Partitional Clustering
Definition
Hard partitional clustering attempts to seek a K-partition of X,
C = {C1, C2, ..., CK } with K ≤ N such that
1 Ci = ∅ for i = 1, ..., K.
2 ∪K
i=1Ci = X.
3 Ci ∩ Cj = ∅ for all i, j = 1, ..., K and i = j.
Clearly
The third property can be relaxed to obtain the mathematical definition of
fuzzy and possibilistic clustering.
6 / 37
Images/cinvestav-
Hard Partitional Clustering
Definition
Hard partitional clustering attempts to seek a K-partition of X,
C = {C1, C2, ..., CK } with K ≤ N such that
1 Ci = ∅ for i = 1, ..., K.
2 ∪K
i=1Ci = X.
3 Ci ∩ Cj = ∅ for all i, j = 1, ..., K and i = j.
Clearly
The third property can be relaxed to obtain the mathematical definition of
fuzzy and possibilistic clustering.
6 / 37
Images/cinvestav-
Hard Partitional Clustering
Definition
Hard partitional clustering attempts to seek a K-partition of X,
C = {C1, C2, ..., CK } with K ≤ N such that
1 Ci = ∅ for i = 1, ..., K.
2 ∪K
i=1Ci = X.
3 Ci ∩ Cj = ∅ for all i, j = 1, ..., K and i = j.
Clearly
The third property can be relaxed to obtain the mathematical definition of
fuzzy and possibilistic clustering.
6 / 37
Images/cinvestav-
Hard Partitional Clustering
Definition
Hard partitional clustering attempts to seek a K-partition of X,
C = {C1, C2, ..., CK } with K ≤ N such that
1 Ci = ∅ for i = 1, ..., K.
2 ∪K
i=1Ci = X.
3 Ci ∩ Cj = ∅ for all i, j = 1, ..., K and i = j.
Clearly
The third property can be relaxed to obtain the mathematical definition of
fuzzy and possibilistic clustering.
6 / 37
Images/cinvestav-
Hard Partitional Clustering
Definition
Hard partitional clustering attempts to seek a K-partition of X,
C = {C1, C2, ..., CK } with K ≤ N such that
1 Ci = ∅ for i = 1, ..., K.
2 ∪K
i=1Ci = X.
3 Ci ∩ Cj = ∅ for all i, j = 1, ..., K and i = j.
Clearly
The third property can be relaxed to obtain the mathematical definition of
fuzzy and possibilistic clustering.
6 / 37
Images/cinvestav-
For example
Given the membership Ai (xj) ∈ [0, 1]
We can have K
i=1 Ai (xj) = 1, ∀j and N
j=1 Ai (xj) < N, ∀i.
7 / 37
Images/cinvestav-
Hierarchical Clustering
Definition
Hierarchical clustering attempts to construct a tree-like, nested structure
partition of X, H = {H1, ..., HQ} with Q ≤ N such that
If Ci ∈ Hm and Cj ∈ Hl and m > l imply Ci ⊂ Cj or Ci ∩ Cj = ∅ for
all i, j = 1, ..., K, i = j and m, l = 1, ..., Q.
8 / 37
Images/cinvestav-
Hierarchical Clustering
Definition
Hierarchical clustering attempts to construct a tree-like, nested structure
partition of X, H = {H1, ..., HQ} with Q ≤ N such that
If Ci ∈ Hm and Cj ∈ Hl and m > l imply Ci ⊂ Cj or Ci ∩ Cj = ∅ for
all i, j = 1, ..., K, i = j and m, l = 1, ..., Q.
8 / 37
Images/cinvestav-
Outline
1 Introduction
Introduction
2 Features
Types of Features
3 Similarity and Dissimilarity Measures
Basic Definition
4 Proximity Measures between Two Points
Real-Valued Vectors
Dissimilarity Measures
Similarity Measures
Discrete-Valued Vectors
Dissimilarity Measures
Similarity Measures
5 Other Examples
Between Sets
9 / 37
Images/cinvestav-
Features
Usually
A data object is described by a set of features or variables, usually
represented as a multidimensional vector.
Thus
We tend to collect all that information as a pattern matrix of
dimensionality N × d.
Then
A feature can be classified as continuous, discrete, or binary.
10 / 37
Images/cinvestav-
Features
Usually
A data object is described by a set of features or variables, usually
represented as a multidimensional vector.
Thus
We tend to collect all that information as a pattern matrix of
dimensionality N × d.
Then
A feature can be classified as continuous, discrete, or binary.
10 / 37
Images/cinvestav-
Features
Usually
A data object is described by a set of features or variables, usually
represented as a multidimensional vector.
Thus
We tend to collect all that information as a pattern matrix of
dimensionality N × d.
Then
A feature can be classified as continuous, discrete, or binary.
10 / 37
Images/cinvestav-
Thus
Continuous
A continuous feature takes values from an uncountably infinite range set.
Discrete
Discrete features only have finite, or at most, a countably infinite number
of values.
Binary
Binary or dichotomous features are a special case of discrete features when
they have exactly two values.
11 / 37
Images/cinvestav-
Thus
Continuous
A continuous feature takes values from an uncountably infinite range set.
Discrete
Discrete features only have finite, or at most, a countably infinite number
of values.
Binary
Binary or dichotomous features are a special case of discrete features when
they have exactly two values.
11 / 37
Images/cinvestav-
Thus
Continuous
A continuous feature takes values from an uncountably infinite range set.
Discrete
Discrete features only have finite, or at most, a countably infinite number
of values.
Binary
Binary or dichotomous features are a special case of discrete features when
they have exactly two values.
11 / 37
Images/cinvestav-
Measurement Level
Meaning
Another property of features is the measurement level, which reflects the
relative significance of numbers.
We have four from lowest to highest
Nominal, ordinal, interval, and ratio.
12 / 37
Images/cinvestav-
Measurement Level
Meaning
Another property of features is the measurement level, which reflects the
relative significance of numbers.
We have four from lowest to highest
Nominal, ordinal, interval, and ratio.
12 / 37
Images/cinvestav-
Measurement Level
Nominal
Features at this level are represented with labels, states, or names.
The numbers are meaningless in any mathematical sense and no
mathematical calculation can be made.
For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}.
Ordinal
Features at this level are also names, but with a certain order implied.
However, the difference between the values is again meaningless.
For example, abdominal distension ∈ {slight, moderate, high}.
13 / 37
Images/cinvestav-
Measurement Level
Nominal
Features at this level are represented with labels, states, or names.
The numbers are meaningless in any mathematical sense and no
mathematical calculation can be made.
For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}.
Ordinal
Features at this level are also names, but with a certain order implied.
However, the difference between the values is again meaningless.
For example, abdominal distension ∈ {slight, moderate, high}.
13 / 37
Images/cinvestav-
Measurement Level
Nominal
Features at this level are represented with labels, states, or names.
The numbers are meaningless in any mathematical sense and no
mathematical calculation can be made.
For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}.
Ordinal
Features at this level are also names, but with a certain order implied.
However, the difference between the values is again meaningless.
For example, abdominal distension ∈ {slight, moderate, high}.
13 / 37
Images/cinvestav-
Measurement Level
Nominal
Features at this level are represented with labels, states, or names.
The numbers are meaningless in any mathematical sense and no
mathematical calculation can be made.
For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}.
Ordinal
Features at this level are also names, but with a certain order implied.
However, the difference between the values is again meaningless.
For example, abdominal distension ∈ {slight, moderate, high}.
13 / 37
Images/cinvestav-
Measurement Level
Nominal
Features at this level are represented with labels, states, or names.
The numbers are meaningless in any mathematical sense and no
mathematical calculation can be made.
For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}.
Ordinal
Features at this level are also names, but with a certain order implied.
However, the difference between the values is again meaningless.
For example, abdominal distension ∈ {slight, moderate, high}.
13 / 37
Images/cinvestav-
Measurement Level
Nominal
Features at this level are represented with labels, states, or names.
The numbers are meaningless in any mathematical sense and no
mathematical calculation can be made.
For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}.
Ordinal
Features at this level are also names, but with a certain order implied.
However, the difference between the values is again meaningless.
For example, abdominal distension ∈ {slight, moderate, high}.
13 / 37
Images/cinvestav-
Measurement Level
Interval
Features at this level offer a meaningful interpretation of the
difference between two values.
However, there exists no true zero and the ratio between two values is
meaningless.
For example, the concept cold can be seen as an interval.
Ratio
Features at this level possess all the properties of the above levels.
There is an absolute zero.
Thus, we have the meaning of ratio!!!
14 / 37
Images/cinvestav-
Measurement Level
Interval
Features at this level offer a meaningful interpretation of the
difference between two values.
However, there exists no true zero and the ratio between two values is
meaningless.
For example, the concept cold can be seen as an interval.
Ratio
Features at this level possess all the properties of the above levels.
There is an absolute zero.
Thus, we have the meaning of ratio!!!
14 / 37
Images/cinvestav-
Measurement Level
Interval
Features at this level offer a meaningful interpretation of the
difference between two values.
However, there exists no true zero and the ratio between two values is
meaningless.
For example, the concept cold can be seen as an interval.
Ratio
Features at this level possess all the properties of the above levels.
There is an absolute zero.
Thus, we have the meaning of ratio!!!
14 / 37
Images/cinvestav-
Measurement Level
Interval
Features at this level offer a meaningful interpretation of the
difference between two values.
However, there exists no true zero and the ratio between two values is
meaningless.
For example, the concept cold can be seen as an interval.
Ratio
Features at this level possess all the properties of the above levels.
There is an absolute zero.
Thus, we have the meaning of ratio!!!
14 / 37
Images/cinvestav-
Measurement Level
Interval
Features at this level offer a meaningful interpretation of the
difference between two values.
However, there exists no true zero and the ratio between two values is
meaningless.
For example, the concept cold can be seen as an interval.
Ratio
Features at this level possess all the properties of the above levels.
There is an absolute zero.
Thus, we have the meaning of ratio!!!
14 / 37
Images/cinvestav-
Measurement Level
Interval
Features at this level offer a meaningful interpretation of the
difference between two values.
However, there exists no true zero and the ratio between two values is
meaningless.
For example, the concept cold can be seen as an interval.
Ratio
Features at this level possess all the properties of the above levels.
There is an absolute zero.
Thus, we have the meaning of ratio!!!
14 / 37
Images/cinvestav-
Outline
1 Introduction
Introduction
2 Features
Types of Features
3 Similarity and Dissimilarity Measures
Basic Definition
4 Proximity Measures between Two Points
Real-Valued Vectors
Dissimilarity Measures
Similarity Measures
Discrete-Valued Vectors
Dissimilarity Measures
Similarity Measures
5 Other Examples
Between Sets
15 / 37
Images/cinvestav-
Similarity Measures
Definition
A Similarity Measure s on X is a function
s : X × X → R (1)
Such that
1 s (x, y) ≤ s0.
2 s (x, x) = s0 for all x ∈ X.
3 s (x, y) = s (y, x) for all x, y ∈ X.
4 If s (x, y) = s0 ⇐⇒ x = y.
5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X.
16 / 37
Images/cinvestav-
Similarity Measures
Definition
A Similarity Measure s on X is a function
s : X × X → R (1)
Such that
1 s (x, y) ≤ s0.
2 s (x, x) = s0 for all x ∈ X.
3 s (x, y) = s (y, x) for all x, y ∈ X.
4 If s (x, y) = s0 ⇐⇒ x = y.
5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X.
16 / 37
Images/cinvestav-
Similarity Measures
Definition
A Similarity Measure s on X is a function
s : X × X → R (1)
Such that
1 s (x, y) ≤ s0.
2 s (x, x) = s0 for all x ∈ X.
3 s (x, y) = s (y, x) for all x, y ∈ X.
4 If s (x, y) = s0 ⇐⇒ x = y.
5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X.
16 / 37
Images/cinvestav-
Similarity Measures
Definition
A Similarity Measure s on X is a function
s : X × X → R (1)
Such that
1 s (x, y) ≤ s0.
2 s (x, x) = s0 for all x ∈ X.
3 s (x, y) = s (y, x) for all x, y ∈ X.
4 If s (x, y) = s0 ⇐⇒ x = y.
5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X.
16 / 37
Images/cinvestav-
Similarity Measures
Definition
A Similarity Measure s on X is a function
s : X × X → R (1)
Such that
1 s (x, y) ≤ s0.
2 s (x, x) = s0 for all x ∈ X.
3 s (x, y) = s (y, x) for all x, y ∈ X.
4 If s (x, y) = s0 ⇐⇒ x = y.
5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X.
16 / 37
Images/cinvestav-
Similarity Measures
Definition
A Similarity Measure s on X is a function
s : X × X → R (1)
Such that
1 s (x, y) ≤ s0.
2 s (x, x) = s0 for all x ∈ X.
3 s (x, y) = s (y, x) for all x, y ∈ X.
4 If s (x, y) = s0 ⇐⇒ x = y.
5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X.
16 / 37
Images/cinvestav-
Dissimilarity Measures
Defintion
A Dissimilarity Measure d on X is a function
d : X × X → R (2)
Such that
1 d (x, y) ≥ d0 for all x, y ∈ X.
2 d (x, x) = d0 for all x ∈ X.
3 d (x, y) = d (y, x) for all x, y ∈ X.
4 If d (x, y) = d0 ⇐⇒ x = y.
5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X.
17 / 37
Images/cinvestav-
Dissimilarity Measures
Defintion
A Dissimilarity Measure d on X is a function
d : X × X → R (2)
Such that
1 d (x, y) ≥ d0 for all x, y ∈ X.
2 d (x, x) = d0 for all x ∈ X.
3 d (x, y) = d (y, x) for all x, y ∈ X.
4 If d (x, y) = d0 ⇐⇒ x = y.
5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X.
17 / 37
Images/cinvestav-
Dissimilarity Measures
Defintion
A Dissimilarity Measure d on X is a function
d : X × X → R (2)
Such that
1 d (x, y) ≥ d0 for all x, y ∈ X.
2 d (x, x) = d0 for all x ∈ X.
3 d (x, y) = d (y, x) for all x, y ∈ X.
4 If d (x, y) = d0 ⇐⇒ x = y.
5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X.
17 / 37
Images/cinvestav-
Dissimilarity Measures
Defintion
A Dissimilarity Measure d on X is a function
d : X × X → R (2)
Such that
1 d (x, y) ≥ d0 for all x, y ∈ X.
2 d (x, x) = d0 for all x ∈ X.
3 d (x, y) = d (y, x) for all x, y ∈ X.
4 If d (x, y) = d0 ⇐⇒ x = y.
5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X.
17 / 37
Images/cinvestav-
Dissimilarity Measures
Defintion
A Dissimilarity Measure d on X is a function
d : X × X → R (2)
Such that
1 d (x, y) ≥ d0 for all x, y ∈ X.
2 d (x, x) = d0 for all x ∈ X.
3 d (x, y) = d (y, x) for all x, y ∈ X.
4 If d (x, y) = d0 ⇐⇒ x = y.
5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X.
17 / 37
Images/cinvestav-
Dissimilarity Measures
Defintion
A Dissimilarity Measure d on X is a function
d : X × X → R (2)
Such that
1 d (x, y) ≥ d0 for all x, y ∈ X.
2 d (x, x) = d0 for all x ∈ X.
3 d (x, y) = d (y, x) for all x, y ∈ X.
4 If d (x, y) = d0 ⇐⇒ x = y.
5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X.
17 / 37
Images/cinvestav-
Outline
1 Introduction
Introduction
2 Features
Types of Features
3 Similarity and Dissimilarity Measures
Basic Definition
4 Proximity Measures between Two Points
Real-Valued Vectors
Dissimilarity Measures
Similarity Measures
Discrete-Valued Vectors
Dissimilarity Measures
Similarity Measures
5 Other Examples
Between Sets
18 / 37
Images/cinvestav-
Dissimilarity Measures
The Hamming distance for {0, 1}
The vectors here are collections of zeros and ones. Thus,
dH (x, y) =
d
i=1
(xi − yi)2
(3)
Euclidean distance
d (x, y) =
d
i=1
|xi − yi|
1
2
2
(4)
19 / 37
Images/cinvestav-
Dissimilarity Measures
The Hamming distance for {0, 1}
The vectors here are collections of zeros and ones. Thus,
dH (x, y) =
d
i=1
(xi − yi)2
(3)
Euclidean distance
d (x, y) =
d
i=1
|xi − yi|
1
2
2
(4)
19 / 37
Images/cinvestav-
Dissimilarity Measures
The weighted lp metric DMs
dp (x, y) =
d
i=1
wi |xi − yi|p
1
p
(5)
Where
Where xi, yi are the ith coordinates of x and y, i = 1, ..., d and wi ≥ 0 is
the ith weight coefficient.
Important
A well-known representative of the latter category of measures is the
Euclidean distance.
20 / 37
Images/cinvestav-
Dissimilarity Measures
The weighted lp metric DMs
dp (x, y) =
d
i=1
wi |xi − yi|p
1
p
(5)
Where
Where xi, yi are the ith coordinates of x and y, i = 1, ..., d and wi ≥ 0 is
the ith weight coefficient.
Important
A well-known representative of the latter category of measures is the
Euclidean distance.
20 / 37
Images/cinvestav-
Dissimilarity Measures
The weighted lp metric DMs
dp (x, y) =
d
i=1
wi |xi − yi|p
1
p
(5)
Where
Where xi, yi are the ith coordinates of x and y, i = 1, ..., d and wi ≥ 0 is
the ith weight coefficient.
Important
A well-known representative of the latter category of measures is the
Euclidean distance.
20 / 37
Images/cinvestav-
Dissimilarity Measures
The weighted l2 metric DM can be further generalized as follows
d (x, y) = (x − y)T
B (x − y) (6)
Where B is a symmetric, positive definite matrix
A special case
This includes the Mahalanobis distance as a special case
21 / 37
Images/cinvestav-
Dissimilarity Measures
The weighted l2 metric DM can be further generalized as follows
d (x, y) = (x − y)T
B (x − y) (6)
Where B is a symmetric, positive definite matrix
A special case
This includes the Mahalanobis distance as a special case
21 / 37
Images/cinvestav-
Dissimilarity Measures
Special lp metric DMs that are also encountered in practice are
d1 (x, y) =
d
i=1
wi |xi − yi| (Weighted Manhattan Norm)
d∞ (x, y) = max
1≤i≤d
wi |xi − yi| (Weighted l∞ Norm)
22 / 37
Images/cinvestav-
Dissimilarity Measures
Special lp metric DMs that are also encountered in practice are
d1 (x, y) =
d
i=1
wi |xi − yi| (Weighted Manhattan Norm)
d∞ (x, y) = max
1≤i≤d
wi |xi − yi| (Weighted l∞ Norm)
22 / 37
Images/cinvestav-
Similarity Measures
The inner product
sinner (x, y) = xT
y =
d
i=1
xiyi (7)
Closely related to the inner product is the cosine similarity measure
scosine (x, y) =
xT y
x y
(8)
Where x = d
i=1 x2
i and y = d
i=1 y2
i , the length of the
vectors!!!
23 / 37
Images/cinvestav-
Similarity Measures
The inner product
sinner (x, y) = xT
y =
d
i=1
xiyi (7)
Closely related to the inner product is the cosine similarity measure
scosine (x, y) =
xT y
x y
(8)
Where x = d
i=1 x2
i and y = d
i=1 y2
i , the length of the
vectors!!!
23 / 37
Images/cinvestav-
Similarity Measures
Pearson’s correlation coefficient
rPearson (x, y) =
(x − x)T
(y − y)
x y
(9)
Where
x =
1
d
d
i=1
xi and y =
1
d
d
i=1
yi (10)
Properties
It is clear that rPearson (x, y) takes values between -1 and +1.
The difference between sinner and rPearson does not depend directly
on x and y but on their corresponding difference vectors.
24 / 37
Images/cinvestav-
Similarity Measures
Pearson’s correlation coefficient
rPearson (x, y) =
(x − x)T
(y − y)
x y
(9)
Where
x =
1
d
d
i=1
xi and y =
1
d
d
i=1
yi (10)
Properties
It is clear that rPearson (x, y) takes values between -1 and +1.
The difference between sinner and rPearson does not depend directly
on x and y but on their corresponding difference vectors.
24 / 37
Images/cinvestav-
Similarity Measures
Pearson’s correlation coefficient
rPearson (x, y) =
(x − x)T
(y − y)
x y
(9)
Where
x =
1
d
d
i=1
xi and y =
1
d
d
i=1
yi (10)
Properties
It is clear that rPearson (x, y) takes values between -1 and +1.
The difference between sinner and rPearson does not depend directly
on x and y but on their corresponding difference vectors.
24 / 37
Images/cinvestav-
Similarity Measures
Pearson’s correlation coefficient
rPearson (x, y) =
(x − x)T
(y − y)
x y
(9)
Where
x =
1
d
d
i=1
xi and y =
1
d
d
i=1
yi (10)
Properties
It is clear that rPearson (x, y) takes values between -1 and +1.
The difference between sinner and rPearson does not depend directly
on x and y but on their corresponding difference vectors.
24 / 37
Images/cinvestav-
Similarity Measures
The Tanimoto measure or distance
sT (x, y) =
xT y
x 2
+ y 2
− xT y
(11)
Something Notable
sT (x, y) =
1
1 + (x−y)T
(x−y)
xT y
(12)
Meaning
The Tanimoto measure is inversely proportional to the squared Euclidean
distance.
25 / 37
Images/cinvestav-
Similarity Measures
The Tanimoto measure or distance
sT (x, y) =
xT y
x 2
+ y 2
− xT y
(11)
Something Notable
sT (x, y) =
1
1 + (x−y)T
(x−y)
xT y
(12)
Meaning
The Tanimoto measure is inversely proportional to the squared Euclidean
distance.
25 / 37
Images/cinvestav-
Similarity Measures
The Tanimoto measure or distance
sT (x, y) =
xT y
x 2
+ y 2
− xT y
(11)
Something Notable
sT (x, y) =
1
1 + (x−y)T
(x−y)
xT y
(12)
Meaning
The Tanimoto measure is inversely proportional to the squared Euclidean
distance.
25 / 37
Images/cinvestav-
Outline
1 Introduction
Introduction
2 Features
Types of Features
3 Similarity and Dissimilarity Measures
Basic Definition
4 Proximity Measures between Two Points
Real-Valued Vectors
Dissimilarity Measures
Similarity Measures
Discrete-Valued Vectors
Dissimilarity Measures
Similarity Measures
5 Other Examples
Between Sets
26 / 37
Images/cinvestav-
We consider now vectors whose coordinates belong to a
finite set
Given F = {0, 1, 2, ..., k − 1} where k is a positive integer
There are exactly kd vectors x ∈ Fd
Now consider x, y ∈ Fd
With A (x, y) = [aij], i, j = 0, 1, ..., k − 1 be a k × k matrix
Where
The element aij is the number of places where the first vector has the i
symbol and the corresponding element of the second vector has the j
symbol.
27 / 37
Images/cinvestav-
We consider now vectors whose coordinates belong to a
finite set
Given F = {0, 1, 2, ..., k − 1} where k is a positive integer
There are exactly kd vectors x ∈ Fd
Now consider x, y ∈ Fd
With A (x, y) = [aij], i, j = 0, 1, ..., k − 1 be a k × k matrix
Where
The element aij is the number of places where the first vector has the i
symbol and the corresponding element of the second vector has the j
symbol.
27 / 37
Images/cinvestav-
We consider now vectors whose coordinates belong to a
finite set
Given F = {0, 1, 2, ..., k − 1} where k is a positive integer
There are exactly kd vectors x ∈ Fd
Now consider x, y ∈ Fd
With A (x, y) = [aij], i, j = 0, 1, ..., k − 1 be a k × k matrix
Where
The element aij is the number of places where the first vector has the i
symbol and the corresponding element of the second vector has the j
symbol.
27 / 37
Images/cinvestav-
Using Indicator functions
We can think of each element aij
aij =
d
k=
I [(i, j) == (xk, yk)] (13)
Thanks to Ricardo Llamas and Gibran Felix!!! Class Summer 2015.
28 / 37
Images/cinvestav-
Contingency Matrix
Thus
This matrix is called a contingency table.
For example if d = 6 and k = 3
With x = [0, 1, 2, 1, 2, 1]T
and y = [1, 0, 2, 1, 0, 1]T
Then
A (x, y) =



0 1 0
1 2 0
1 0 1


 (14)
29 / 37
Images/cinvestav-
Contingency Matrix
Thus
This matrix is called a contingency table.
For example if d = 6 and k = 3
With x = [0, 1, 2, 1, 2, 1]T
and y = [1, 0, 2, 1, 0, 1]T
Then
A (x, y) =



0 1 0
1 2 0
1 0 1


 (14)
29 / 37
Images/cinvestav-
Contingency Matrix
Thus
This matrix is called a contingency table.
For example if d = 6 and k = 3
With x = [0, 1, 2, 1, 2, 1]T
and y = [1, 0, 2, 1, 0, 1]T
Then
A (x, y) =



0 1 0
1 2 0
1 0 1


 (14)
29 / 37
Images/cinvestav-
Thus, we have that
Easy to verify that
k−1
i=0
k−1
j=0
aij = d (15)
Something Notable
Most of the proximity measures between two discrete-valued vectors may
be expressed as combinations of elements of matrix A (x, y).
30 / 37
Images/cinvestav-
Thus, we have that
Easy to verify that
k−1
i=0
k−1
j=0
aij = d (15)
Something Notable
Most of the proximity measures between two discrete-valued vectors may
be expressed as combinations of elements of matrix A (x, y).
30 / 37
Images/cinvestav-
Dissimilarity Measures
The Hamming distance
It is defined as the number of places where two vectors differ:
d (x, y) =
k−1
i=0
k−1
j=0,j=i
aij (16)
The summation of all the off-diagonal elements of A, which indicate the
positions where x and y differ.
Special case k = 2, thus vectors are binary valued
The vectors x ∈ Fd are binary valued and the Hamming distance becomes
dH (x, y) =
d
i=1
(xi + yi − 2xiyi)
=
d
i=1
(xi − yi)2
31 / 37
Images/cinvestav-
Dissimilarity Measures
The Hamming distance
It is defined as the number of places where two vectors differ:
d (x, y) =
k−1
i=0
k−1
j=0,j=i
aij (16)
The summation of all the off-diagonal elements of A, which indicate the
positions where x and y differ.
Special case k = 2, thus vectors are binary valued
The vectors x ∈ Fd are binary valued and the Hamming distance becomes
dH (x, y) =
d
i=1
(xi + yi − 2xiyi)
=
d
i=1
(xi − yi)2
31 / 37
Images/cinvestav-
Dissimilarity Measures
The l1 distance
It is defined as in the case of the continuous-valued vectors,
d1 (x, y) =
d
i=1
wi |xi − yi| (17)
32 / 37
Images/cinvestav-
Similarity Measures
The Tanimoto measure
Given two sets X and Y , we have that
sT (X, Y ) =
nX∩Y
nX + nY − nX∩Y
(18)
where nX = |X|, nY = |Y |, nX∩Y = |X ∩ Y |.
Thus, if x, y are discrete-valued vectors
The measure takes into account all pairs of corresponding coordinates,
except those whose corresponding coordinates (xi, yi) are both 0.
33 / 37
Images/cinvestav-
Similarity Measures
The Tanimoto measure
Given two sets X and Y , we have that
sT (X, Y ) =
nX∩Y
nX + nY − nX∩Y
(18)
where nX = |X|, nY = |Y |, nX∩Y = |X ∩ Y |.
Thus, if x, y are discrete-valued vectors
The measure takes into account all pairs of corresponding coordinates,
except those whose corresponding coordinates (xi, yi) are both 0.
33 / 37
Images/cinvestav-
Similarity Measures
The Tanimoto measure
Given two sets X and Y , we have that
sT (X, Y ) =
nX∩Y
nX + nY − nX∩Y
(18)
where nX = |X|, nY = |Y |, nX∩Y = |X ∩ Y |.
Thus, if x, y are discrete-valued vectors
The measure takes into account all pairs of corresponding coordinates,
except those whose corresponding coordinates (xi, yi) are both 0.
33 / 37
Images/cinvestav-
Thus
We now define
nx = k−1
i=1
k−1
j=0 aij and ny = k−1
i=0
k−1
j=1 aij
In other words
nx and ny denotes the number of the nonzero coordinate of x and y.
Thus, we have
sT (x, y) ==
k−1
i=1
k−1
j=i aii
nX + nY − k−1
i=1
k−1
j=i aij
(19)
34 / 37
Images/cinvestav-
Thus
We now define
nx = k−1
i=1
k−1
j=0 aij and ny = k−1
i=0
k−1
j=1 aij
In other words
nx and ny denotes the number of the nonzero coordinate of x and y.
Thus, we have
sT (x, y) ==
k−1
i=1
k−1
j=i aii
nX + nY − k−1
i=1
k−1
j=i aij
(19)
34 / 37
Images/cinvestav-
Thus
We now define
nx = k−1
i=1
k−1
j=0 aij and ny = k−1
i=0
k−1
j=1 aij
In other words
nx and ny denotes the number of the nonzero coordinate of x and y.
Thus, we have
sT (x, y) ==
k−1
i=1
k−1
j=i aii
nX + nY − k−1
i=1
k−1
j=i aij
(19)
34 / 37
Images/cinvestav-
Outline
1 Introduction
Introduction
2 Features
Types of Features
3 Similarity and Dissimilarity Measures
Basic Definition
4 Proximity Measures between Two Points
Real-Valued Vectors
Dissimilarity Measures
Similarity Measures
Discrete-Valued Vectors
Dissimilarity Measures
Similarity Measures
5 Other Examples
Between Sets
35 / 37
Images/cinvestav-
Between Sets
Jaccard Similarity
Given two sets X and Y , we have that
J (X, Y ) =
|X ∩ Y |
|X ∪ Y |
(20)
36 / 37
Images/cinvestav-
Between Sets
Jaccard Similarity
Given two sets X and Y , we have that
J (X, Y ) =
|X ∩ Y |
|X ∪ Y |
(20)
36 / 37
Images/cinvestav-
Although there are more stuff to look for
Please Refer to
1 Theodoridis’ Book Chapter 11.
2 Rui Xu and Don Wunsch. 2009. Clustering. Wiley-IEEE Press.
37 / 37

More Related Content

What's hot

Machine learning in science and industry — day 2
Machine learning in science and industry — day 2Machine learning in science and industry — day 2
Machine learning in science and industry — day 2arogozhnikov
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1arogozhnikov
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3arogozhnikov
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsGilles Louppe
 
Evaluating classifierperformance ml-cs6923
Evaluating classifierperformance ml-cs6923Evaluating classifierperformance ml-cs6923
Evaluating classifierperformance ml-cs6923Raman Kannan
 
Ridge regression, lasso and elastic net
Ridge regression, lasso and elastic netRidge regression, lasso and elastic net
Ridge regression, lasso and elastic netVivian S. Zhang
 
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...Masahiro Suzuki
 
Preparation Data Structures 10 trees
Preparation Data Structures 10 treesPreparation Data Structures 10 trees
Preparation Data Structures 10 treesAndres Mendez-Vazquez
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster AnalysisSSA KPI
 
15857 cse422 unsupervised-learning
15857 cse422 unsupervised-learning15857 cse422 unsupervised-learning
15857 cse422 unsupervised-learningAnil Yadav
 
Scaling Multinomial Logistic Regression via Hybrid Parallelism
Scaling Multinomial Logistic Regression via Hybrid ParallelismScaling Multinomial Logistic Regression via Hybrid Parallelism
Scaling Multinomial Logistic Regression via Hybrid ParallelismParameswaran Raman
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Syed Atif Naseem
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorVivian S. Zhang
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysisAcad
 

What's hot (20)

Introduction to logistic regression
Introduction to logistic regressionIntroduction to logistic regression
Introduction to logistic regression
 
Machine learning in science and industry — day 2
Machine learning in science and industry — day 2Machine learning in science and industry — day 2
Machine learning in science and industry — day 2
 
18.1 combining models
18.1 combining models18.1 combining models
18.1 combining models
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Evaluating classifierperformance ml-cs6923
Evaluating classifierperformance ml-cs6923Evaluating classifierperformance ml-cs6923
Evaluating classifierperformance ml-cs6923
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
M03 nb-02
M03 nb-02M03 nb-02
M03 nb-02
 
Ridge regression, lasso and elastic net
Ridge regression, lasso and elastic netRidge regression, lasso and elastic net
Ridge regression, lasso and elastic net
 
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
 
Preparation Data Structures 10 trees
Preparation Data Structures 10 treesPreparation Data Structures 10 trees
Preparation Data Structures 10 trees
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
 
15857 cse422 unsupervised-learning
15857 cse422 unsupervised-learning15857 cse422 unsupervised-learning
15857 cse422 unsupervised-learning
 
Scaling Multinomial Logistic Regression via Hybrid Parallelism
Scaling Multinomial Logistic Regression via Hybrid ParallelismScaling Multinomial Logistic Regression via Hybrid Parallelism
Scaling Multinomial Logistic Regression via Hybrid Parallelism
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
 
06 Machine Learning - Naive Bayes
06 Machine Learning - Naive Bayes06 Machine Learning - Naive Bayes
06 Machine Learning - Naive Bayes
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 

Viewers also liked

What is cluster analysis
What is cluster analysisWhat is cluster analysis
What is cluster analysisPrabhat gangwar
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Salah Amean
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataSalah Amean
 
similarity measure
similarity measure similarity measure
similarity measure ZHAO Sam
 
Cluster analysis for market segmentation
Cluster analysis for market segmentationCluster analysis for market segmentation
Cluster analysis for market segmentationVishal Tandel
 

Viewers also liked (7)

What is cluster analysis
What is cluster analysisWhat is cluster analysis
What is cluster analysis
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
 
similarity measure
similarity measure similarity measure
similarity measure
 
Dataa miining
Dataa miiningDataa miining
Dataa miining
 
Cluster analysis for market segmentation
Cluster analysis for market segmentationCluster analysis for market segmentation
Cluster analysis for market segmentation
 

Similar to 27 Machine Learning Unsupervised Measure Properties

20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variationsAndres Mendez-Vazquez
 
Similarity Measures (pptx)
Similarity Measures (pptx)Similarity Measures (pptx)
Similarity Measures (pptx)JackDi2
 
Preparation Data Structures 11 graphs
Preparation Data Structures 11 graphsPreparation Data Structures 11 graphs
Preparation Data Structures 11 graphsAndres Mendez-Vazquez
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Zihui Li
 
Decision Trees and Bayes Classifiers
Decision Trees and Bayes ClassifiersDecision Trees and Bayes Classifiers
Decision Trees and Bayes ClassifiersAlexander Jung
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...Simplilearn
 
kmean_naivebayes.pptx
kmean_naivebayes.pptxkmean_naivebayes.pptx
kmean_naivebayes.pptxAryanhayaran
 
2_9_asset-v1-ColumbiaX+CSMM.101x+2T2017+type@asset+block@AI_edx_ml_unsupervis...
2_9_asset-v1-ColumbiaX+CSMM.101x+2T2017+type@asset+block@AI_edx_ml_unsupervis...2_9_asset-v1-ColumbiaX+CSMM.101x+2T2017+type@asset+block@AI_edx_ml_unsupervis...
2_9_asset-v1-ColumbiaX+CSMM.101x+2T2017+type@asset+block@AI_edx_ml_unsupervis...MostafaHazemMostafaa
 
Cheatsheet unsupervised-learning
Cheatsheet unsupervised-learningCheatsheet unsupervised-learning
Cheatsheet unsupervised-learningSteve Nouri
 
Output Units and Cost Function in FNN
Output Units and Cost Function in FNNOutput Units and Cost Function in FNN
Output Units and Cost Function in FNNLin JiaMing
 
Cantor Infinity theorems
Cantor Infinity theoremsCantor Infinity theorems
Cantor Infinity theoremsOren Ish-Am
 

Similar to 27 Machine Learning Unsupervised Measure Properties (20)

Lect4
Lect4Lect4
Lect4
 
20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
Similarity Measures (pptx)
Similarity Measures (pptx)Similarity Measures (pptx)
Similarity Measures (pptx)
 
Preparation Data Structures 11 graphs
Preparation Data Structures 11 graphsPreparation Data Structures 11 graphs
Preparation Data Structures 11 graphs
 
cluster analysis
cluster analysiscluster analysis
cluster analysis
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision tree
Decision treeDecision tree
Decision tree
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Cs345 cl
Cs345 clCs345 cl
Cs345 cl
 
Decision Trees and Bayes Classifiers
Decision Trees and Bayes ClassifiersDecision Trees and Bayes Classifiers
Decision Trees and Bayes Classifiers
 
PR07.pdf
PR07.pdfPR07.pdf
PR07.pdf
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
 
kmean_naivebayes.pptx
kmean_naivebayes.pptxkmean_naivebayes.pptx
kmean_naivebayes.pptx
 
2_9_asset-v1-ColumbiaX+CSMM.101x+2T2017+type@asset+block@AI_edx_ml_unsupervis...
2_9_asset-v1-ColumbiaX+CSMM.101x+2T2017+type@asset+block@AI_edx_ml_unsupervis...2_9_asset-v1-ColumbiaX+CSMM.101x+2T2017+type@asset+block@AI_edx_ml_unsupervis...
2_9_asset-v1-ColumbiaX+CSMM.101x+2T2017+type@asset+block@AI_edx_ml_unsupervis...
 
Cheatsheet unsupervised-learning
Cheatsheet unsupervised-learningCheatsheet unsupervised-learning
Cheatsheet unsupervised-learning
 
Output Units and Cost Function in FNN
Output Units and Cost Function in FNNOutput Units and Cost Function in FNN
Output Units and Cost Function in FNN
 
[PPT]
[PPT][PPT]
[PPT]
 
Cantor Infinity theorems
Cantor Infinity theoremsCantor Infinity theorems
Cantor Infinity theorems
 
TunUp final presentation
TunUp final presentationTunUp final presentation
TunUp final presentation
 

More from Andres Mendez-Vazquez

01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectorsAndres Mendez-Vazquez
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issuesAndres Mendez-Vazquez
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiationAndres Mendez-Vazquez
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep LearningAndres Mendez-Vazquez
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learningAndres Mendez-Vazquez
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusAndres Mendez-Vazquez
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusAndres Mendez-Vazquez
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesAndres Mendez-Vazquez
 
Introduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems SyllabusIntroduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems SyllabusAndres Mendez-Vazquez
 
Introduction Machine Learning Syllabus
Introduction Machine Learning SyllabusIntroduction Machine Learning Syllabus
Introduction Machine Learning SyllabusAndres Mendez-Vazquez
 

More from Andres Mendez-Vazquez (20)

2.03 bayesian estimation
2.03 bayesian estimation2.03 bayesian estimation
2.03 bayesian estimation
 
05 linear transformations
05 linear transformations05 linear transformations
05 linear transformations
 
01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues
 
01.02 linear equations
01.02 linear equations01.02 linear equations
01.02 linear equations
 
01.01 vector spaces
01.01 vector spaces01.01 vector spaces
01.01 vector spaces
 
06 recurrent neural_networks
06 recurrent neural_networks06 recurrent neural_networks
06 recurrent neural_networks
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation
 
Zetta global
Zetta globalZetta global
Zetta global
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learning
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning Syllabus
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabus
 
Ideas 09 22_2018
Ideas 09 22_2018Ideas 09 22_2018
Ideas 09 22_2018
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data Sciences
 
Analysis of Algorithms Syllabus
Analysis of Algorithms  SyllabusAnalysis of Algorithms  Syllabus
Analysis of Algorithms Syllabus
 
17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension
 
A basic introduction to learning
A basic introduction to learningA basic introduction to learning
A basic introduction to learning
 
Introduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems SyllabusIntroduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems Syllabus
 
Introduction Machine Learning Syllabus
Introduction Machine Learning SyllabusIntroduction Machine Learning Syllabus
Introduction Machine Learning Syllabus
 

Recently uploaded

Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxMustafa Ahmed
 
CLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalCLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalSwarnaSLcse
 
Presentation on Slab, Beam, Column, and Foundation/Footing
Presentation on Slab,  Beam, Column, and Foundation/FootingPresentation on Slab,  Beam, Column, and Foundation/Footing
Presentation on Slab, Beam, Column, and Foundation/FootingEr. Suman Jyoti
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxKarpagam Institute of Teechnology
 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentationsj9399037128
 
Databricks Generative AI Fundamentals .pdf
Databricks Generative AI Fundamentals  .pdfDatabricks Generative AI Fundamentals  .pdf
Databricks Generative AI Fundamentals .pdfVinayVadlagattu
 
Basics of Relay for Engineering Students
Basics of Relay for Engineering StudentsBasics of Relay for Engineering Students
Basics of Relay for Engineering Studentskannan348865
 
Circuit Breakers for Engineering Students
Circuit Breakers for Engineering StudentsCircuit Breakers for Engineering Students
Circuit Breakers for Engineering Studentskannan348865
 
Databricks Generative AI FoundationCertified.pdf
Databricks Generative AI FoundationCertified.pdfDatabricks Generative AI FoundationCertified.pdf
Databricks Generative AI FoundationCertified.pdfVinayVadlagattu
 
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptxSLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptxCHAIRMAN M
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...archanaece3
 
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and ToolsMaximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Toolssoginsider
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxkalpana413121
 
DBMS-Report on Student management system.pptx
DBMS-Report on Student management system.pptxDBMS-Report on Student management system.pptx
DBMS-Report on Student management system.pptxrajjais1221
 
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...mikehavy0
 
Passive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptPassive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptamrabdallah9
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailingAshishSingh1301
 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligencemahaffeycheryld
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxMustafa Ahmed
 
Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...
Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...
Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...Christo Ananth
 

Recently uploaded (20)

Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptx
 
CLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalCLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference Modal
 
Presentation on Slab, Beam, Column, and Foundation/Footing
Presentation on Slab,  Beam, Column, and Foundation/FootingPresentation on Slab,  Beam, Column, and Foundation/Footing
Presentation on Slab, Beam, Column, and Foundation/Footing
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptx
 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentation
 
Databricks Generative AI Fundamentals .pdf
Databricks Generative AI Fundamentals  .pdfDatabricks Generative AI Fundamentals  .pdf
Databricks Generative AI Fundamentals .pdf
 
Basics of Relay for Engineering Students
Basics of Relay for Engineering StudentsBasics of Relay for Engineering Students
Basics of Relay for Engineering Students
 
Circuit Breakers for Engineering Students
Circuit Breakers for Engineering StudentsCircuit Breakers for Engineering Students
Circuit Breakers for Engineering Students
 
Databricks Generative AI FoundationCertified.pdf
Databricks Generative AI FoundationCertified.pdfDatabricks Generative AI FoundationCertified.pdf
Databricks Generative AI FoundationCertified.pdf
 
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptxSLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and ToolsMaximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptx
 
DBMS-Report on Student management system.pptx
DBMS-Report on Student management system.pptxDBMS-Report on Student management system.pptx
DBMS-Report on Student management system.pptx
 
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
 
Passive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptPassive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.ppt
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailing
 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligence
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 
Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...
Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...
Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...
 

27 Machine Learning Unsupervised Measure Properties

  • 1. Machine Learning for Data Mining Measure Properties for Clustering Andres Mendez-Vazquez July 27, 2015 1 / 37
  • 2. Images/cinvestav- Outline 1 Introduction Introduction 2 Features Types of Features 3 Similarity and Dissimilarity Measures Basic Definition 4 Proximity Measures between Two Points Real-Valued Vectors Dissimilarity Measures Similarity Measures Discrete-Valued Vectors Dissimilarity Measures Similarity Measures 5 Other Examples Between Sets 2 / 37
  • 3. Images/cinvestav- Outline 1 Introduction Introduction 2 Features Types of Features 3 Similarity and Dissimilarity Measures Basic Definition 4 Proximity Measures between Two Points Real-Valued Vectors Dissimilarity Measures Similarity Measures Discrete-Valued Vectors Dissimilarity Measures Similarity Measures 5 Other Examples Between Sets 3 / 37
  • 4. Images/cinvestav- Introduction Observation Clusters are considered as groups containing data objects that are similar to each other. When they are not We assume there is some way to measure that difference!!! 4 / 37
  • 5. Images/cinvestav- Introduction Observation Clusters are considered as groups containing data objects that are similar to each other. When they are not We assume there is some way to measure that difference!!! 4 / 37
  • 6. Images/cinvestav- Therefore It is possible to give a mathematical definition of two important types of clustering Partitional Clustering Hierarchical Clustering Given Given a set of input patterns X = {x1, x2, ..., xN }, where xj = (x1j, x2j, ..., xdj)T ∈ Rd, with each measure xij called a feature (attribute, dimension, or variable). 5 / 37
  • 7. Images/cinvestav- Therefore It is possible to give a mathematical definition of two important types of clustering Partitional Clustering Hierarchical Clustering Given Given a set of input patterns X = {x1, x2, ..., xN }, where xj = (x1j, x2j, ..., xdj)T ∈ Rd, with each measure xij called a feature (attribute, dimension, or variable). 5 / 37
  • 8. Images/cinvestav- Therefore It is possible to give a mathematical definition of two important types of clustering Partitional Clustering Hierarchical Clustering Given Given a set of input patterns X = {x1, x2, ..., xN }, where xj = (x1j, x2j, ..., xdj)T ∈ Rd, with each measure xij called a feature (attribute, dimension, or variable). 5 / 37
  • 9. Images/cinvestav- Hard Partitional Clustering Definition Hard partitional clustering attempts to seek a K-partition of X, C = {C1, C2, ..., CK } with K ≤ N such that 1 Ci = ∅ for i = 1, ..., K. 2 ∪K i=1Ci = X. 3 Ci ∩ Cj = ∅ for all i, j = 1, ..., K and i = j. Clearly The third property can be relaxed to obtain the mathematical definition of fuzzy and possibilistic clustering. 6 / 37
  • 10. Images/cinvestav- Hard Partitional Clustering Definition Hard partitional clustering attempts to seek a K-partition of X, C = {C1, C2, ..., CK } with K ≤ N such that 1 Ci = ∅ for i = 1, ..., K. 2 ∪K i=1Ci = X. 3 Ci ∩ Cj = ∅ for all i, j = 1, ..., K and i = j. Clearly The third property can be relaxed to obtain the mathematical definition of fuzzy and possibilistic clustering. 6 / 37
  • 11. Images/cinvestav- Hard Partitional Clustering Definition Hard partitional clustering attempts to seek a K-partition of X, C = {C1, C2, ..., CK } with K ≤ N such that 1 Ci = ∅ for i = 1, ..., K. 2 ∪K i=1Ci = X. 3 Ci ∩ Cj = ∅ for all i, j = 1, ..., K and i = j. Clearly The third property can be relaxed to obtain the mathematical definition of fuzzy and possibilistic clustering. 6 / 37
  • 12. Images/cinvestav- Hard Partitional Clustering Definition Hard partitional clustering attempts to seek a K-partition of X, C = {C1, C2, ..., CK } with K ≤ N such that 1 Ci = ∅ for i = 1, ..., K. 2 ∪K i=1Ci = X. 3 Ci ∩ Cj = ∅ for all i, j = 1, ..., K and i = j. Clearly The third property can be relaxed to obtain the mathematical definition of fuzzy and possibilistic clustering. 6 / 37
  • 13. Images/cinvestav- Hard Partitional Clustering Definition Hard partitional clustering attempts to seek a K-partition of X, C = {C1, C2, ..., CK } with K ≤ N such that 1 Ci = ∅ for i = 1, ..., K. 2 ∪K i=1Ci = X. 3 Ci ∩ Cj = ∅ for all i, j = 1, ..., K and i = j. Clearly The third property can be relaxed to obtain the mathematical definition of fuzzy and possibilistic clustering. 6 / 37
  • 14. Images/cinvestav- For example Given the membership Ai (xj) ∈ [0, 1] We can have K i=1 Ai (xj) = 1, ∀j and N j=1 Ai (xj) < N, ∀i. 7 / 37
  • 15. Images/cinvestav- Hierarchical Clustering Definition Hierarchical clustering attempts to construct a tree-like, nested structure partition of X, H = {H1, ..., HQ} with Q ≤ N such that If Ci ∈ Hm and Cj ∈ Hl and m > l imply Ci ⊂ Cj or Ci ∩ Cj = ∅ for all i, j = 1, ..., K, i = j and m, l = 1, ..., Q. 8 / 37
  • 16. Images/cinvestav- Hierarchical Clustering Definition Hierarchical clustering attempts to construct a tree-like, nested structure partition of X, H = {H1, ..., HQ} with Q ≤ N such that If Ci ∈ Hm and Cj ∈ Hl and m > l imply Ci ⊂ Cj or Ci ∩ Cj = ∅ for all i, j = 1, ..., K, i = j and m, l = 1, ..., Q. 8 / 37
  • 17. Images/cinvestav- Outline 1 Introduction Introduction 2 Features Types of Features 3 Similarity and Dissimilarity Measures Basic Definition 4 Proximity Measures between Two Points Real-Valued Vectors Dissimilarity Measures Similarity Measures Discrete-Valued Vectors Dissimilarity Measures Similarity Measures 5 Other Examples Between Sets 9 / 37
  • 18. Images/cinvestav- Features Usually A data object is described by a set of features or variables, usually represented as a multidimensional vector. Thus We tend to collect all that information as a pattern matrix of dimensionality N × d. Then A feature can be classified as continuous, discrete, or binary. 10 / 37
  • 19. Images/cinvestav- Features Usually A data object is described by a set of features or variables, usually represented as a multidimensional vector. Thus We tend to collect all that information as a pattern matrix of dimensionality N × d. Then A feature can be classified as continuous, discrete, or binary. 10 / 37
  • 20. Images/cinvestav- Features Usually A data object is described by a set of features or variables, usually represented as a multidimensional vector. Thus We tend to collect all that information as a pattern matrix of dimensionality N × d. Then A feature can be classified as continuous, discrete, or binary. 10 / 37
  • 21. Images/cinvestav- Thus Continuous A continuous feature takes values from an uncountably infinite range set. Discrete Discrete features only have finite, or at most, a countably infinite number of values. Binary Binary or dichotomous features are a special case of discrete features when they have exactly two values. 11 / 37
  • 22. Images/cinvestav- Thus Continuous A continuous feature takes values from an uncountably infinite range set. Discrete Discrete features only have finite, or at most, a countably infinite number of values. Binary Binary or dichotomous features are a special case of discrete features when they have exactly two values. 11 / 37
  • 23. Images/cinvestav- Thus Continuous A continuous feature takes values from an uncountably infinite range set. Discrete Discrete features only have finite, or at most, a countably infinite number of values. Binary Binary or dichotomous features are a special case of discrete features when they have exactly two values. 11 / 37
  • 24. Images/cinvestav- Measurement Level Meaning Another property of features is the measurement level, which reflects the relative significance of numbers. We have four from lowest to highest Nominal, ordinal, interval, and ratio. 12 / 37
  • 25. Images/cinvestav- Measurement Level Meaning Another property of features is the measurement level, which reflects the relative significance of numbers. We have four from lowest to highest Nominal, ordinal, interval, and ratio. 12 / 37
  • 26. Images/cinvestav- Measurement Level Nominal Features at this level are represented with labels, states, or names. The numbers are meaningless in any mathematical sense and no mathematical calculation can be made. For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}. Ordinal Features at this level are also names, but with a certain order implied. However, the difference between the values is again meaningless. For example, abdominal distension ∈ {slight, moderate, high}. 13 / 37
  • 27. Images/cinvestav- Measurement Level Nominal Features at this level are represented with labels, states, or names. The numbers are meaningless in any mathematical sense and no mathematical calculation can be made. For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}. Ordinal Features at this level are also names, but with a certain order implied. However, the difference between the values is again meaningless. For example, abdominal distension ∈ {slight, moderate, high}. 13 / 37
  • 28. Images/cinvestav- Measurement Level Nominal Features at this level are represented with labels, states, or names. The numbers are meaningless in any mathematical sense and no mathematical calculation can be made. For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}. Ordinal Features at this level are also names, but with a certain order implied. However, the difference between the values is again meaningless. For example, abdominal distension ∈ {slight, moderate, high}. 13 / 37
  • 29. Images/cinvestav- Measurement Level Nominal Features at this level are represented with labels, states, or names. The numbers are meaningless in any mathematical sense and no mathematical calculation can be made. For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}. Ordinal Features at this level are also names, but with a certain order implied. However, the difference between the values is again meaningless. For example, abdominal distension ∈ {slight, moderate, high}. 13 / 37
  • 30. Images/cinvestav- Measurement Level Nominal Features at this level are represented with labels, states, or names. The numbers are meaningless in any mathematical sense and no mathematical calculation can be made. For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}. Ordinal Features at this level are also names, but with a certain order implied. However, the difference between the values is again meaningless. For example, abdominal distension ∈ {slight, moderate, high}. 13 / 37
  • 31. Images/cinvestav- Measurement Level Nominal Features at this level are represented with labels, states, or names. The numbers are meaningless in any mathematical sense and no mathematical calculation can be made. For example, peristalsis ∈ {hypermotile, normal, hypomotile, absent}. Ordinal Features at this level are also names, but with a certain order implied. However, the difference between the values is again meaningless. For example, abdominal distension ∈ {slight, moderate, high}. 13 / 37
  • 32. Images/cinvestav- Measurement Level Interval Features at this level offer a meaningful interpretation of the difference between two values. However, there exists no true zero and the ratio between two values is meaningless. For example, the concept cold can be seen as an interval. Ratio Features at this level possess all the properties of the above levels. There is an absolute zero. Thus, we have the meaning of ratio!!! 14 / 37
  • 33. Images/cinvestav- Measurement Level Interval Features at this level offer a meaningful interpretation of the difference between two values. However, there exists no true zero and the ratio between two values is meaningless. For example, the concept cold can be seen as an interval. Ratio Features at this level possess all the properties of the above levels. There is an absolute zero. Thus, we have the meaning of ratio!!! 14 / 37
  • 34. Images/cinvestav- Measurement Level Interval Features at this level offer a meaningful interpretation of the difference between two values. However, there exists no true zero and the ratio between two values is meaningless. For example, the concept cold can be seen as an interval. Ratio Features at this level possess all the properties of the above levels. There is an absolute zero. Thus, we have the meaning of ratio!!! 14 / 37
  • 35. Images/cinvestav- Measurement Level Interval Features at this level offer a meaningful interpretation of the difference between two values. However, there exists no true zero and the ratio between two values is meaningless. For example, the concept cold can be seen as an interval. Ratio Features at this level possess all the properties of the above levels. There is an absolute zero. Thus, we have the meaning of ratio!!! 14 / 37
  • 36. Images/cinvestav- Measurement Level Interval Features at this level offer a meaningful interpretation of the difference between two values. However, there exists no true zero and the ratio between two values is meaningless. For example, the concept cold can be seen as an interval. Ratio Features at this level possess all the properties of the above levels. There is an absolute zero. Thus, we have the meaning of ratio!!! 14 / 37
  • 37. Images/cinvestav- Measurement Level Interval Features at this level offer a meaningful interpretation of the difference between two values. However, there exists no true zero and the ratio between two values is meaningless. For example, the concept cold can be seen as an interval. Ratio Features at this level possess all the properties of the above levels. There is an absolute zero. Thus, we have the meaning of ratio!!! 14 / 37
  • 38. Images/cinvestav- Outline 1 Introduction Introduction 2 Features Types of Features 3 Similarity and Dissimilarity Measures Basic Definition 4 Proximity Measures between Two Points Real-Valued Vectors Dissimilarity Measures Similarity Measures Discrete-Valued Vectors Dissimilarity Measures Similarity Measures 5 Other Examples Between Sets 15 / 37
  • 39. Images/cinvestav- Similarity Measures Definition A Similarity Measure s on X is a function s : X × X → R (1) Such that 1 s (x, y) ≤ s0. 2 s (x, x) = s0 for all x ∈ X. 3 s (x, y) = s (y, x) for all x, y ∈ X. 4 If s (x, y) = s0 ⇐⇒ x = y. 5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X. 16 / 37
  • 40. Images/cinvestav- Similarity Measures Definition A Similarity Measure s on X is a function s : X × X → R (1) Such that 1 s (x, y) ≤ s0. 2 s (x, x) = s0 for all x ∈ X. 3 s (x, y) = s (y, x) for all x, y ∈ X. 4 If s (x, y) = s0 ⇐⇒ x = y. 5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X. 16 / 37
  • 41. Images/cinvestav- Similarity Measures Definition A Similarity Measure s on X is a function s : X × X → R (1) Such that 1 s (x, y) ≤ s0. 2 s (x, x) = s0 for all x ∈ X. 3 s (x, y) = s (y, x) for all x, y ∈ X. 4 If s (x, y) = s0 ⇐⇒ x = y. 5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X. 16 / 37
  • 42. Images/cinvestav- Similarity Measures Definition A Similarity Measure s on X is a function s : X × X → R (1) Such that 1 s (x, y) ≤ s0. 2 s (x, x) = s0 for all x ∈ X. 3 s (x, y) = s (y, x) for all x, y ∈ X. 4 If s (x, y) = s0 ⇐⇒ x = y. 5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X. 16 / 37
  • 43. Images/cinvestav- Similarity Measures Definition A Similarity Measure s on X is a function s : X × X → R (1) Such that 1 s (x, y) ≤ s0. 2 s (x, x) = s0 for all x ∈ X. 3 s (x, y) = s (y, x) for all x, y ∈ X. 4 If s (x, y) = s0 ⇐⇒ x = y. 5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X. 16 / 37
  • 44. Images/cinvestav- Similarity Measures Definition A Similarity Measure s on X is a function s : X × X → R (1) Such that 1 s (x, y) ≤ s0. 2 s (x, x) = s0 for all x ∈ X. 3 s (x, y) = s (y, x) for all x, y ∈ X. 4 If s (x, y) = s0 ⇐⇒ x = y. 5 s (x, y) s (y, z) ≤ s (x, z) [s (x, y) + s (y, z)] for all x, y, z ∈ X. 16 / 37
  • 45. Images/cinvestav- Dissimilarity Measures Defintion A Dissimilarity Measure d on X is a function d : X × X → R (2) Such that 1 d (x, y) ≥ d0 for all x, y ∈ X. 2 d (x, x) = d0 for all x ∈ X. 3 d (x, y) = d (y, x) for all x, y ∈ X. 4 If d (x, y) = d0 ⇐⇒ x = y. 5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X. 17 / 37
  • 46. Images/cinvestav- Dissimilarity Measures Defintion A Dissimilarity Measure d on X is a function d : X × X → R (2) Such that 1 d (x, y) ≥ d0 for all x, y ∈ X. 2 d (x, x) = d0 for all x ∈ X. 3 d (x, y) = d (y, x) for all x, y ∈ X. 4 If d (x, y) = d0 ⇐⇒ x = y. 5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X. 17 / 37
  • 47. Images/cinvestav- Dissimilarity Measures Defintion A Dissimilarity Measure d on X is a function d : X × X → R (2) Such that 1 d (x, y) ≥ d0 for all x, y ∈ X. 2 d (x, x) = d0 for all x ∈ X. 3 d (x, y) = d (y, x) for all x, y ∈ X. 4 If d (x, y) = d0 ⇐⇒ x = y. 5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X. 17 / 37
  • 48. Images/cinvestav- Dissimilarity Measures Defintion A Dissimilarity Measure d on X is a function d : X × X → R (2) Such that 1 d (x, y) ≥ d0 for all x, y ∈ X. 2 d (x, x) = d0 for all x ∈ X. 3 d (x, y) = d (y, x) for all x, y ∈ X. 4 If d (x, y) = d0 ⇐⇒ x = y. 5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X. 17 / 37
  • 49. Images/cinvestav- Dissimilarity Measures Defintion A Dissimilarity Measure d on X is a function d : X × X → R (2) Such that 1 d (x, y) ≥ d0 for all x, y ∈ X. 2 d (x, x) = d0 for all x ∈ X. 3 d (x, y) = d (y, x) for all x, y ∈ X. 4 If d (x, y) = d0 ⇐⇒ x = y. 5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X. 17 / 37
  • 50. Images/cinvestav- Dissimilarity Measures Defintion A Dissimilarity Measure d on X is a function d : X × X → R (2) Such that 1 d (x, y) ≥ d0 for all x, y ∈ X. 2 d (x, x) = d0 for all x ∈ X. 3 d (x, y) = d (y, x) for all x, y ∈ X. 4 If d (x, y) = d0 ⇐⇒ x = y. 5 d (x, z) ≤ d (x, y) + d (y, z) for all x, y, z ∈ X. 17 / 37
  • 51. Images/cinvestav- Outline 1 Introduction Introduction 2 Features Types of Features 3 Similarity and Dissimilarity Measures Basic Definition 4 Proximity Measures between Two Points Real-Valued Vectors Dissimilarity Measures Similarity Measures Discrete-Valued Vectors Dissimilarity Measures Similarity Measures 5 Other Examples Between Sets 18 / 37
  • 52. Images/cinvestav- Dissimilarity Measures The Hamming distance for {0, 1} The vectors here are collections of zeros and ones. Thus, dH (x, y) = d i=1 (xi − yi)2 (3) Euclidean distance d (x, y) = d i=1 |xi − yi| 1 2 2 (4) 19 / 37
  • 53. Images/cinvestav- Dissimilarity Measures The Hamming distance for {0, 1} The vectors here are collections of zeros and ones. Thus, dH (x, y) = d i=1 (xi − yi)2 (3) Euclidean distance d (x, y) = d i=1 |xi − yi| 1 2 2 (4) 19 / 37
  • 54. Images/cinvestav- Dissimilarity Measures The weighted lp metric DMs dp (x, y) = d i=1 wi |xi − yi|p 1 p (5) Where Where xi, yi are the ith coordinates of x and y, i = 1, ..., d and wi ≥ 0 is the ith weight coefficient. Important A well-known representative of the latter category of measures is the Euclidean distance. 20 / 37
  • 55. Images/cinvestav- Dissimilarity Measures The weighted lp metric DMs dp (x, y) = d i=1 wi |xi − yi|p 1 p (5) Where Where xi, yi are the ith coordinates of x and y, i = 1, ..., d and wi ≥ 0 is the ith weight coefficient. Important A well-known representative of the latter category of measures is the Euclidean distance. 20 / 37
  • 56. Images/cinvestav- Dissimilarity Measures The weighted lp metric DMs dp (x, y) = d i=1 wi |xi − yi|p 1 p (5) Where Where xi, yi are the ith coordinates of x and y, i = 1, ..., d and wi ≥ 0 is the ith weight coefficient. Important A well-known representative of the latter category of measures is the Euclidean distance. 20 / 37
  • 57. Images/cinvestav- Dissimilarity Measures The weighted l2 metric DM can be further generalized as follows d (x, y) = (x − y)T B (x − y) (6) Where B is a symmetric, positive definite matrix A special case This includes the Mahalanobis distance as a special case 21 / 37
  • 58. Images/cinvestav- Dissimilarity Measures The weighted l2 metric DM can be further generalized as follows d (x, y) = (x − y)T B (x − y) (6) Where B is a symmetric, positive definite matrix A special case This includes the Mahalanobis distance as a special case 21 / 37
  • 59. Images/cinvestav- Dissimilarity Measures Special lp metric DMs that are also encountered in practice are d1 (x, y) = d i=1 wi |xi − yi| (Weighted Manhattan Norm) d∞ (x, y) = max 1≤i≤d wi |xi − yi| (Weighted l∞ Norm) 22 / 37
  • 60. Images/cinvestav- Dissimilarity Measures Special lp metric DMs that are also encountered in practice are d1 (x, y) = d i=1 wi |xi − yi| (Weighted Manhattan Norm) d∞ (x, y) = max 1≤i≤d wi |xi − yi| (Weighted l∞ Norm) 22 / 37
  • 61. Images/cinvestav- Similarity Measures The inner product sinner (x, y) = xT y = d i=1 xiyi (7) Closely related to the inner product is the cosine similarity measure scosine (x, y) = xT y x y (8) Where x = d i=1 x2 i and y = d i=1 y2 i , the length of the vectors!!! 23 / 37
  • 62. Images/cinvestav- Similarity Measures The inner product sinner (x, y) = xT y = d i=1 xiyi (7) Closely related to the inner product is the cosine similarity measure scosine (x, y) = xT y x y (8) Where x = d i=1 x2 i and y = d i=1 y2 i , the length of the vectors!!! 23 / 37
  • 63. Images/cinvestav- Similarity Measures Pearson’s correlation coefficient rPearson (x, y) = (x − x)T (y − y) x y (9) Where x = 1 d d i=1 xi and y = 1 d d i=1 yi (10) Properties It is clear that rPearson (x, y) takes values between -1 and +1. The difference between sinner and rPearson does not depend directly on x and y but on their corresponding difference vectors. 24 / 37
  • 64. Images/cinvestav- Similarity Measures Pearson’s correlation coefficient rPearson (x, y) = (x − x)T (y − y) x y (9) Where x = 1 d d i=1 xi and y = 1 d d i=1 yi (10) Properties It is clear that rPearson (x, y) takes values between -1 and +1. The difference between sinner and rPearson does not depend directly on x and y but on their corresponding difference vectors. 24 / 37
  • 65. Images/cinvestav- Similarity Measures Pearson’s correlation coefficient rPearson (x, y) = (x − x)T (y − y) x y (9) Where x = 1 d d i=1 xi and y = 1 d d i=1 yi (10) Properties It is clear that rPearson (x, y) takes values between -1 and +1. The difference between sinner and rPearson does not depend directly on x and y but on their corresponding difference vectors. 24 / 37
  • 66. Images/cinvestav- Similarity Measures Pearson’s correlation coefficient rPearson (x, y) = (x − x)T (y − y) x y (9) Where x = 1 d d i=1 xi and y = 1 d d i=1 yi (10) Properties It is clear that rPearson (x, y) takes values between -1 and +1. The difference between sinner and rPearson does not depend directly on x and y but on their corresponding difference vectors. 24 / 37
  • 67. Images/cinvestav- Similarity Measures The Tanimoto measure or distance sT (x, y) = xT y x 2 + y 2 − xT y (11) Something Notable sT (x, y) = 1 1 + (x−y)T (x−y) xT y (12) Meaning The Tanimoto measure is inversely proportional to the squared Euclidean distance. 25 / 37
  • 68. Images/cinvestav- Similarity Measures The Tanimoto measure or distance sT (x, y) = xT y x 2 + y 2 − xT y (11) Something Notable sT (x, y) = 1 1 + (x−y)T (x−y) xT y (12) Meaning The Tanimoto measure is inversely proportional to the squared Euclidean distance. 25 / 37
  • 69. Images/cinvestav- Similarity Measures The Tanimoto measure or distance sT (x, y) = xT y x 2 + y 2 − xT y (11) Something Notable sT (x, y) = 1 1 + (x−y)T (x−y) xT y (12) Meaning The Tanimoto measure is inversely proportional to the squared Euclidean distance. 25 / 37
  • 70. Images/cinvestav- Outline 1 Introduction Introduction 2 Features Types of Features 3 Similarity and Dissimilarity Measures Basic Definition 4 Proximity Measures between Two Points Real-Valued Vectors Dissimilarity Measures Similarity Measures Discrete-Valued Vectors Dissimilarity Measures Similarity Measures 5 Other Examples Between Sets 26 / 37
  • 71. Images/cinvestav- We consider now vectors whose coordinates belong to a finite set Given F = {0, 1, 2, ..., k − 1} where k is a positive integer There are exactly kd vectors x ∈ Fd Now consider x, y ∈ Fd With A (x, y) = [aij], i, j = 0, 1, ..., k − 1 be a k × k matrix Where The element aij is the number of places where the first vector has the i symbol and the corresponding element of the second vector has the j symbol. 27 / 37
  • 72. Images/cinvestav- We consider now vectors whose coordinates belong to a finite set Given F = {0, 1, 2, ..., k − 1} where k is a positive integer There are exactly kd vectors x ∈ Fd Now consider x, y ∈ Fd With A (x, y) = [aij], i, j = 0, 1, ..., k − 1 be a k × k matrix Where The element aij is the number of places where the first vector has the i symbol and the corresponding element of the second vector has the j symbol. 27 / 37
  • 73. Images/cinvestav- We consider now vectors whose coordinates belong to a finite set Given F = {0, 1, 2, ..., k − 1} where k is a positive integer There are exactly kd vectors x ∈ Fd Now consider x, y ∈ Fd With A (x, y) = [aij], i, j = 0, 1, ..., k − 1 be a k × k matrix Where The element aij is the number of places where the first vector has the i symbol and the corresponding element of the second vector has the j symbol. 27 / 37
  • 74. Images/cinvestav- Using Indicator functions We can think of each element aij aij = d k= I [(i, j) == (xk, yk)] (13) Thanks to Ricardo Llamas and Gibran Felix!!! Class Summer 2015. 28 / 37
  • 75. Images/cinvestav- Contingency Matrix Thus This matrix is called a contingency table. For example if d = 6 and k = 3 With x = [0, 1, 2, 1, 2, 1]T and y = [1, 0, 2, 1, 0, 1]T Then A (x, y) =    0 1 0 1 2 0 1 0 1    (14) 29 / 37
  • 76. Images/cinvestav- Contingency Matrix Thus This matrix is called a contingency table. For example if d = 6 and k = 3 With x = [0, 1, 2, 1, 2, 1]T and y = [1, 0, 2, 1, 0, 1]T Then A (x, y) =    0 1 0 1 2 0 1 0 1    (14) 29 / 37
  • 77. Images/cinvestav- Contingency Matrix Thus This matrix is called a contingency table. For example if d = 6 and k = 3 With x = [0, 1, 2, 1, 2, 1]T and y = [1, 0, 2, 1, 0, 1]T Then A (x, y) =    0 1 0 1 2 0 1 0 1    (14) 29 / 37
  • 78. Images/cinvestav- Thus, we have that Easy to verify that k−1 i=0 k−1 j=0 aij = d (15) Something Notable Most of the proximity measures between two discrete-valued vectors may be expressed as combinations of elements of matrix A (x, y). 30 / 37
  • 79. Images/cinvestav- Thus, we have that Easy to verify that k−1 i=0 k−1 j=0 aij = d (15) Something Notable Most of the proximity measures between two discrete-valued vectors may be expressed as combinations of elements of matrix A (x, y). 30 / 37
  • 80. Images/cinvestav- Dissimilarity Measures The Hamming distance It is defined as the number of places where two vectors differ: d (x, y) = k−1 i=0 k−1 j=0,j=i aij (16) The summation of all the off-diagonal elements of A, which indicate the positions where x and y differ. Special case k = 2, thus vectors are binary valued The vectors x ∈ Fd are binary valued and the Hamming distance becomes dH (x, y) = d i=1 (xi + yi − 2xiyi) = d i=1 (xi − yi)2 31 / 37
  • 81. Images/cinvestav- Dissimilarity Measures The Hamming distance It is defined as the number of places where two vectors differ: d (x, y) = k−1 i=0 k−1 j=0,j=i aij (16) The summation of all the off-diagonal elements of A, which indicate the positions where x and y differ. Special case k = 2, thus vectors are binary valued The vectors x ∈ Fd are binary valued and the Hamming distance becomes dH (x, y) = d i=1 (xi + yi − 2xiyi) = d i=1 (xi − yi)2 31 / 37
  • 82. Images/cinvestav- Dissimilarity Measures The l1 distance It is defined as in the case of the continuous-valued vectors, d1 (x, y) = d i=1 wi |xi − yi| (17) 32 / 37
  • 83. Images/cinvestav- Similarity Measures The Tanimoto measure Given two sets X and Y , we have that sT (X, Y ) = nX∩Y nX + nY − nX∩Y (18) where nX = |X|, nY = |Y |, nX∩Y = |X ∩ Y |. Thus, if x, y are discrete-valued vectors The measure takes into account all pairs of corresponding coordinates, except those whose corresponding coordinates (xi, yi) are both 0. 33 / 37
  • 84. Images/cinvestav- Similarity Measures The Tanimoto measure Given two sets X and Y , we have that sT (X, Y ) = nX∩Y nX + nY − nX∩Y (18) where nX = |X|, nY = |Y |, nX∩Y = |X ∩ Y |. Thus, if x, y are discrete-valued vectors The measure takes into account all pairs of corresponding coordinates, except those whose corresponding coordinates (xi, yi) are both 0. 33 / 37
  • 85. Images/cinvestav- Similarity Measures The Tanimoto measure Given two sets X and Y , we have that sT (X, Y ) = nX∩Y nX + nY − nX∩Y (18) where nX = |X|, nY = |Y |, nX∩Y = |X ∩ Y |. Thus, if x, y are discrete-valued vectors The measure takes into account all pairs of corresponding coordinates, except those whose corresponding coordinates (xi, yi) are both 0. 33 / 37
  • 86. Images/cinvestav- Thus We now define nx = k−1 i=1 k−1 j=0 aij and ny = k−1 i=0 k−1 j=1 aij In other words nx and ny denotes the number of the nonzero coordinate of x and y. Thus, we have sT (x, y) == k−1 i=1 k−1 j=i aii nX + nY − k−1 i=1 k−1 j=i aij (19) 34 / 37
  • 87. Images/cinvestav- Thus We now define nx = k−1 i=1 k−1 j=0 aij and ny = k−1 i=0 k−1 j=1 aij In other words nx and ny denotes the number of the nonzero coordinate of x and y. Thus, we have sT (x, y) == k−1 i=1 k−1 j=i aii nX + nY − k−1 i=1 k−1 j=i aij (19) 34 / 37
  • 88. Images/cinvestav- Thus We now define nx = k−1 i=1 k−1 j=0 aij and ny = k−1 i=0 k−1 j=1 aij In other words nx and ny denotes the number of the nonzero coordinate of x and y. Thus, we have sT (x, y) == k−1 i=1 k−1 j=i aii nX + nY − k−1 i=1 k−1 j=i aij (19) 34 / 37
  • 89. Images/cinvestav- Outline 1 Introduction Introduction 2 Features Types of Features 3 Similarity and Dissimilarity Measures Basic Definition 4 Proximity Measures between Two Points Real-Valued Vectors Dissimilarity Measures Similarity Measures Discrete-Valued Vectors Dissimilarity Measures Similarity Measures 5 Other Examples Between Sets 35 / 37
  • 90. Images/cinvestav- Between Sets Jaccard Similarity Given two sets X and Y , we have that J (X, Y ) = |X ∩ Y | |X ∪ Y | (20) 36 / 37
  • 91. Images/cinvestav- Between Sets Jaccard Similarity Given two sets X and Y , we have that J (X, Y ) = |X ∩ Y | |X ∪ Y | (20) 36 / 37
  • 92. Images/cinvestav- Although there are more stuff to look for Please Refer to 1 Theodoridis’ Book Chapter 11. 2 Rui Xu and Don Wunsch. 2009. Clustering. Wiley-IEEE Press. 37 / 37