SlideShare a Scribd company logo
1 of 10
Download to read offline
Chapter 5. Fuzzy Clustering and Merging
5.1 Fuzzy Weights
Let {x(q)
: q = 1,...,Q} be a sample of feature vectors. During clustering we assign the Q
feature vectors to K clusters according to how close they are to the prototype for each
cluster. As was discussed previously, feature vectors that are farther away from the
prototypical vector for a cluster should not have as much weight as those that are
centrally and densely located. These more distant feature vectors are outliers caused by
errors in one or more measurements or a deviation in the processes that formed the
object.
To give deviant feature vectors the same weight in the averaging of a cluster as the more
centrally and densely situation ones causes the prototype obtained by weighted averaging
to no be as representative of its cluster as is should be. The previous discussion suggested
that median vectors are more immune to outliers and are are thus more likely to be more
typical and representative for a cluster.
While medians are better than averages, there are other ways to obtain weights so that a
weighted average will be immune to outliers and yet provide appropriate weighting for
more centrally and densely located vectors. These weights are usually called fuzzy
weights. One way to generate them is to used the reciprocal of distances. Consider the
distance Dqk between x(q)
and z (k)
. If the distance is relatively large, then x (q)
should be
weighted much less than if the distance is small. Thus
wqk = 1/Dqk.................................... ...........................................(5.1)
is a weighting that was used in the 1970s. It is relatively large when the distance is small,
so that close vectors count much more in the averaging. On the other hand, it is relatively
small when the distance is large.
There is one problem with using the reciprocal of the distance. Dqk may be zero when x(q)
= z(k)
. We can use
wqk = 1/[Dqk + 1] .......................................................................(5.2)
instead so that the maximum weight is 1 and the minimum approaches 0. In practice, if a
weight computed from Equation (5.1) ever exceeds 1 then we may round it to 1.
Considering the k-means loop of assigning all feature vectors to clusters and then
averaging those in each cluster, we can improve the obtained prototypes for each cluster
by using the weighted average
zn
(k)
= Σ {q: clust[q]=k}wqkxn
(q)
..................................................(5.3)
for each nth
component of the prototype. Upon doing this for each component, n = 1,...,N,
we obtain the prototype
z(k)
= (z1
(k)
,...,zN
(k)
) ...............................................................(5.4)
Upon doing this for each cluster k, we arrive at new prototypes that are better located
inside of their clusters. When we next assign the feature vectors to clusters by nearest
distance, some of the assignments may be different.
5.2 Fuzzy Weights
The fuzzy c-means algorithm was invented by Bezdek [1a, 1b] and Dunn [2] in 1973 (it
was Bezdek's dissertation at Cornell, but Dunn's fuzzy ISODATA appeared that year
also). Bezdek's idea was to express the total weighted mean-square error
J(wqk, z(k)
) = Σ(q=1,Q) Σ(k=1,K) (wqk)p
|x(q)
- z (k)
|2
..............(5.5)
subject to the contraint that
Σ(k=1,K) wqk = 1 for each q ..................................................(5.6)
Upon employing Lagrange multipliers and letting p > 1, he obtained
H(wqk, z(k)
) = Σ(q=1,Q) Σ(k=1,K) (wqk)p
|x(q)
- z (k)
|2
+
Σ(q=1,Q) λq( Σ(k=1,K)wqk - 1) ............................................(5.7)
Upon setting the partial derivatives with respect to wqk equal to zero and solving, the
optimizing weights were found to be (where Dqk = |x(q)
- z(k)
|)
wqk = (1/(Dqk)2
)1/(p-1)
/ Σ(r=1,K) (1/(Dqk)2
)1/(p-1)
...............(5.8)
For each cluster k, the weights are standardized so that they sum to unity over the cluster
so the new prototype for that cluster can be computed as a weighted average. The
standardizing is done by dividing the weights for the kth
cluster by the sum of all weights
for that cluster according to
Wqk = wqk / Σ(r=1,Q) wrk.....................................................(5.9)
The fuzzy c-means algorithm allows each feature vector to belong to every cluster with a
fuzzy value (between 0 and 1), which is interpreted as its fuzzy truth of belonging. Rather
than let a feature vector belong to a single cluster with a truth value of 0 (false, which
means it does not belong) or 1 (true, it does belong), the fuzzy weights permit an in-
between truth value of belonging.
The fuzzy truths are the weights. Thus Wqk gives the fuzzy truth that feature vector q
belongs to cluster k when 0 ≤ wqk ≤ 1. Given a feature vector q, we can take the
maximum wqkmax of all wqk over all clusters k = 1,...,K and say that the feature vector
belongs to cluster kmax (wqkmax > wqk for all k ≠ kmax). But if another truth value is very
near (or equal) for the same q but a different k, then we may conclude that feature vector
q belongs to both clusters.
The fuzzy prototypes are updated by the following (fuzzy) weighted averaging.
z(k)
= Σ (q = 1,Q)Wqkx(q)
..................................................(5.10)
5.3 The Fuzzy c-means Algorithm
The complete fuzzy c-means algorithm is given below. It requires that a value for K be
given. Thus we need to compute clusterings for several values of K and use a validity
measure to see which is best. In the next chapter we will describe the Xie-Bene clustering
validity measure that can be used for fuzzy clusterings. In the algorithm given below the
function random() is a uniform(0,1) random number generator.
We first draw initial weights wqk randomly and then standardize them so that for each
fixed q, we have that wq1 + ... + wqK = 1 (all fuzzy truths for it belonging to all clusters
sum to 1). We next standardize the weights for each fixed cluster k by summing over q =
1,...,Q. This is necessary to find the weighted average for the prototype of each cluster k.
Then we loop by i) computing new prototypes as weighted averages of all Q feature
vectors (rather than just the ones in the kth
cluster; ii) computing new weights (and
standardizing); and iii) checking the stopping criterion.
-----------------randomly initialize the weights-----------------------
Step 1:
input K; input p; //p is used in the weight exponent
input Imax; I = 0;
for k = 1 to K do
for q = 1 to Q do
w[q,k] = random(); //uniform random numbers
------------------standardize the initial weights-----------------------
Step 2:
for q = 1 to Q do //for each feature vector
Sum = 0.0;
for k = 1 to K do //sum over all k prototypes
Sum = Sum + w[q,k];
for k = 1 to K do
w[q,k] = w[q,k]/Sum; //standardize over K
--------------------standardize cluster weights-----------------------
Step 3:
for k = 1 to K do //for each cluster k
Sum = 0.0;
for q = 1 to Q do //sum over all Q
Sum = Sum + w[q,k];
for q = 1 to Q do
W[q,k] = w[q,k]/Sum; //standardize over Q
------------------compute new prototype vectors---------------------
Step 4:
for k = 1 to K do //for each cluster k
for n = 1 to N do
Sum = 0.0;
for q = 1 to Q do //compute weighted sum over Q
Sum = Sum + W[q,k]x[n,q];
z[n,k] = Sum; //prototype equals weighted sum
------------------------compute new weights--------------------------
Step 5:
for q = 1 to Q do //for each feature vector q
Sum = 0.0;
for k = 1 to K do //for each prototype k
D[q,k] = 0.0;
for n = 1 to N do //compute distance components
D[q,k] = D[q,k] + (x[n,q] - z[n,k])2
;
Sum = Sum + (1/D[q,k])1/(p-1)
; //compute sum of distance reciprocals
for k = 1 to K do
w[q,k] = (1/D[q,k])1/(p-1)
/Sum; //compute standardized weights
-----------------------check stopping criteria-----------------------
Step 6:
I = I + 1; //update iteration number
if I > Imax then stop;
else goto Step 3;
-------------------------------------------------------------------------
Bezdek and others have proved certain convergence theorems under various conditions.
But convergence is not assured to a global minimum because of local minima and saddle
points of the function J(-). Thus we should run the fuzzy c-means algorithm multiple
times with different initial weight sets and then choose the results corresponding to the
lowest functional value J in Equation (5.5)
5.3 A Variation of the Fuzzy c-means Algorithm
Smolensky [4] reverses the sums in the constraints in Equation (5.7) for the purpose of
forcing the weights to be large for representative feature vectors and forcing the
nonrepresentative ones to be low. He uses
G(wqk, z(k)
) = Σ(q=1,Q) Σ(k=1,K) (wqk)p
|x(q)
- z (k)
|2
+
Σ(k=1,K) λk( Σ(q=1,Q)(1 - wqk)p
..........................................(5.11)
This is not the same as Equation (5.7). Upon putting the partial derivatives to zero and
solving, the result is
wqk = 1 / {1 + (Dqk
2
/λk)1/(p-1)}
..........................................(5.12)
The value of p determines the weights of the final partition into K clusters. The values of
λk determine the distance at which the fuzzy weight becomes 0.5. Some practical values
found to work well are
λk = AΣ(q=1,Q) (wqk)p
|x(q)
- z (k)
|2
/ Σ(q=1,Q) (wqk)p
............(5.13)
A = 1 can be used with good results. The algorithm is the same as that of the fuzzy c-
mean except that the weights are computed from Equation (5.12) and λk is computed
from Equation (5.13) for use in Equation (5.12).
5.4 A Simple Effective Fuzzy Clustering Algorithm
Fuzzy Set Membership Functions. Let {x(q)
: q = 1,...,Q} be a sample of feature vectors in
N-dimensional Euclidean space. The Gaussian function
x = f(x;z(k)
) = exp[-|x - z(k)
|2
/(2 σ2
)]....................................(5.14)
is called a fuzzy set membership function [3] for the linguistic variable SIMILAR_TO_z(k)
.
The closer x is to z(k)
, the higher is the fuzzy truth f(x;z (k)
) of SIMILAR_TO_z(k)
. If x =
z(k)
then the fuzzy truth of x is SIMILAR_TO_z(k)
is f(x;z(k)
) = 1. The fuzzy truth values
go from 0 (but never attain 0) to 1. Figure 1 below shows the function on 2-dimensional
feature vectors.
Figure 1. A Gaussian Fuzzy Set Membership Function
Thus fuzzy logic is not like binary logic where a switch is off (0) or on (1) and nothing in
between. Fuzzy logic is like a water faucett that can be off (0), partially on (value f where
0 < f < 1) or on all the way (1). It doesn't make much difference whether it is on all the
way (1) or almost all the way on (say, 0.987).
Gaussian functions are the most natural but there are other types of fuzzy set membership
functions and an unlimited number of linguistic variables. In the case given above, if x is
very close to z(k)
, then the fuzzy truth of the given linguistic variable is very close to 1,
but as x moves farther away, the fuzzy truth starts decreasing increasingly rapidly. They
are also defined on vector spaces of any dimension N without difficult contrivances.
To develop an algorithm that is similar to the improved k-means, we initially draw K
prototypes {z(k)
: k = 1,...,K} with a uniform random number generator. We assign each
feature vector x(q)
according to nearest distance and put clust[q] = kmin, where kmin is
the index of the nearest prototype. After all Q feature vectors have been assigned, we
compute the fuzzy weights. A Gaussian fuzzy set membership function is centered on
each prototype and we compute
wqk = f(x(q)
;z(k)
) ....................................................................(5.15)
We next standardize the weights for each cluster k by the mapping
wqk = wqk/{Σ {q: clust[q]=k}wqk} ............................................(5.16)
The new fuzzy prototypes are computed via
z(k)
= Σ {q: clust[q]=k}wqkx(q)
...................................................(5.17)
The weighted fuzzy expected value (WFEV) z of a set of real values x1,...,xP is defined
recursively. Starting with the arithmetic mean z(0)
we iterate via
z(r+1)
= Σ (p=1,P)αp xp .............................................................(5.18)
where
αp = exp[-(xp - z(r)
)2
/(2 σ 2
)] / Σ(s=1,P) exp[-(xs - z(r)
)2
/(2 σ2
)]....(5.19)
For vectors we compute the WFEV of each component. Figure 2 shows the WFEV of 5
vectors compared with the mean and median vectors.
Figure 2. A weighted fuzzy expected vector
The actual computation of the weighted fuzzy expected vector is done componentwise for
each n = 1,...,N. The higher level simple fuzzy clustering algorithm follows (some details
are shown and some functions are used that have been defined previously.
------------------initialize prototypes for large K-------------------
Step 1:
input K; input Imax;
for k = 1 to K do
for n = 1 to N do
z[n,k] = random();
---------------run functions, improved k-means algorithm--------------
Step 2:
delete(); //deletes prototypes too close to another prototype
kmeans(); //run simple k-means clustering as before
eliminate(p); //eliminate clusters with less than p members
kmeans(); //reassign vectors to remaining clusters
sigmas(); //compute variances of all clusters
-------------compute fuzzy weights, fuzzy cluster averages------------
Step 3:
fuzIts = 0;
for k = 1 to K do //for each cluster k do fuzzy average recursively
for n = 1 to N do //componentwise fuzzy averaging
Sum[n,k] = 0.0;
for q = 1 to Q do
if clust[q] = k then //if feature vector q is in cluster k
w[q,k] = exp(-(x[n,k] - z[n,q])2
/(2σ2
)); //fuzzy weights for component n
Sum[n,k] = Sum[n,k] + w[n,k]; //denominator sum for standardizing
z[n,k] = 0.0; //zero out for summing up new fuzzy average as prototype;
for q = 1 to Q do
if clust[q] = k then
w[q,k] = w[q,k]/Sum[n,k]; //standardize
z[n,k] = z[n,k] + w[q,k]z[n,k]; //fuzzy averaging
------------------check for finish fuzzy averaging--------------------
Step 4:
fuzIts = fuzIts + 1;
if fuzIts < 20 then goto Step 3; //iterate prototype component fuzzy averages
else
assign(); //this function assigns feature vectors to clusters
I = I + 1;
if I < Imax then goto Step 3;
else
merge(); //merge clusters close together
stop; //all fuzzy clustering iterations done
-----------------------------------------------------------------------
References
[1a] J. C. Bezdek, Fuzzy Mathematics in Pattern Classification, Ph.D. thesis, Center for
Applied Mathematics, Cornell University, 1973.
[1b] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms,
Plenum Press, New York, 1981.
[2] J. C. Dunn, "A fuzzy relative of the ISODATA process and its use in detecting
compact well-separated clusters," J. Cybernetics, vol. 3, no. 3, 32-57, 1973.
[3] C. Looney, A Fuzzy Clustering and Fuzzy Merging Algorithm, Technical Report, CS-
UNR-101-1999.
[4] Paul Smolensky and Michael C. Mozer, Mathematical Perspectives on Neural
Networks, Lawrence Erlbaum Publishers, New York, 1994.

More Related Content

What's hot

Zero-Forcing Precoding and Generalized Inverses
Zero-Forcing Precoding and Generalized InversesZero-Forcing Precoding and Generalized Inverses
Zero-Forcing Precoding and Generalized Inverses
Daniel Tai
 

What's hot (20)

FGRessay
FGRessayFGRessay
FGRessay
 
Origin of Universe (Twin)
Origin of Universe (Twin)Origin of Universe (Twin)
Origin of Universe (Twin)
 
1404.1503
1404.15031404.1503
1404.1503
 
Master's thesis
Master's thesisMaster's thesis
Master's thesis
 
Dynamics
DynamicsDynamics
Dynamics
 
Very brief highlights on some key details tosssqrd
Very brief highlights on some key details tosssqrdVery brief highlights on some key details tosssqrd
Very brief highlights on some key details tosssqrd
 
Numerical analysis m3 l6slides
Numerical analysis m3 l6slidesNumerical analysis m3 l6slides
Numerical analysis m3 l6slides
 
Restricted boltzmann machine
Restricted boltzmann machineRestricted boltzmann machine
Restricted boltzmann machine
 
Very brief highlights on some key details 2
Very brief highlights on some key details 2Very brief highlights on some key details 2
Very brief highlights on some key details 2
 
Zero-Forcing Precoding and Generalized Inverses
Zero-Forcing Precoding and Generalized InversesZero-Forcing Precoding and Generalized Inverses
Zero-Forcing Precoding and Generalized Inverses
 
2.6 all pairsshortestpath
2.6 all pairsshortestpath2.6 all pairsshortestpath
2.6 all pairsshortestpath
 
3 151010205457-lva1-app6892
3 151010205457-lva1-app68923 151010205457-lva1-app6892
3 151010205457-lva1-app6892
 
A Novel Space-time Discontinuous Galerkin Method for Solving of One-dimension...
A Novel Space-time Discontinuous Galerkin Method for Solving of One-dimension...A Novel Space-time Discontinuous Galerkin Method for Solving of One-dimension...
A Novel Space-time Discontinuous Galerkin Method for Solving of One-dimension...
 
Öncel Akademi: İstatistiksel Sismoloji
Öncel Akademi: İstatistiksel SismolojiÖncel Akademi: İstatistiksel Sismoloji
Öncel Akademi: İstatistiksel Sismoloji
 
Multi degree of freedom systems
Multi degree of freedom systemsMulti degree of freedom systems
Multi degree of freedom systems
 
Dynamics of wind & marine turbines
Dynamics of wind & marine turbinesDynamics of wind & marine turbines
Dynamics of wind & marine turbines
 
Principles of soil dynamics 3rd edition das solutions manual
Principles of soil dynamics 3rd edition das solutions manualPrinciples of soil dynamics 3rd edition das solutions manual
Principles of soil dynamics 3rd edition das solutions manual
 
Cahit 8-equitability of coronas cn○k1
Cahit 8-equitability of coronas cn○k1Cahit 8-equitability of coronas cn○k1
Cahit 8-equitability of coronas cn○k1
 
Lap2009c&p105-ode3.5 ht
Lap2009c&p105-ode3.5 htLap2009c&p105-ode3.5 ht
Lap2009c&p105-ode3.5 ht
 
Response spectrum
Response spectrumResponse spectrum
Response spectrum
 

Similar to Fuzzy clustering and merging

Reachability Analysis "Control Of Dynamical Non-Linear Systems"
Reachability Analysis "Control Of Dynamical Non-Linear Systems" Reachability Analysis "Control Of Dynamical Non-Linear Systems"
Reachability Analysis "Control Of Dynamical Non-Linear Systems"
M Reza Rahmati
 
Reachability Analysis Control of Non-Linear Dynamical Systems
Reachability Analysis Control of Non-Linear Dynamical SystemsReachability Analysis Control of Non-Linear Dynamical Systems
Reachability Analysis Control of Non-Linear Dynamical Systems
M Reza Rahmati
 
controllability-and-observability.pdf
controllability-and-observability.pdfcontrollability-and-observability.pdf
controllability-and-observability.pdf
LAbiba16
 
The navier stokes equations
The navier stokes equationsThe navier stokes equations
The navier stokes equations
Springer
 

Similar to Fuzzy clustering and merging (20)

Paper.pdf
Paper.pdfPaper.pdf
Paper.pdf
 
Gram-Schmidt Orthogonalization and QR Decompositon
Gram-Schmidt Orthogonalization and QR Decompositon Gram-Schmidt Orthogonalization and QR Decompositon
Gram-Schmidt Orthogonalization and QR Decompositon
 
Proofs nearest rank
Proofs nearest rankProofs nearest rank
Proofs nearest rank
 
Some_Properties_for_the_Commutators_of_Special_Linear_Quantum_Groups (12).pdf
Some_Properties_for_the_Commutators_of_Special_Linear_Quantum_Groups (12).pdfSome_Properties_for_the_Commutators_of_Special_Linear_Quantum_Groups (12).pdf
Some_Properties_for_the_Commutators_of_Special_Linear_Quantum_Groups (12).pdf
 
Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...
Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...
Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...
 
Bellmon Ford Algorithm
Bellmon Ford AlgorithmBellmon Ford Algorithm
Bellmon Ford Algorithm
 
Daa chpater14
Daa chpater14Daa chpater14
Daa chpater14
 
Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...
Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...
Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...
 
Probabilistic Control of Switched Linear Systems with Chance Constraints
Probabilistic Control of Switched Linear Systems with Chance ConstraintsProbabilistic Control of Switched Linear Systems with Chance Constraints
Probabilistic Control of Switched Linear Systems with Chance Constraints
 
P1 . norm vector space
P1 . norm vector spaceP1 . norm vector space
P1 . norm vector space
 
PhotonModel
PhotonModelPhotonModel
PhotonModel
 
Reachability Analysis "Control Of Dynamical Non-Linear Systems"
Reachability Analysis "Control Of Dynamical Non-Linear Systems" Reachability Analysis "Control Of Dynamical Non-Linear Systems"
Reachability Analysis "Control Of Dynamical Non-Linear Systems"
 
Reachability Analysis Control of Non-Linear Dynamical Systems
Reachability Analysis Control of Non-Linear Dynamical SystemsReachability Analysis Control of Non-Linear Dynamical Systems
Reachability Analysis Control of Non-Linear Dynamical Systems
 
20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations
 
application of complex numbers
application of complex numbersapplication of complex numbers
application of complex numbers
 
controllability-and-observability.pdf
controllability-and-observability.pdfcontrollability-and-observability.pdf
controllability-and-observability.pdf
 
Probabilistic Control of Uncertain Linear Systems Using Stochastic Reachability
Probabilistic Control of Uncertain Linear Systems Using Stochastic ReachabilityProbabilistic Control of Uncertain Linear Systems Using Stochastic Reachability
Probabilistic Control of Uncertain Linear Systems Using Stochastic Reachability
 
The navier stokes equations
The navier stokes equationsThe navier stokes equations
The navier stokes equations
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
 
Back propderiv
Back propderivBack propderiv
Back propderiv
 

Recently uploaded

Shots fired Budget Presentation.pdf12312
Shots fired Budget Presentation.pdf12312Shots fired Budget Presentation.pdf12312
Shots fired Budget Presentation.pdf12312
LR1709MUSIC
 
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
yulianti213969
 
2024 May - Clearbit Integration with Hubspot - Greenville HUG.pptx
2024 May - Clearbit Integration with Hubspot  - Greenville HUG.pptx2024 May - Clearbit Integration with Hubspot  - Greenville HUG.pptx
2024 May - Clearbit Integration with Hubspot - Greenville HUG.pptx
Boundify
 
Jual obat aborsi Hongkong ( 085657271886 ) Cytote pil telat bulan penggugur k...
Jual obat aborsi Hongkong ( 085657271886 ) Cytote pil telat bulan penggugur k...Jual obat aborsi Hongkong ( 085657271886 ) Cytote pil telat bulan penggugur k...
Jual obat aborsi Hongkong ( 085657271886 ) Cytote pil telat bulan penggugur k...
Klinik kandungan
 

Recently uploaded (20)

Shots fired Budget Presentation.pdf12312
Shots fired Budget Presentation.pdf12312Shots fired Budget Presentation.pdf12312
Shots fired Budget Presentation.pdf12312
 
Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 
Progress Report - Oracle's OCI Analyst Summit 2024
Progress Report - Oracle's OCI Analyst Summit 2024Progress Report - Oracle's OCI Analyst Summit 2024
Progress Report - Oracle's OCI Analyst Summit 2024
 
Falcon Invoice Discounting: Aviate Your Cash Flow Challenges
Falcon Invoice Discounting: Aviate Your Cash Flow ChallengesFalcon Invoice Discounting: Aviate Your Cash Flow Challenges
Falcon Invoice Discounting: Aviate Your Cash Flow Challenges
 
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
 
Falcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business GrowthFalcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business Growth
 
QSM Chap 10 Service Culture in Tourism and Hospitality Industry.pptx
QSM Chap 10 Service Culture in Tourism and Hospitality Industry.pptxQSM Chap 10 Service Culture in Tourism and Hospitality Industry.pptx
QSM Chap 10 Service Culture in Tourism and Hospitality Industry.pptx
 
Putting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptxPutting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptx
 
Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...
Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...
Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...
 
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
 
WheelTug Short Pitch Deck 2024 | Byond Insights
WheelTug Short Pitch Deck 2024 | Byond InsightsWheelTug Short Pitch Deck 2024 | Byond Insights
WheelTug Short Pitch Deck 2024 | Byond Insights
 
How does a bike-share company navigate speedy success? - Cyclistic
How does a bike-share company navigate speedy success? - CyclisticHow does a bike-share company navigate speedy success? - Cyclistic
How does a bike-share company navigate speedy success? - Cyclistic
 
The Art of Decision-Making: Navigating Complexity and Uncertainty
The Art of Decision-Making: Navigating Complexity and UncertaintyThe Art of Decision-Making: Navigating Complexity and Uncertainty
The Art of Decision-Making: Navigating Complexity and Uncertainty
 
Buy gmail accounts.pdf buy Old Gmail Accounts
Buy gmail accounts.pdf buy Old Gmail AccountsBuy gmail accounts.pdf buy Old Gmail Accounts
Buy gmail accounts.pdf buy Old Gmail Accounts
 
Top Quality adbb 5cl-a-d-b Best precursor raw material
Top Quality adbb 5cl-a-d-b Best precursor raw materialTop Quality adbb 5cl-a-d-b Best precursor raw material
Top Quality adbb 5cl-a-d-b Best precursor raw material
 
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
 
2024 May - Clearbit Integration with Hubspot - Greenville HUG.pptx
2024 May - Clearbit Integration with Hubspot  - Greenville HUG.pptx2024 May - Clearbit Integration with Hubspot  - Greenville HUG.pptx
2024 May - Clearbit Integration with Hubspot - Greenville HUG.pptx
 
Ital Liptz - all about Itai Liptz. news.
Ital Liptz - all about Itai Liptz. news.Ital Liptz - all about Itai Liptz. news.
Ital Liptz - all about Itai Liptz. news.
 
Arti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdfArti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdf
 
Jual obat aborsi Hongkong ( 085657271886 ) Cytote pil telat bulan penggugur k...
Jual obat aborsi Hongkong ( 085657271886 ) Cytote pil telat bulan penggugur k...Jual obat aborsi Hongkong ( 085657271886 ) Cytote pil telat bulan penggugur k...
Jual obat aborsi Hongkong ( 085657271886 ) Cytote pil telat bulan penggugur k...
 

Fuzzy clustering and merging

  • 1. Chapter 5. Fuzzy Clustering and Merging 5.1 Fuzzy Weights Let {x(q) : q = 1,...,Q} be a sample of feature vectors. During clustering we assign the Q feature vectors to K clusters according to how close they are to the prototype for each cluster. As was discussed previously, feature vectors that are farther away from the prototypical vector for a cluster should not have as much weight as those that are centrally and densely located. These more distant feature vectors are outliers caused by errors in one or more measurements or a deviation in the processes that formed the object. To give deviant feature vectors the same weight in the averaging of a cluster as the more centrally and densely situation ones causes the prototype obtained by weighted averaging to no be as representative of its cluster as is should be. The previous discussion suggested that median vectors are more immune to outliers and are are thus more likely to be more typical and representative for a cluster. While medians are better than averages, there are other ways to obtain weights so that a weighted average will be immune to outliers and yet provide appropriate weighting for more centrally and densely located vectors. These weights are usually called fuzzy weights. One way to generate them is to used the reciprocal of distances. Consider the distance Dqk between x(q) and z (k) . If the distance is relatively large, then x (q) should be weighted much less than if the distance is small. Thus wqk = 1/Dqk.................................... ...........................................(5.1) is a weighting that was used in the 1970s. It is relatively large when the distance is small, so that close vectors count much more in the averaging. On the other hand, it is relatively small when the distance is large. There is one problem with using the reciprocal of the distance. Dqk may be zero when x(q) = z(k) . We can use wqk = 1/[Dqk + 1] .......................................................................(5.2) instead so that the maximum weight is 1 and the minimum approaches 0. In practice, if a weight computed from Equation (5.1) ever exceeds 1 then we may round it to 1.
  • 2. Considering the k-means loop of assigning all feature vectors to clusters and then averaging those in each cluster, we can improve the obtained prototypes for each cluster by using the weighted average zn (k) = Σ {q: clust[q]=k}wqkxn (q) ..................................................(5.3) for each nth component of the prototype. Upon doing this for each component, n = 1,...,N, we obtain the prototype z(k) = (z1 (k) ,...,zN (k) ) ...............................................................(5.4) Upon doing this for each cluster k, we arrive at new prototypes that are better located inside of their clusters. When we next assign the feature vectors to clusters by nearest distance, some of the assignments may be different. 5.2 Fuzzy Weights The fuzzy c-means algorithm was invented by Bezdek [1a, 1b] and Dunn [2] in 1973 (it was Bezdek's dissertation at Cornell, but Dunn's fuzzy ISODATA appeared that year also). Bezdek's idea was to express the total weighted mean-square error J(wqk, z(k) ) = Σ(q=1,Q) Σ(k=1,K) (wqk)p |x(q) - z (k) |2 ..............(5.5) subject to the contraint that Σ(k=1,K) wqk = 1 for each q ..................................................(5.6) Upon employing Lagrange multipliers and letting p > 1, he obtained H(wqk, z(k) ) = Σ(q=1,Q) Σ(k=1,K) (wqk)p |x(q) - z (k) |2 + Σ(q=1,Q) λq( Σ(k=1,K)wqk - 1) ............................................(5.7) Upon setting the partial derivatives with respect to wqk equal to zero and solving, the optimizing weights were found to be (where Dqk = |x(q) - z(k) |) wqk = (1/(Dqk)2 )1/(p-1) / Σ(r=1,K) (1/(Dqk)2 )1/(p-1) ...............(5.8)
  • 3. For each cluster k, the weights are standardized so that they sum to unity over the cluster so the new prototype for that cluster can be computed as a weighted average. The standardizing is done by dividing the weights for the kth cluster by the sum of all weights for that cluster according to Wqk = wqk / Σ(r=1,Q) wrk.....................................................(5.9) The fuzzy c-means algorithm allows each feature vector to belong to every cluster with a fuzzy value (between 0 and 1), which is interpreted as its fuzzy truth of belonging. Rather than let a feature vector belong to a single cluster with a truth value of 0 (false, which means it does not belong) or 1 (true, it does belong), the fuzzy weights permit an in- between truth value of belonging. The fuzzy truths are the weights. Thus Wqk gives the fuzzy truth that feature vector q belongs to cluster k when 0 ≤ wqk ≤ 1. Given a feature vector q, we can take the maximum wqkmax of all wqk over all clusters k = 1,...,K and say that the feature vector belongs to cluster kmax (wqkmax > wqk for all k ≠ kmax). But if another truth value is very near (or equal) for the same q but a different k, then we may conclude that feature vector q belongs to both clusters. The fuzzy prototypes are updated by the following (fuzzy) weighted averaging. z(k) = Σ (q = 1,Q)Wqkx(q) ..................................................(5.10) 5.3 The Fuzzy c-means Algorithm The complete fuzzy c-means algorithm is given below. It requires that a value for K be given. Thus we need to compute clusterings for several values of K and use a validity measure to see which is best. In the next chapter we will describe the Xie-Bene clustering validity measure that can be used for fuzzy clusterings. In the algorithm given below the function random() is a uniform(0,1) random number generator. We first draw initial weights wqk randomly and then standardize them so that for each fixed q, we have that wq1 + ... + wqK = 1 (all fuzzy truths for it belonging to all clusters sum to 1). We next standardize the weights for each fixed cluster k by summing over q = 1,...,Q. This is necessary to find the weighted average for the prototype of each cluster k. Then we loop by i) computing new prototypes as weighted averages of all Q feature vectors (rather than just the ones in the kth cluster; ii) computing new weights (and standardizing); and iii) checking the stopping criterion. -----------------randomly initialize the weights----------------------- Step 1:
  • 4. input K; input p; //p is used in the weight exponent input Imax; I = 0; for k = 1 to K do for q = 1 to Q do w[q,k] = random(); //uniform random numbers ------------------standardize the initial weights----------------------- Step 2: for q = 1 to Q do //for each feature vector Sum = 0.0; for k = 1 to K do //sum over all k prototypes Sum = Sum + w[q,k]; for k = 1 to K do w[q,k] = w[q,k]/Sum; //standardize over K --------------------standardize cluster weights----------------------- Step 3: for k = 1 to K do //for each cluster k Sum = 0.0; for q = 1 to Q do //sum over all Q Sum = Sum + w[q,k]; for q = 1 to Q do W[q,k] = w[q,k]/Sum; //standardize over Q ------------------compute new prototype vectors--------------------- Step 4: for k = 1 to K do //for each cluster k for n = 1 to N do Sum = 0.0; for q = 1 to Q do //compute weighted sum over Q Sum = Sum + W[q,k]x[n,q]; z[n,k] = Sum; //prototype equals weighted sum ------------------------compute new weights-------------------------- Step 5: for q = 1 to Q do //for each feature vector q Sum = 0.0; for k = 1 to K do //for each prototype k D[q,k] = 0.0; for n = 1 to N do //compute distance components D[q,k] = D[q,k] + (x[n,q] - z[n,k])2 ; Sum = Sum + (1/D[q,k])1/(p-1) ; //compute sum of distance reciprocals for k = 1 to K do w[q,k] = (1/D[q,k])1/(p-1) /Sum; //compute standardized weights -----------------------check stopping criteria----------------------- Step 6: I = I + 1; //update iteration number if I > Imax then stop; else goto Step 3; -------------------------------------------------------------------------
  • 5. Bezdek and others have proved certain convergence theorems under various conditions. But convergence is not assured to a global minimum because of local minima and saddle points of the function J(-). Thus we should run the fuzzy c-means algorithm multiple times with different initial weight sets and then choose the results corresponding to the lowest functional value J in Equation (5.5) 5.3 A Variation of the Fuzzy c-means Algorithm Smolensky [4] reverses the sums in the constraints in Equation (5.7) for the purpose of forcing the weights to be large for representative feature vectors and forcing the nonrepresentative ones to be low. He uses G(wqk, z(k) ) = Σ(q=1,Q) Σ(k=1,K) (wqk)p |x(q) - z (k) |2 + Σ(k=1,K) λk( Σ(q=1,Q)(1 - wqk)p ..........................................(5.11) This is not the same as Equation (5.7). Upon putting the partial derivatives to zero and solving, the result is wqk = 1 / {1 + (Dqk 2 /λk)1/(p-1)} ..........................................(5.12) The value of p determines the weights of the final partition into K clusters. The values of λk determine the distance at which the fuzzy weight becomes 0.5. Some practical values found to work well are λk = AΣ(q=1,Q) (wqk)p |x(q) - z (k) |2 / Σ(q=1,Q) (wqk)p ............(5.13) A = 1 can be used with good results. The algorithm is the same as that of the fuzzy c- mean except that the weights are computed from Equation (5.12) and λk is computed from Equation (5.13) for use in Equation (5.12). 5.4 A Simple Effective Fuzzy Clustering Algorithm Fuzzy Set Membership Functions. Let {x(q) : q = 1,...,Q} be a sample of feature vectors in N-dimensional Euclidean space. The Gaussian function
  • 6. x = f(x;z(k) ) = exp[-|x - z(k) |2 /(2 σ2 )]....................................(5.14) is called a fuzzy set membership function [3] for the linguistic variable SIMILAR_TO_z(k) . The closer x is to z(k) , the higher is the fuzzy truth f(x;z (k) ) of SIMILAR_TO_z(k) . If x = z(k) then the fuzzy truth of x is SIMILAR_TO_z(k) is f(x;z(k) ) = 1. The fuzzy truth values go from 0 (but never attain 0) to 1. Figure 1 below shows the function on 2-dimensional feature vectors. Figure 1. A Gaussian Fuzzy Set Membership Function Thus fuzzy logic is not like binary logic where a switch is off (0) or on (1) and nothing in between. Fuzzy logic is like a water faucett that can be off (0), partially on (value f where 0 < f < 1) or on all the way (1). It doesn't make much difference whether it is on all the way (1) or almost all the way on (say, 0.987). Gaussian functions are the most natural but there are other types of fuzzy set membership functions and an unlimited number of linguistic variables. In the case given above, if x is
  • 7. very close to z(k) , then the fuzzy truth of the given linguistic variable is very close to 1, but as x moves farther away, the fuzzy truth starts decreasing increasingly rapidly. They are also defined on vector spaces of any dimension N without difficult contrivances. To develop an algorithm that is similar to the improved k-means, we initially draw K prototypes {z(k) : k = 1,...,K} with a uniform random number generator. We assign each feature vector x(q) according to nearest distance and put clust[q] = kmin, where kmin is the index of the nearest prototype. After all Q feature vectors have been assigned, we compute the fuzzy weights. A Gaussian fuzzy set membership function is centered on each prototype and we compute wqk = f(x(q) ;z(k) ) ....................................................................(5.15) We next standardize the weights for each cluster k by the mapping wqk = wqk/{Σ {q: clust[q]=k}wqk} ............................................(5.16) The new fuzzy prototypes are computed via z(k) = Σ {q: clust[q]=k}wqkx(q) ...................................................(5.17) The weighted fuzzy expected value (WFEV) z of a set of real values x1,...,xP is defined recursively. Starting with the arithmetic mean z(0) we iterate via z(r+1) = Σ (p=1,P)αp xp .............................................................(5.18) where αp = exp[-(xp - z(r) )2 /(2 σ 2 )] / Σ(s=1,P) exp[-(xs - z(r) )2 /(2 σ2 )]....(5.19) For vectors we compute the WFEV of each component. Figure 2 shows the WFEV of 5 vectors compared with the mean and median vectors.
  • 8. Figure 2. A weighted fuzzy expected vector The actual computation of the weighted fuzzy expected vector is done componentwise for each n = 1,...,N. The higher level simple fuzzy clustering algorithm follows (some details are shown and some functions are used that have been defined previously. ------------------initialize prototypes for large K------------------- Step 1: input K; input Imax; for k = 1 to K do for n = 1 to N do z[n,k] = random(); ---------------run functions, improved k-means algorithm--------------
  • 9. Step 2: delete(); //deletes prototypes too close to another prototype kmeans(); //run simple k-means clustering as before eliminate(p); //eliminate clusters with less than p members kmeans(); //reassign vectors to remaining clusters sigmas(); //compute variances of all clusters -------------compute fuzzy weights, fuzzy cluster averages------------ Step 3: fuzIts = 0; for k = 1 to K do //for each cluster k do fuzzy average recursively for n = 1 to N do //componentwise fuzzy averaging Sum[n,k] = 0.0; for q = 1 to Q do if clust[q] = k then //if feature vector q is in cluster k w[q,k] = exp(-(x[n,k] - z[n,q])2 /(2σ2 )); //fuzzy weights for component n Sum[n,k] = Sum[n,k] + w[n,k]; //denominator sum for standardizing z[n,k] = 0.0; //zero out for summing up new fuzzy average as prototype; for q = 1 to Q do if clust[q] = k then w[q,k] = w[q,k]/Sum[n,k]; //standardize z[n,k] = z[n,k] + w[q,k]z[n,k]; //fuzzy averaging ------------------check for finish fuzzy averaging-------------------- Step 4: fuzIts = fuzIts + 1; if fuzIts < 20 then goto Step 3; //iterate prototype component fuzzy averages else assign(); //this function assigns feature vectors to clusters I = I + 1; if I < Imax then goto Step 3; else merge(); //merge clusters close together stop; //all fuzzy clustering iterations done -----------------------------------------------------------------------
  • 10. References [1a] J. C. Bezdek, Fuzzy Mathematics in Pattern Classification, Ph.D. thesis, Center for Applied Mathematics, Cornell University, 1973. [1b] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York, 1981. [2] J. C. Dunn, "A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters," J. Cybernetics, vol. 3, no. 3, 32-57, 1973. [3] C. Looney, A Fuzzy Clustering and Fuzzy Merging Algorithm, Technical Report, CS- UNR-101-1999. [4] Paul Smolensky and Michael C. Mozer, Mathematical Perspectives on Neural Networks, Lawrence Erlbaum Publishers, New York, 1994.