Coefficient of Thermal Expansion and their Importance.pptx
Pitch.pdf
1. FUZZY-C-MEANS CLUSTERING
● Clustering is an unsupervised machine learning technique which divides the given
data into different clusters based on their distances (similarity) from each other
● In fuzzy c-means clustering, we find out the centroid of the data points and then
calculate the distance of each data point from the given centroids until the clusters
formed becomes constant.
● This algorithm works by assigning membership to each data point corresponding to
each cluster center on the basis of distance between the cluster center and the data
point
● More the data is near to the cluster center more is its membership towards the
particular cluster center
2. ● summation of membership of each data point
should be equal to one. After each iteration
membership and cluster centers are updated
according to the formula:
where,
'n' is the number of data points.
'vj' represents the jth cluster center.
'm' is the fuzziness index m € [1, ∞].
'c' represents the number of cluster center.
'µij' represents the membership of ith data to jth cluster center. 'dij'
represents the Euclidean distance between ith data and jth cluster center.
3. ● Henceforth,the main objective of
the Algoritm is to minimise :
where,
'||xi – vj||' is the Euclidean distance between ith
data and jth cluster center.
4. ALGORITHM:
1) Let X = {x1, x2, x3 ..., xn} be the set of data points and V =
{v1, v2, v3 ..., vc} be the set of centers
2) Randomly select ‘c’ cluster centers
3)Calculate the fuzzy membership 'µij' using:
5. 4)Compute the fuzzy centers 'vj' using:
4) Repeat step 2) and 3) until the minimum 'J' value is achieved
or ||U(k+1) - U(k)|| < β.
where,
‘k’ is the iteration step.
‘β’ is the termination criterion between [0, 1].
‘U = (µij)n*c’ is the fuzzy membership matrix.
‘J’ is the objective function
17. ADVANTAGES:
1) Gives best result for overlapped data set and comparatively better
then k-means algorithm.
2) Unlike k-means where data point must exclusively belong to one
cluster center here data point is assigned membership to each cluster
center as a result of which data point may belong to more then one
cluster center.
18. DISADVANTAGES:
1) Apriori specification of the number of clusters
2) With lower value of β we get the bet
ter result but at the expense of more number of
iteration
3) Euclidean distance measures can unequally weight
underlying factors.