StabilitySimilarity Index (Lange et al, 2004) indicates the percentage of pairs of observations that belong to the samecluster in both clustering C and clustering C’.
Cluster Integrity – HeterogeneityTotal separation of clusters: based on the distance between cluster centers
Cluster Integrity - HomogeneityScatter (compactness): average ratio of the cluster variance to the variance of the dataset.
Accuracy Reality Prediction 5 5 6 4 6 4 2 1 2 1 3 7 7 3 8 8 9 9Adjusted Rand Index (Hubert and Arabie, 1985): level of agreement between the predicted segment and the realsegment correcting for the expected level of agreement.
SizeUniformity deviation: average deviation from each segment from uniform segment size (1/number of segments).
Anita Prinzie, Nicole Huyghe firstname.lastname@example.org www.solutions2.be do we causerisingquestions
References• Fred and Jain, Combining Multiple Clustering using Evidence Accumulation (2005), IEEE Transactions on Pattern analysis and Machine Intelligence, 27(6), 835-850.• Lange, T., Roth., V., Braun L. And Buhmann J.M. (2004) , Stability- based validation of Clustering Solutions, Neural Computation, 16, 1299-1323.• Haldiki, M.,Vazirgiannis M. and Batistakis, Y. (2000), Quality Scheme Assessment in the Clustering Process, Proc. Of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, 265-276.• Hubert, L. And Arabie, P. (1985) Comparing partitions, Journal of Classification, 193-218.• Nieweglowski, L., CLV package (2007), R software.• Martin, A., Quinn, K.M. And Park, J.H., Markov Chain Monte Carlo Package (MCMCpack) (2003-2012), R software.