Here K means clustering method is being used to compress the images. The input is the number of color required in the output image which is same as the number of clusters. The output is the compressed image having prementioned number of colors
3. Clustering
•Group of collection of points into clusters
• Patterns are extracted from variables without
analysing any variable – unsupervised learning
•The points in each cluster are closer to one
another and far from points in other clusters
4. K-Means Clustering
• Unsupervised learning algorithm.
• Grouping of different data points which are like each other.
• Forming dissimilar groups and each group containing similar data points
• To partition data into distinct K clusters. K is defined by user.
•Works on predefined distinct K clusters in which each data point belongs to a
particular cluster.
5. Cost Function
•The goal is to minimize within-cluster dissimilarity.
•The Cost function(J) is:
J= 𝑖=1
𝑁
𝑘=1
𝐾
𝑟𝑖𝑘 𝑥(𝑖)
− 𝜇𝑘
2
Where= 𝑥(𝑖)
are data points
𝜇𝑘 is center of cluster k.
𝑟𝑖𝑘 = 1 if 𝑥(𝑖)
belongs to cluster k and 0 if it doesn’t belong to cluster k.
k = 1,.,…,K where K is the number of clusters provided
N is the number of total data points
•J represents sum of distances between each data 𝑥(𝑖)
and cluster center 𝜇𝑘.
• Cost function J is minimized for optimal clustering.
•After each iteration 𝜇𝑘 is obtained by the formula
𝜇𝑘 =
𝑖=1
𝑁
𝑟𝑖𝑘𝑥𝑖
𝑖=1
𝑁
𝑟𝑖𝑘
6. K- Means Algorithm
Step1- Randomly initialize the K data points as initial centroids for K clusters
Step2- Until the cluster centers are changed or for max iteration
◦ Allocate each data point to centroid whose data point is nearest
◦ Replace the cluster centres with the mean of the element in their
clusters
end
9. Image Compression
•An image is made up of small intensity dots called pixels.
•Each pixel contains three values which are the values of intensities of Red, Blue, Green colors
respectively for that pixel
•Reducing the size that an image takes while storing and transmitting
• Reducing the number of colors occurred in image to the most frequent colors appearing in it
• Essentially forming the different clusters of frequent occurring colors present in the image by
using pixel values
10. Original and Compressed Image-Parrot
𝑡𝑛= time taken for K- means algorithm to run for n iterations
Fig1b. 𝑡10= 1min 42sec
Fig1a. Original Image
11. Original and Compressed Image-Parrot
Fig1c. 𝑡10= 6min 32sec Fig1d. 𝑡10= 12min 42sec Fig.1e 𝑡10= 50min 40sec
13. Original and Compressed Image-Scenery
Fig2e. 𝑡10= 63 min 30sec
Fig2d. 𝑡10= 15 min 30sec
Fig2c. 𝑡10= 8 min 15sec
14. Uses of Image Compression
•Lesser data for storing the compressed image compared to original image,
reducing the cost of storage and transmission
• K-Means is utilized to compress visual contents in vast nexus of social
messaging app for its faster transmission and less storage utilization
•Used for archival purpose and for medical imaging, technical drawings
•Widely used in remote sensing via satellite, television broadcasting, for
capturing and transmitting satellite images
15. Results
Actual Size of Image of
Parrot
Number of clusters(K)
Specified while
Compressing the image
Reduced Size of the
Image of Parrot
1,87,236 bytes
100 52,032 bytes
20 54,888 bytes
15 54,888 bytes
12 54,351 bytes
2 43,616 bytes
Table1. Results of K-means clustering applied on parrot.jpg
16. Results
Actual Size of Image of
Scenery
Number of clusters(K)
Specified while
Compressing the image
Reduced Size of the
Image of Scenery
5,50,287 bytes
50 1,01,404 bytes
10 1,01,813 bytes
5 95,616 bytes
2 83,729 bytes
Table2. Results of K-means clustering applied on scenery.jpg
17. Drawbacks of K-means
• Gets sluggish as the size of data(image) increases.
• Time taken by algorithm increases as the number of cluster (K) increases.
• Results may represent a suboptimal local minimum.
• Works only for linear or almost linear boundaries
18. References
•Xing Wan (2019), “Application of K-means Algorithm in Image Compression”, IOP Conference
Series: Materials Science and Engineering, 563 052042,
•B. Reddaiah “A Study on Image Compression and its Applications”, International Journal of
Computer Applications, Volume 177 – No. 38, February 2020
•Hartigan, J. A., Wong, M. A. (1979). Algorithm as 136: a k-means clustering algorithm. Journal of
the Royal Statistical Society, 28(1), 100-10
•https://www.simplilearn.com/tutorials/machine-learning-tutorial/k-means-clustering-algorithm
•Van der Geer, J., Hanraads, J.A.J., Lupton, R.A. (2010) The art of writing a scientific article. J. Sci.
Commun., 163: 51–59.