SlideShare a Scribd company logo
1 of 9
Download to read offline
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/221365086
Automatic Image Annotation Using Color K-
Means Clustering
Conference Paper · November 2009
DOI: 10.1007/978-3-642-05036-7_61 · Source: DBLP
CITATION
1
READS
55
2 authors:
Some of the authors of this publication are also working on these related projects:
QUANTITATIVE MODELLING OF MALAY VOWEL SOUNDS View project
Nursuriati Jamil
Universiti Teknologi MARA
60 PUBLICATIONS 150 CITATIONS
SEE PROFILE
Siti 'Aisyah Sa'dan
Universiti Teknologi MARA
3 PUBLICATIONS 1 CITATION
SEE PROFILE
All content following this page was uploaded by Nursuriati Jamil on 01 December 2016.
The user has requested enhancement of the downloaded file. All in-text references underlined in blue
are linked to publications on ResearchGate, letting you access and read them immediately.
H. Badioze Zaman et al. (Eds.): IVIC 2009, LNCS 5857, pp. 645–652, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Automatic Image Annotation Using Color K-Means
Clustering
Nursuriati Jamil and Siti ’Aisyah Sa’adan
Faculty of Computer & Mathematical Sciences, Universiti Teknologi MARA
40450 Shah Alam, Selangor, Malaysia
liza@tmsk.uitm.edu.my, aisyah.sadan@gmail.com
Abstract. Automatic image annotation is a process of modeling a human in as-
signing words to images based on visual observations. It is essential as manual
annotation is time consuming especially for large databases and there is no
standard captioning procedure because it is based on human perception. This
paper discusses implementation of automatic image annotation using K-means
clustering algorithm to annotate the colors with the appropriate words by using
predefined colors. Experiments are conducted to identify the number of cen-
troids, distance measures and initialization mode for the best clustering results.
A prototype of an automatic image annotation is developed and then tested us-
ing thirty-five beach scenery photographs. Results showed that annotating im-
age using evenly-spaced initialization mode and 100 centroids measured using
City-Block distance function managed to achieve a commendable 75% preci-
sion rate.
Keywords: Automatic image annotation, K-means clustering, RGB model, ini-
tialization mode, cluster number.
1 Introduction
Automatic image annotation is defined indirectly as the process by which a computer
system automatically assigns words in the form of captioning to a digital image [14].
Commonly, automatic image annotation is used in image retrieval systems to organize
and locate images of interest from a database. Annotation-based image retrieval is
perceived as better than content-based image retrieval (CBIR) because it allows user
to compose queries freely using their natural language [4]. Furthermore, CBIR system
matches images based on the low-level visual similarities. Thus, it has some limita-
tions due to missing semantic information [8].
Clustering algorithms are commonly used in classifying low-level features of the
images prior to annotation. [1] defined clustering as the process of organizing objects
into groups whose members are similar in some way. A cluster is therefore a collection
of objects that are similar between them and are dissimilar to objects belonging to
other clusters. Several popular clustering algorithms include K-Means, Expectation
Maximization (EM) and Discreet Distribution (D2) clustering [7] algorithms. K-Means
clustering relies on hard assignment of information to a given set of partitions also
646 N. Jamil and S. ’Aisyah Sa’adan
known as cluster centers or the K centroids [13]. At every step of the algorithm, each
data value is assigned to the nearest centroid based on some similarity parameter that is
calculated using distance measurement. Then, the centroids are then recalculated based
on these hard assignments. With each successive pass, a data value can change the
centroid where it belongs to, thus altering the values of the centroid at every pass. K-
Means clustering has been used extensively to facilitate in classifications of low-level
features in image retrieval systems [3][14][11][7][9]. The EM algorithm employed in
[13] [14], on the other hand relies on soft assignment of data given set of centroids.
Every data value is associated with every centroid through system of weight based on
strongly the data value should be associated with the particular centroid. In general, K-
Means clustering works better than EM algorithm and is fairly simple to implement for
image segmentation using color as the feature parameter [13].
In this paper, implementation of K-Means clustering algorithm is experimented to
automatically annotate beach scenery photographs using their RGB color features. The
purpose of the study is to investigate the suitable number of centroids, distance meas-
ures and initialization mode in an attempt to achieve the best clustering performance.
2 K-Means Clustering
The K-Means is a very popular algorithm and one of the best for implementing the
clustering process [12]. It has a time complexity that is dominated by the product of
the number of patterns, the number of centroids, and the number of iterations. For an
image, K-Means clustering may be implemented as follows:
i) Place K points into the space represented by the pixels that are being clustered.
These points represent initial cluster centroids (K), also known as initialization
point.
ii) Assign each pixel to the cluster that has the closest centroid (obtain by measuring
distance).
iii) When all pixels have been assigned, recalculate the positions of the K centroids.
iv) Repeat Steps 2 and 3 until the centroids no longer move.
Factors that may affect performance of K-Means algorithm are the initialization
mode, distance measures and the number of centroids used during clustering process.
Initialization mode is important in order to have accurate RGB representation for the
centroids at the starting point of the clustering. Each pixel in the image is then assign
to its proper cluster based on its similarity by using a distance measure. This will
influence the shape of the clusters, as some elements may be close or further away to
one another according to the distance calculated [10]. Thus, the distance measurement
used is also vital to ensure every pixel is assigned to its centroid precisely. Common
distance functions used in clustering are Euclidean distance, City Block distance,
Minkowski distance and Canberra distance. Number of centroids, K chosen in the
clustering process must also be taken into consideration too. According to [13], the
number of the centroids used in the segmentation has a very large effect on the output.
The more centroids used in the color setup, more possible colors are available to show
up in the output.
Automatic Image Annotation Using Color K-Means Clustering 647
3 Materials and Methods
As mentioned previously, this paper discusses the implementation of an automatic
annotation prototype that will annotate beach photographs using eight predefined
words: sky, sea, beach, cloud, tree, hill, grass and rock. Fig. 1 demonstrates the dia-
gram of the annotation process.
Fig. 1. Automatic annotation processes
3.1 Data Collection
Ten natural beach scenery photographs are downloaded from [5][6] as these images
have been classified into their proper categories for benchmarking purpose. These
images are chosen from a total of 3,360 photograph images to be used as training
images. For testing purposes, thirty-five photograph images are collected randomly
from search engine Yahoo! and Google. The criteria of the test images are that they
are beach scenery photographs and they must have at least one of the eight beach
elements, which are sky, sea, beach, cloud, tree, hill, grass and rock.
3.2 Manual Image Annotation
All thirty-five testing images are manually annotated using visual inspection of three
people. They are given a selected list of words taken from Oxford Fajar dictionary [2]
that describe beach scenery and they manually annotated the test images based on the
given words. Results of these manual annotations are then used as benchmarking of
the proposed prototype.
Beach image
Color extraction
Predefined
colors
Color clustering
Identify init mode,
cluster no, distance
measure
K-Means clustering
Automatic annotation
Captions: SKY,
CLOUD, SEA, ROCK,
TREE, GRASS, HILL,
BEACH
Manual
annotation
Relevance
list
Benchmarking
Training
648 N. Jamil and S. ’Aisyah Sa’adan
3.3 Color Feature Extraction
Color features using RGB model of the eight beach elements mentioned earlier are
extracted from the training images. These predefined color features are later used
during the testing phase for annotating the test images. Table 1 shows the RGB aver-
age color values for all the beach elements.
Table 1. Predefined colors of the beach elements
Beach element Average RGB values
Sky 88, 122, 170
Sea 58, 97, 123
Beach 187, 174, 147
Grass 59, 69, 30
Hill 43, 88, 75
Tree 72, 79, 36
Rock 76, 67, 69
Cloud 190, 189, 199
3.4 Color Clustering
Two experiments are conducted to determine the initialization mode, distance func-
tion and number of clusters in an effort to achieve the highest performance of K-
Means algorithm. The first experiment is to discover the best combination of initiali-
zation mode and distance measure. The initialization modes that are tested are evenly-
spaced mode and max-data mode; and the distance measurements that are involved
are Euclidean, City Block and Canberra. Objective of the second experiment is to
identify the appropriate number of centroids (K) to be implemented in automatic an-
notation prototype. These centroids contain the RGB values to be compared later with
predefined color of beach elements. The numbers of centroids to be tested are 8K,
30K, 50K and 100K.
To evaluate the performance of the clustering algorithm, Recall and Precision
measures are computed [14], where numCorrect is the number of correctly retrieved
words from output caption, numRetrieved is the total number of retrieved words
from the caption and numExist is the actual number of retrieved words for the
caption.
Recall =
numCorrect
numRetrieved
(1)
Precision =
numCorrect
numExist
(2)
3.5 Development of Automatic Annotation Prototype
Based on the experiment results, a prototype of an automatic annotation system was
developed using Java programming language. The software development tools used
Automatic Image Annotation Using Color K-Means Clustering 649
are BlueJ version 2.1.2 with Java Development Kit of version jdk1.6.0_05, Java Run-
time Environment of version jre1.6.0_07, Java Advance Imaging Development Kit
version jai-1_1_3-lib-windows-i586-jdk and Java Advance Imaging Runtime Envi-
ronment version jai-1_1_3-lib-windows-i586-jre. The prototype is then tested and
evaluated using thirty-five photographs of beach scenery.
4 Results and Discussions
Table 2 shows results of the first experiment to determine the combination of
initialization mode and distance measure in achieving the best performance of clus-
tering. Overall, evenly-spaced initialization mode performed better compared to
max-data mode. It can be also seen that the highest average precision rate of 88% is
accomplished by using evenly-spaced initialization mode and City Block distance
measure. Even though recall rate of this combination is slightly lower than Canberra
measure, we perceived precision rate as a better judgment of clustering
performance.
Table 2. Performance of different combinations of initialization modes and distance measures
Precision Recall
Initialization
Mode Euclidian CityBlock Canberra Euclidian CityBlock Canberra
Evenly-spaced 0.80 0.88 0.83 0.40 0.40 0.46
Max-data 0.82 0.87 0.88 0.34 0.32 0.37
The experiment result of comparing the number of centroids is recorded in Table 3.
From the table, it is shown that 8K and 100K have equal precision rate of 88% in
annotating the images. Therefore, we include the recall rate of 40% in order to
conclude that the highest performance was achieved when the highest number of
centroids of 100 is used. It is also interesting to note that when using 30 and 50 cen-
troids, the precision rates are in fact lower than when utilizing only 8 centroids.
After all the techniques and distance measure are determined, the prototype was
developed and tested with the 35 testing images. Fig. 2 illustrates an output of one of
the annotated image. Recall and precision rates of the tested images are demonstrated
in Table 4 showing and average precision rate of 75% and recall rate of 50%.
Table 3. Performance of different number of centroids
Number of Centroids (K)
Average
8 30 50 100
Precision 0.88 0.87 0.87 0.88
Recall 0.32 0.34 0.38 0.40
650 N. Jamil and S. ’Aisyah Sa’adan
Table 4. Recall and Precision Rates of the Prototype
Image Precision Recall
beach001.jpg 0.5 0.25
beach002.jpg 0.67 0.4
beach003.jpg 0.4 0.5
beach004.jpg 1 0.33
beach005.jpg 1 0.4
beach006.jpg 0.5 0.2
beach007.jpg 1 0.4
beach008.jpg 1 0.4
beach009.jpg 1 0.5
beach010.jpg 0.57 1
beach011.jpg 1 0.2
beach012.jpg 1 0.4
beach013.jpg 0.5 1
beach014.jpg 0.8 0.8
beach015.jpg 1 0.4
beach016.jpg 1 0.2
beach017.jpg 1 0.4
beach018.jpg 0.5 0.2
beach019.jpg 0.67 0.5
beach020.jpg 0.5 0.25
beach021.jpg 1 0.67
beach022.jpg 1 0.33
beach023.jpg 0.67 1
beach024.jpg 0.57 1
beach025.jpg 1 0.4
beach026.jpg 0.5 0.25
beach027.jpg 1 0.25
beach028.jpg 1 0.4
beach029.jpg 0.5 0.4
beach030.jpg 0.5 1
beach031.jpg 0.67 1
beach032.jpg 0.2 0.33
beach033.jpg 0.33 0.35
beach034.jpg 0.63 1
beach035.jpg 1 0.25
Average 0.75 0.50
Table 5 illustrated the result of the percentage of each beach element correctly re-
trieved. From the table, it is shown that SKY and CLOUD have the highest retrieval
rate at 77% and 70%, respectively. This is due to the fact that SKY has similar color
with CLOUD. In other words, when there is SKY, there is possibility of CLOUD to
be annotated. However, ROCK has 0% of correctly retrieved rate. The main reason of
this is the little occurrence of ROCK in all the tested images.
Automatic Image Annotation Using Color K-Means Clustering 651
Fig. 2. An image automatically annotated with 5 words related to beach scenery
Table 5. Percentage of beach elements correctly retrieved
Beach Element
Manual
Annotation
Automatic
Annotation Correctly Retrieved
SKY 35 27 77.14
SEA 28 11 39.29
BEACH 33 13 39.39
GRASS 6 2 33.33
HILL 8 1 12.50
TREE 21 7 33.33
ROCK 4 0 0.00
CLOUD 20 14 70.00
5 Conclusion
From the experimental results, it shows that the prototype is best implemented using
evenly spaced values for initialization mode with City Block distance for distance
measure in K-Means clustering. Even though the training data is very small, due to
lack of free image database, the prototype achieved a commendable precision rate of
75 %. This shows that K-Means algorithm is robust enough to be utilized in clustering
low-level features of an image for annotation purposes. Our study is an initial work of
automatic image annotation. There are several constraints and limitations that should
be overcome with further research.
Future work to improve the accuracy of the system can take many directions. For
example, this prototype needs to be tested with other color model that is more align
with human vision such as HSV color model. More training images should be ac-
quired to increase the accuracy of the feature extractions. Finally, more features
should be extracted from the image to imply more meaning when annotation process
is performed.
652 N. Jamil and S. ’Aisyah Sa’adan
References
1. A Tutorial on Clustering Algorithm,
http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/
index.html
2. Hawkins, J.M.: Kamus Dwibahasa Oxford Fajar: Melayu Inggeris, 4th edn. Fajar Bakti,
Selangor (2004)
3. Çavuş, Ö., Aksoy, S.: Semantic Scene Classification for Image Annotation and Retrieval.
In: da Vitoria Lobo, N., et al. (eds.) IAPR 2008. LNCS, vol. 5342, pp. 402–410. Springer,
Heidelberg (2008)
4. Inoue, M.: On the Need for Annotation-Based Image Retrieval. In: Workshop of Informa-
tion Retrieval in Context, pp. 44–46 (2004)
5. James Wang Research Group, http://wang.ist.psu.edu/~jwang/test1.zip
6. Jia Li Research Group,
http://www.stat.psu.edu/~jiali/li_photograph.tar
7. Li, J., Wang, J.Z.: Real-Time Computerized Annotation of Pictures. In: ACM Multimedia
Conference, pp. 911–920 (2006)
8. Pan, J.Y., Yang, H.J., Duygulu, P., Faloutsos, C.: Automatic Image Captioning. In: IEEE
International Conference on Multimedia and Expo., pp. 1987–1990 (2004)
9. Sayar, A., Yarman-Vural, F.T.: Image Annotation by Semi-Supervised Constrained by
SIFT Orientation Information. In: 23rd International Symposium on Computer and Infor-
mation Sciences, pp. 1–4 (2008)
10. Similarity Measurements, http://people.revoledu.com/kardi/tutorial/
Similarity/index.html
11. Srikanth, M., Varner, J., Bowden, M., Moldovan, D.: Exploiting Ontologies for Automatic
Image Annotation. In: 28th International ACM SIGIR Conference on Research and Devel-
opment in information Retrieval, pp. 552–558 (2005)
12. Vrahatis, M.N., Boutsinas, B., Alevizos, P., Pavlides, G.: The New k-Windows Algorithm
for Improving the K-Means Clustering Algorithm. J. Complexity 18(1), 375–391 (2002)
13. Vutsinas, C.: Image Segmentation: K-Means and EM Algorithms,
http://www.ces.clemson.edu/~stb/ece847/fall2007/projects/
kmeans_em.doc
14. Wang, L., Liu, L., Khan, L.: Automatic Image Annotation and Retrieval using Subspace
Clustering Algorithm. In: 2nd ACM International Workshop on Multimedia Databases, pp.
100–108 (2004)
15. Li, W., Sun, M.: Automatic Image Annotation Based on WordNet and Hierarchical En-
sembles. In: Gelbukh, A. (ed.) CICLing 2006. LNCS, vol. 3878, pp. 417–428. Springer,
Heidelberg (2006)

More Related Content

Similar to Automatic Image Annotation Using Color K-Means Clustering

IRJET- Content Based Image Retrieval (CBIR)
IRJET- Content Based Image Retrieval (CBIR)IRJET- Content Based Image Retrieval (CBIR)
IRJET- Content Based Image Retrieval (CBIR)IRJET Journal
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHESWEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHEScscpconf
 
Automatic Image Annotation Using CMRM with Scene Information
Automatic Image Annotation Using CMRM with Scene InformationAutomatic Image Annotation Using CMRM with Scene Information
Automatic Image Annotation Using CMRM with Scene InformationTELKOMNIKA JOURNAL
 
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATION
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATIONA DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATION
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATIONcsandit
 
Image search using similarity measures based on circular sectors
Image search using similarity measures based on circular sectorsImage search using similarity measures based on circular sectors
Image search using similarity measures based on circular sectorscsandit
 
IMAGE SEARCH USING SIMILARITY MEASURES BASED ON CIRCULAR SECTORS
IMAGE SEARCH USING SIMILARITY MEASURES BASED ON CIRCULAR SECTORSIMAGE SEARCH USING SIMILARITY MEASURES BASED ON CIRCULAR SECTORS
IMAGE SEARCH USING SIMILARITY MEASURES BASED ON CIRCULAR SECTORScscpconf
 
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATION
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATIONA DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATION
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATIONcscpconf
 
Web Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual DictionaryWeb Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual Dictionaryijwscjournal
 
Web Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual DictionaryWeb Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual Dictionaryijwscjournal
 
Application of Image Retrieval Techniques to Understand Evolving Weather
Application of Image Retrieval Techniques to Understand Evolving WeatherApplication of Image Retrieval Techniques to Understand Evolving Weather
Application of Image Retrieval Techniques to Understand Evolving Weatherijsrd.com
 
Pillar k means
Pillar k meansPillar k means
Pillar k meansswathi b
 
A SURVEY ON CONTENT BASED IMAGE RETRIEVAL USING MACHINE LEARNING
A SURVEY ON CONTENT BASED IMAGE RETRIEVAL USING MACHINE LEARNINGA SURVEY ON CONTENT BASED IMAGE RETRIEVAL USING MACHINE LEARNING
A SURVEY ON CONTENT BASED IMAGE RETRIEVAL USING MACHINE LEARNINGIRJET Journal
 
Effect of Similarity Measures for CBIR using Bins Approach
Effect of Similarity Measures for CBIR using Bins ApproachEffect of Similarity Measures for CBIR using Bins Approach
Effect of Similarity Measures for CBIR using Bins ApproachCSCJournals
 
A Study on Image Retrieval Features and Techniques with Various Combinations
A Study on Image Retrieval Features and Techniques with Various CombinationsA Study on Image Retrieval Features and Techniques with Various Combinations
A Study on Image Retrieval Features and Techniques with Various CombinationsIRJET Journal
 

Similar to Automatic Image Annotation Using Color K-Means Clustering (20)

IRJET- Content Based Image Retrieval (CBIR)
IRJET- Content Based Image Retrieval (CBIR)IRJET- Content Based Image Retrieval (CBIR)
IRJET- Content Based Image Retrieval (CBIR)
 
Av4301248253
Av4301248253Av4301248253
Av4301248253
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHESWEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
 
Automatic Image Annotation Using CMRM with Scene Information
Automatic Image Annotation Using CMRM with Scene InformationAutomatic Image Annotation Using CMRM with Scene Information
Automatic Image Annotation Using CMRM with Scene Information
 
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATION
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATIONA DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATION
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATION
 
Image search using similarity measures based on circular sectors
Image search using similarity measures based on circular sectorsImage search using similarity measures based on circular sectors
Image search using similarity measures based on circular sectors
 
IMAGE SEARCH USING SIMILARITY MEASURES BASED ON CIRCULAR SECTORS
IMAGE SEARCH USING SIMILARITY MEASURES BASED ON CIRCULAR SECTORSIMAGE SEARCH USING SIMILARITY MEASURES BASED ON CIRCULAR SECTORS
IMAGE SEARCH USING SIMILARITY MEASURES BASED ON CIRCULAR SECTORS
 
Oc2423022305
Oc2423022305Oc2423022305
Oc2423022305
 
I04302068075
I04302068075I04302068075
I04302068075
 
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATION
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATIONA DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATION
A DIGITAL COLOR IMAGE WATERMARKING SYSTEM USING BLIND SOURCE SEPARATION
 
Web Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual DictionaryWeb Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual Dictionary
 
Web Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual DictionaryWeb Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual Dictionary
 
Application of Image Retrieval Techniques to Understand Evolving Weather
Application of Image Retrieval Techniques to Understand Evolving WeatherApplication of Image Retrieval Techniques to Understand Evolving Weather
Application of Image Retrieval Techniques to Understand Evolving Weather
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Pillar k means
Pillar k meansPillar k means
Pillar k means
 
leaf diseses.pptx
leaf diseses.pptxleaf diseses.pptx
leaf diseses.pptx
 
A SURVEY ON CONTENT BASED IMAGE RETRIEVAL USING MACHINE LEARNING
A SURVEY ON CONTENT BASED IMAGE RETRIEVAL USING MACHINE LEARNINGA SURVEY ON CONTENT BASED IMAGE RETRIEVAL USING MACHINE LEARNING
A SURVEY ON CONTENT BASED IMAGE RETRIEVAL USING MACHINE LEARNING
 
Effect of Similarity Measures for CBIR using Bins Approach
Effect of Similarity Measures for CBIR using Bins ApproachEffect of Similarity Measures for CBIR using Bins Approach
Effect of Similarity Measures for CBIR using Bins Approach
 
A Study on Image Retrieval Features and Techniques with Various Combinations
A Study on Image Retrieval Features and Techniques with Various CombinationsA Study on Image Retrieval Features and Techniques with Various Combinations
A Study on Image Retrieval Features and Techniques with Various Combinations
 

More from Jim Webb

When Practicing Writing Chinese, Is It Recommende
When Practicing Writing Chinese, Is It RecommendeWhen Practicing Writing Chinese, Is It Recommende
When Practicing Writing Chinese, Is It RecommendeJim Webb
 
016 King Essay Example Stephen Why We Crave H
016 King Essay Example Stephen Why We Crave H016 King Essay Example Stephen Why We Crave H
016 King Essay Example Stephen Why We Crave HJim Webb
 
How To Write An Essay Fast Essay Writing Guide - Greetinglines
How To Write An Essay Fast Essay Writing Guide - GreetinglinesHow To Write An Essay Fast Essay Writing Guide - Greetinglines
How To Write An Essay Fast Essay Writing Guide - GreetinglinesJim Webb
 
Essay Coaching Seven Secrets For Writing Standout College
Essay Coaching Seven Secrets For Writing Standout CollegeEssay Coaching Seven Secrets For Writing Standout College
Essay Coaching Seven Secrets For Writing Standout CollegeJim Webb
 
Write Essays That Get In And Get Money EBook - Comp
Write Essays That Get In And Get Money EBook - CompWrite Essays That Get In And Get Money EBook - Comp
Write Essays That Get In And Get Money EBook - CompJim Webb
 
Wicked Fun In First Grade
Wicked Fun In First GradeWicked Fun In First Grade
Wicked Fun In First GradeJim Webb
 
Research Paper Help ‒ Write My P
Research Paper Help ‒ Write My PResearch Paper Help ‒ Write My P
Research Paper Help ‒ Write My PJim Webb
 
How To Do A Term Paper. D
How To Do A Term Paper. DHow To Do A Term Paper. D
How To Do A Term Paper. DJim Webb
 
Essay Websites Life Philosophy Essay
Essay Websites Life Philosophy EssayEssay Websites Life Philosophy Essay
Essay Websites Life Philosophy EssayJim Webb
 
Baby Thesis Introduction Sample - Thesis Title Idea
Baby Thesis Introduction Sample - Thesis Title IdeaBaby Thesis Introduction Sample - Thesis Title Idea
Baby Thesis Introduction Sample - Thesis Title IdeaJim Webb
 
Buy Essay Paper - Purchase Cu
Buy Essay Paper - Purchase CuBuy Essay Paper - Purchase Cu
Buy Essay Paper - Purchase CuJim Webb
 
From Where Can I Avail Cheap Essa
From Where Can I Avail Cheap EssaFrom Where Can I Avail Cheap Essa
From Where Can I Avail Cheap EssaJim Webb
 
Writing Philosophy Papers
Writing Philosophy PapersWriting Philosophy Papers
Writing Philosophy PapersJim Webb
 
Paragraph Ipyu9-M682198491
Paragraph Ipyu9-M682198491Paragraph Ipyu9-M682198491
Paragraph Ipyu9-M682198491Jim Webb
 
PPT - Writing Biomedical Research Papers PowerPo
PPT - Writing Biomedical Research Papers PowerPoPPT - Writing Biomedical Research Papers PowerPo
PPT - Writing Biomedical Research Papers PowerPoJim Webb
 
Economics Summary Essay Example
Economics Summary Essay ExampleEconomics Summary Essay Example
Economics Summary Essay ExampleJim Webb
 
Who Are Professional Essay Writers And How Students Might Benefit From
Who Are Professional Essay Writers And How Students Might Benefit FromWho Are Professional Essay Writers And How Students Might Benefit From
Who Are Professional Essay Writers And How Students Might Benefit FromJim Webb
 
Sample Personal Statements Graduate School Persona
Sample Personal Statements Graduate School PersonaSample Personal Statements Graduate School Persona
Sample Personal Statements Graduate School PersonaJim Webb
 
Buy A Critical Analysis Paper
Buy A Critical Analysis PaperBuy A Critical Analysis Paper
Buy A Critical Analysis PaperJim Webb
 
Writing A Position Paper - MUNKi
Writing A Position Paper - MUNKiWriting A Position Paper - MUNKi
Writing A Position Paper - MUNKiJim Webb
 

More from Jim Webb (20)

When Practicing Writing Chinese, Is It Recommende
When Practicing Writing Chinese, Is It RecommendeWhen Practicing Writing Chinese, Is It Recommende
When Practicing Writing Chinese, Is It Recommende
 
016 King Essay Example Stephen Why We Crave H
016 King Essay Example Stephen Why We Crave H016 King Essay Example Stephen Why We Crave H
016 King Essay Example Stephen Why We Crave H
 
How To Write An Essay Fast Essay Writing Guide - Greetinglines
How To Write An Essay Fast Essay Writing Guide - GreetinglinesHow To Write An Essay Fast Essay Writing Guide - Greetinglines
How To Write An Essay Fast Essay Writing Guide - Greetinglines
 
Essay Coaching Seven Secrets For Writing Standout College
Essay Coaching Seven Secrets For Writing Standout CollegeEssay Coaching Seven Secrets For Writing Standout College
Essay Coaching Seven Secrets For Writing Standout College
 
Write Essays That Get In And Get Money EBook - Comp
Write Essays That Get In And Get Money EBook - CompWrite Essays That Get In And Get Money EBook - Comp
Write Essays That Get In And Get Money EBook - Comp
 
Wicked Fun In First Grade
Wicked Fun In First GradeWicked Fun In First Grade
Wicked Fun In First Grade
 
Research Paper Help ‒ Write My P
Research Paper Help ‒ Write My PResearch Paper Help ‒ Write My P
Research Paper Help ‒ Write My P
 
How To Do A Term Paper. D
How To Do A Term Paper. DHow To Do A Term Paper. D
How To Do A Term Paper. D
 
Essay Websites Life Philosophy Essay
Essay Websites Life Philosophy EssayEssay Websites Life Philosophy Essay
Essay Websites Life Philosophy Essay
 
Baby Thesis Introduction Sample - Thesis Title Idea
Baby Thesis Introduction Sample - Thesis Title IdeaBaby Thesis Introduction Sample - Thesis Title Idea
Baby Thesis Introduction Sample - Thesis Title Idea
 
Buy Essay Paper - Purchase Cu
Buy Essay Paper - Purchase CuBuy Essay Paper - Purchase Cu
Buy Essay Paper - Purchase Cu
 
From Where Can I Avail Cheap Essa
From Where Can I Avail Cheap EssaFrom Where Can I Avail Cheap Essa
From Where Can I Avail Cheap Essa
 
Writing Philosophy Papers
Writing Philosophy PapersWriting Philosophy Papers
Writing Philosophy Papers
 
Paragraph Ipyu9-M682198491
Paragraph Ipyu9-M682198491Paragraph Ipyu9-M682198491
Paragraph Ipyu9-M682198491
 
PPT - Writing Biomedical Research Papers PowerPo
PPT - Writing Biomedical Research Papers PowerPoPPT - Writing Biomedical Research Papers PowerPo
PPT - Writing Biomedical Research Papers PowerPo
 
Economics Summary Essay Example
Economics Summary Essay ExampleEconomics Summary Essay Example
Economics Summary Essay Example
 
Who Are Professional Essay Writers And How Students Might Benefit From
Who Are Professional Essay Writers And How Students Might Benefit FromWho Are Professional Essay Writers And How Students Might Benefit From
Who Are Professional Essay Writers And How Students Might Benefit From
 
Sample Personal Statements Graduate School Persona
Sample Personal Statements Graduate School PersonaSample Personal Statements Graduate School Persona
Sample Personal Statements Graduate School Persona
 
Buy A Critical Analysis Paper
Buy A Critical Analysis PaperBuy A Critical Analysis Paper
Buy A Critical Analysis Paper
 
Writing A Position Paper - MUNKi
Writing A Position Paper - MUNKiWriting A Position Paper - MUNKi
Writing A Position Paper - MUNKi
 

Recently uploaded

Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 

Recently uploaded (20)

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 

Automatic Image Annotation Using Color K-Means Clustering

  • 1. See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/221365086 Automatic Image Annotation Using Color K- Means Clustering Conference Paper · November 2009 DOI: 10.1007/978-3-642-05036-7_61 · Source: DBLP CITATION 1 READS 55 2 authors: Some of the authors of this publication are also working on these related projects: QUANTITATIVE MODELLING OF MALAY VOWEL SOUNDS View project Nursuriati Jamil Universiti Teknologi MARA 60 PUBLICATIONS 150 CITATIONS SEE PROFILE Siti 'Aisyah Sa'dan Universiti Teknologi MARA 3 PUBLICATIONS 1 CITATION SEE PROFILE All content following this page was uploaded by Nursuriati Jamil on 01 December 2016. The user has requested enhancement of the downloaded file. All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.
  • 2. H. Badioze Zaman et al. (Eds.): IVIC 2009, LNCS 5857, pp. 645–652, 2009. © Springer-Verlag Berlin Heidelberg 2009 Automatic Image Annotation Using Color K-Means Clustering Nursuriati Jamil and Siti ’Aisyah Sa’adan Faculty of Computer & Mathematical Sciences, Universiti Teknologi MARA 40450 Shah Alam, Selangor, Malaysia liza@tmsk.uitm.edu.my, aisyah.sadan@gmail.com Abstract. Automatic image annotation is a process of modeling a human in as- signing words to images based on visual observations. It is essential as manual annotation is time consuming especially for large databases and there is no standard captioning procedure because it is based on human perception. This paper discusses implementation of automatic image annotation using K-means clustering algorithm to annotate the colors with the appropriate words by using predefined colors. Experiments are conducted to identify the number of cen- troids, distance measures and initialization mode for the best clustering results. A prototype of an automatic image annotation is developed and then tested us- ing thirty-five beach scenery photographs. Results showed that annotating im- age using evenly-spaced initialization mode and 100 centroids measured using City-Block distance function managed to achieve a commendable 75% preci- sion rate. Keywords: Automatic image annotation, K-means clustering, RGB model, ini- tialization mode, cluster number. 1 Introduction Automatic image annotation is defined indirectly as the process by which a computer system automatically assigns words in the form of captioning to a digital image [14]. Commonly, automatic image annotation is used in image retrieval systems to organize and locate images of interest from a database. Annotation-based image retrieval is perceived as better than content-based image retrieval (CBIR) because it allows user to compose queries freely using their natural language [4]. Furthermore, CBIR system matches images based on the low-level visual similarities. Thus, it has some limita- tions due to missing semantic information [8]. Clustering algorithms are commonly used in classifying low-level features of the images prior to annotation. [1] defined clustering as the process of organizing objects into groups whose members are similar in some way. A cluster is therefore a collection of objects that are similar between them and are dissimilar to objects belonging to other clusters. Several popular clustering algorithms include K-Means, Expectation Maximization (EM) and Discreet Distribution (D2) clustering [7] algorithms. K-Means clustering relies on hard assignment of information to a given set of partitions also
  • 3. 646 N. Jamil and S. ’Aisyah Sa’adan known as cluster centers or the K centroids [13]. At every step of the algorithm, each data value is assigned to the nearest centroid based on some similarity parameter that is calculated using distance measurement. Then, the centroids are then recalculated based on these hard assignments. With each successive pass, a data value can change the centroid where it belongs to, thus altering the values of the centroid at every pass. K- Means clustering has been used extensively to facilitate in classifications of low-level features in image retrieval systems [3][14][11][7][9]. The EM algorithm employed in [13] [14], on the other hand relies on soft assignment of data given set of centroids. Every data value is associated with every centroid through system of weight based on strongly the data value should be associated with the particular centroid. In general, K- Means clustering works better than EM algorithm and is fairly simple to implement for image segmentation using color as the feature parameter [13]. In this paper, implementation of K-Means clustering algorithm is experimented to automatically annotate beach scenery photographs using their RGB color features. The purpose of the study is to investigate the suitable number of centroids, distance meas- ures and initialization mode in an attempt to achieve the best clustering performance. 2 K-Means Clustering The K-Means is a very popular algorithm and one of the best for implementing the clustering process [12]. It has a time complexity that is dominated by the product of the number of patterns, the number of centroids, and the number of iterations. For an image, K-Means clustering may be implemented as follows: i) Place K points into the space represented by the pixels that are being clustered. These points represent initial cluster centroids (K), also known as initialization point. ii) Assign each pixel to the cluster that has the closest centroid (obtain by measuring distance). iii) When all pixels have been assigned, recalculate the positions of the K centroids. iv) Repeat Steps 2 and 3 until the centroids no longer move. Factors that may affect performance of K-Means algorithm are the initialization mode, distance measures and the number of centroids used during clustering process. Initialization mode is important in order to have accurate RGB representation for the centroids at the starting point of the clustering. Each pixel in the image is then assign to its proper cluster based on its similarity by using a distance measure. This will influence the shape of the clusters, as some elements may be close or further away to one another according to the distance calculated [10]. Thus, the distance measurement used is also vital to ensure every pixel is assigned to its centroid precisely. Common distance functions used in clustering are Euclidean distance, City Block distance, Minkowski distance and Canberra distance. Number of centroids, K chosen in the clustering process must also be taken into consideration too. According to [13], the number of the centroids used in the segmentation has a very large effect on the output. The more centroids used in the color setup, more possible colors are available to show up in the output.
  • 4. Automatic Image Annotation Using Color K-Means Clustering 647 3 Materials and Methods As mentioned previously, this paper discusses the implementation of an automatic annotation prototype that will annotate beach photographs using eight predefined words: sky, sea, beach, cloud, tree, hill, grass and rock. Fig. 1 demonstrates the dia- gram of the annotation process. Fig. 1. Automatic annotation processes 3.1 Data Collection Ten natural beach scenery photographs are downloaded from [5][6] as these images have been classified into their proper categories for benchmarking purpose. These images are chosen from a total of 3,360 photograph images to be used as training images. For testing purposes, thirty-five photograph images are collected randomly from search engine Yahoo! and Google. The criteria of the test images are that they are beach scenery photographs and they must have at least one of the eight beach elements, which are sky, sea, beach, cloud, tree, hill, grass and rock. 3.2 Manual Image Annotation All thirty-five testing images are manually annotated using visual inspection of three people. They are given a selected list of words taken from Oxford Fajar dictionary [2] that describe beach scenery and they manually annotated the test images based on the given words. Results of these manual annotations are then used as benchmarking of the proposed prototype. Beach image Color extraction Predefined colors Color clustering Identify init mode, cluster no, distance measure K-Means clustering Automatic annotation Captions: SKY, CLOUD, SEA, ROCK, TREE, GRASS, HILL, BEACH Manual annotation Relevance list Benchmarking Training
  • 5. 648 N. Jamil and S. ’Aisyah Sa’adan 3.3 Color Feature Extraction Color features using RGB model of the eight beach elements mentioned earlier are extracted from the training images. These predefined color features are later used during the testing phase for annotating the test images. Table 1 shows the RGB aver- age color values for all the beach elements. Table 1. Predefined colors of the beach elements Beach element Average RGB values Sky 88, 122, 170 Sea 58, 97, 123 Beach 187, 174, 147 Grass 59, 69, 30 Hill 43, 88, 75 Tree 72, 79, 36 Rock 76, 67, 69 Cloud 190, 189, 199 3.4 Color Clustering Two experiments are conducted to determine the initialization mode, distance func- tion and number of clusters in an effort to achieve the highest performance of K- Means algorithm. The first experiment is to discover the best combination of initiali- zation mode and distance measure. The initialization modes that are tested are evenly- spaced mode and max-data mode; and the distance measurements that are involved are Euclidean, City Block and Canberra. Objective of the second experiment is to identify the appropriate number of centroids (K) to be implemented in automatic an- notation prototype. These centroids contain the RGB values to be compared later with predefined color of beach elements. The numbers of centroids to be tested are 8K, 30K, 50K and 100K. To evaluate the performance of the clustering algorithm, Recall and Precision measures are computed [14], where numCorrect is the number of correctly retrieved words from output caption, numRetrieved is the total number of retrieved words from the caption and numExist is the actual number of retrieved words for the caption. Recall = numCorrect numRetrieved (1) Precision = numCorrect numExist (2) 3.5 Development of Automatic Annotation Prototype Based on the experiment results, a prototype of an automatic annotation system was developed using Java programming language. The software development tools used
  • 6. Automatic Image Annotation Using Color K-Means Clustering 649 are BlueJ version 2.1.2 with Java Development Kit of version jdk1.6.0_05, Java Run- time Environment of version jre1.6.0_07, Java Advance Imaging Development Kit version jai-1_1_3-lib-windows-i586-jdk and Java Advance Imaging Runtime Envi- ronment version jai-1_1_3-lib-windows-i586-jre. The prototype is then tested and evaluated using thirty-five photographs of beach scenery. 4 Results and Discussions Table 2 shows results of the first experiment to determine the combination of initialization mode and distance measure in achieving the best performance of clus- tering. Overall, evenly-spaced initialization mode performed better compared to max-data mode. It can be also seen that the highest average precision rate of 88% is accomplished by using evenly-spaced initialization mode and City Block distance measure. Even though recall rate of this combination is slightly lower than Canberra measure, we perceived precision rate as a better judgment of clustering performance. Table 2. Performance of different combinations of initialization modes and distance measures Precision Recall Initialization Mode Euclidian CityBlock Canberra Euclidian CityBlock Canberra Evenly-spaced 0.80 0.88 0.83 0.40 0.40 0.46 Max-data 0.82 0.87 0.88 0.34 0.32 0.37 The experiment result of comparing the number of centroids is recorded in Table 3. From the table, it is shown that 8K and 100K have equal precision rate of 88% in annotating the images. Therefore, we include the recall rate of 40% in order to conclude that the highest performance was achieved when the highest number of centroids of 100 is used. It is also interesting to note that when using 30 and 50 cen- troids, the precision rates are in fact lower than when utilizing only 8 centroids. After all the techniques and distance measure are determined, the prototype was developed and tested with the 35 testing images. Fig. 2 illustrates an output of one of the annotated image. Recall and precision rates of the tested images are demonstrated in Table 4 showing and average precision rate of 75% and recall rate of 50%. Table 3. Performance of different number of centroids Number of Centroids (K) Average 8 30 50 100 Precision 0.88 0.87 0.87 0.88 Recall 0.32 0.34 0.38 0.40
  • 7. 650 N. Jamil and S. ’Aisyah Sa’adan Table 4. Recall and Precision Rates of the Prototype Image Precision Recall beach001.jpg 0.5 0.25 beach002.jpg 0.67 0.4 beach003.jpg 0.4 0.5 beach004.jpg 1 0.33 beach005.jpg 1 0.4 beach006.jpg 0.5 0.2 beach007.jpg 1 0.4 beach008.jpg 1 0.4 beach009.jpg 1 0.5 beach010.jpg 0.57 1 beach011.jpg 1 0.2 beach012.jpg 1 0.4 beach013.jpg 0.5 1 beach014.jpg 0.8 0.8 beach015.jpg 1 0.4 beach016.jpg 1 0.2 beach017.jpg 1 0.4 beach018.jpg 0.5 0.2 beach019.jpg 0.67 0.5 beach020.jpg 0.5 0.25 beach021.jpg 1 0.67 beach022.jpg 1 0.33 beach023.jpg 0.67 1 beach024.jpg 0.57 1 beach025.jpg 1 0.4 beach026.jpg 0.5 0.25 beach027.jpg 1 0.25 beach028.jpg 1 0.4 beach029.jpg 0.5 0.4 beach030.jpg 0.5 1 beach031.jpg 0.67 1 beach032.jpg 0.2 0.33 beach033.jpg 0.33 0.35 beach034.jpg 0.63 1 beach035.jpg 1 0.25 Average 0.75 0.50 Table 5 illustrated the result of the percentage of each beach element correctly re- trieved. From the table, it is shown that SKY and CLOUD have the highest retrieval rate at 77% and 70%, respectively. This is due to the fact that SKY has similar color with CLOUD. In other words, when there is SKY, there is possibility of CLOUD to be annotated. However, ROCK has 0% of correctly retrieved rate. The main reason of this is the little occurrence of ROCK in all the tested images.
  • 8. Automatic Image Annotation Using Color K-Means Clustering 651 Fig. 2. An image automatically annotated with 5 words related to beach scenery Table 5. Percentage of beach elements correctly retrieved Beach Element Manual Annotation Automatic Annotation Correctly Retrieved SKY 35 27 77.14 SEA 28 11 39.29 BEACH 33 13 39.39 GRASS 6 2 33.33 HILL 8 1 12.50 TREE 21 7 33.33 ROCK 4 0 0.00 CLOUD 20 14 70.00 5 Conclusion From the experimental results, it shows that the prototype is best implemented using evenly spaced values for initialization mode with City Block distance for distance measure in K-Means clustering. Even though the training data is very small, due to lack of free image database, the prototype achieved a commendable precision rate of 75 %. This shows that K-Means algorithm is robust enough to be utilized in clustering low-level features of an image for annotation purposes. Our study is an initial work of automatic image annotation. There are several constraints and limitations that should be overcome with further research. Future work to improve the accuracy of the system can take many directions. For example, this prototype needs to be tested with other color model that is more align with human vision such as HSV color model. More training images should be ac- quired to increase the accuracy of the feature extractions. Finally, more features should be extracted from the image to imply more meaning when annotation process is performed.
  • 9. 652 N. Jamil and S. ’Aisyah Sa’adan References 1. A Tutorial on Clustering Algorithm, http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/ index.html 2. Hawkins, J.M.: Kamus Dwibahasa Oxford Fajar: Melayu Inggeris, 4th edn. Fajar Bakti, Selangor (2004) 3. Çavuş, Ö., Aksoy, S.: Semantic Scene Classification for Image Annotation and Retrieval. In: da Vitoria Lobo, N., et al. (eds.) IAPR 2008. LNCS, vol. 5342, pp. 402–410. Springer, Heidelberg (2008) 4. Inoue, M.: On the Need for Annotation-Based Image Retrieval. In: Workshop of Informa- tion Retrieval in Context, pp. 44–46 (2004) 5. James Wang Research Group, http://wang.ist.psu.edu/~jwang/test1.zip 6. Jia Li Research Group, http://www.stat.psu.edu/~jiali/li_photograph.tar 7. Li, J., Wang, J.Z.: Real-Time Computerized Annotation of Pictures. In: ACM Multimedia Conference, pp. 911–920 (2006) 8. Pan, J.Y., Yang, H.J., Duygulu, P., Faloutsos, C.: Automatic Image Captioning. In: IEEE International Conference on Multimedia and Expo., pp. 1987–1990 (2004) 9. Sayar, A., Yarman-Vural, F.T.: Image Annotation by Semi-Supervised Constrained by SIFT Orientation Information. In: 23rd International Symposium on Computer and Infor- mation Sciences, pp. 1–4 (2008) 10. Similarity Measurements, http://people.revoledu.com/kardi/tutorial/ Similarity/index.html 11. Srikanth, M., Varner, J., Bowden, M., Moldovan, D.: Exploiting Ontologies for Automatic Image Annotation. In: 28th International ACM SIGIR Conference on Research and Devel- opment in information Retrieval, pp. 552–558 (2005) 12. Vrahatis, M.N., Boutsinas, B., Alevizos, P., Pavlides, G.: The New k-Windows Algorithm for Improving the K-Means Clustering Algorithm. J. Complexity 18(1), 375–391 (2002) 13. Vutsinas, C.: Image Segmentation: K-Means and EM Algorithms, http://www.ces.clemson.edu/~stb/ece847/fall2007/projects/ kmeans_em.doc 14. Wang, L., Liu, L., Khan, L.: Automatic Image Annotation and Retrieval using Subspace Clustering Algorithm. In: 2nd ACM International Workshop on Multimedia Databases, pp. 100–108 (2004) 15. Li, W., Sun, M.: Automatic Image Annotation Based on WordNet and Hierarchical En- sembles. In: Gelbukh, A. (ed.) CICLing 2006. LNCS, vol. 3878, pp. 417–428. Springer, Heidelberg (2006)