SlideShare a Scribd company logo
Seminar Report
Presented to:
Dr. Shanbehzadeh
Presented by:
Farzaneh Rezaei
November 2015
What is the goal of computer vision ?
Perceive the story
behind the picture
See the world!!
But what exactly does it
mean to see?
2Source: Wall-e Movie: Pixar, Walt Disney Pictures
Outline
Introduction
To Image
Annotation
• What?
• Why?
Story Behind
AIA
• Components of AIA
• Progress of AIA
• Issues &
Conclusions
Going deeper !
• Feature Extraction
• Learning Methods
• Deep Learning
• Conclusions
Useful
Information
• Recent Articles
• Toolbox
• Databases
• Authors
Conclusions
• References
3
Outline
Introduction
To Image
Annotation
• What?
• Why?
Story Behind
AIA
• Components of AIA
• Progress of AIA
• Issues &
Conclusions
Going deeper !
• Feature Extraction
• Learning Methods
• Deep Learning
• Conclusions
Useful
Information
• Recent Articles
• Toolbox
• Databases
• Authors
Conclusions
• References
4
What is Automatic
Image Annotation?
Automatic image annotation is the
task of automatically assigning
words to an image that describe the
content of the image.
Munirathnam Srikanth, et al. Exploiting
ontologies for automatic image
annotation
Source: Personalizing Automated Image Annotation Using Cross-Entropy: https://ivi.fnwi.uva.nl/isis/publications/bibtexbrowser.php?key=LiICM2011&bib=all.bib
5
What is Automatic Image Annotation?(Cont.)
Source: MS COCO Captioning Challenge: http://mscoco.org/dataset/#captions-challenge2015
6
3,000 Photos Are Uploaded
Every Second to Facebook
Why Image Annotation
is important?
Recently, we have witnessed an
exponential growth of user generated
videos and images, due to the
booming of social networks, such as
Facebook and Flickr.
Source: petapixel.com
Source: http://petapixel.com/2012/02/01/3000-photos-are-uploaded-every-second-to-facebook/
7
Why Image Annotation is important?(Cont.)
Source: Barriuso, A., & Torralba, A. (2012). Notes on image annotation
• Applications e.g. Photo organizer
apps
• Image Classification Systems
8
Numbers of articles per year for “Automatic Image Annotation”
(in Title of article)
0
10
20
30
40
50
60
70
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Year Reported by: Google Scholar
9
Outline
Introduction
To Image
Annotation
• What?
• Why?
Story Behind
AIA
• Components of AIA
• Progress of AIA
• Issues &
Conclusions
Going deeper !
• Feature Extraction
• Learning Methods
• Deep Learning
• Conclusions
Useful
Information
• Recent Articles
• Toolbox
• Databases
• Authors
Conclusions
• References
10
How do you annotate these images?
11
What are
components of
Automatic Image
Annotation
System ?
12
How to classify Images ?
What are
components of
Automatic Image
Annotation
System ?
13
Feature
Extraction
Classification
Methods
What are
components of
Automatic Image
Annotation
System ?
14
What are
components of
Automatic Image
Annotation
System ?
Classification
Methods
Feature
Extraction
15
What are
components of
Automatic Image
Annotation
System ?
Feature
Extraction
Classification
Methods
Pattern Recognition !!
16
17
Slide Credit
18
An Example of classical approaches in AIA
Source: Zhang, D., Islam, M. M., &
Lu, G. (2012). A review on
automatic image annotation
techniques. Pattern Recognition,
45(1), 346–362.
doi:10.1016/j.patcog.2011.05.013
Theoretical Limitations of Shallow Architectures*
Functions that can be compactly represented by a depth k architecture
might require an exponential number of computational elements to
be represented by a depth k − 1 architecture
Issues of classical approaches
19
*Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning
Issues of classical approaches (Cont.)
Theoretical Limitations of Shallow Architectures
• Shallow? Deep?
• Functions?
• Compact?
• Depth?
• Computational Elements?
20
logic circuit
Issues of classical approaches (Cont.)
21Picture Source: Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning
Issues of classical approaches (Cont.)
Theoretical Limitations of Shallow Architectures
• Linear regression and logistic regression have depth 1, i.e., have a single level.
• Ordinary multi-layer neural networks With the most common choice of one
hidden layer, they have depth two
• Decision trees can also be seen as having two levels
• Boosting (Freund & Schapire, 1996) usually adds one level to its base learners:
that level computes a vote or linear combination of the outputs of the base
learners
22
Issues of classical approaches (Cont.)
Theoretical Limitations of Shallow Architectures
• Shallow? Deep?
• Functions
• Compact
• Depth
• Computational Elements
23
Theoretical Limitations of Shallow Architectures*
Functions that can be compactly represented by a depth k architecture
might require an exponential number of computational elements to
be represented by a depth k − 1 architecture
Issues of classical approaches
24
*Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning
• A two-layer circuit of logic gates can represent any boolean function (Mendelson, 1997).
• With depth two logical circuits, most boolean functions require an exponential number
of logic gates (Wegener, 1987) to be represented (with respect to input size)
• There are functions computable with a polynomial-size logic gates circuit of depth k that
require exponential size when restricted to depth k − 1 (Hastad, 1986) The proof of this
theorem relies on earlier results (Yao, 1985) showing that d-bit parity circuits of depth 2
have exponential size
25
Issues of classical approaches (Cont.)
• One might wonder whether these computational complexity results for boolean
circuits are relevant to machine learning.
• See Orponen (1994)!
• for an early survey of theoretical results in computational complexity relevant to
learning algorithms. Interestingly, many of the results for boolean circuits can be
generalized to architectures whose computational elements are linear threshold
units (also known as artificial neurons (McCulloch & Pitts, 1943)), which compute:
f(x) = w0 x+b≥0 (1)
with parameters w and b.
26
Issues of classical approaches (Cont.)
27
Issues of classical approaches (Cont.)
1 Theoretical Limitations of Shallow Architectures
2 Theoretical Advantages of Deep Architectures
Which one ?? !
28
Slide Credit
29
Slide Credit
How to assign a word to
an image ?
What are
components of
Automatic Image
Annotation
System ?
Feature
Extraction
Classification
Methods
Pattern Recognition !!
30
Components
of AIA
Classical or
Shallow
Structure
Issues
31http://graffiti-artist.net/corporate-offices/ny-facebook-office-graffiti/
Outline
Introduction
To Image
Annotation
• What?
• Why?
Story Behind
AIA
• Components of AIA
• Progress of AIA
• Issues &
Conclusions
Going deeper !
• Feature Extraction
• Learning Methods
• CNN
• Conclusions
Useful
Information
• Recent Articles
• Toolbox
• Databases
• Authors
Conclusions
• References
32
Going Deeper!
• Color
• Texture
• Shape
• Segmentation
Feature
Extraction &
Representation
• ANN
• SVM
• Bayes
• Metadata
Learning
Methods
33
Feature Extraction
ColorHistogram
Color
Moments
Color
Coherence
Vector
Color
Correlogram Scalable
Color
Descriptor
Color
Structure
Descriptor
Dominant
Color
Descriptor
Spatial
• Statistical
• Structural
• Model-based
Spectral
• FT, DCT, Wavelet,
..
Texture
34
Color
35
Color
36
Color: Comparisons
Color method Pros Cons
Histogram Simple to compute, intuitive High dimension, no spatial info,
sensitive to noise
CM Compact, robust Not enough to describe all colors,
no spatial info
CCV Spatial info High dimension, high computation
cost
Correlogram Spatial info Very high computation cost,
sensitive to noise, rotation and
scale
37
Color: Comparisons (Cont.)
Color method Pros Cons
DCD Compact, robust,
perceptual meaning
Need post-processing for spatial
info
CSD Spatial info Sensitive to noise, rotation and
scale
SCD Compact on need,
scalability
No spatial info, less accurate if
compact
38
Spatial Texture : Comparisons
Color method Pros Cons
Texton Intuitive Sensitive to noise, rotation and
scale, difficult to define textons
GLCM based method Intuitive, compact, robust High High computation cost, not enough
to describe all
Tamura Perceptually meaningful Too few features
SAR Compact, robust, rotation
invariant
High computation cost, difficult to
define pattern size
FD Compact, perceptually meaningful computation cost, sensitive to scale
39
Spectral Texture : Comparisons (Cont.)
Color method Pros Cons
FT/DCT Fast computation Sensitive to scale and rotation
Wavelet Fast computation, multi-resolution Sensitive to rotation, limited
orientations
Gabor Multi-scale, multi-orientation,
robust
normalisation, losing of spectral
information due to incomplete
cover of spectrum plane
Curvelet Multi-resolution, multi-orientation,
robust
Need rotation normalisation
40
Shape
Chart Source: [Zhang and Lu 2004]
41
Chart Source:
[M. Yang, K. Kpalma, J. Ronsin 2008]
Shape (Cont.)
42
Shape (Cont.)
Contour
Based
Calculate shape
features only from
the boundary
of the shape
Region
Based
Extract features
from the entire
region
43
Shape (Cont.)
• Because contour based techniques are more sensitive to noise than
region based techniques.
• Therefore, color image retrieval usually employs region based shape
features.
44
Learning Methods:
Learning Methods
• SVM
• ANN
• Tree
• Parametric
• Non-Parametric
45
Learning Methods: Comparisons
Annotation method Pros Cons
SVM Small sample, optimal class
boundary, non-linear classification
Single labelling, one class per time,
expensive trial and run, sensitive to
noisy data, prone to over-fitting
ANN Multiclass outputs, non- linear
classification, robust to noisy data,
suitable for complex problem
Single labelling, sub-optimal,
expensive training, complex and
black box classification
DT Intuitive, semantic rules, multiclass
outputs, fast, allow missing values,
handle both categorical and
numerical values
Single labelling, sub-optimal, need
pruning, can be unstable
46
Learning Methods: Comparisons
Annotation method Pros Cons
Non-parametric Multi-labelling, model free, fast Large number of parameters, large
sample, sensitive to noisy data
Parametric Multi-labelling, small sample, good
approximation of unknown
distribution
Predefined distribution, expensive
training, approximated boundary
Metadata Use of both textual and visual
features
Difficult to relate visual features
with textual features, difficult
textual feature extraction
47
Deep Learning
48
• Deep belief networks
• Deep Boltzmann machines
• Deep Convolutional neural networks
• Deep Recurrent neural networks
• Hierarchical temporal memory
Source: https://en.wikipedia.org/wiki/List_of_machine_learning_concepts
Deep Learning (Cont.)
49
Source: Ranzato, 4 October 2013, Slides
Deep Learning (Cont.)
50
•A Potential Problem with Deep Learning *??
•Optimization Task
• See :
• Bengio’s Articles!
• Hot videos about Deep Learning on YouTube!
• Ranzato, 4 October 2013:
• https://www.youtube.com/watch?v=clgMTk5V
2Sk
*: Ranzato, 4 October 2013, Slides
Outline
Introduction
To Image
Annotation
• What?
• Why?
Story Behind
AIA
• Components of AIA
• Progress of AIA
• Issues &
Conclusions
Going deeper !
• Feature Extraction
• Learning Methods
• Deep Learning
• Conclusions
Useful
Information
• Recent Articles
• Toolbox
• Databases
• Authors
Conclusions
• References
51
2009, Shallow
Source: Venkatesh N. Mur thy, S. Maji, R. Manmatha, Automatic Image Annotation using Deep Learning Representations 2015
Useful Information: Recent Articles
52
53
Which one ?? !
1 Theoretical Limitations of Shallow Architectures
2 Theoretical Advantages of Deep Architectures
Source: B. Klein, G. Lev, G. Sadeh, and L. Wolf, Fisher Vectors Derived from Hybrid Gaussian-Laplacian Mixture Models for Image Annotation 2015
Useful Information: Recent Articles (Cont.)
54
Useful Information: Toolbox
MatConvNet
• MatConvNet is a MATLAB toolbox
implementing Convolutional Neural
Networks (CNNs) for computer vision
applications. It is simple, efficient, and can run
and learn state-of-the-art CNNs. Several
example CNNs are included to classify and
encode images.
Caffe
• Caffe is a deep learning framework made with
expression, speed, and modularity in mind. It
is developed by the Berkeley Vision and
Learning Center (BVLC) and by community
contributors.Yangqing Jia created the project
during his PhD at UC Berkeley. Caffe is
released under the BSD 2-Clause license.
55
Useful Information: Databases
an important
benchmark for
keyword based image
retrieval and image
annotation
5000 images
manually annotated
with 1 to 5 keywords.
The vocabulary
contains 260 words.
Corel5k:
This data set is
obtained from an
online game where
two players, that can
not communicate
outside the game,
gain points by
agreeing on words
describing the image
ESP Game:
This set of 20.000
images accompanied
with descriptions in
several languages
was initially published
for cross-lingual
retrieval
IAPR TC12:
56
Useful Information: Databases
• Other Databases:
• Flicker8,10,30
Table Source: M. Guillaumin, T. Mensink, J. Verbeek and C. Schmid, TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation
57
Useful Information: Authors
Cordelia Schmid
•Research director INRIA
•Computer vision, object
recognition, video
recognition, learning
Li Fei-Fei
•Professor, Stanford University
•Artificial Intelligence, Machine
Learning, Computer
Vision, Neuroscience
Yoshua Bengio
•Professor, U. Montreal, Computer Sc.
•Machine learning, deep
learning, artificial intelligence
Reported by: Google Scholar
58
Useful Information: Authors (Cont.)
Richard Socher
•MetaMind
•deep learning, machine learning, natural language
processing, computer vision
59
Recursive Deep Learning for Natural Language
Processing and Computer Vision,
PhD Thesis, Computer Science Department, Stanford
University
2014 Arthur L. Samuel Best Computer Science PhD
Thesis Award
Reported by: Google Scholar
Outline
Introduction
To Image
Annotation
• What?
• Why?
Story Behind
AIA
• Components of AIA
• Progress of AIA
• Issues &
Conclusions
Going deeper !
• Feature Extraction
• Learning Methods
• Deep Learning
• Conclusions
Useful
Information
• Recent Articles
• Toolbox
• Databases
• Authors
Conclusions
• References
60
How to assign a word to
an image ?
What are
components of
Automatic Image
Annotation
System ?
Feature
Extraction
Classification
Methods
Pattern Recognition !!
61
Components
of AIA
Classical or
Shallow
Structure
Issues
Conclusions!!!
1. High dimensional feature analysis
2. How to build an effective annotation model?
3. The third issue is that currently annotation and
ranking are done online simultaneously in the
multiple labelling annotation approaches. This is not
efficient for image retrieval.
4. Lack of standard vocabulary and taxonomy.
5. There is no commonly acceptable image database
6. insufficient depth of architectures, and locality of
estimators[Bengio, 2009]
62
Picture Source: Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning
Source: Zhang, D., Islam, M. M., & Lu, G. (2012). A review on automatic image annotation techniques. Pattern Recognition,
45(1), 346–362. doi:10.1016/j.patcog.2011.05.013
Conclusions (Cont.)
References
63

More Related Content

What's hot

Texture in image processing
Texture in image processing Texture in image processing
Texture in image processing
Anna Aquarian
 
Content Based Image Retrieval
Content Based Image Retrieval Content Based Image Retrieval
Content Based Image Retrieval
Swati Chauhan
 
Content based image retrieval
Content based image retrievalContent based image retrieval
Content based image retrieval
rubaiyat11
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
SowmyaJyothi3
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
Kuppusamy P
 
Image processing
Image processingImage processing
Image processing
Varun Raj
 
Web scraping in python
Web scraping in pythonWeb scraping in python
Web scraping in python
Saurav Tomar
 
Real time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimationReal time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimation
omid Asudeh
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
Shaleen Saini
 
Digital image processing ppt
Digital image processing pptDigital image processing ppt
Digital image processing ppt
khanam22
 
Application of edge detection
Application of edge detectionApplication of edge detection
Application of edge detection
Naresh Biloniya
 
Web usage mining
Web usage miningWeb usage mining
Web usage mining
Monu Chaudhary
 
Principles of data visualisation 2020
Principles of data visualisation 2020Principles of data visualisation 2020
Principles of data visualisation 2020
Marié Roux
 
Text detection and recognition from natural scenes
Text detection and recognition from natural scenesText detection and recognition from natural scenes
Text detection and recognition from natural scenes
hemanthmcqueen
 
Textmining Introduction
Textmining IntroductionTextmining Introduction
Textmining Introduction
Datamining Tools
 
Sentiment Analysis using Twitter Data
Sentiment Analysis using Twitter DataSentiment Analysis using Twitter Data
Sentiment Analysis using Twitter Data
Hari Prasad
 
Image processing
Image processingImage processing
Image processing
pradnya patil
 
Image processing- an introduction
Image processing- an introductionImage processing- an introduction
Image processing- an introduction
Aarohi Gupta
 
APRIORI Algorithm
APRIORI AlgorithmAPRIORI Algorithm
APRIORI Algorithm
Ashish Kumar Thakur
 
An Introduction to Image Processing and Artificial Intelligence
An Introduction to Image Processing and Artificial IntelligenceAn Introduction to Image Processing and Artificial Intelligence
An Introduction to Image Processing and Artificial Intelligence
Wasif Altaf
 

What's hot (20)

Texture in image processing
Texture in image processing Texture in image processing
Texture in image processing
 
Content Based Image Retrieval
Content Based Image Retrieval Content Based Image Retrieval
Content Based Image Retrieval
 
Content based image retrieval
Content based image retrievalContent based image retrieval
Content based image retrieval
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Image processing
Image processingImage processing
Image processing
 
Web scraping in python
Web scraping in pythonWeb scraping in python
Web scraping in python
 
Real time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimationReal time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimation
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 
Digital image processing ppt
Digital image processing pptDigital image processing ppt
Digital image processing ppt
 
Application of edge detection
Application of edge detectionApplication of edge detection
Application of edge detection
 
Web usage mining
Web usage miningWeb usage mining
Web usage mining
 
Principles of data visualisation 2020
Principles of data visualisation 2020Principles of data visualisation 2020
Principles of data visualisation 2020
 
Text detection and recognition from natural scenes
Text detection and recognition from natural scenesText detection and recognition from natural scenes
Text detection and recognition from natural scenes
 
Textmining Introduction
Textmining IntroductionTextmining Introduction
Textmining Introduction
 
Sentiment Analysis using Twitter Data
Sentiment Analysis using Twitter DataSentiment Analysis using Twitter Data
Sentiment Analysis using Twitter Data
 
Image processing
Image processingImage processing
Image processing
 
Image processing- an introduction
Image processing- an introductionImage processing- an introduction
Image processing- an introduction
 
APRIORI Algorithm
APRIORI AlgorithmAPRIORI Algorithm
APRIORI Algorithm
 
An Introduction to Image Processing and Artificial Intelligence
An Introduction to Image Processing and Artificial IntelligenceAn Introduction to Image Processing and Artificial Intelligence
An Introduction to Image Processing and Artificial Intelligence
 

Viewers also liked

каталог фаберлик №5 2015
каталог фаберлик №5 2015каталог фаберлик №5 2015
каталог фаберлик №5 2015
ivgen08
 
Bad 2014 01_small
Bad 2014 01_smallBad 2014 01_small
Bad 2014 01_small
ivgen08
 
каталог фаберлик 2 2016
каталог фаберлик 2 2016каталог фаберлик 2 2016
каталог фаберлик 2 2016
ivgen08
 
Asian landscapes
Asian landscapesAsian landscapes
Asian landscapes
Thai Chamroeun
 
детская одежда фаберлик
детская одежда фаберликдетская одежда фаберлик
детская одежда фаберлик
ivgen08
 
SELENAMILLSCV.V.1
SELENAMILLSCV.V.1SELENAMILLSCV.V.1
SELENAMILLSCV.V.1
Selena Burgess
 

Viewers also liked (6)

каталог фаберлик №5 2015
каталог фаберлик №5 2015каталог фаберлик №5 2015
каталог фаберлик №5 2015
 
Bad 2014 01_small
Bad 2014 01_smallBad 2014 01_small
Bad 2014 01_small
 
каталог фаберлик 2 2016
каталог фаберлик 2 2016каталог фаберлик 2 2016
каталог фаберлик 2 2016
 
Asian landscapes
Asian landscapesAsian landscapes
Asian landscapes
 
детская одежда фаберлик
детская одежда фаберликдетская одежда фаберлик
детская одежда фаберлик
 
SELENAMILLSCV.V.1
SELENAMILLSCV.V.1SELENAMILLSCV.V.1
SELENAMILLSCV.V.1
 

Similar to Automatic Image Annotation (AIA)

Lecture 1 (bce-7)
Lecture   1 (bce-7)Lecture   1 (bce-7)
Lecture 1 (bce-7)
farazahmad005
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
Motaz El-Saban
 
Overblik over kunstig intelligens og digital billedanalyse
Overblik over kunstig intelligens og digital billedanalyseOverblik over kunstig intelligens og digital billedanalyse
Overblik over kunstig intelligens og digital billedanalyse
LFF - Landsforeningen til bevaring af foto og film
 
AI In Actuarial Science
AI In Actuarial ScienceAI In Actuarial Science
AI In Actuarial Science
Audrey Britton
 
Parcel Lot Division with cGAN
Parcel Lot Division with cGANParcel Lot Division with cGAN
Parcel Lot Division with cGAN
Matthew To
 
AI_Module_1_Lecture_1.pptx
AI_Module_1_Lecture_1.pptxAI_Module_1_Lecture_1.pptx
AI_Module_1_Lecture_1.pptx
adityab33
 
Weave-D - 2nd Progress Evaluation Presentation
Weave-D - 2nd Progress Evaluation PresentationWeave-D - 2nd Progress Evaluation Presentation
Weave-D - 2nd Progress Evaluation Presentation
lasinducharith
 
ExplainingMLModels.pdf
ExplainingMLModels.pdfExplainingMLModels.pdf
ExplainingMLModels.pdf
LHong526661
 
Contextless Object Recognition with Shape-enriched SIFT and Bags of Features
Contextless Object Recognition with Shape-enriched SIFT and Bags of FeaturesContextless Object Recognition with Shape-enriched SIFT and Bags of Features
Contextless Object Recognition with Shape-enriched SIFT and Bags of Features
Universitat Politècnica de Catalunya
 
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET Journal
 
Anits dip
Anits dipAnits dip
Object Recognition
Object RecognitionObject Recognition
Object Recognition
Eman Abed AlWahhab
 
AN INTEGRATED APPROACH TO CONTENT BASED IMAGE RETRIEVAL by Madhu
AN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by MadhuAN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by Madhu
AN INTEGRATED APPROACH TO CONTENT BASED IMAGE RETRIEVAL by Madhu
Madhu Rock
 
Era ofdataeconomyv4short
Era ofdataeconomyv4shortEra ofdataeconomyv4short
Era ofdataeconomyv4short
Jun Miyazaki
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software Testing
Lionel Briand
 
Bibliotheca Digitalis Summer school: From pixels to content - Jean-Yves Ramel
Bibliotheca Digitalis Summer school: From pixels to content - Jean-Yves RamelBibliotheca Digitalis Summer school: From pixels to content - Jean-Yves Ramel
Bibliotheca Digitalis Summer school: From pixels to content - Jean-Yves Ramel
Bibliothèques Virtuelles Humanistes - CESR, Université de Tours, UMR 7323
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
Xavier Ochoa
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
Manish Pandey
 
Edge Detection Using Fuzzy Logic with Varied Inputs
Edge Detection Using Fuzzy Logic with Varied InputsEdge Detection Using Fuzzy Logic with Varied Inputs
Edge Detection Using Fuzzy Logic with Varied Inputs
paperpublications3
 
Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204
Editor IJARCET
 

Similar to Automatic Image Annotation (AIA) (20)

Lecture 1 (bce-7)
Lecture   1 (bce-7)Lecture   1 (bce-7)
Lecture 1 (bce-7)
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 
Overblik over kunstig intelligens og digital billedanalyse
Overblik over kunstig intelligens og digital billedanalyseOverblik over kunstig intelligens og digital billedanalyse
Overblik over kunstig intelligens og digital billedanalyse
 
AI In Actuarial Science
AI In Actuarial ScienceAI In Actuarial Science
AI In Actuarial Science
 
Parcel Lot Division with cGAN
Parcel Lot Division with cGANParcel Lot Division with cGAN
Parcel Lot Division with cGAN
 
AI_Module_1_Lecture_1.pptx
AI_Module_1_Lecture_1.pptxAI_Module_1_Lecture_1.pptx
AI_Module_1_Lecture_1.pptx
 
Weave-D - 2nd Progress Evaluation Presentation
Weave-D - 2nd Progress Evaluation PresentationWeave-D - 2nd Progress Evaluation Presentation
Weave-D - 2nd Progress Evaluation Presentation
 
ExplainingMLModels.pdf
ExplainingMLModels.pdfExplainingMLModels.pdf
ExplainingMLModels.pdf
 
Contextless Object Recognition with Shape-enriched SIFT and Bags of Features
Contextless Object Recognition with Shape-enriched SIFT and Bags of FeaturesContextless Object Recognition with Shape-enriched SIFT and Bags of Features
Contextless Object Recognition with Shape-enriched SIFT and Bags of Features
 
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
 
Anits dip
Anits dipAnits dip
Anits dip
 
Object Recognition
Object RecognitionObject Recognition
Object Recognition
 
AN INTEGRATED APPROACH TO CONTENT BASED IMAGE RETRIEVAL by Madhu
AN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by MadhuAN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by Madhu
AN INTEGRATED APPROACH TO CONTENT BASED IMAGE RETRIEVAL by Madhu
 
Era ofdataeconomyv4short
Era ofdataeconomyv4shortEra ofdataeconomyv4short
Era ofdataeconomyv4short
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software Testing
 
Bibliotheca Digitalis Summer school: From pixels to content - Jean-Yves Ramel
Bibliotheca Digitalis Summer school: From pixels to content - Jean-Yves RamelBibliotheca Digitalis Summer school: From pixels to content - Jean-Yves Ramel
Bibliotheca Digitalis Summer school: From pixels to content - Jean-Yves Ramel
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Edge Detection Using Fuzzy Logic with Varied Inputs
Edge Detection Using Fuzzy Logic with Varied InputsEdge Detection Using Fuzzy Logic with Varied Inputs
Edge Detection Using Fuzzy Logic with Varied Inputs
 
Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204
 

Recently uploaded

ZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptxZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptx
dot55audits
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
S. Raj Kumar
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
BoudhayanBhattachari
 
Constructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective CommunicationConstructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective Communication
Chevonnese Chevers Whyte, MBA, B.Sc.
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
Solutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptxSolutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptx
spdendr
 
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Leena Ghag-Sakpal
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 

Recently uploaded (20)

ZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptxZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptx
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
 
Constructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective CommunicationConstructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective Communication
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
Solutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptxSolutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptx
 
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 

Automatic Image Annotation (AIA)

  • 1. Seminar Report Presented to: Dr. Shanbehzadeh Presented by: Farzaneh Rezaei November 2015
  • 2. What is the goal of computer vision ? Perceive the story behind the picture See the world!! But what exactly does it mean to see? 2Source: Wall-e Movie: Pixar, Walt Disney Pictures
  • 3. Outline Introduction To Image Annotation • What? • Why? Story Behind AIA • Components of AIA • Progress of AIA • Issues & Conclusions Going deeper ! • Feature Extraction • Learning Methods • Deep Learning • Conclusions Useful Information • Recent Articles • Toolbox • Databases • Authors Conclusions • References 3
  • 4. Outline Introduction To Image Annotation • What? • Why? Story Behind AIA • Components of AIA • Progress of AIA • Issues & Conclusions Going deeper ! • Feature Extraction • Learning Methods • Deep Learning • Conclusions Useful Information • Recent Articles • Toolbox • Databases • Authors Conclusions • References 4
  • 5. What is Automatic Image Annotation? Automatic image annotation is the task of automatically assigning words to an image that describe the content of the image. Munirathnam Srikanth, et al. Exploiting ontologies for automatic image annotation Source: Personalizing Automated Image Annotation Using Cross-Entropy: https://ivi.fnwi.uva.nl/isis/publications/bibtexbrowser.php?key=LiICM2011&bib=all.bib 5
  • 6. What is Automatic Image Annotation?(Cont.) Source: MS COCO Captioning Challenge: http://mscoco.org/dataset/#captions-challenge2015 6
  • 7. 3,000 Photos Are Uploaded Every Second to Facebook Why Image Annotation is important? Recently, we have witnessed an exponential growth of user generated videos and images, due to the booming of social networks, such as Facebook and Flickr. Source: petapixel.com Source: http://petapixel.com/2012/02/01/3000-photos-are-uploaded-every-second-to-facebook/ 7
  • 8. Why Image Annotation is important?(Cont.) Source: Barriuso, A., & Torralba, A. (2012). Notes on image annotation • Applications e.g. Photo organizer apps • Image Classification Systems 8
  • 9. Numbers of articles per year for “Automatic Image Annotation” (in Title of article) 0 10 20 30 40 50 60 70 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 Year Reported by: Google Scholar 9
  • 10. Outline Introduction To Image Annotation • What? • Why? Story Behind AIA • Components of AIA • Progress of AIA • Issues & Conclusions Going deeper ! • Feature Extraction • Learning Methods • Deep Learning • Conclusions Useful Information • Recent Articles • Toolbox • Databases • Authors Conclusions • References 10
  • 11. How do you annotate these images? 11
  • 12. What are components of Automatic Image Annotation System ? 12
  • 13. How to classify Images ? What are components of Automatic Image Annotation System ? 13
  • 15. What are components of Automatic Image Annotation System ? Classification Methods Feature Extraction 15
  • 16. What are components of Automatic Image Annotation System ? Feature Extraction Classification Methods Pattern Recognition !! 16
  • 18. 18 An Example of classical approaches in AIA Source: Zhang, D., Islam, M. M., & Lu, G. (2012). A review on automatic image annotation techniques. Pattern Recognition, 45(1), 346–362. doi:10.1016/j.patcog.2011.05.013
  • 19. Theoretical Limitations of Shallow Architectures* Functions that can be compactly represented by a depth k architecture might require an exponential number of computational elements to be represented by a depth k − 1 architecture Issues of classical approaches 19 *Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning
  • 20. Issues of classical approaches (Cont.) Theoretical Limitations of Shallow Architectures • Shallow? Deep? • Functions? • Compact? • Depth? • Computational Elements? 20 logic circuit
  • 21. Issues of classical approaches (Cont.) 21Picture Source: Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning
  • 22. Issues of classical approaches (Cont.) Theoretical Limitations of Shallow Architectures • Linear regression and logistic regression have depth 1, i.e., have a single level. • Ordinary multi-layer neural networks With the most common choice of one hidden layer, they have depth two • Decision trees can also be seen as having two levels • Boosting (Freund & Schapire, 1996) usually adds one level to its base learners: that level computes a vote or linear combination of the outputs of the base learners 22
  • 23. Issues of classical approaches (Cont.) Theoretical Limitations of Shallow Architectures • Shallow? Deep? • Functions • Compact • Depth • Computational Elements 23
  • 24. Theoretical Limitations of Shallow Architectures* Functions that can be compactly represented by a depth k architecture might require an exponential number of computational elements to be represented by a depth k − 1 architecture Issues of classical approaches 24 *Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning
  • 25. • A two-layer circuit of logic gates can represent any boolean function (Mendelson, 1997). • With depth two logical circuits, most boolean functions require an exponential number of logic gates (Wegener, 1987) to be represented (with respect to input size) • There are functions computable with a polynomial-size logic gates circuit of depth k that require exponential size when restricted to depth k − 1 (Hastad, 1986) The proof of this theorem relies on earlier results (Yao, 1985) showing that d-bit parity circuits of depth 2 have exponential size 25 Issues of classical approaches (Cont.)
  • 26. • One might wonder whether these computational complexity results for boolean circuits are relevant to machine learning. • See Orponen (1994)! • for an early survey of theoretical results in computational complexity relevant to learning algorithms. Interestingly, many of the results for boolean circuits can be generalized to architectures whose computational elements are linear threshold units (also known as artificial neurons (McCulloch & Pitts, 1943)), which compute: f(x) = w0 x+b≥0 (1) with parameters w and b. 26 Issues of classical approaches (Cont.)
  • 27. 27 Issues of classical approaches (Cont.) 1 Theoretical Limitations of Shallow Architectures 2 Theoretical Advantages of Deep Architectures Which one ?? !
  • 30. How to assign a word to an image ? What are components of Automatic Image Annotation System ? Feature Extraction Classification Methods Pattern Recognition !! 30 Components of AIA Classical or Shallow Structure Issues
  • 32. Outline Introduction To Image Annotation • What? • Why? Story Behind AIA • Components of AIA • Progress of AIA • Issues & Conclusions Going deeper ! • Feature Extraction • Learning Methods • CNN • Conclusions Useful Information • Recent Articles • Toolbox • Databases • Authors Conclusions • References 32
  • 33. Going Deeper! • Color • Texture • Shape • Segmentation Feature Extraction & Representation • ANN • SVM • Bayes • Metadata Learning Methods 33
  • 37. Color: Comparisons Color method Pros Cons Histogram Simple to compute, intuitive High dimension, no spatial info, sensitive to noise CM Compact, robust Not enough to describe all colors, no spatial info CCV Spatial info High dimension, high computation cost Correlogram Spatial info Very high computation cost, sensitive to noise, rotation and scale 37
  • 38. Color: Comparisons (Cont.) Color method Pros Cons DCD Compact, robust, perceptual meaning Need post-processing for spatial info CSD Spatial info Sensitive to noise, rotation and scale SCD Compact on need, scalability No spatial info, less accurate if compact 38
  • 39. Spatial Texture : Comparisons Color method Pros Cons Texton Intuitive Sensitive to noise, rotation and scale, difficult to define textons GLCM based method Intuitive, compact, robust High High computation cost, not enough to describe all Tamura Perceptually meaningful Too few features SAR Compact, robust, rotation invariant High computation cost, difficult to define pattern size FD Compact, perceptually meaningful computation cost, sensitive to scale 39
  • 40. Spectral Texture : Comparisons (Cont.) Color method Pros Cons FT/DCT Fast computation Sensitive to scale and rotation Wavelet Fast computation, multi-resolution Sensitive to rotation, limited orientations Gabor Multi-scale, multi-orientation, robust normalisation, losing of spectral information due to incomplete cover of spectrum plane Curvelet Multi-resolution, multi-orientation, robust Need rotation normalisation 40
  • 41. Shape Chart Source: [Zhang and Lu 2004] 41
  • 42. Chart Source: [M. Yang, K. Kpalma, J. Ronsin 2008] Shape (Cont.) 42
  • 43. Shape (Cont.) Contour Based Calculate shape features only from the boundary of the shape Region Based Extract features from the entire region 43
  • 44. Shape (Cont.) • Because contour based techniques are more sensitive to noise than region based techniques. • Therefore, color image retrieval usually employs region based shape features. 44
  • 45. Learning Methods: Learning Methods • SVM • ANN • Tree • Parametric • Non-Parametric 45
  • 46. Learning Methods: Comparisons Annotation method Pros Cons SVM Small sample, optimal class boundary, non-linear classification Single labelling, one class per time, expensive trial and run, sensitive to noisy data, prone to over-fitting ANN Multiclass outputs, non- linear classification, robust to noisy data, suitable for complex problem Single labelling, sub-optimal, expensive training, complex and black box classification DT Intuitive, semantic rules, multiclass outputs, fast, allow missing values, handle both categorical and numerical values Single labelling, sub-optimal, need pruning, can be unstable 46
  • 47. Learning Methods: Comparisons Annotation method Pros Cons Non-parametric Multi-labelling, model free, fast Large number of parameters, large sample, sensitive to noisy data Parametric Multi-labelling, small sample, good approximation of unknown distribution Predefined distribution, expensive training, approximated boundary Metadata Use of both textual and visual features Difficult to relate visual features with textual features, difficult textual feature extraction 47
  • 48. Deep Learning 48 • Deep belief networks • Deep Boltzmann machines • Deep Convolutional neural networks • Deep Recurrent neural networks • Hierarchical temporal memory Source: https://en.wikipedia.org/wiki/List_of_machine_learning_concepts
  • 49. Deep Learning (Cont.) 49 Source: Ranzato, 4 October 2013, Slides
  • 50. Deep Learning (Cont.) 50 •A Potential Problem with Deep Learning *?? •Optimization Task • See : • Bengio’s Articles! • Hot videos about Deep Learning on YouTube! • Ranzato, 4 October 2013: • https://www.youtube.com/watch?v=clgMTk5V 2Sk *: Ranzato, 4 October 2013, Slides
  • 51. Outline Introduction To Image Annotation • What? • Why? Story Behind AIA • Components of AIA • Progress of AIA • Issues & Conclusions Going deeper ! • Feature Extraction • Learning Methods • Deep Learning • Conclusions Useful Information • Recent Articles • Toolbox • Databases • Authors Conclusions • References 51
  • 52. 2009, Shallow Source: Venkatesh N. Mur thy, S. Maji, R. Manmatha, Automatic Image Annotation using Deep Learning Representations 2015 Useful Information: Recent Articles 52
  • 53. 53 Which one ?? ! 1 Theoretical Limitations of Shallow Architectures 2 Theoretical Advantages of Deep Architectures
  • 54. Source: B. Klein, G. Lev, G. Sadeh, and L. Wolf, Fisher Vectors Derived from Hybrid Gaussian-Laplacian Mixture Models for Image Annotation 2015 Useful Information: Recent Articles (Cont.) 54
  • 55. Useful Information: Toolbox MatConvNet • MatConvNet is a MATLAB toolbox implementing Convolutional Neural Networks (CNNs) for computer vision applications. It is simple, efficient, and can run and learn state-of-the-art CNNs. Several example CNNs are included to classify and encode images. Caffe • Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors.Yangqing Jia created the project during his PhD at UC Berkeley. Caffe is released under the BSD 2-Clause license. 55
  • 56. Useful Information: Databases an important benchmark for keyword based image retrieval and image annotation 5000 images manually annotated with 1 to 5 keywords. The vocabulary contains 260 words. Corel5k: This data set is obtained from an online game where two players, that can not communicate outside the game, gain points by agreeing on words describing the image ESP Game: This set of 20.000 images accompanied with descriptions in several languages was initially published for cross-lingual retrieval IAPR TC12: 56
  • 57. Useful Information: Databases • Other Databases: • Flicker8,10,30 Table Source: M. Guillaumin, T. Mensink, J. Verbeek and C. Schmid, TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation 57
  • 58. Useful Information: Authors Cordelia Schmid •Research director INRIA •Computer vision, object recognition, video recognition, learning Li Fei-Fei •Professor, Stanford University •Artificial Intelligence, Machine Learning, Computer Vision, Neuroscience Yoshua Bengio •Professor, U. Montreal, Computer Sc. •Machine learning, deep learning, artificial intelligence Reported by: Google Scholar 58
  • 59. Useful Information: Authors (Cont.) Richard Socher •MetaMind •deep learning, machine learning, natural language processing, computer vision 59 Recursive Deep Learning for Natural Language Processing and Computer Vision, PhD Thesis, Computer Science Department, Stanford University 2014 Arthur L. Samuel Best Computer Science PhD Thesis Award Reported by: Google Scholar
  • 60. Outline Introduction To Image Annotation • What? • Why? Story Behind AIA • Components of AIA • Progress of AIA • Issues & Conclusions Going deeper ! • Feature Extraction • Learning Methods • Deep Learning • Conclusions Useful Information • Recent Articles • Toolbox • Databases • Authors Conclusions • References 60
  • 61. How to assign a word to an image ? What are components of Automatic Image Annotation System ? Feature Extraction Classification Methods Pattern Recognition !! 61 Components of AIA Classical or Shallow Structure Issues Conclusions!!!
  • 62. 1. High dimensional feature analysis 2. How to build an effective annotation model? 3. The third issue is that currently annotation and ranking are done online simultaneously in the multiple labelling annotation approaches. This is not efficient for image retrieval. 4. Lack of standard vocabulary and taxonomy. 5. There is no commonly acceptable image database 6. insufficient depth of architectures, and locality of estimators[Bengio, 2009] 62 Picture Source: Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning Source: Zhang, D., Islam, M. M., & Lu, G. (2012). A review on automatic image annotation techniques. Pattern Recognition, 45(1), 346–362. doi:10.1016/j.patcog.2011.05.013 Conclusions (Cont.)

Editor's Notes

  1. Pas ghesmate azaame ye system aia az sakhtar haye pr peyravi mikone, va be hamin dalil motale sakhtarha da pr be ma komak mikone Va azoon mohem tar dalile asli moshkelate aia ro mishe too hamin sakhtar ha jostojo kard
  2. In derakhto ta che omghi pish berim??? Masalan ye tasvire gol? Ya ye tasvir az ye chahar rahe sholoogh? Ta chand level berim ?
  3. Mesal baraye roshan shodane function va ce mitoone madar manteghi bashe, ke khoroojie ma hamoon form sade shodeye madaremoone va har gate neshan dahande yek onsor mohasebatie, mesale ai mitoone … Safheye bad
  4. Ama hala deep boodan yani chi? Masalan migim az omgh 10 be bad deep hesab mishe ??! Nazare shoma chie Hala baz bargardim be hamoon jomle, bebinid ma ye tedad khas made nazaremoon nis Chon baste be masala fargh mikone bahs e ma ine ke age beshe target funcemoon ro ba k omgh compact neshoon bedim … VA Maaghaleye zisserman ke miad mige harchi omgh bishtar javab behtar ama bayad did che ghadr behtar shode mi arze?
  5. Bahse inke migim shallow structure behtar az classic hastam mikhay begoo Nagoo javab chie begoo bayad maghalate bengio ro kamel tar khoond va dalile in eslah ro fahmid ama hala shoma behesh fek konid va man ham dar natije giri payani, nazare khodamo ba tavajjoh be matalebi ke khoondam midam
  6. Because contour based techniques use only a portion of the region, they are more sensitive to noise than region based techniques
  7. Because contour based techniques use only a portion of the region, they are more sensitive to noise than region based techniques
  8. Because contour based techniques use only a portion of the region, they are more sensitive to noise than region based techniques
  9. Darinja ye sakhtare kolli deepo migim , va baraye moghayse beyne deepo classic va beyne khode deep ha dar slide badi natayeje yeki az maghalate 2015 ro be namayesh mizarim,
  10. and locality of estimators: moshkele digari ke deep hal karde Va begim chera ma rooye in moshkel focus kardim na moshkelate dg? Ye slide besaz Chon hameye maghalate AIA be Sematic Gap eshare kardand Bargardim be inke aya classic kollan kenar gozashte shode ??