SlideShare a Scribd company logo
1 of 41
Bag of Words
Slide from Andrew Zisserman 
Sivic & Zisserman, ICCV 2003
AAnnaallooggyy ttoo ddooccuummeennttss 
Of all the sensory impressions proceeding to the 
brain, the visual experiences are the dominant 
ones. Our perception of the world around us is 
based essentially on the messages that reach the 
brain from our eyes. sensory, For a long brain, 
time it was 
thought that the visual, retinal image perception, 
was transmitted 
point by point to visual centers in the brain; the 
cerebral cortex retinal, was a movie cerebral screen, cortex, 
so to speak, 
upon which the eye, image cell, in the eye optical 
was projected. 
Through the discoveries of Hubel and Wiesel we 
now know that behind nerve, the origin image 
of the visual 
perception in the brain there is a considerably 
more complicated Hubel, course of events. Wiesel 
By following 
the visual impulses along their path to the various 
cell layers of the optical cortex, Hubel and Wiesel 
have been able to demonstrate that the message 
about the image falling on the retina undergoes a 
step-wise analysis in a system of nerve cells 
stored in columns. In this system each cell has its 
specific function and is responsible for a specific 
detail in the pattern of the retinal image. 
China is forecasting a trade surplus of $90bn 
(£51bn) to $100bn this year, a threefold increase 
on 2004's $32bn. The Commerce Ministry said 
the surplus would be created by a predicted 30% 
jump in exports to $750bn, compared with a 18% 
rise in imports to China, $660bn. The trade, 
figures are likely to 
further annoy surplus, the US, which commerce, 
has long argued that 
China's exports exports, are unfairly helped by a 
deliberately undervalued imports, yuan. Beijing US, 
agrees the 
surplus is too yuan, high, but bank, says the domestic, 
yuan is only one 
factor. Bank of foreign, China governor increase, 
Zhou Xiaochuan 
said the country also needed to do more to boost 
domestic demand so trade, more goods value 
stayed within the 
country. China increased the value of the yuan 
against the dollar by 2.1% in July and permitted it 
to trade within a narrow band, but the US wants 
the yuan to be allowed to trade freely. However, 
Beijing has made it clear that it will take its time 
and tread carefully before allowing the yuan to 
rise further in value. 
ICCV 2005 short course, L. Fei-Fei
Visual words: main idea 
 Extract some local features from a number of images … 
e.g., SIFT descriptor 
space: each point is 128- 
dimensional 
Slide credit: D. Nister
Visual words: main idea 
Slide credit: D. Nister
Visual words: main idea 
Slide credit: D. Nister
Visual words: main idea 
Slide credit: D. Nister
Each point is a local 
descriptor, e.g. SIFT 
vector. 
Slide credit: D. Nister
Slide credit: D. Nister
Visual words 
 Example: each 
group of patches 
belongs to the 
same visual word 
Figure from Sivic & Zisserman, ICCV 2003
Visual words 
11 
• More recently used for describing scenes and 
objects for the sake of indexing or classification. 
Sivic & Zisserman 2003; 
Csurka, Bray, Dance, & Fan 
2004; many others. 
Source credit: K. Grauman, B. Leibe
OObbjjeecctt BBaagg ooff ‘‘wwoorrddss’’ 
ICCV 2005 short course, L. Fei-Fei
Bags of visual words 
 Summarize entire image 
based on its distribution 
(histogram) of word 
occurrences. 
 Analogous to bag of words 
representation commonly used 
for documents. 
14 
Image credit: Fei-Fei Li
Similarly, bags of textons for texture 
representation 
histogram 
Universal texton dictionary 
Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, 
Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 
Source: Lana Lazebnik
Comparing bags of words 
 Rank frames by normalized scalar product between their 
(possibly weighted) occurrence counts---nearest 
neighbor search for similar images. 
[1 8 1 4] [5 1 1 0] 
 q 
j d
Inverted file index for images 
comprised of visual words 
Image credit: A. Zisserman 
Word 
number 
List of image 
numbers 
When will this give us a significant gain in efficiency?
Indexing local features: inverted file index 
 For text documents, 
an efficient way to find 
all pages on which a 
word occurs is to use 
an index… 
 We want to find all 
images in which a 
feature occurs. 
 We need to index 
each feature by the 
image it appears and 
also we keep the # of 
occurrence. 
Source credit : K. Grauman, B. Leibe
tf-idf weighting 
 Term frequency – inverse document frequency 
 Describe frame by frequency of each word within it, 
downweight words that appear often in the database 
 (Standard weighting for text retrieval) 
Total number of 
words in database 
Number of 
occurrences of 
word i in whole 
database 
Number of 
occurrences of 
word i in 
document d 
Number of 
words in 
document d
Bags of words for content-based image 
What if query of interest is a portion of a frame? 
Slide from Andrew Zisserman 
Sivic & Zisserman, ICCV 2003 
retrieval
Video Google System 
1. Collect all words within 
query region 
2. Inverted file index to find 
relevant frames 
3. Compare word counts 
4. Spatial verification 
Sivic & Zisserman, ICCV 2003 
 Demo online at : 
http://www.robots.ox.ac.uk/~vgg/rese 
arch/vgoogle/index.html 
21 
Query 
region 
Retrieved frames
22 
 Collecting words within a query region 
Query region: 
pull out only the SIFT 
descriptors whose 
positions are within the 
polygon
Bag of words representation: 
spatial info 
 A bag of words is an orderless representation: 
throwing out spatial relationships between features 
 Middle ground: 
 Visual “phrases” : frequently co-occurring words 
 Semi-local features : describe configuration, 
neighborhood 
 Let position be part of each feature 
 Count bags of words only within sub-grids of an 
image 
 After matching, verify spatial consistency (e.g., look 
at neighbors – are they the same too?)
Visual vocabulary formation 
Issues: 
 Sampling strategy: where to extract features? 
 Clustering / quantization algorithm 
 Unsupervised vs. supervised 
 What corpus provides features (universal 
vocabulary?) 
 Vocabulary size, number of words 
27 
K. Grauman, B. Leibe
Sampling strategies 
28 
Sparse, at Dense, uniformly 
interest points 
Image credits: F-F. Li, E. Nowak, J. Sivic 
Randomly 
Multiple interest 
operators 
• To find specific, textured objects, sparse 
sampling from interest points often more 
reliable. 
• Multiple complementary interest operators 
offer more image coverage. 
• For object categorization, dense sampling 
offers better coverage. 
[See Nowak, Jurie & Triggs, ECCV 2006]
[Nister & Stewenius, CVPR’06] 
29 
Example: Recognition with Vocabulary Tree 
 Tree construction: 
Slide credit: David Nister
[Nister & Stewenius, CVPR’06] 
30 
Vocabulary Tree 
 Training: Filling the tree 
Slide credit: David Nister
[Nister & Stewenius, CVPR’06] 
31 
Vocabulary Tree 
 Training: Filling the tree 
Slide credit: David Nister
[Nister & Stewenius, CVPR’06] 
32 
Vocabulary Tree 
 Training: Filling the tree 
Slide credit: David Nister
[Nister & Stewenius, CVPR’06] 
33 
Vocabulary Tree 
 Training: Filling the tree 
Slide credit: David Nister
[Nister & Stewenius, CVPR’06] 
34 
Vocabulary Tree 
 Training: Filling the tree 
Slide credit: David Nister
[Nister & Stewenius, CVPR’06] 
35 
Vocabulary Tree 
 Recognition 
Slide credit: David Nister 
RANSAC 
verification
36 
Vocabulary Tree: Performance 
 Evaluated on large databases 
 Indexing with up to 1M images 
 Online recognition for database 
of 50,000 CD covers 
 Retrieval in ~1s 
 Find experimentally that large 
vocabularies can be beneficial for 
recognition 
[Nister & Stewenius, CVPR’06]
Larger vocabularies 
can be 
advantageous… 
But what happens if it 
is too large?
Bags of features for object recognition 
face, flowers, building 
 Works pretty well for image-level classification 
Source credit : K. Grauman, B. Leibe Source: Lana Lazebnik
Bags of features for object recognition 
Caltech6 dataset 
bag of features bag of features Parts-and-shape model 
Source: Lana Lazebnik
Bags of words: pros and cons 
+ flexible to geometry / deformations / viewpoint 
+ compact summary of image content 
+ provides vector representation for sets 
+ has yielded good recognition results in practice 
- basic model ignores geometry – must verify 
afterwards, or encode via features 
- background and foreground mixed when bag covers 
whole image 
- interest points or sampling: no guarantee to capture 
object-level parts 
- optimal vocabulary formation remains unclear 
40 
Source credit : K. Grauman, B. Leibe
Summary 
 Local invariant features: distinctive matches possible in 
spite of significant view change, useful not only to 
provide matches for multi-view geometry, but also to find 
objects and scenes. 
 To find correspondences among detected features, 
measure distance between descriptors, and look for 
most similar patches. 
 Bag of words representation: quantize feature space to 
make discrete set of visual words 
 Summarize image by distribution of words 
 Index individual words 
 Inverted index: pre-compute index to enable faster 
search at query time 
Source credit : K. Grauman, B. Leibe

More Related Content

Similar to Bagwords

Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...Symeon Papadopoulos
 
Vocabulary length experiments for binary image classification using bov approach
Vocabulary length experiments for binary image classification using bov approachVocabulary length experiments for binary image classification using bov approach
Vocabulary length experiments for binary image classification using bov approachsipij
 
Cvpr2007 tutorial bag_of_words
Cvpr2007 tutorial bag_of_wordsCvpr2007 tutorial bag_of_words
Cvpr2007 tutorial bag_of_wordsBo Li
 
Lec18 bag of_features
Lec18 bag of_featuresLec18 bag of_features
Lec18 bag of_featuresBo Li
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
12 cie552 object_recognition
12 cie552 object_recognition12 cie552 object_recognition
12 cie552 object_recognitionElsayed Hemayed
 
SOC2002 Lecture 13
SOC2002 Lecture 13SOC2002 Lecture 13
SOC2002 Lecture 13Bonnie Green
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsRoelof Pieters
 
pang_paper.pdf
pang_paper.pdfpang_paper.pdf
pang_paper.pdfAhmed26851
 
bag-of-words models
bag-of-words models bag-of-words models
bag-of-words models Xiaotao Zou
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachCSCJournals
 
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIBayes Nets meetup London
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for RetrievalBhaskar Mitra
 
Introduction of object oriented analysis & design by sarmad baloch
Introduction of object oriented analysis & design by sarmad balochIntroduction of object oriented analysis & design by sarmad baloch
Introduction of object oriented analysis & design by sarmad balochSarmad Baloch
 
M01_OO_Intro.ppt
M01_OO_Intro.pptM01_OO_Intro.ppt
M01_OO_Intro.pptRAJESH S
 

Similar to Bagwords (20)

Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
 
Vocabulary length experiments for binary image classification using bov approach
Vocabulary length experiments for binary image classification using bov approachVocabulary length experiments for binary image classification using bov approach
Vocabulary length experiments for binary image classification using bov approach
 
Cvpr2007 tutorial bag_of_words
Cvpr2007 tutorial bag_of_wordsCvpr2007 tutorial bag_of_words
Cvpr2007 tutorial bag_of_words
 
Video google-text (2)
Video google-text (2)Video google-text (2)
Video google-text (2)
 
Lec18 bag of_features
Lec18 bag of_featuresLec18 bag of_features
Lec18 bag of_features
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
12 cie552 object_recognition
12 cie552 object_recognition12 cie552 object_recognition
12 cie552 object_recognition
 
SOC2002 Lecture 13
SOC2002 Lecture 13SOC2002 Lecture 13
SOC2002 Lecture 13
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
Convolutional Features for Instance Search
Convolutional Features for Instance SearchConvolutional Features for Instance Search
Convolutional Features for Instance Search
 
Video+Language: From Classification to Description
Video+Language: From Classification to DescriptionVideo+Language: From Classification to Description
Video+Language: From Classification to Description
 
pang_paper.pdf
pang_paper.pdfpang_paper.pdf
pang_paper.pdf
 
bag-of-words models
bag-of-words models bag-of-words models
bag-of-words models
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional Approach
 
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AI
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
 
Introduction of object oriented analysis & design by sarmad baloch
Introduction of object oriented analysis & design by sarmad balochIntroduction of object oriented analysis & design by sarmad baloch
Introduction of object oriented analysis & design by sarmad baloch
 
M01_OO_Intro.ppt
M01_OO_Intro.pptM01_OO_Intro.ppt
M01_OO_Intro.ppt
 
Video + Language 2019
Video + Language 2019Video + Language 2019
Video + Language 2019
 
Video + Language
Video + LanguageVideo + Language
Video + Language
 

More from mustafa sarac

Uluslararasilasma son
Uluslararasilasma sonUluslararasilasma son
Uluslararasilasma sonmustafa sarac
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3mustafa sarac
 
Latka december digital
Latka december digitalLatka december digital
Latka december digitalmustafa sarac
 
Axial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualAxial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualmustafa sarac
 
Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpymustafa sarac
 
Math for programmers
Math for programmersMath for programmers
Math for programmersmustafa sarac
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizmustafa sarac
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?mustafa sarac
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mimustafa sarac
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?mustafa sarac
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Marketsmustafa sarac
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimimustafa sarac
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0mustafa sarac
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tshmustafa sarac
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008mustafa sarac
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guidemustafa sarac
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020mustafa sarac
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dicemustafa sarac
 

More from mustafa sarac (20)

Uluslararasilasma son
Uluslararasilasma sonUluslararasilasma son
Uluslararasilasma son
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
 
Latka december digital
Latka december digitalLatka december digital
Latka december digital
 
Axial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualAxial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manual
 
Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpy
 
Math for programmers
Math for programmersMath for programmers
Math for programmers
 
The book of Why
The book of WhyThe book of Why
The book of Why
 
BM sgk meslek kodu
BM sgk meslek koduBM sgk meslek kodu
BM sgk meslek kodu
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimiz
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mi
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Markets
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimi
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tsh
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guide
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dice
 

Recently uploaded

chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 

Recently uploaded (20)

🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 

Bagwords

  • 2. Slide from Andrew Zisserman Sivic & Zisserman, ICCV 2003
  • 3. AAnnaallooggyy ttoo ddooccuummeennttss Of all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. sensory, For a long brain, time it was thought that the visual, retinal image perception, was transmitted point by point to visual centers in the brain; the cerebral cortex retinal, was a movie cerebral screen, cortex, so to speak, upon which the eye, image cell, in the eye optical was projected. Through the discoveries of Hubel and Wiesel we now know that behind nerve, the origin image of the visual perception in the brain there is a considerably more complicated Hubel, course of events. Wiesel By following the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a step-wise analysis in a system of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image. China is forecasting a trade surplus of $90bn (£51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to China, $660bn. The trade, figures are likely to further annoy surplus, the US, which commerce, has long argued that China's exports exports, are unfairly helped by a deliberately undervalued imports, yuan. Beijing US, agrees the surplus is too yuan, high, but bank, says the domestic, yuan is only one factor. Bank of foreign, China governor increase, Zhou Xiaochuan said the country also needed to do more to boost domestic demand so trade, more goods value stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value. ICCV 2005 short course, L. Fei-Fei
  • 4. Visual words: main idea  Extract some local features from a number of images … e.g., SIFT descriptor space: each point is 128- dimensional Slide credit: D. Nister
  • 5. Visual words: main idea Slide credit: D. Nister
  • 6. Visual words: main idea Slide credit: D. Nister
  • 7. Visual words: main idea Slide credit: D. Nister
  • 8. Each point is a local descriptor, e.g. SIFT vector. Slide credit: D. Nister
  • 10. Visual words  Example: each group of patches belongs to the same visual word Figure from Sivic & Zisserman, ICCV 2003
  • 11. Visual words 11 • More recently used for describing scenes and objects for the sake of indexing or classification. Sivic & Zisserman 2003; Csurka, Bray, Dance, & Fan 2004; many others. Source credit: K. Grauman, B. Leibe
  • 12. OObbjjeecctt BBaagg ooff ‘‘wwoorrddss’’ ICCV 2005 short course, L. Fei-Fei
  • 13.
  • 14. Bags of visual words  Summarize entire image based on its distribution (histogram) of word occurrences.  Analogous to bag of words representation commonly used for documents. 14 Image credit: Fei-Fei Li
  • 15. Similarly, bags of textons for texture representation histogram Universal texton dictionary Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, Source: Lana Lazebnik
  • 16. Comparing bags of words  Rank frames by normalized scalar product between their (possibly weighted) occurrence counts---nearest neighbor search for similar images. [1 8 1 4] [5 1 1 0]  q j d
  • 17. Inverted file index for images comprised of visual words Image credit: A. Zisserman Word number List of image numbers When will this give us a significant gain in efficiency?
  • 18. Indexing local features: inverted file index  For text documents, an efficient way to find all pages on which a word occurs is to use an index…  We want to find all images in which a feature occurs.  We need to index each feature by the image it appears and also we keep the # of occurrence. Source credit : K. Grauman, B. Leibe
  • 19. tf-idf weighting  Term frequency – inverse document frequency  Describe frame by frequency of each word within it, downweight words that appear often in the database  (Standard weighting for text retrieval) Total number of words in database Number of occurrences of word i in whole database Number of occurrences of word i in document d Number of words in document d
  • 20. Bags of words for content-based image What if query of interest is a portion of a frame? Slide from Andrew Zisserman Sivic & Zisserman, ICCV 2003 retrieval
  • 21. Video Google System 1. Collect all words within query region 2. Inverted file index to find relevant frames 3. Compare word counts 4. Spatial verification Sivic & Zisserman, ICCV 2003  Demo online at : http://www.robots.ox.ac.uk/~vgg/rese arch/vgoogle/index.html 21 Query region Retrieved frames
  • 22. 22  Collecting words within a query region Query region: pull out only the SIFT descriptors whose positions are within the polygon
  • 23.
  • 24.
  • 25.
  • 26. Bag of words representation: spatial info  A bag of words is an orderless representation: throwing out spatial relationships between features  Middle ground:  Visual “phrases” : frequently co-occurring words  Semi-local features : describe configuration, neighborhood  Let position be part of each feature  Count bags of words only within sub-grids of an image  After matching, verify spatial consistency (e.g., look at neighbors – are they the same too?)
  • 27. Visual vocabulary formation Issues:  Sampling strategy: where to extract features?  Clustering / quantization algorithm  Unsupervised vs. supervised  What corpus provides features (universal vocabulary?)  Vocabulary size, number of words 27 K. Grauman, B. Leibe
  • 28. Sampling strategies 28 Sparse, at Dense, uniformly interest points Image credits: F-F. Li, E. Nowak, J. Sivic Randomly Multiple interest operators • To find specific, textured objects, sparse sampling from interest points often more reliable. • Multiple complementary interest operators offer more image coverage. • For object categorization, dense sampling offers better coverage. [See Nowak, Jurie & Triggs, ECCV 2006]
  • 29. [Nister & Stewenius, CVPR’06] 29 Example: Recognition with Vocabulary Tree  Tree construction: Slide credit: David Nister
  • 30. [Nister & Stewenius, CVPR’06] 30 Vocabulary Tree  Training: Filling the tree Slide credit: David Nister
  • 31. [Nister & Stewenius, CVPR’06] 31 Vocabulary Tree  Training: Filling the tree Slide credit: David Nister
  • 32. [Nister & Stewenius, CVPR’06] 32 Vocabulary Tree  Training: Filling the tree Slide credit: David Nister
  • 33. [Nister & Stewenius, CVPR’06] 33 Vocabulary Tree  Training: Filling the tree Slide credit: David Nister
  • 34. [Nister & Stewenius, CVPR’06] 34 Vocabulary Tree  Training: Filling the tree Slide credit: David Nister
  • 35. [Nister & Stewenius, CVPR’06] 35 Vocabulary Tree  Recognition Slide credit: David Nister RANSAC verification
  • 36. 36 Vocabulary Tree: Performance  Evaluated on large databases  Indexing with up to 1M images  Online recognition for database of 50,000 CD covers  Retrieval in ~1s  Find experimentally that large vocabularies can be beneficial for recognition [Nister & Stewenius, CVPR’06]
  • 37. Larger vocabularies can be advantageous… But what happens if it is too large?
  • 38. Bags of features for object recognition face, flowers, building  Works pretty well for image-level classification Source credit : K. Grauman, B. Leibe Source: Lana Lazebnik
  • 39. Bags of features for object recognition Caltech6 dataset bag of features bag of features Parts-and-shape model Source: Lana Lazebnik
  • 40. Bags of words: pros and cons + flexible to geometry / deformations / viewpoint + compact summary of image content + provides vector representation for sets + has yielded good recognition results in practice - basic model ignores geometry – must verify afterwards, or encode via features - background and foreground mixed when bag covers whole image - interest points or sampling: no guarantee to capture object-level parts - optimal vocabulary formation remains unclear 40 Source credit : K. Grauman, B. Leibe
  • 41. Summary  Local invariant features: distinctive matches possible in spite of significant view change, useful not only to provide matches for multi-view geometry, but also to find objects and scenes.  To find correspondences among detected features, measure distance between descriptors, and look for most similar patches.  Bag of words representation: quantize feature space to make discrete set of visual words  Summarize image by distribution of words  Index individual words  Inverted index: pre-compute index to enable faster search at query time Source credit : K. Grauman, B. Leibe