SlideShare a Scribd company logo
Journal(Club( (
Visualizing(Object(Detec6on(Features
Vondrick,(C.(et(al.(
In(Proc.(Interna6onal(Conference(on(
Computer(Vision(2013(
(
2015/03/26(shouno@uec.ac.jp
Overview
•  (
–  (HOG,(SIFT,(etc)(
–  ((SVM,(Log.(regression,(etc)(
(
•  “ ”(
(
(
•  HOG
Car
HOG(+(SVM(
HOG(Histogram(of(Oriented(Gradient)
•  Dalal(2005)(
– 
! SVM!
1Cellu,( v
1Cell
HOG(Histogram(of(Oriented(Gradient)
•  Dalal(2005)(
– 
! SVM!
1Block!=!3Cell×3Cell
V!:!Block !
L2norm !
Cell!,!Block !
Car
Car
(via(HOG(glyph)
Car
:(Paired(Dict.LearningMethod: Paired Dictionary
+
=
= +...+
+...++
↵1 ↵2 ↵k
↵1 ↵2 ↵k
ˆy = f(x) = V ˆ↵
2
Method: Paired Dictionary
+
=
= +...+
+...++
↵1 ↵2 ↵k
↵1 ↵2 ↵k
ˆy = f(x) = V ˆ↵
where ˆ↵ = arg min
↵
||x U↵||2
2 s.t. ||↵||1 
Method: Paired Dictionary
+
=
= +...+
+...++
↵1 ↵2 ↵k
↵1 ↵2 ↵k
ˆy = f(x) = V ˆ↵
where ˆ↵ = arg min
↵
||x U↵||2
2 s.t. ||↵||1 
Method: Paired Dictionary
+
=
= +...+
+...++
↵1 ↵2 ↵k
↵1 ↵2 ↵k
ˆy = f(x) = V ˆ↵
where ˆ↵ = arg min
↵
||x U↵||2
2 s.t. ||↵||1 
= α1 + α2 + …+ αM
u1 u2 uM
x
Method: Paired Dictionary
+
=
= +...+
+...++
↵1 ↵2 ↵k
↵1 ↵2 ↵k
ˆy = f(x) = V ˆ↵
where ˆ↵ = arg min
↵
||x U↵||2
2 s.t. ||↵||1 
Method: Paired Dictionary
+
=
= +...+
+...++
↵1 ↵2 ↵k
↵1 ↵2 ↵k
ˆy = f(x) = V ˆ↵
where ˆ↵ = arg min
↵
||x U↵||2
2 s.t. ||↵||1 
Method: Paired Dictionary
+
=
= +...+
+...++
↵1 ↵2 ↵k
↵1 ↵2 ↵k
ˆy = f(x) = V ˆ↵
where ˆ↵ = arg min
↵
||x U↵||2
2 s.t. ||↵||1 
Method: Paired Dictionary
+
=
= +...+
+...++
↵1 ↵2 ↵k
↵1 ↵2 ↵k
= f(x) = V ˆ↵
where ˆ↵ = arg min ||x U↵||2
2 s.t. ||↵||1 
α1 + α2 + …+ αM
=
v1 v2 vM
y
U, V(
•  :(Paired(Dic6onary(Learning
HOG(glyph(
Figure 3: We visualize some high scoring detections from th
guess which are false alarms? Take a minute to study this fig
Figure 4: In this paper, we present algorithms to visualize
HOG
Dict
61
71
39
05
17
14
92
46
37
ersion al-
uth image
her is bet-
ull table.
Expert
0.438
(a) Human Vision (b) HOG Vision
Figure 13: HOG inversion reveals the world that object de-
tectors see. The left shows a man standing in a dark room.
Invert(HOG(
http://web.mit.edu/vondrick/ihog/(
HOG(
40 ⇥ 40 20 ⇥ 20 10 ⇥ 10 5 ⇥ 5
Figure 12: Our inversion algorithms are sensitive to the
HOG template size. We show how performance degrades
as the template becomes smaller.
cell(size(
nary learning.
OG basis. By
es and natural
e image basis
d V . The left
ment and the
he HOG dic-
naries.
U 2 RD⇥K
d coefficients
(5)
Original ELDA Ridge Direct PairDict
Figure 8: We show results for all four of our inversion al-
gorithms on held out image patches on similar dimensions
(
•  ELDA(
K (
( )(
•  Ridge(
(X( HOG(glyph(Y( (Ridge( (
•  Direct(
Gabor( U( (
•  PairedDict.(
(
Category ELDA Ridge Direct PairDict
bicycle 0.452 0.577 0.513 0.561
bottle 0.697 0.683 0.660 0.671
car 0.668 0.677 0.652 0.639
cat 0.749 0.712 0.687 0.705
chair 0.660 0.621 0.604 0.617
table 0.656 0.617 0.582 0.614
motorbike 0.573 0.617 0.549 0.592
person 0.696 0.667 0.646 0.646
Mean 0.671 0.656 0.620 0.637
Table 1: We evaluate the performance of our inversion al-
gorithm by comparing the inverse to the ground truth image
using the mean normalized cross correlation. Higher is bet-
ter; a score of 1 is perfect. See supplemental for full table.
Category ELDA Ridge Direct PairDict Glyph Expert
bicycle 0.327 0.127 0.362 0.307 0.405 0.438
(a)
Figure 13:
tectors see.
(´ ω )
PASCAL(VOC(2011 8(
•  Mechanical(Turks((or(Expert)
Image
HOG(
inversion
Classifier
Image
HOG(
inversion
Human
Figure 6: Inverting HOG using paired dictionary learning.
We first project the HOG vector on to a HOG basis. By
jointly learning a coupled basis of HOG features and natural
images, we then transfer the coefficients to the image basis
to recover the natural image.
Figure 7: Some pairs of dictionaries for U and V . The left
of every pair is the gray scale dictionary element and the
right is the positive components elements in the HOG dic-
tionary. Notice the correlation between dictionaries.
Suppose we write x and y in terms of bases U 2 RD⇥K
and V 2 Rd⇥K
respectively, but with shared coefficients
↵ 2 RK
:
x = U↵ and y = V ↵ (5)
The key observation is that inversion can be obtained by
Original ELDA Ridge Direct PairDict
Figure 8: We show results for all four of our inversion al-
gorithms on held out image patches on similar dimensions
common for object detection. See supplemental for more.
: Inverting HOG using paired dictionary learning.
project the HOG vector on to a HOG basis. By
earning a coupled basis of HOG features and natural
we then transfer the coefficients to the image basis
er the natural image.
: Some pairs of dictionaries for U and V . The left
pair is the gray scale dictionary element and the
he positive components elements in the HOG dic-
Notice the correlation between dictionaries.
we write x and y in terms of bases U 2 RD⇥K
2 Rd⇥K
respectively, but with shared coefficients
:
x = U↵ and y = V ↵ (5)
Original ELDA Ridge Direct PairDict
Figure 8: We show results for all four of our inversion al-
gorithms on held out image patches on similar dimensions
It’s(‘Bird’
The HOGgles C
Comm
False Ala
Let’s(try(Mechanical(Turk
Figure 3: We visualize some high scoring detections from the deform
guess which are false alarms? Take a minute to study this figure, then
Figure 15: We visualize a few deformable parts models trained with
ization. First row: car, person, bottle, bicycle, motorbike, potted plan
the right most visualizations, we also included the HOG glyph. Our v
HOG
Car
Object Detection Features⇤
masz Malisiewicz, Antonio Torralba
itute of Technology
,torralba}@csail.mit.edu
Figure 1: An image from PASCAL and a high scoring car
detection from DPM [8]. Why did the detector fail?
Figure 2: We show the crop for the false car detection from
Figure 1. On the right, we show our visualization of the
HOG features for the same patch. Our visualization reveals
that this false alarm actually looks like a car in HOG space.
Figure 2 shows the output from our visualization on the
features for the false car detection. This visualization re-
veals that, while there are clearly no cars in the original
image, there is a car hiding in the HOG descriptor. HOG
features see a slightly different visual world than what we
see, and by visualizing this space, we can gain a more intu-
itive understanding of our object detectors.
Figure 3 inverts more top detections on PASCAL for
a few categories. Can you guess which are false alarms?
Take a minute to study the figure since the next sentence
might ruin the surprise. Although every visualization looks
age from PASCAL and a high scoring car
DPM [8]. Why did the detector fail?
MTurk
PASCAL(VOC(2011 (
1 3
Table 1: We evaluate the performance of our inversion al-
gorithm by comparing the inverse to the ground truth image
using the mean normalized cross correlation. Higher is bet-
ter; a score of 1 is perfect. See supplemental for full table.
Category ELDA Ridge Direct PairDict Glyph Expert
bicycle 0.327 0.127 0.362 0.307 0.405 0.438
bottle 0.269 0.282 0.283 0.446 0.312 0.222
car 0.397 0.457 0.617 0.585 0.359 0.389
cat 0.219 0.178 0.381 0.199 0.139 0.286
chair 0.099 0.239 0.223 0.386 0.119 0.167
table 0.152 0.064 0.162 0.237 0.071 0.125
motorbike 0.221 0.232 0.396 0.224 0.298 0.350
person 0.458 0.546 0.502 0.676 0.301 0.375
Mean 0.282 0.258 0.355 0.383 0.191 0.233
Table 2: We evaluate visualization performance across
twenty PASCAL VOC categories by asking MTurk work-
ers to classify our inversions. Numbers are percent classi-
fied correctly; higher is better. Chance is 0.05. Glyph refers
to the standard black-and-white HOG diagram popularized
Figur
tector
If we
ously
struct
hand
with
W
classi
with
last c
better
on gl
This
itive
of the top
ery simi-
ent types
particular
equently,
use larger
e patches
ow when
the false
ctors fire
sualizing
the learn-
e features
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Recall
Precision
Chair
HOG+Human AP = 0.63
RGB+Human AP = 0.96
HOG+DPM AP = 0.51
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Recall
Precision
Cat
HOG+Human AP = 0.78
HOG+DPM AP = 0.58
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Recall
Precision
Car
HOG+Human AP = 0.83
HOG+DPM AP = 0.87
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Recall
Precision
Person
HOG+Human AP = 0.69
HOG+DPM AP = 0.79
Figure 14: By instructing multiple human subjects to clas-
• 
(
• 
(HOG(glyph( (
•  (
– [Zeiler(&(Fergus(13]
invert this, the deconvnet uses transposed versions of
the same filters, but applied to the rectified maps, not
the output of the layer beneath. In practice this means
flipping each filter vertically and horizontally.
Projecting down from higher layers uses the switch
settings generated by the max pooling in the convnet
on the way up. As these switch settings are peculiar
to a given input image, the reconstruction obtained
from a single activation thus resembles a small piece
of the original input image, with structures weighted
according to their contribution toward to the feature
activation. Since the model is trained discriminatively,
they implicitly show which parts of the input image
are discriminative. Note that these projections are not
samples from the model, since there is no generative
process involved.
Layer Below Pooled Maps
Feature Maps
Rectified Feature Maps
Convolu'onal)
Filtering){F})
Rec'fied)Linear)
Func'on)
Pooled Maps
Max)Pooling)
Reconstruction
Rectified Unpooled Maps
Unpooled Maps
Convolu'onal)
Filtering){FT})
Rec'fied)Linear)
Func'on)
Layer Above
Reconstruction
Max)Unpooling)
Switches)
Unpooling
Max Locations
“Switches”
Pooling
Pooled Maps
Feature Map
Layer Above
Reconstruction
Unpooled
Maps
Rectified
Feature Maps
Figure 1. Top: A deconvnet layer (left) attached to a con-
vnet layer (right). The deconvnet will reconstruct an ap-
proximate version of the convnet features from the layer
beneath. Bottom: An illustration of the unpooling oper-
ation in the deconvnet, using switches which record the
location of the local max in each pooling region (colored
zones) during pooling in the convnet.
Other important di↵erences relating to layers 1 and
2 were made following inspection of the visualizations
in Fig. 6, as described in Section 4.1.
The model was trained on the ImageNet 2012 train-
ing set (1.3 million images, spread over 1000 di↵erent
classes). Each RGB image was preprocessed by resiz-
ing the smallest dimension to 256, cropping the center
256x256 region, subtracting the per-pixel mean (across
all images) and then using 10 di↵erent sub-crops of size
224x224 (corners + center with(out) horizontal flips).
Stochastic gradient descent with a mini-batch size of
128 was used to update the parameters, starting with a
learning rate of 10 2
, in conjunction with a momentum
term of 0.9. We anneal the learning rate throughout
training manually when the validation error plateaus.
Dropout (Hinton et al., 2012) is used in the fully con-
nected layers (6 and 7) with a rate of 0.5. All weights
are initialized to 10 2
and biases are set to 0.
Visualization of the first layer filters during training
reveals that a few of them dominate, as shown in
Fig. 6(a). To combat this, we renormalize each filter
in the convolutional layers whose RMS value exceeds
a fixed radius of 10 1
to this fixed radius. This is cru-
cial, especially in the first layer of the model, where the
input images are roughly in the [-128,128] range. As in
(Krizhevsky et al., 2012), we produce multiple di↵er-
ent crops and flips of each training example to boost
training set size. We stopped training after 70 epochs,
which took around 12 days on a single GTX580 GPU,
using an implementation based on (Krizhevsky et al.,
2012).
4. Convnet Visualization
Using the model described in Section 3, we now use
the deconvnet to visualize the feature activations on
the ImageNet validation set.
Feature Visualization: Fig. 2 shows feature visu-
alizations from our model once training is complete.
However, instead of showing the single strongest ac-
tivation for a given feature map, we show the top 9
activations. Projecting each separately down to pixel

More Related Content

Viewers also liked

20140705.西野研セミナー
20140705.西野研セミナー20140705.西野研セミナー
20140705.西野研セミナー
Hayaru SHOUNO
 
20150803.山口大学集中講義
20150803.山口大学集中講義20150803.山口大学集中講義
20150803.山口大学集中講義
Hayaru SHOUNO
 
20160825 IEICE SIP研究会 講演
20160825 IEICE SIP研究会 講演20160825 IEICE SIP研究会 講演
20160825 IEICE SIP研究会 講演
Hayaru SHOUNO
 
20160329.dnn講演
20160329.dnn講演20160329.dnn講演
20160329.dnn講演
Hayaru SHOUNO
 
20141003.journal club
20141003.journal club20141003.journal club
20141003.journal clubHayaru SHOUNO
 
20140530.journal club
20140530.journal club20140530.journal club
20140530.journal club
Hayaru SHOUNO
 
ベイズ Chow-Liu アルゴリズム
ベイズ Chow-Liu アルゴリズムベイズ Chow-Liu アルゴリズム
ベイズ Chow-Liu アルゴリズム
Joe Suzuki
 
20141204.journal club
20141204.journal club20141204.journal club
20141204.journal club
Hayaru SHOUNO
 
20130925.deeplearning
20130925.deeplearning20130925.deeplearning
20130925.deeplearningHayaru SHOUNO
 
量子アニーリングを用いたクラスタ分析
量子アニーリングを用いたクラスタ分析量子アニーリングを用いたクラスタ分析
量子アニーリングを用いたクラスタ分析
Shu Tanaka
 
ものまね鳥を愛でる 結合子論理と計算
ものまね鳥を愛でる 結合子論理と計算ものまね鳥を愛でる 結合子論理と計算
ものまね鳥を愛でる 結合子論理と計算
Hiromi Ishii
 

Viewers also liked (12)

20140705.西野研セミナー
20140705.西野研セミナー20140705.西野研セミナー
20140705.西野研セミナー
 
20150803.山口大学集中講義
20150803.山口大学集中講義20150803.山口大学集中講義
20150803.山口大学集中講義
 
20160825 IEICE SIP研究会 講演
20160825 IEICE SIP研究会 講演20160825 IEICE SIP研究会 講演
20160825 IEICE SIP研究会 講演
 
20160329.dnn講演
20160329.dnn講演20160329.dnn講演
20160329.dnn講演
 
20130722
2013072220130722
20130722
 
20141003.journal club
20141003.journal club20141003.journal club
20141003.journal club
 
20140530.journal club
20140530.journal club20140530.journal club
20140530.journal club
 
ベイズ Chow-Liu アルゴリズム
ベイズ Chow-Liu アルゴリズムベイズ Chow-Liu アルゴリズム
ベイズ Chow-Liu アルゴリズム
 
20141204.journal club
20141204.journal club20141204.journal club
20141204.journal club
 
20130925.deeplearning
20130925.deeplearning20130925.deeplearning
20130925.deeplearning
 
量子アニーリングを用いたクラスタ分析
量子アニーリングを用いたクラスタ分析量子アニーリングを用いたクラスタ分析
量子アニーリングを用いたクラスタ分析
 
ものまね鳥を愛でる 結合子論理と計算
ものまね鳥を愛でる 結合子論理と計算ものまね鳥を愛でる 結合子論理と計算
ものまね鳥を愛でる 結合子論理と計算
 

Similar to 20150326.journal club

Class[6][5th aug] [three js-loaders]
Class[6][5th aug] [three js-loaders]Class[6][5th aug] [three js-loaders]
Class[6][5th aug] [three js-loaders]
Saajid Akram
 
CS 354 Object Viewing and Representation
CS 354 Object Viewing and RepresentationCS 354 Object Viewing and Representation
CS 354 Object Viewing and Representation
Mark Kilgard
 
Capture and Rendering
Capture and RenderingCapture and Rendering
Capture and Rendering
Matt Hirsch - MIT Media Lab
 
iOS Visual F/X Using GLSL
iOS Visual F/X Using GLSLiOS Visual F/X Using GLSL
iOS Visual F/X Using GLSL
Douglass Turner
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
Yoonho Lee
 
lec10.pdf
lec10.pdflec10.pdf
lec10.pdf
ssuserc02bd5
 
OpenGL Fixed Function to Shaders - Porting a fixed function application to “m...
OpenGL Fixed Function to Shaders - Porting a fixed function application to “m...OpenGL Fixed Function to Shaders - Porting a fixed function application to “m...
OpenGL Fixed Function to Shaders - Porting a fixed function application to “m...
ICS
 
CariGANs : Unpaired Photo-to-Caricature Translation
CariGANs : Unpaired Photo-to-Caricature TranslationCariGANs : Unpaired Photo-to-Caricature Translation
CariGANs : Unpaired Photo-to-Caricature Translation
Razorthink
 
Unsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANUnsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGAN
Shyam Krishna Khadka
 
Getting more out of Matplotlib with GR
Getting more out of Matplotlib with GRGetting more out of Matplotlib with GR
Getting more out of Matplotlib with GR
Josef Heinen
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
Devansh16
 
From Hello World to the Interactive Web with Three.js: Workshop at FutureJS 2014
From Hello World to the Interactive Web with Three.js: Workshop at FutureJS 2014From Hello World to the Interactive Web with Three.js: Workshop at FutureJS 2014
From Hello World to the Interactive Web with Three.js: Workshop at FutureJS 2014
Verold
 
Curvefitting
CurvefittingCurvefitting
Curvefitting
Philberto Saroni
 
Greedy subtourcrossover.arob98
Greedy subtourcrossover.arob98Greedy subtourcrossover.arob98
Greedy subtourcrossover.arob98
Kaal Nath
 
Lab Practices and Works Documentation / Report on Computer Graphics
Lab Practices and Works Documentation / Report on Computer GraphicsLab Practices and Works Documentation / Report on Computer Graphics
Lab Practices and Works Documentation / Report on Computer Graphics
Rup Chowdhury
 
CS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingCS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and Culling
Mark Kilgard
 
One-Sample Face Recognition Using HMM Model of Fiducial Areas
One-Sample Face Recognition Using HMM Model of Fiducial AreasOne-Sample Face Recognition Using HMM Model of Fiducial Areas
One-Sample Face Recognition Using HMM Model of Fiducial Areas
CSCJournals
 

Similar to 20150326.journal club (20)

Class[6][5th aug] [three js-loaders]
Class[6][5th aug] [three js-loaders]Class[6][5th aug] [three js-loaders]
Class[6][5th aug] [three js-loaders]
 
CS 354 Object Viewing and Representation
CS 354 Object Viewing and RepresentationCS 354 Object Viewing and Representation
CS 354 Object Viewing and Representation
 
Fpga human detection
Fpga human detectionFpga human detection
Fpga human detection
 
Bai 1
Bai 1Bai 1
Bai 1
 
Capture and Rendering
Capture and RenderingCapture and Rendering
Capture and Rendering
 
iOS Visual F/X Using GLSL
iOS Visual F/X Using GLSLiOS Visual F/X Using GLSL
iOS Visual F/X Using GLSL
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
 
lec10.pdf
lec10.pdflec10.pdf
lec10.pdf
 
OpenGL Fixed Function to Shaders - Porting a fixed function application to “m...
OpenGL Fixed Function to Shaders - Porting a fixed function application to “m...OpenGL Fixed Function to Shaders - Porting a fixed function application to “m...
OpenGL Fixed Function to Shaders - Porting a fixed function application to “m...
 
CariGANs : Unpaired Photo-to-Caricature Translation
CariGANs : Unpaired Photo-to-Caricature TranslationCariGANs : Unpaired Photo-to-Caricature Translation
CariGANs : Unpaired Photo-to-Caricature Translation
 
Unsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANUnsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGAN
 
Getting more out of Matplotlib with GR
Getting more out of Matplotlib with GRGetting more out of Matplotlib with GR
Getting more out of Matplotlib with GR
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
 
From Hello World to the Interactive Web with Three.js: Workshop at FutureJS 2014
From Hello World to the Interactive Web with Three.js: Workshop at FutureJS 2014From Hello World to the Interactive Web with Three.js: Workshop at FutureJS 2014
From Hello World to the Interactive Web with Three.js: Workshop at FutureJS 2014
 
Curvefitting
CurvefittingCurvefitting
Curvefitting
 
Greedy subtourcrossover.arob98
Greedy subtourcrossover.arob98Greedy subtourcrossover.arob98
Greedy subtourcrossover.arob98
 
Lab Practices and Works Documentation / Report on Computer Graphics
Lab Practices and Works Documentation / Report on Computer GraphicsLab Practices and Works Documentation / Report on Computer Graphics
Lab Practices and Works Documentation / Report on Computer Graphics
 
CS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingCS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and Culling
 
One-Sample Face Recognition Using HMM Model of Fiducial Areas
One-Sample Face Recognition Using HMM Model of Fiducial AreasOne-Sample Face Recognition Using HMM Model of Fiducial Areas
One-Sample Face Recognition Using HMM Model of Fiducial Areas
 
Final Project
Final ProjectFinal Project
Final Project
 

Recently uploaded

road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
ssuser7dcef0
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
The Role of Electrical and Electronics Engineers in IOT Technology.pdf
The Role of Electrical and Electronics Engineers in IOT Technology.pdfThe Role of Electrical and Electronics Engineers in IOT Technology.pdf
The Role of Electrical and Electronics Engineers in IOT Technology.pdf
Nettur Technical Training Foundation
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 

Recently uploaded (20)

road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
The Role of Electrical and Electronics Engineers in IOT Technology.pdf
The Role of Electrical and Electronics Engineers in IOT Technology.pdfThe Role of Electrical and Electronics Engineers in IOT Technology.pdf
The Role of Electrical and Electronics Engineers in IOT Technology.pdf
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 

20150326.journal club

  • 2. Overview •  ( –  (HOG,(SIFT,(etc)( –  ((SVM,(Log.(regression,(etc)( ( •  “ ”( ( ( •  HOG
  • 8. :(Paired(Dict.LearningMethod: Paired Dictionary + = = +...+ +...++ ↵1 ↵2 ↵k ↵1 ↵2 ↵k ˆy = f(x) = V ˆ↵ 2 Method: Paired Dictionary + = = +...+ +...++ ↵1 ↵2 ↵k ↵1 ↵2 ↵k ˆy = f(x) = V ˆ↵ where ˆ↵ = arg min ↵ ||x U↵||2 2 s.t. ||↵||1  Method: Paired Dictionary + = = +...+ +...++ ↵1 ↵2 ↵k ↵1 ↵2 ↵k ˆy = f(x) = V ˆ↵ where ˆ↵ = arg min ↵ ||x U↵||2 2 s.t. ||↵||1  Method: Paired Dictionary + = = +...+ +...++ ↵1 ↵2 ↵k ↵1 ↵2 ↵k ˆy = f(x) = V ˆ↵ where ˆ↵ = arg min ↵ ||x U↵||2 2 s.t. ||↵||1  = α1 + α2 + …+ αM u1 u2 uM x Method: Paired Dictionary + = = +...+ +...++ ↵1 ↵2 ↵k ↵1 ↵2 ↵k ˆy = f(x) = V ˆ↵ where ˆ↵ = arg min ↵ ||x U↵||2 2 s.t. ||↵||1  Method: Paired Dictionary + = = +...+ +...++ ↵1 ↵2 ↵k ↵1 ↵2 ↵k ˆy = f(x) = V ˆ↵ where ˆ↵ = arg min ↵ ||x U↵||2 2 s.t. ||↵||1  Method: Paired Dictionary + = = +...+ +...++ ↵1 ↵2 ↵k ↵1 ↵2 ↵k ˆy = f(x) = V ˆ↵ where ˆ↵ = arg min ↵ ||x U↵||2 2 s.t. ||↵||1  Method: Paired Dictionary + = = +...+ +...++ ↵1 ↵2 ↵k ↵1 ↵2 ↵k = f(x) = V ˆ↵ where ˆ↵ = arg min ||x U↵||2 2 s.t. ||↵||1  α1 + α2 + …+ αM = v1 v2 vM y
  • 10.
  • 11. HOG(glyph( Figure 3: We visualize some high scoring detections from th guess which are false alarms? Take a minute to study this fig Figure 4: In this paper, we present algorithms to visualize
  • 12. HOG Dict 61 71 39 05 17 14 92 46 37 ersion al- uth image her is bet- ull table. Expert 0.438 (a) Human Vision (b) HOG Vision Figure 13: HOG inversion reveals the world that object de- tectors see. The left shows a man standing in a dark room.
  • 14. HOG( 40 ⇥ 40 20 ⇥ 20 10 ⇥ 10 5 ⇥ 5 Figure 12: Our inversion algorithms are sensitive to the HOG template size. We show how performance degrades as the template becomes smaller. cell(size(
  • 15. nary learning. OG basis. By es and natural e image basis d V . The left ment and the he HOG dic- naries. U 2 RD⇥K d coefficients (5) Original ELDA Ridge Direct PairDict Figure 8: We show results for all four of our inversion al- gorithms on held out image patches on similar dimensions
  • 16. ( •  ELDA( K ( ( )( •  Ridge( (X( HOG(glyph(Y( (Ridge( ( •  Direct( Gabor( U( ( •  PairedDict.( (
  • 17. Category ELDA Ridge Direct PairDict bicycle 0.452 0.577 0.513 0.561 bottle 0.697 0.683 0.660 0.671 car 0.668 0.677 0.652 0.639 cat 0.749 0.712 0.687 0.705 chair 0.660 0.621 0.604 0.617 table 0.656 0.617 0.582 0.614 motorbike 0.573 0.617 0.549 0.592 person 0.696 0.667 0.646 0.646 Mean 0.671 0.656 0.620 0.637 Table 1: We evaluate the performance of our inversion al- gorithm by comparing the inverse to the ground truth image using the mean normalized cross correlation. Higher is bet- ter; a score of 1 is perfect. See supplemental for full table. Category ELDA Ridge Direct PairDict Glyph Expert bicycle 0.327 0.127 0.362 0.307 0.405 0.438 (a) Figure 13: tectors see. (´ ω ) PASCAL(VOC(2011 8(
  • 18. •  Mechanical(Turks((or(Expert) Image HOG( inversion Classifier Image HOG( inversion Human Figure 6: Inverting HOG using paired dictionary learning. We first project the HOG vector on to a HOG basis. By jointly learning a coupled basis of HOG features and natural images, we then transfer the coefficients to the image basis to recover the natural image. Figure 7: Some pairs of dictionaries for U and V . The left of every pair is the gray scale dictionary element and the right is the positive components elements in the HOG dic- tionary. Notice the correlation between dictionaries. Suppose we write x and y in terms of bases U 2 RD⇥K and V 2 Rd⇥K respectively, but with shared coefficients ↵ 2 RK : x = U↵ and y = V ↵ (5) The key observation is that inversion can be obtained by Original ELDA Ridge Direct PairDict Figure 8: We show results for all four of our inversion al- gorithms on held out image patches on similar dimensions common for object detection. See supplemental for more. : Inverting HOG using paired dictionary learning. project the HOG vector on to a HOG basis. By earning a coupled basis of HOG features and natural we then transfer the coefficients to the image basis er the natural image. : Some pairs of dictionaries for U and V . The left pair is the gray scale dictionary element and the he positive components elements in the HOG dic- Notice the correlation between dictionaries. we write x and y in terms of bases U 2 RD⇥K 2 Rd⇥K respectively, but with shared coefficients : x = U↵ and y = V ↵ (5) Original ELDA Ridge Direct PairDict Figure 8: We show results for all four of our inversion al- gorithms on held out image patches on similar dimensions It’s(‘Bird’
  • 19. The HOGgles C Comm False Ala Let’s(try(Mechanical(Turk Figure 3: We visualize some high scoring detections from the deform guess which are false alarms? Take a minute to study this figure, then Figure 15: We visualize a few deformable parts models trained with ization. First row: car, person, bottle, bicycle, motorbike, potted plan the right most visualizations, we also included the HOG glyph. Our v
  • 20. HOG Car Object Detection Features⇤ masz Malisiewicz, Antonio Torralba itute of Technology ,torralba}@csail.mit.edu Figure 1: An image from PASCAL and a high scoring car detection from DPM [8]. Why did the detector fail? Figure 2: We show the crop for the false car detection from Figure 1. On the right, we show our visualization of the HOG features for the same patch. Our visualization reveals that this false alarm actually looks like a car in HOG space. Figure 2 shows the output from our visualization on the features for the false car detection. This visualization re- veals that, while there are clearly no cars in the original image, there is a car hiding in the HOG descriptor. HOG features see a slightly different visual world than what we see, and by visualizing this space, we can gain a more intu- itive understanding of our object detectors. Figure 3 inverts more top detections on PASCAL for a few categories. Can you guess which are false alarms? Take a minute to study the figure since the next sentence might ruin the surprise. Although every visualization looks age from PASCAL and a high scoring car DPM [8]. Why did the detector fail?
  • 21. MTurk PASCAL(VOC(2011 ( 1 3 Table 1: We evaluate the performance of our inversion al- gorithm by comparing the inverse to the ground truth image using the mean normalized cross correlation. Higher is bet- ter; a score of 1 is perfect. See supplemental for full table. Category ELDA Ridge Direct PairDict Glyph Expert bicycle 0.327 0.127 0.362 0.307 0.405 0.438 bottle 0.269 0.282 0.283 0.446 0.312 0.222 car 0.397 0.457 0.617 0.585 0.359 0.389 cat 0.219 0.178 0.381 0.199 0.139 0.286 chair 0.099 0.239 0.223 0.386 0.119 0.167 table 0.152 0.064 0.162 0.237 0.071 0.125 motorbike 0.221 0.232 0.396 0.224 0.298 0.350 person 0.458 0.546 0.502 0.676 0.301 0.375 Mean 0.282 0.258 0.355 0.383 0.191 0.233 Table 2: We evaluate visualization performance across twenty PASCAL VOC categories by asking MTurk work- ers to classify our inversions. Numbers are percent classi- fied correctly; higher is better. Chance is 0.05. Glyph refers to the standard black-and-white HOG diagram popularized Figur tector If we ously struct hand with W classi with last c better on gl This itive
  • 22. of the top ery simi- ent types particular equently, use larger e patches ow when the false ctors fire sualizing the learn- e features 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall Precision Chair HOG+Human AP = 0.63 RGB+Human AP = 0.96 HOG+DPM AP = 0.51 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall Precision Cat HOG+Human AP = 0.78 HOG+DPM AP = 0.58 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall Precision Car HOG+Human AP = 0.83 HOG+DPM AP = 0.87 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall Precision Person HOG+Human AP = 0.69 HOG+DPM AP = 0.79 Figure 14: By instructing multiple human subjects to clas-
  • 24. •  ( – [Zeiler(&(Fergus(13] invert this, the deconvnet uses transposed versions of the same filters, but applied to the rectified maps, not the output of the layer beneath. In practice this means flipping each filter vertically and horizontally. Projecting down from higher layers uses the switch settings generated by the max pooling in the convnet on the way up. As these switch settings are peculiar to a given input image, the reconstruction obtained from a single activation thus resembles a small piece of the original input image, with structures weighted according to their contribution toward to the feature activation. Since the model is trained discriminatively, they implicitly show which parts of the input image are discriminative. Note that these projections are not samples from the model, since there is no generative process involved. Layer Below Pooled Maps Feature Maps Rectified Feature Maps Convolu'onal) Filtering){F}) Rec'fied)Linear) Func'on) Pooled Maps Max)Pooling) Reconstruction Rectified Unpooled Maps Unpooled Maps Convolu'onal) Filtering){FT}) Rec'fied)Linear) Func'on) Layer Above Reconstruction Max)Unpooling) Switches) Unpooling Max Locations “Switches” Pooling Pooled Maps Feature Map Layer Above Reconstruction Unpooled Maps Rectified Feature Maps Figure 1. Top: A deconvnet layer (left) attached to a con- vnet layer (right). The deconvnet will reconstruct an ap- proximate version of the convnet features from the layer beneath. Bottom: An illustration of the unpooling oper- ation in the deconvnet, using switches which record the location of the local max in each pooling region (colored zones) during pooling in the convnet. Other important di↵erences relating to layers 1 and 2 were made following inspection of the visualizations in Fig. 6, as described in Section 4.1. The model was trained on the ImageNet 2012 train- ing set (1.3 million images, spread over 1000 di↵erent classes). Each RGB image was preprocessed by resiz- ing the smallest dimension to 256, cropping the center 256x256 region, subtracting the per-pixel mean (across all images) and then using 10 di↵erent sub-crops of size 224x224 (corners + center with(out) horizontal flips). Stochastic gradient descent with a mini-batch size of 128 was used to update the parameters, starting with a learning rate of 10 2 , in conjunction with a momentum term of 0.9. We anneal the learning rate throughout training manually when the validation error plateaus. Dropout (Hinton et al., 2012) is used in the fully con- nected layers (6 and 7) with a rate of 0.5. All weights are initialized to 10 2 and biases are set to 0. Visualization of the first layer filters during training reveals that a few of them dominate, as shown in Fig. 6(a). To combat this, we renormalize each filter in the convolutional layers whose RMS value exceeds a fixed radius of 10 1 to this fixed radius. This is cru- cial, especially in the first layer of the model, where the input images are roughly in the [-128,128] range. As in (Krizhevsky et al., 2012), we produce multiple di↵er- ent crops and flips of each training example to boost training set size. We stopped training after 70 epochs, which took around 12 days on a single GTX580 GPU, using an implementation based on (Krizhevsky et al., 2012). 4. Convnet Visualization Using the model described in Section 3, we now use the deconvnet to visualize the feature activations on the ImageNet validation set. Feature Visualization: Fig. 2 shows feature visu- alizations from our model once training is complete. However, instead of showing the single strongest ac- tivation for a given feature map, we show the top 9 activations. Projecting each separately down to pixel