Upcoming SlideShare
×

# Team meeting 100325

480 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
480
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
10
0
Likes
0
Embeds 0
No embeds

No notes for slide

• ### Team meeting 100325

1. 1. Progress Report 2010.03.25
2. 2. Outline Discriminatively Trained, Multiscale, Deformable Part Model Object Detection with Discriminatively Trained Part Based Models Works to do
3. 3. A Discriminatively Trained, Multiscale, Deformable Part Model, CVPR’08
4. 4. Part-Based Model Template ﬁlters part ﬁlters deformation resolution ﬁner resolution models root ﬁlters part ﬁlters deformation Each component has a root ﬁlter F0 models coarse resolution ﬁner resolution and n part models (Fi, vi, di) Each component has a root ﬁlter F0
5. 5. Feature Pyramid Object hypothesis z = (p0,..., pn) p0 : location of root p1,..., pn : location of parts Score is sum of ﬁlter scores minus deformation costs Image pyramid HOG feature pyramid Multiscale model captures features at two-resolutions
6. 6. φd (dx, dy) = (dx, dy, dx2 , dy 2 ) This tr (4) locatio Score of a hypothesis are deformation features. value D Score Function Note that if di = (0, 0, 1, 1) the deformation cost for part to the i-th part is the squared distance between its actual of this position and its anchor position relative to the root. In The general the deformation term” is an arbitrary separable time fr “data cost “spatial prior” quadratic function of the displacements. distanc n n The bias term is introduced in the score to make 2 the The score(p0 , .of , pn ) = modelsφ(H, pi ) − when· we combine by the scores . . multiple Fi · comparable di (dxi , dyi ) 2 i=0 i=1 displacements shifted them into a mixture model. The score of a hypothesis z can be expressed in terms respon of a dot product, β ﬁlters z), between a vector of model · ψ(H, deformation parameters parameters β and a vector ψ(H, z), scor β = (F0 , . . .score(z). . . , βn·, Ψ(H, z) , Fn , d1 , = d b). (5) ψ(H, z) = (φ(H, p0 ), . . . φ(H, pn ), (6) Recall −φd (dx1 , dy1 ), ﬁlters and(dxconcatenation of HOG concatenation . . . , −φd n , dyn ), 1). deformation parameters features and part in the This illustrates a connection between our models and compu displacement features linear classiﬁers. We use this relationship for learning Figu
7. 7. Matching Find The Best Hypothesis Matching results • Deﬁne an overall score for each root location - Based on best placement of parts score(p0 ) = max score(p0 , . . . , pn ). p1 ,...,pn • High scoring root locations deﬁne detections - “sliding window approach” • Efﬁcient computation: dynamic programming + generalized distance transforms (max-convolution)
8. 8. Semi-convexity fβ (x) = max β · Φ(x, z) if dthe=squared 1) the deformationits actual Note that the i-th part is i (0, 0, 1, distance between cost for par of z∈Z(x) position and its anchor position relative to the root. In T general the deformation cost is an arbitrary separable tim Latent SVM (MI-SVM) Latent SVM ! are model parameters quadratic function of the displacements. The bias term is introduced in the score to make the dis T z are latent values scores of multiple models comparable when we combine by • Maximum of convex functions is convex them into a mixture model. The score of a hypothesis z can be expressed in terms shi res Classiﬁers that score an = ( x , y x. using , y ) y ∈ {−1, 1} Training data D example , . . , x of a dot product, β · ψ(H, z), between a vector of model 1 parameters β and a vector ψ(H, z), i 1 n n (x) = max Φ(x, z) is • ffβ(x) = z∈Z(x)ββ··Φ(x, z)! suchconvex f (x ) > 0 in ! β = (F0 , . . . , Fn , d1 , . . . , dn , b). (5) β We max like to ﬁnd would that: y i β i ψ(H, z) = (φ(H, p0 ), . . . φ(H, pn ), z∈Z(x) (6) Re • max(0, 1 − yi fβ (xi )) is convex for negative examples Minimize −φd (dx1 , dy1 ), . . . , −φd (dxn , dyn ), 1). This illustrates a connection between our models and in com ! are model parameters 1 Semi-convexity linear classiﬁers. We use this relationship for learning n model parameters with the latent SVM framework. the F T z are latent D (β) = ||β||2 + C L values max(0, 1 − yi fβ (xi )) loc eac 2 3.2 Matching in n i=1 detect objects in an image we compute an overall 1 To giv Training data D = ( + 1 , y1 , . .placementnof thenparts,fβ (xii)) {−1, 1} LD (β) = ||β|| x C max(0, 1 y yi ., x , − ) 2 score for each root location according to the best possible sco y ∈ • 2 Maximum of convex functions=is convex. , p ). i=1 score(p ) max score(p , . . 0 (7) 0 n we add We would like to ﬁnd ! such that: yi froot(xi ) > 0 detections while the p1 ,...,pn fro High-scoring β locations deﬁne I Convex if latent values Φ(x, z) of the parts that yield a!are ﬁxed fβ (x) = max β · forlocationsdeﬁne convex in high-scoring root • positive examples can location is a full object hypothesis. fun z∈Z(x) By deﬁning an overall score for each root location we Pi, nimize can detect multiple instances of an object (we assume max(0, 1 − yi fβ (xi )) is convex for negative examp there is at most one instance per root location). This Aft
9. 9. Object Detection with Discriminatively Trained Part Based Models, PAMI’09
10. 10. Modiﬁcation Optimization function Lower dimension but more informative features Bounding box prediction Contextual Information
11. 11. HOG with PCA 0.45617 0.04390 0.02462 0.01339 0.00629 0.00556 0.00456 0.00391 0.00367 0.00353 0.00310 0.00063 0.00030 0.00020 0.00018 0.00018 0.00017 0.00014 0.00013 0.00011 0.00010 0.00010 0.00009 0.00009 0.00008 0.00008 0.00007 0.00006 0.00005 0.00004 0.00004 0.00003 0.00003 0.00003 0.00002 0.00002 6. PCA of HOG features. Each eigenvector is displayed as a 4 by 9 matrix so that each row corresponds t The ﬁrst 11 eigenvectors alization factor and each column to one orientation bin. The eigenvalues are displayed on top of the eigenve near subspace spanned by the top 11 eigenvectors captures essentially all of the information in a feature v capture almost all information how all of the top eigenvectors are either constant along each column or row of the matrix representation. C be a cell-based feature map computed by aggre- 7 P OST P ROCESSING g a pixel-level feature map with 9 contrast insensi-
12. 12. 7.3 Contextual Information overla box, o We have implemented a simple procedure to rescore positiv Post-Processing detections using contextual information. Let (D1 , . . . , Dk ) be a set of detections obtained using a syst with a k different models (for different object categories) in an diction image I. Each detection (B, s) ∈ Di is deﬁned by a false p bounding box B = (x1 , y1 , x2 , y2 ) and a score s. We cision deﬁne the context of I in terms of a k-dimensional vector We c(I) = (σ(s1 ), .a. regression model to ﬁgurethe high- Learning . , σ(sk )) where si is the score of out each d est the bounding boxDi , and σ(x) = 1/(1 + exp(−2x)) scoring detection in coordinates on the is a logistic function for renormalizing the scores. obtain To rescore a detection (B, s) by an imagewith all Re-scoring the window in models I we build correc a 25-dimensional feature vector with the original score scores of categories detection windows In s of the detection, the top-left and bottom-right bounding to con box coordinates, and the image context, cow o g = (σ(s), x1 , y1 , x2 , y2 , c(I)). (30) detect box cr The coordinates x1 , y1 , x2 , y2 ∈ [0, 1] are normalized by catego the width and height of the image. We use a category- truth b speciﬁc classiﬁer to score this vector to obtain a new
13. 13. PASCAL VOC 2008 Precision/Recall results on Person 2008
14. 14. 09 Base 09 BB 09 Cont 08 Average Precisison n 0.407 0.423 0.431 0.42 18 person
15. 15. Work to Do Cell model work modiﬁcations Integrate other methods into cell model work Another direction