SlideShare a Scribd company logo
1 of 62
Download to read offline
Structured Regression for
            Efficient Object Detection

                        Christoph Lampert
                   www.christoph-lampert.org

            Max Planck Institute for Biological Cybernetics, Tübingen


                          December 3rd, 2009




• [C.L., Matthew B. Blaschko, Thomas Hofmann. CVPR 2008]
• [Matthew B. Blaschko, C.L. ECCV 2008]
• [C.L., Matthew B. Blaschko, Thomas Hofmann. PAMI 2009]
Category-Level Object Localization
Category-Level Object Localization




             What objects are present? person, car
Category-Level Object Localization




                    Where are the objects?
Object Localization ⇒ Scene Interpretation




    A man inside of a car       A man outside of a car
    ⇒ He’s driving.             ⇒ He’s passing by.
Algorithmic Approach: Sliding Window




       f (y1 ) = 0.2         f (y2 ) = 0.8        f (y3 ) = 1.5

   Use a (pre-trained) classifier function f :
     • Place candidate window on the image.
     • Iterate:
            Evaluate f and store result.
            Shift candidate window by k pixels.
     • Return position where f was largest.
Algorithmic approach: Sliding Window




       f (y1 ) = 0.2       f (y2 ) = 0.8      f (y3 ) = 1.5

   Drawbacks:
     • single scale, single aspect ratio
       → repeat with different window sizes/shapes
     • search on grid
       → speed–accuracy tradeoff
     • computationally expensive
New view: Generalized Sliding Window




   Assumptions:
     • Objects are rectangular image regions of arbitrary size.
     • The score of f is largest at the correct object position.

   Mathematical Formulation:

                       yopt = argmax f (y)
                                  y∈Y


   with Y = {all rectangular regions in image}
New view: Generalized Sliding Window




   Mathematical Formulation:

                        yopt = argmax f (y)
                                  y∈Y


   with Y = {all rectangular regions in image}

   • How to choose/construct/learn the function f ?
   • How to do the optimization efficiently and robustly?
     (exhaustive search is too slow, O(w2 h2 ) elements).
New view: Generalized Sliding Window




   Mathematical Formulation:

                        yopt = argmax f (y)
                                  y∈Y


   with Y = {all rectangular regions in image}

   • How to choose/construct/learn the function f ?
   • How to do the optimization efficiently and robustly?
     (exhaustive search is too slow, O(w2 h2 ) elements).
New view: Generalized Sliding Window


   Use the problem’s geometric structure:
New view: Generalized Sliding Window


   Use the problem’s geometric structure:
                                             • Calculate scores for
                                               sets of boxes jointly.

                                             • If no element can
                                               contain the maximum,
                                               discard the box set.

                                             • Otherwise, split the
                                               box set and iterate.

                                            → Branch-and-bound
                                              optimization
   • finds global maximum yopt
New view: Generalized Sliding Window


   Use the problem’s geometric structure:
                                             • Calculate scores for
                                               sets of boxes jointly.

                                             • If no element can
                                               contain the maximum,
                                               discard the box set.

                                             • Otherwise, split the
                                               box set and iterate.

                                            → Branch-and-bound
                                              optimization
   • finds global maximum yopt
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 .
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4




   Splitting:
     • Identify largest interval.
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4




   Splitting:
     • Identify largest interval. Split at center: R → R1 ∪R2 .
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4




   Splitting:
     • Identify largest interval. Split at center: R → R1 ∪R2 .
     • New box sets: [L, T, R1 , B]
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4




   Splitting:
     • Identify largest interval. Split at center: R → R1 ∪R2 .
     • New box sets: [L, T, R1 , B] and [L, T, R2 , B].
Calculating Scores for Box Sets

   Example: Linear Support-Vector-Machine f (y) :=     pi ∈y   wi .




                                    +


         f upper (Y) =        min(0, wi ) +    max(0, wi )
                         pi ∈y∩           pi ∈y∪

   Can be computed in O(1) using integral images.
Calculating Scores for Box Sets
                                                  J               y
   Histogram Intersection Similarity: f (y) :=    j=1   min(hj , hj ).




                                  J               ∪
                                                  y
                 f upper (Y) =    j=1
                                        min(hj , hj )

   As fast as for a single box: O(J) with integral histograms.
Evaluation: Speed (on PASCAL VOC 2006)




   Sliding Window Runtime:
      • always: O(w2 h2 )
   Branch-and-Bound (ESS) Runtime:
      • worst-case: O(w2 h2 )
      • empirical: not more than O(wh)
Extensions:


   Action classification: (y, t)opt = argmax(y,t)∈Y×T fx (y, t)




   • J. Yuan: Discriminative 3D Subvolume Search for Efficient Action Detection, CVPR 2009.
Extensions:

   Localized image retrieval: (x, y)opt = argmaxy∈Y, x∈D fx (y)




   • C.L.: Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval, ICCV 2009
Extensions:


   Hybrid – Branch-and-Bound with Implicit Shape Model




   • A. Lehmann, B. Leibe, L. van Gool: Feature-Centric Efficient Subwindow Search, ICCV 2009
Generalized Sliding Window




                       yopt = argmax f (y)
                                y∈Y


   with Y = {all rectangular regions in image}

   • How to choose/construct/learn f ?
   • How to do the optimization efficiently and robustly?
Traditional Approach: Binary Classifier

    Training images:
         +            +
      • x1 , . . . , xn show the object
         −            −
      • x1 , . . . , xm show something else

    Train a classifier, e.g.
      • support vector machine,
      • boosted cascade,
      • artificial neural network,. . .

    Decision function f : {images} → R
      • f > 0 means “image shows the object.”
      • f < 0 means “image does not show
        the object.”
Traditional Approach: Binary Classifier


   Drawbacks:


      • Train distribution
        = test distribution

      • No control over partial
        detections.

      • No guarantee to even find
        training examples again.
Object Localization as Structured Output Regression

   Ideal setup:
     • function
                    g : {all images} → {all boxes}
       to predict object boxes from images
     • train and test in the same way, end-to-end



                                 

      gcar                          =
Object Localization as Structured Output Regression

   Ideal setup:
     • function
                        g : {all images} → {all boxes}
       to predict object boxes from images
     • train and test in the same way, end-to-end

   Regression problem:
     • training examples (x1 , y1 ), . . . , (xn , yn ) ∈ X × Y
             xi are images, yi are bounding boxes
     • Learn a mapping
                                      g : X →Y
        that generalizes from the given examples:
             g(xi ) ≈ yi , for i = 1, . . . , n,
Structured Support Vector Machine

     SVM-like framework by Tsochantaridis et al.:
       • Positive definite kernel k : (X × Y) × (X × Y)→R.
         ϕ : X × Y → H : (implicit) feature map induced by k.
         • ∆ : Y × Y → R:                        loss function
         • Solve the convex optimization problem
                                                                               n
                                                        1         2
                                          minw,ξ          w           +C           ξi
                                                        2                    i=1

             subject to margin constraints for i = 1, . . . , n :

             ∀y ∈ Y  {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi ,
         • unique solution: w ∗ ∈ H

• I. Tsochantaridis, T. Joachims, T. Hofmann, Y. Altun: Large Margin Methods for Structured and Interdependent
  Output Variables, Journal of Machine Learning Research (JMLR), 2005.
Structured Support Vector Machine



     • w ∗ defines compatiblity function

                         F (x, y) = w ∗ , ϕ(x, y)

     • best prediction for x is the most compatible y:

                        g(x) := argmax F (x, y).
                                    y∈Y

     • evaluating g : X → Y is like generalized Sliding Window:
            for fixed x, evaluate quality function for every box y ∈ Y.
            for example, use previous branch-and-bound procedure!
Joint Image/Box-Kernel: Example


   Joint kernel: how to compare one (image,box)-pair (x, y) with
   another (image,box)-pair (x , y )?

   kjoint          ,            =k           ,          is large.



   kjoint          ,            =k           ,          is small.



   kjoint          ,            = kimage         ,

                                            could also be large.
Loss Function: Example

   Loss function: how to compare two boxes y and y ?




          ∆(y, y ) := 1 − area overlap between y and y
                          area(y ∩ y )
                    =1−
                          area(y ∪ y )
Structured Support Vector Machine

                                                                      n
                                                     1       2
     • S-SVM Optimization:                  minw,ξ   2
                                                         w       +C         ξi
                                                                      i=1
        subject to for i = 1, . . . , n :

   ∀y ∈ Y  {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi ,
Structured Support Vector Machine

                                                                      n
                                                     1       2
     • S-SVM Optimization:                  minw,ξ   2
                                                         w       +C         ξi
                                                                      i=1
        subject to for i = 1, . . . , n :

   ∀y ∈ Y  {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi ,

     • Solve via constraint generation:
     • Iterate:
             Solve minimization with working set of contraints
             Identify argmaxy∈Y ∆(y, yi ) + w, ϕ(xi , y)
             Add violated constraints to working set and iterate
     • Polynomial time convergence to any precision ε

     • Similar to bootstrap training, but with a margin.
Evaluation: PASCAL VOC 2006




           Example detections for VOC 2006 bicycle, bus and cat.




          Precision–recall curves for VOC 2006 bicycle, bus and cat.

    • Structured regression improves detection accuracy.
    • New best scores (at that time) in 6 of 10 classes.
Why does it work?




        Learned weights from binary (center) and structured training (right).



     • Both methods assign positive weights to object region.
     • Structured training also assigns negative weights to
       features surrounding the bounding box position.
     • Posterior distribution over box coordinates becomes more
       peaked.
More Recent Results (PASCAL VOC 2009)




                aeroplane
More Recent Results (PASCAL VOC 2009)




                 bicycle
More Recent Results (PASCAL VOC 2009)




                   bird
More Recent Results (PASCAL VOC 2009)




                   boat
More Recent Results (PASCAL VOC 2009)




                   bottle
More Recent Results (PASCAL VOC 2009)




                    bus
More Recent Results (PASCAL VOC 2009)




                     car
More Recent Results (PASCAL VOC 2009)




                     cat
More Recent Results (PASCAL VOC 2009)




                     chair
More Recent Results (PASCAL VOC 2009)




                     cow
More Recent Results (PASCAL VOC 2009)




                   diningtable
More Recent Results (PASCAL VOC 2009)




                      dog
More Recent Results (PASCAL VOC 2009)




                      horse
More Recent Results (PASCAL VOC 2009)




                    motorbike
More Recent Results (PASCAL VOC 2009)




                      person
More Recent Results (PASCAL VOC 2009)




                    pottedplant
More Recent Results (PASCAL VOC 2009)




                       sheep
More Recent Results (PASCAL VOC 2009)




                        sofa
More Recent Results (PASCAL VOC 2009)




                        train
More Recent Results (PASCAL VOC 2009)




                      tvmonitor
Extensions:

      Image segmentation with connectedness constraint:




               CRF segmentation                             connected CRF segmentation

 • S. Nowozin, C.L.: Global Connectivity Potentials for Random Field Models, CVPR 2009.
Summary


  Object Localization is a step towards image interpretation.

  Conceptual approach instead of algorithmic:
    • Branch-and-bound evaluation:
           don’t slide a window, but solve an argmax problem,
           ⇒ higher efficiency

    • Structured regression training:
           solve the prediction problem, not a classification proxy.
           ⇒ higher localization accuracy

    • Modular and kernelized:
           easily adapted to other problems/representations, e.g.
           image segmentations
Structured regression for efficient object detection

More Related Content

What's hot

07 cie552 image_mosaicing
07 cie552 image_mosaicing07 cie552 image_mosaicing
07 cie552 image_mosaicingElsayed Hemayed
 
Ann chapter-3-single layerperceptron20021031
Ann chapter-3-single layerperceptron20021031Ann chapter-3-single layerperceptron20021031
Ann chapter-3-single layerperceptron20021031frdos
 
Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsLesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsMatthew Leingang
 
Journey to structure from motion
Journey to structure from motionJourney to structure from motion
Journey to structure from motionJa-Keoung Koo
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2zukun
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slidesCVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slideszukun
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningzukun
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationSignal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationGabriel Peyré
 
Calculus Cheat Sheet All
Calculus Cheat Sheet AllCalculus Cheat Sheet All
Calculus Cheat Sheet AllMoe Han
 
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb) IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb) Charles Deledalle
 
Common derivatives integrals_reduced
Common derivatives integrals_reducedCommon derivatives integrals_reduced
Common derivatives integrals_reducedKyro Fitkry
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsGabriel Peyré
 
Category Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) picturesCategory Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) picturesAshwin Rao
 
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
05210401  P R O B A B I L I T Y  T H E O R Y  A N D  S T O C H A S T I C  P R...05210401  P R O B A B I L I T Y  T H E O R Y  A N D  S T O C H A S T I C  P R...
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...guestd436758
 
Computer Science and Information Science 3rd semester (2012-December) Questio...
Computer Science and Information Science 3rd semester (2012-December) Questio...Computer Science and Information Science 3rd semester (2012-December) Questio...
Computer Science and Information Science 3rd semester (2012-December) Questio...B G S Institute of Technolgy
 

What's hot (19)

07 cie552 image_mosaicing
07 cie552 image_mosaicing07 cie552 image_mosaicing
07 cie552 image_mosaicing
 
Curve fitting
Curve fittingCurve fitting
Curve fitting
 
Ann chapter-3-single layerperceptron20021031
Ann chapter-3-single layerperceptron20021031Ann chapter-3-single layerperceptron20021031
Ann chapter-3-single layerperceptron20021031
 
03 finding roots
03 finding roots03 finding roots
03 finding roots
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsLesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite Integrals
 
Journey to structure from motion
Journey to structure from motionJourney to structure from motion
Journey to structure from motion
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slidesCVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learning
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationSignal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems Regularization
 
Calculus Cheat Sheet All
Calculus Cheat Sheet AllCalculus Cheat Sheet All
Calculus Cheat Sheet All
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
 
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb) IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
 
Common derivatives integrals_reduced
Common derivatives integrals_reducedCommon derivatives integrals_reduced
Common derivatives integrals_reduced
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse Problems
 
Category Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) picturesCategory Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) pictures
 
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
05210401  P R O B A B I L I T Y  T H E O R Y  A N D  S T O C H A S T I C  P R...05210401  P R O B A B I L I T Y  T H E O R Y  A N D  S T O C H A S T I C  P R...
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
 
Computer Science and Information Science 3rd semester (2012-December) Questio...
Computer Science and Information Science 3rd semester (2012-December) Questio...Computer Science and Information Science 3rd semester (2012-December) Questio...
Computer Science and Information Science 3rd semester (2012-December) Questio...
 

Similar to Structured regression for efficient object detection

super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deepKNaveenKumarECE
 
Support Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the theSupport Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the thesanjaibalajeessn
 
Dual SVM Problem.pdf
Dual SVM Problem.pdfDual SVM Problem.pdf
Dual SVM Problem.pdfssuser8547f2
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
Support vector machine
Support vector machineSupport vector machine
Support vector machinePrasenjit Dey
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
 
Finite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy LrFinite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy Lrguesta32562
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
image_enhancement_spatial
 image_enhancement_spatial image_enhancement_spatial
image_enhancement_spatialhoneyjecrc
 
SIGNATE 国立国会図書館の画像データレイアウト認識 1st place solution
SIGNATE 国立国会図書館の画像データレイアウト認識 1st place solutionSIGNATE 国立国会図書館の画像データレイアウト認識 1st place solution
SIGNATE 国立国会図書館の画像データレイアウト認識 1st place solutionKoji Asami
 
support vector machine algorithm in machine learning
support vector machine algorithm in machine learningsupport vector machine algorithm in machine learning
support vector machine algorithm in machine learningSamGuy7
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineSoma Boubou
 

Similar to Structured regression for efficient object detection (20)

super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deep
 
Support Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the theSupport Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the the
 
Dual SVM Problem.pdf
Dual SVM Problem.pdfDual SVM Problem.pdf
Dual SVM Problem.pdf
 
Integration
IntegrationIntegration
Integration
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
point processing
point processingpoint processing
point processing
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
Svm V SVC
Svm V SVCSvm V SVC
Svm V SVC
 
Finite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy LrFinite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy Lr
 
Knapsack problem using fixed tuple
Knapsack problem using fixed tupleKnapsack problem using fixed tuple
Knapsack problem using fixed tuple
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
image_enhancement_spatial
 image_enhancement_spatial image_enhancement_spatial
image_enhancement_spatial
 
SIGNATE 国立国会図書館の画像データレイアウト認識 1st place solution
SIGNATE 国立国会図書館の画像データレイアウト認識 1st place solutionSIGNATE 国立国会図書館の画像データレイアウト認識 1st place solution
SIGNATE 国立国会図書館の画像データレイアウト認識 1st place solution
 
Support Vector Machine.ppt
Support Vector Machine.pptSupport Vector Machine.ppt
Support Vector Machine.ppt
 
svm.ppt
svm.pptsvm.ppt
svm.ppt
 
support vector machine algorithm in machine learning
support vector machine algorithm in machine learningsupport vector machine algorithm in machine learning
support vector machine algorithm in machine learning
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
 

More from zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Structured regression for efficient object detection

  • 1. Structured Regression for Efficient Object Detection Christoph Lampert www.christoph-lampert.org Max Planck Institute for Biological Cybernetics, Tübingen December 3rd, 2009 • [C.L., Matthew B. Blaschko, Thomas Hofmann. CVPR 2008] • [Matthew B. Blaschko, C.L. ECCV 2008] • [C.L., Matthew B. Blaschko, Thomas Hofmann. PAMI 2009]
  • 3. Category-Level Object Localization What objects are present? person, car
  • 4. Category-Level Object Localization Where are the objects?
  • 5. Object Localization ⇒ Scene Interpretation A man inside of a car A man outside of a car ⇒ He’s driving. ⇒ He’s passing by.
  • 6. Algorithmic Approach: Sliding Window f (y1 ) = 0.2 f (y2 ) = 0.8 f (y3 ) = 1.5 Use a (pre-trained) classifier function f : • Place candidate window on the image. • Iterate: Evaluate f and store result. Shift candidate window by k pixels. • Return position where f was largest.
  • 7. Algorithmic approach: Sliding Window f (y1 ) = 0.2 f (y2 ) = 0.8 f (y3 ) = 1.5 Drawbacks: • single scale, single aspect ratio → repeat with different window sizes/shapes • search on grid → speed–accuracy tradeoff • computationally expensive
  • 8. New view: Generalized Sliding Window Assumptions: • Objects are rectangular image regions of arbitrary size. • The score of f is largest at the correct object position. Mathematical Formulation: yopt = argmax f (y) y∈Y with Y = {all rectangular regions in image}
  • 9. New view: Generalized Sliding Window Mathematical Formulation: yopt = argmax f (y) y∈Y with Y = {all rectangular regions in image} • How to choose/construct/learn the function f ? • How to do the optimization efficiently and robustly? (exhaustive search is too slow, O(w2 h2 ) elements).
  • 10. New view: Generalized Sliding Window Mathematical Formulation: yopt = argmax f (y) y∈Y with Y = {all rectangular regions in image} • How to choose/construct/learn the function f ? • How to do the optimization efficiently and robustly? (exhaustive search is too slow, O(w2 h2 ) elements).
  • 11. New view: Generalized Sliding Window Use the problem’s geometric structure:
  • 12. New view: Generalized Sliding Window Use the problem’s geometric structure: • Calculate scores for sets of boxes jointly. • If no element can contain the maximum, discard the box set. • Otherwise, split the box set and iterate. → Branch-and-bound optimization • finds global maximum yopt
  • 13. New view: Generalized Sliding Window Use the problem’s geometric structure: • Calculate scores for sets of boxes jointly. • If no element can contain the maximum, discard the box set. • Otherwise, split the box set and iterate. → Branch-and-bound optimization • finds global maximum yopt
  • 14. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 .
  • 15. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4
  • 16. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4 Splitting: • Identify largest interval.
  • 17. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4 Splitting: • Identify largest interval. Split at center: R → R1 ∪R2 .
  • 18. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4 Splitting: • Identify largest interval. Split at center: R → R1 ∪R2 . • New box sets: [L, T, R1 , B]
  • 19. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4 Splitting: • Identify largest interval. Split at center: R → R1 ∪R2 . • New box sets: [L, T, R1 , B] and [L, T, R2 , B].
  • 20. Calculating Scores for Box Sets Example: Linear Support-Vector-Machine f (y) := pi ∈y wi . + f upper (Y) = min(0, wi ) + max(0, wi ) pi ∈y∩ pi ∈y∪ Can be computed in O(1) using integral images.
  • 21. Calculating Scores for Box Sets J y Histogram Intersection Similarity: f (y) := j=1 min(hj , hj ). J ∪ y f upper (Y) = j=1 min(hj , hj ) As fast as for a single box: O(J) with integral histograms.
  • 22. Evaluation: Speed (on PASCAL VOC 2006) Sliding Window Runtime: • always: O(w2 h2 ) Branch-and-Bound (ESS) Runtime: • worst-case: O(w2 h2 ) • empirical: not more than O(wh)
  • 23. Extensions: Action classification: (y, t)opt = argmax(y,t)∈Y×T fx (y, t) • J. Yuan: Discriminative 3D Subvolume Search for Efficient Action Detection, CVPR 2009.
  • 24. Extensions: Localized image retrieval: (x, y)opt = argmaxy∈Y, x∈D fx (y) • C.L.: Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval, ICCV 2009
  • 25. Extensions: Hybrid – Branch-and-Bound with Implicit Shape Model • A. Lehmann, B. Leibe, L. van Gool: Feature-Centric Efficient Subwindow Search, ICCV 2009
  • 26.
  • 27. Generalized Sliding Window yopt = argmax f (y) y∈Y with Y = {all rectangular regions in image} • How to choose/construct/learn f ? • How to do the optimization efficiently and robustly?
  • 28. Traditional Approach: Binary Classifier Training images: + + • x1 , . . . , xn show the object − − • x1 , . . . , xm show something else Train a classifier, e.g. • support vector machine, • boosted cascade, • artificial neural network,. . . Decision function f : {images} → R • f > 0 means “image shows the object.” • f < 0 means “image does not show the object.”
  • 29. Traditional Approach: Binary Classifier Drawbacks: • Train distribution = test distribution • No control over partial detections. • No guarantee to even find training examples again.
  • 30. Object Localization as Structured Output Regression Ideal setup: • function g : {all images} → {all boxes} to predict object boxes from images • train and test in the same way, end-to-end   gcar   =
  • 31. Object Localization as Structured Output Regression Ideal setup: • function g : {all images} → {all boxes} to predict object boxes from images • train and test in the same way, end-to-end Regression problem: • training examples (x1 , y1 ), . . . , (xn , yn ) ∈ X × Y xi are images, yi are bounding boxes • Learn a mapping g : X →Y that generalizes from the given examples: g(xi ) ≈ yi , for i = 1, . . . , n,
  • 32. Structured Support Vector Machine SVM-like framework by Tsochantaridis et al.: • Positive definite kernel k : (X × Y) × (X × Y)→R. ϕ : X × Y → H : (implicit) feature map induced by k. • ∆ : Y × Y → R: loss function • Solve the convex optimization problem n 1 2 minw,ξ w +C ξi 2 i=1 subject to margin constraints for i = 1, . . . , n : ∀y ∈ Y {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi , • unique solution: w ∗ ∈ H • I. Tsochantaridis, T. Joachims, T. Hofmann, Y. Altun: Large Margin Methods for Structured and Interdependent Output Variables, Journal of Machine Learning Research (JMLR), 2005.
  • 33. Structured Support Vector Machine • w ∗ defines compatiblity function F (x, y) = w ∗ , ϕ(x, y) • best prediction for x is the most compatible y: g(x) := argmax F (x, y). y∈Y • evaluating g : X → Y is like generalized Sliding Window: for fixed x, evaluate quality function for every box y ∈ Y. for example, use previous branch-and-bound procedure!
  • 34. Joint Image/Box-Kernel: Example Joint kernel: how to compare one (image,box)-pair (x, y) with another (image,box)-pair (x , y )? kjoint , =k , is large. kjoint , =k , is small. kjoint , = kimage , could also be large.
  • 35. Loss Function: Example Loss function: how to compare two boxes y and y ? ∆(y, y ) := 1 − area overlap between y and y area(y ∩ y ) =1− area(y ∪ y )
  • 36. Structured Support Vector Machine n 1 2 • S-SVM Optimization: minw,ξ 2 w +C ξi i=1 subject to for i = 1, . . . , n : ∀y ∈ Y {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi ,
  • 37. Structured Support Vector Machine n 1 2 • S-SVM Optimization: minw,ξ 2 w +C ξi i=1 subject to for i = 1, . . . , n : ∀y ∈ Y {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi , • Solve via constraint generation: • Iterate: Solve minimization with working set of contraints Identify argmaxy∈Y ∆(y, yi ) + w, ϕ(xi , y) Add violated constraints to working set and iterate • Polynomial time convergence to any precision ε • Similar to bootstrap training, but with a margin.
  • 38. Evaluation: PASCAL VOC 2006 Example detections for VOC 2006 bicycle, bus and cat. Precision–recall curves for VOC 2006 bicycle, bus and cat. • Structured regression improves detection accuracy. • New best scores (at that time) in 6 of 10 classes.
  • 39. Why does it work? Learned weights from binary (center) and structured training (right). • Both methods assign positive weights to object region. • Structured training also assigns negative weights to features surrounding the bounding box position. • Posterior distribution over box coordinates becomes more peaked.
  • 40. More Recent Results (PASCAL VOC 2009) aeroplane
  • 41. More Recent Results (PASCAL VOC 2009) bicycle
  • 42. More Recent Results (PASCAL VOC 2009) bird
  • 43. More Recent Results (PASCAL VOC 2009) boat
  • 44. More Recent Results (PASCAL VOC 2009) bottle
  • 45. More Recent Results (PASCAL VOC 2009) bus
  • 46. More Recent Results (PASCAL VOC 2009) car
  • 47. More Recent Results (PASCAL VOC 2009) cat
  • 48. More Recent Results (PASCAL VOC 2009) chair
  • 49. More Recent Results (PASCAL VOC 2009) cow
  • 50. More Recent Results (PASCAL VOC 2009) diningtable
  • 51. More Recent Results (PASCAL VOC 2009) dog
  • 52. More Recent Results (PASCAL VOC 2009) horse
  • 53. More Recent Results (PASCAL VOC 2009) motorbike
  • 54. More Recent Results (PASCAL VOC 2009) person
  • 55. More Recent Results (PASCAL VOC 2009) pottedplant
  • 56. More Recent Results (PASCAL VOC 2009) sheep
  • 57. More Recent Results (PASCAL VOC 2009) sofa
  • 58. More Recent Results (PASCAL VOC 2009) train
  • 59. More Recent Results (PASCAL VOC 2009) tvmonitor
  • 60. Extensions: Image segmentation with connectedness constraint: CRF segmentation connected CRF segmentation • S. Nowozin, C.L.: Global Connectivity Potentials for Random Field Models, CVPR 2009.
  • 61. Summary Object Localization is a step towards image interpretation. Conceptual approach instead of algorithmic: • Branch-and-bound evaluation: don’t slide a window, but solve an argmax problem, ⇒ higher efficiency • Structured regression training: solve the prediction problem, not a classification proxy. ⇒ higher localization accuracy • Modular and kernelized: easily adapted to other problems/representations, e.g. image segmentations