SlideShare a Scribd company logo
1 of 29
Download to read offline
Robert Collins
CSE486




                       Lecture 10:
                 Pyramids and Scale Space
Robert Collins
CSE486
                                   Recall
         • Cascaded Gaussians
                 – Repeated convolution by a smaller Gaussian
                   to simulate effects of a larger one.
         • G*(G*f) = (G*G)*f [associativity]
Robert Collins
CSE486
                  Example: Cascaded Convolutions

                 [1 1] * [1 1] --> [1 2 1]
                 [1 1] * [1 2 1] --> [1 3 3 1]
                 [1 1] * [1 3 3 1] --> [1 4 6 4 1]
                 ..and so on…
                                      1 1
                                     1 2 1           Pascal’s
                                    1 3 3 1          Triangle
                                   1 4 6 4 1
                                 1 5 10 10 5 1
                                  and so on…
Robert Collins
CSE486
                 Aside: Binomial Approximation

             1 1
            1 2 1
           1 3 3 1
          1 4 6 4 1
        1 5 10 10 5 1
         and so on…
             Pascal’s           Binomial
             Triangle           Coefficients
Robert Collins
CSE486
                   Aside: Binomial Approximation

       Look at odd-length rows of Pascal’s triangle:

                      1 1         [1 2 1]/4 - approximates
                     1 2 1        Gaussian with sigma=1/sqrt(2)
                    1 3 3 1
                   1 4 6 4 1
                                 [1 4 6 4 1]/16 - approximates
                 1 5 10 10 5 1
                                 Gaussian with sigma=1
                  and so on…

      An easy way to generate integer-coefficient
      Gaussian approximations.
Robert Collins
CSE486
                 From Homework 2
Robert Collins
CSE486
                 More about Cascaded Convolutions
                      (for the mathematically inclined)
     Fun facts:
     The distribution of the sum of two random variables
     X + Y is the convolution of their two distributions

     Given N i.i.d. random variables, X1 … XN, the distribution
     of their sum approaches a Gaussian distribution (aka the
     central limit theorem)

     Therefore:
     The repeated convolution of a (nonnegative)
     filter with itself takes on a Gaussian shape.
Robert Collins
CSE486               Gaussian Smoothing at
                        Different Scales




                 original            sigma = 1
Robert Collins
CSE486               Gaussian Smoothing at
                        Different Scales




                 original            sigma = 3
Robert Collins
CSE486               Gaussian Smoothing at
                        Different Scales




                 original            sigma = 10
Robert Collins
CSE486                    Idea for Today:
                 Form a Multi-Resolution Representation




     original

                    sigma = 1

                                sigma = 3
                                                sigma = 10
Robert Collins
CSE486
                        Pyramid Representations

                 Because a large amount of smoothing limits
                 the frequency of features in the image, we do
                 not need to keep all the pixels around!

                 Strategy: progressively reduce the number of
                 pixels as we smooth more and more.

                 Leads to a “pyramid” representation if we
                 subsample at each level.
Robert Collins
CSE486
                   Gaussian Pyramid

     • Synthesis: Smooth image with a Gaussian and
       downsample. Repeat.
     • Gaussian is used because it is self-reproducing
       (enables incremental smoothing).
     • Top levels come “for free”. Processing cost
       typically dominated by two lowest levels
       (highest resolution).
Robert Collins
CSE486
                 Gaussian Pyramid




                                    Low
                                    res




                                    High res
Robert Collins
CSE486
                 Emphasis: Smaller Images
                  have Lower Resolution
Robert Collins
CSE486
                   Generating a Gaussian Pyramid

        Basic Functions:
                 Blur (convolve with Gaussian
                 to smooth image)



                 DownSample (reduce image
                 size by half)



                 Upsample (double image size)
Robert Collins
CSE486
                   Generating a Gaussian Pyramid

        Basic Functions:
                 Blur (convolve with Gaussian
                 to smooth image)



                 DownSample (reduce image
                 size by half)



                 Upsample (double image size)
Robert Collins
CSE486
                               Downsample




                 By the way: Subsampling is a bad idea unless
                 you have previously blurred/smoothed the
                 image! (because it leads to aliasing)
Robert Collins
CSE486
                 To Elaborate: Thumbnails

                                        131x97



                                         65x48

                                         32x24

       original image    downsampled (left)
       262x195           vs. smoothed then
                        downsampled (right)
Robert Collins
CSE486
                 To Elaborate: Thumbnails




       original image      downsampled (left)
       262x195             vs. smoothed then
                          downsampled (right)
                                131x97
Robert Collins
CSE486
                 To Elaborate: Thumbnails




       original image      downsampled (left)
       262x195             vs. smoothed then
                          downsampled (right)
                               65x48
Robert Collins
CSE486
                 To Elaborate: Thumbnails




       original image      downsampled (left)
       262x195             vs. smoothed then
                          downsampled (right)
                                32x24
Robert Collins
CSE486
                                 Upsample




                 How to fill in the empty values?
                 Interpolation:
                 • initially set empty pixels to zero
                 • convolve upsampled image with Gaussian filter!
                     e.g. 5x5 kernel with sigma = 1.
                 • Must also multiply by 4. Explain why.
Robert Collins
CSE486
                         Specific Example
   From Crowley et.al., “Fast Computation of Characteristic Scale
   using a Half-Octave Pyramid.” Proc International Workshop
   on Cognitive Vision (CogVis), Zurich, Switzerland, 2002.

   General idea: cascaded filtering using [1 4 6 4 1] kernel to generate
   a pyramid with two images per octave (power of 2 change in
   resolution). When we reach a full octave, downsample the image.


                 blur         blur


                                     downsample


                                             blur       blur
Robert Collins
CSE486
                 Effective Sigma at Each Level




Crowley etal.
Robert Collins
CSE486
                 Effective Sigma at Each Level




                                     CAN YOU
                                     EXPLAIN HOW
                                     THESE VALUES
                                     ARISE?
Robert Collins
CSE486
                      Concept: Scale Space
   Basic idea: different scales are appropriate for describing different
   objects in the image, and we may not know the correct scale/size
   ahead of time.
Robert Collins
CSE486           Example: Detecting “Blobs” at
                      Different Scales.




                           But first, we have to talk
                           about detecting blobs
                           at one scale...
Robert Collins
CSE486

More Related Content

More from zukun

Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
zukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
zukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
zukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
zukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
zukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
zukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
zukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
zukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
zukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
zukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
zukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
zukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
zukun
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
zukun
 
Quoc le tera-scale deep learning
Quoc le   tera-scale deep learningQuoc le   tera-scale deep learning
Quoc le tera-scale deep learning
zukun
 
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
zukun
 
Lecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningLecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learning
zukun
 

More from zukun (20)

Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
 
Quoc le tera-scale deep learning
Quoc le   tera-scale deep learningQuoc le   tera-scale deep learning
Quoc le tera-scale deep learning
 
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
 
Lecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningLecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learning
 

Lecture10

  • 1. Robert Collins CSE486 Lecture 10: Pyramids and Scale Space
  • 2. Robert Collins CSE486 Recall • Cascaded Gaussians – Repeated convolution by a smaller Gaussian to simulate effects of a larger one. • G*(G*f) = (G*G)*f [associativity]
  • 3. Robert Collins CSE486 Example: Cascaded Convolutions [1 1] * [1 1] --> [1 2 1] [1 1] * [1 2 1] --> [1 3 3 1] [1 1] * [1 3 3 1] --> [1 4 6 4 1] ..and so on… 1 1 1 2 1 Pascal’s 1 3 3 1 Triangle 1 4 6 4 1 1 5 10 10 5 1 and so on…
  • 4. Robert Collins CSE486 Aside: Binomial Approximation 1 1 1 2 1 1 3 3 1 1 4 6 4 1 1 5 10 10 5 1 and so on… Pascal’s Binomial Triangle Coefficients
  • 5. Robert Collins CSE486 Aside: Binomial Approximation Look at odd-length rows of Pascal’s triangle: 1 1 [1 2 1]/4 - approximates 1 2 1 Gaussian with sigma=1/sqrt(2) 1 3 3 1 1 4 6 4 1 [1 4 6 4 1]/16 - approximates 1 5 10 10 5 1 Gaussian with sigma=1 and so on… An easy way to generate integer-coefficient Gaussian approximations.
  • 6. Robert Collins CSE486 From Homework 2
  • 7. Robert Collins CSE486 More about Cascaded Convolutions (for the mathematically inclined) Fun facts: The distribution of the sum of two random variables X + Y is the convolution of their two distributions Given N i.i.d. random variables, X1 … XN, the distribution of their sum approaches a Gaussian distribution (aka the central limit theorem) Therefore: The repeated convolution of a (nonnegative) filter with itself takes on a Gaussian shape.
  • 8. Robert Collins CSE486 Gaussian Smoothing at Different Scales original sigma = 1
  • 9. Robert Collins CSE486 Gaussian Smoothing at Different Scales original sigma = 3
  • 10. Robert Collins CSE486 Gaussian Smoothing at Different Scales original sigma = 10
  • 11. Robert Collins CSE486 Idea for Today: Form a Multi-Resolution Representation original sigma = 1 sigma = 3 sigma = 10
  • 12. Robert Collins CSE486 Pyramid Representations Because a large amount of smoothing limits the frequency of features in the image, we do not need to keep all the pixels around! Strategy: progressively reduce the number of pixels as we smooth more and more. Leads to a “pyramid” representation if we subsample at each level.
  • 13. Robert Collins CSE486 Gaussian Pyramid • Synthesis: Smooth image with a Gaussian and downsample. Repeat. • Gaussian is used because it is self-reproducing (enables incremental smoothing). • Top levels come “for free”. Processing cost typically dominated by two lowest levels (highest resolution).
  • 14. Robert Collins CSE486 Gaussian Pyramid Low res High res
  • 15. Robert Collins CSE486 Emphasis: Smaller Images have Lower Resolution
  • 16. Robert Collins CSE486 Generating a Gaussian Pyramid Basic Functions: Blur (convolve with Gaussian to smooth image) DownSample (reduce image size by half) Upsample (double image size)
  • 17. Robert Collins CSE486 Generating a Gaussian Pyramid Basic Functions: Blur (convolve with Gaussian to smooth image) DownSample (reduce image size by half) Upsample (double image size)
  • 18. Robert Collins CSE486 Downsample By the way: Subsampling is a bad idea unless you have previously blurred/smoothed the image! (because it leads to aliasing)
  • 19. Robert Collins CSE486 To Elaborate: Thumbnails 131x97 65x48 32x24 original image downsampled (left) 262x195 vs. smoothed then downsampled (right)
  • 20. Robert Collins CSE486 To Elaborate: Thumbnails original image downsampled (left) 262x195 vs. smoothed then downsampled (right) 131x97
  • 21. Robert Collins CSE486 To Elaborate: Thumbnails original image downsampled (left) 262x195 vs. smoothed then downsampled (right) 65x48
  • 22. Robert Collins CSE486 To Elaborate: Thumbnails original image downsampled (left) 262x195 vs. smoothed then downsampled (right) 32x24
  • 23. Robert Collins CSE486 Upsample How to fill in the empty values? Interpolation: • initially set empty pixels to zero • convolve upsampled image with Gaussian filter! e.g. 5x5 kernel with sigma = 1. • Must also multiply by 4. Explain why.
  • 24. Robert Collins CSE486 Specific Example From Crowley et.al., “Fast Computation of Characteristic Scale using a Half-Octave Pyramid.” Proc International Workshop on Cognitive Vision (CogVis), Zurich, Switzerland, 2002. General idea: cascaded filtering using [1 4 6 4 1] kernel to generate a pyramid with two images per octave (power of 2 change in resolution). When we reach a full octave, downsample the image. blur blur downsample blur blur
  • 25. Robert Collins CSE486 Effective Sigma at Each Level Crowley etal.
  • 26. Robert Collins CSE486 Effective Sigma at Each Level CAN YOU EXPLAIN HOW THESE VALUES ARISE?
  • 27. Robert Collins CSE486 Concept: Scale Space Basic idea: different scales are appropriate for describing different objects in the image, and we may not know the correct scale/size ahead of time.
  • 28. Robert Collins CSE486 Example: Detecting “Blobs” at Different Scales. But first, we have to talk about detecting blobs at one scale...