Class-Specific, Top-Down Segmentation Eran Borenstein and Shimon Ullman Presenter : Rafi Zachut Instructor :  Lior Wolf
Goal To create figure-ground map using class-specific criteria Motivation Human vision
Bottom-up Segmentation Rely on image based criteria: grey level or texture uniformity  smoothness and continuity of bounding contours
Difficulty of class-specific segmentation Large variability of shapes within a given class Solution Use fragment-based representation (extracted from training examples) Cover novel object with fragments (like puzzle) to define the figure-ground map
Example
Training – Searching the fragments  Generate large number of candidates from class images (C) 2. For every fragment F i  : Find max correlation S i  for each image in C and NC (non class images) Set detection threshold  θ i  such that : p (S i  >  θ i  | NC) ≤  α   3. Select K fragments with best  p (S i  >  θ i  | C)
Fragment information Grey level template (left) Figure – ground label (right) Reliability value :  p (S i  >  θ i  | C)
Segmentation by Optimal Cover  A cover fragments define a  figure–ground  map An “optimal cover” is needed Quality of a cover is measured by:  Individual Match Consistency Fragment Reliability
Individual Match Similarity between a fragment and the region it covers Combines region normalized correlation with edge detection :
Individual Match – Cont’d
Consistency Cover quality from global view Consistency between two overlapping fragments F i  and F j  is defined as :
Fragment Reliability Reliable fragments guide the covering (similar to a puzzle)
A Cover Score A cover is an assignment of fragments to positions : Its score is defined by :
Finding the Optimal Score Finding the optimal cover is exhausting  Instead we use greedy iterative algorithm that converges to a local maximum Typically 2-3 iterations Complexity-linear in the image size and the number of fragments
The Cover Algorithm Pre-Processing For each image position find best fragment  (Position Score = Individual Match * Reliability) Select the sub-window with maximum score of M positions Iteration Choose the best M unused fragments Add the subset that maximizes the score Remove all fragments that reduce the score
Scaling No analytical support  Pre-processing stage is applied on 5 different scales of the target image and the algorithm continues with the best window in the best scale
Results
Advantages Automatic training Meaningful approximation for the figure-ground map of the image  Can be extended to support multiple classes Disadvantages Covering highly variable parts is difficult Inaccurate delineation of object boundaries
Combining Top-down and Bottom-up segmentation Eran Borenstein Eitan Sharon  Shimon Ullman
Bottom-up vs. Top-down Inaccurate boundaries Accurate boundaries Figure-ground approx. Multiple segments Use class information Rely on image criteria
Bottom-up in a nutshell Applies successive recursive image coarsening Homogenous segments at one level are used to form larger segments at the next level i.e. the image is segmented into fewer and fewer segments  Complexity - linear in image pixels
Bottom-up segmentation tree Nodes  – segments at different coarsening levels  Arcs     – connect between a segment and its sub-    segments at a finer level
Bottom-up Saliency Measurement Ranks segments by their distinctiveness Roughly: low saliency high saliency Internal homogeneity Dissimilarity with the surrounding saliency + =
Top-down & Bottom-up - Goal To provide accurate figure-ground map Top - down Top – down & Bottom-up
The Approach  Work is constrained to using Bottom-up segments. Actually the labeling (figure/ground) of the tree leaves defines the final map  Use Top-down approximation to overcome Bottom-up problems: Group together segments belonging to the object despite image-based dissimilarity Break apart homogenous segments contain both figure and ground regions
Top-down & Bottom-up conflict Top-down wants the final map to be as close as possible to its map  Bottom-up wants to keep salient segments complete  When the Top-down map crosses a salient segment a decision is need to be taken: breaking the salient segment versus distancing from the Top-down map
The Top-down cost If the final map (defined by the leaves labeling) is  C  and the Top-down map is  T ,  then the Top-down cost is :  ║C - T║ 2   Bottom-up cost Each segment  S i  labeled  differently  from its parent will pay  b i   =  │ S i │ (1- h i )  where  h i  is its saliency (between 0 to 1) Low-salience segments pay more (they break salient segments)
The Total cost Is simply a weighted sum of both costs, i.e.  ║ C - T║ 2  +  λ  ∑ b i   Top-down requirement is constrained by the bottom-up tree The variables are the segments labels
Minimizing the cost function If  N  is the number of segments in the tree, there are  2 N  labeling options Luckily the cost function can be solved by the product-sum algorithm in  O(N) See attached example
Confidence map After solving each segment S holds two values: m s (+1) -  minimum cost when S is figure  m s (-1)  -  minimum cost when S is ground The confidence in segment S labeling is: │ m s (+1)  -  m s (-1) │  /  │ S │   If the minimum cost is not sensitive to the labeling of S, the confidence is low A confidence map can be constructed from the confidence of the leaves
Results
Conclusion General approach to use Top-down info to : Group together segments belonging to the object despite image-based dissimilarity Break apart homogenous segments contain both figure and ground regions Efficient algorithm – linear in the tree nodes Reliable confidence map with no extra computation
Thank You

Rafi Zachut's slides on class specific segmentation

  • 1.
    Class-Specific, Top-Down SegmentationEran Borenstein and Shimon Ullman Presenter : Rafi Zachut Instructor : Lior Wolf
  • 2.
    Goal To createfigure-ground map using class-specific criteria Motivation Human vision
  • 3.
    Bottom-up Segmentation Relyon image based criteria: grey level or texture uniformity smoothness and continuity of bounding contours
  • 4.
    Difficulty of class-specificsegmentation Large variability of shapes within a given class Solution Use fragment-based representation (extracted from training examples) Cover novel object with fragments (like puzzle) to define the figure-ground map
  • 5.
  • 6.
    Training – Searchingthe fragments Generate large number of candidates from class images (C) 2. For every fragment F i : Find max correlation S i for each image in C and NC (non class images) Set detection threshold θ i such that : p (S i > θ i | NC) ≤ α 3. Select K fragments with best p (S i > θ i | C)
  • 7.
    Fragment information Greylevel template (left) Figure – ground label (right) Reliability value : p (S i > θ i | C)
  • 8.
    Segmentation by OptimalCover A cover fragments define a figure–ground map An “optimal cover” is needed Quality of a cover is measured by: Individual Match Consistency Fragment Reliability
  • 9.
    Individual Match Similaritybetween a fragment and the region it covers Combines region normalized correlation with edge detection :
  • 10.
  • 11.
    Consistency Cover qualityfrom global view Consistency between two overlapping fragments F i and F j is defined as :
  • 12.
    Fragment Reliability Reliablefragments guide the covering (similar to a puzzle)
  • 13.
    A Cover ScoreA cover is an assignment of fragments to positions : Its score is defined by :
  • 14.
    Finding the OptimalScore Finding the optimal cover is exhausting Instead we use greedy iterative algorithm that converges to a local maximum Typically 2-3 iterations Complexity-linear in the image size and the number of fragments
  • 15.
    The Cover AlgorithmPre-Processing For each image position find best fragment (Position Score = Individual Match * Reliability) Select the sub-window with maximum score of M positions Iteration Choose the best M unused fragments Add the subset that maximizes the score Remove all fragments that reduce the score
  • 16.
    Scaling No analyticalsupport Pre-processing stage is applied on 5 different scales of the target image and the algorithm continues with the best window in the best scale
  • 17.
  • 18.
    Advantages Automatic trainingMeaningful approximation for the figure-ground map of the image Can be extended to support multiple classes Disadvantages Covering highly variable parts is difficult Inaccurate delineation of object boundaries
  • 19.
    Combining Top-down andBottom-up segmentation Eran Borenstein Eitan Sharon Shimon Ullman
  • 20.
    Bottom-up vs. Top-downInaccurate boundaries Accurate boundaries Figure-ground approx. Multiple segments Use class information Rely on image criteria
  • 21.
    Bottom-up in anutshell Applies successive recursive image coarsening Homogenous segments at one level are used to form larger segments at the next level i.e. the image is segmented into fewer and fewer segments Complexity - linear in image pixels
  • 22.
    Bottom-up segmentation treeNodes – segments at different coarsening levels Arcs – connect between a segment and its sub- segments at a finer level
  • 23.
    Bottom-up Saliency MeasurementRanks segments by their distinctiveness Roughly: low saliency high saliency Internal homogeneity Dissimilarity with the surrounding saliency + =
  • 24.
    Top-down & Bottom-up- Goal To provide accurate figure-ground map Top - down Top – down & Bottom-up
  • 25.
    The Approach Work is constrained to using Bottom-up segments. Actually the labeling (figure/ground) of the tree leaves defines the final map Use Top-down approximation to overcome Bottom-up problems: Group together segments belonging to the object despite image-based dissimilarity Break apart homogenous segments contain both figure and ground regions
  • 26.
    Top-down & Bottom-upconflict Top-down wants the final map to be as close as possible to its map Bottom-up wants to keep salient segments complete When the Top-down map crosses a salient segment a decision is need to be taken: breaking the salient segment versus distancing from the Top-down map
  • 27.
    The Top-down costIf the final map (defined by the leaves labeling) is C and the Top-down map is T , then the Top-down cost is : ║C - T║ 2 Bottom-up cost Each segment S i labeled differently from its parent will pay b i = │ S i │ (1- h i ) where h i is its saliency (between 0 to 1) Low-salience segments pay more (they break salient segments)
  • 28.
    The Total costIs simply a weighted sum of both costs, i.e. ║ C - T║ 2 + λ ∑ b i Top-down requirement is constrained by the bottom-up tree The variables are the segments labels
  • 29.
    Minimizing the costfunction If N is the number of segments in the tree, there are 2 N labeling options Luckily the cost function can be solved by the product-sum algorithm in O(N) See attached example
  • 30.
    Confidence map Aftersolving each segment S holds two values: m s (+1) - minimum cost when S is figure m s (-1) - minimum cost when S is ground The confidence in segment S labeling is: │ m s (+1) - m s (-1) │ / │ S │ If the minimum cost is not sensitive to the labeling of S, the confidence is low A confidence map can be constructed from the confidence of the leaves
  • 31.
  • 32.
    Conclusion General approachto use Top-down info to : Group together segments belonging to the object despite image-based dissimilarity Break apart homogenous segments contain both figure and ground regions Efficient algorithm – linear in the tree nodes Reliable confidence map with no extra computation
  • 33.