SlideShare a Scribd company logo
Salient Object Detection by
                Composition

Jie Feng1, Yichen Wei2, Litian Tao3, Chao Zhang1, Jian Sun2
 1Key   Laboratory of Machine Perception, Peking University
                  2Microsoft   Research Asia
          3Microsoft   Search Technology Center Asia
A key vision problem: object detection




• Fundamental for image understanding
• Extremely challenging
  – Huge number of object classes
  – Huge variations in object appearances
What are salient objects?

• Visually distinctive and semantically meaningful
• Inherently ambiguous and subjective




    Yes!             Yes? probably            No!
Why detect salient objects?

• Relatively easy: large and distinct


• Semantically important
1. Image summarization, cropping…
2. Object level matching, retrieval…
3. A generic object detector for later recognition
  –   avoid running thousands of different detectors
  –   a scalable system for image understanding
Traditional approach: saliency map




• Measures per-pixel importance
• Loses information and deficient to find objects
sliding window object detection

                               •   Face, human…
                               •   Car, bus…
                               •   Horse, dog…
                               •   Table, couch…
                               •   …




• Slide different size windows over all positions
• Evaluate a quality function, e.g., a car classifier
• Output windows those are locally optimum
Salient object detection by composition

• A ‘composition’ based window saliency measure
   – intuitive and generalizes to different objects


• A sliding window based generic object detector
   – fast and practical: 1-2 seconds per image
   – a few dozens/hundreds output windows


• Effective pre-processing for later recognition tasks
It is hard to represent a salient window




• Given image I and window W
• saliency(W) = cost of composing W using (I-W)
Benefits of ‘composition’ definition

•
Part based representation


                                         W    {Si1... Si3}

                                                   1       10
                                        I W     {S o ... S o }




• Each part S has an (inside/outside) area A(S)
• Each part pair (p, q) has a composition cost c(p, q)
Generate parts by over-segmentation

Typically 100-200 segments in a natural image




       P.F.Felzenszwalb and D.P.Huttenlocher. Efficient graph-
               based image segmentation. IJCV, 2004
An illustrative ‘composition’ example


                           W={A, B, C
                             D, E}
            a
                          saliency(W)=
    A           b
        B                      cost(A,a)
                             +cost(B,b)
                             +cost(C,c)
                            +cost(D,d)
                             +cost(E,e)
Computational principles

1. Appearance proximity
2. Spatial proximity
3. Non-reusability
4. Non-scale-bias


• Intuitive perceptions about saliency
1. Appearance proximity

                         c(p, q1)=0.6
                   q1
                                c(p, q2)=0.2

                        p         q2




• Salient parts have distinct appearances
• q1 and q2 are equally distant from p, q2 is more similar
2. Spatial proximity


                                 c(p, q2)=0.2

                        p          q2
                                             q1
                                 c(p, q1)=0.3


• Salient parts are far from similar parts
• q1 and q2 are equally similar as p, q2 is closer
3. Non-reusability




• An outside part can be used only once
• Robust to background clutters
4. Non-scale-bias



                                   0.3

                                   0.6




• Normalized by window area and avoid large window bias
• tight bounding box > loose one
Define composition cost c(p, q)

•
Part based composition

• Finding outside parts with the same area of inside
  parts and smallest composition cost
• Need to find which outside part to compose which
  inside part with how much area
• Formulated as an Earth Mover’s Distance (EMD)
   – optimal solution has polynomial (cubic) complexity
• A greedy optimization
   – pre-computation + incremental sliding window update
Greedy composition algorithm
•
Algorithm pseudo code
Pre-computation and initialization

•
More implementation details

• 6 window sizes: 2% to 50% of image area
• 7 aspect ratios: 1:2 to 2:1
• 100-200 segments
• 1-2 seconds for 300 by 300 image


• Find local optimal windows by non-maximum
  suppression
Evaluation on PASCAL VOC 07

• it’s for object detection
   – 20 object classes
   – Large object and background variation
   – Challenging for traditional saliency methods
• not totally suitable for salient object detection
   – Not all labeled objects are salient: small, occluded, repetitive
   – Not all salient objects are labeled: only 20 classes
• but still the best database we have
Yellow: correct, Red: wrong, Blue: ground truth




     top 5 salient windows
Yellow: correct, Red: wrong, Blue: ground truth
Yellow: correct, Red: wrong, Blue: ground truth
Yellow: correct, Red: wrong, Blue: ground truth
Outperforms the state-of-the-art




•   Objectness: B.Alexe, T.Deselaers, and V.Ferrari. What is an object. In CVPR, 2010.
•   Uses mainly local cues: find locally salient windows that are globally not
Yellow: correct, Red: wrong, Blue: ground truth




 ours




objectness
Yellow: correct, Red: wrong, Blue: ground truth




                                       ours



    ours          objectness


                                      objectness
Failure cases: too complex
Failure cases: lack of semantics

• Partial background with object: man with background
• Not annotated objects: painting, pillows
• Similar objects together: two chairs
Failure cases: lack of semantics

• Partial object or object parts: wheels and seat
#windows V.S. detection rate

  #top windows    5     10     20     30    50

     recall      0.25   0.33   0.44   0.5   0.57



• Find many objects within a few windows
• A practical pre-processing tool
Evaluation on MSRA database

• Less challenging: only a single large object
   – T.Liu, J.Sun, N.Zheng, X.Tang, and H.Shum. Learning to detect a
     salient object. In CVPR, 2007



• Use the most salient window of our approach in evaluation
   – pixel level precision/recall is comparable with previous methods


• Our approach is principled for multi-object detection
   – benefits less from the database’s simplicity than previous methods
Summary

•

More Related Content

Similar to Iccv11 salientobjectdetection

Introduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer VisionIntroduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer Vision
othersk46
 
Introduction to Computer Vision (uapycon 2017)
Introduction to Computer Vision (uapycon 2017)Introduction to Computer Vision (uapycon 2017)
Introduction to Computer Vision (uapycon 2017)
Anton Kasyanov
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
Albert Y. C. Chen
 
Computer Vision harris
Computer Vision harrisComputer Vision harris
Computer Vision harris
Wael Badawy
 
Conventional Neural Networks and compute
Conventional Neural Networks and computeConventional Neural Networks and compute
Conventional Neural Networks and compute
YobuDJob1
 
PPT s12-machine vision-s2
PPT s12-machine vision-s2PPT s12-machine vision-s2
PPT s12-machine vision-s2
Binus Online Learning
 
Computer Graphics: Visible surface detection methods
Computer Graphics: Visible surface detection methodsComputer Graphics: Visible surface detection methods
Computer Graphics: Visible surface detection methods
Joseph Charles
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
Jan Aerts
 
Scale and object aware image retargeting for thumbnail browsing
Scale and object aware image retargeting for thumbnail browsingScale and object aware image retargeting for thumbnail browsing
Scale and object aware image retargeting for thumbnail browsing
perillaroc
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
Wael Sharba
 
3 D texturing
 3 D texturing 3 D texturing
3 D texturing
krishn verma
 
Reza talk
Reza talkReza talk
Reza talk
reza79sh
 
Poster 1-13-Paper ID 207
Poster 1-13-Paper ID 207Poster 1-13-Paper ID 207
Poster 1-13-Paper ID 207
Sudeshna Roy
 
Lec13 stereo converted
Lec13 stereo convertedLec13 stereo converted
Lec13 stereo converted
BaliThorat1
 
Lighting and shading
Lighting and shadingLighting and shading
Lighting and shading
Sri Harsha Vemuri
 
Fcv scene efros
Fcv scene efrosFcv scene efros
Fcv scene efros
zukun
 
Computer vision old problems new solutions
Computer vision   old problems new solutionsComputer vision   old problems new solutions
Computer vision old problems new solutions
Gopi Krishna Nuti
 
cnn.pptx
cnn.pptxcnn.pptx
cnn.pptx
sghorai
 
Stixel based real time object detection for ADAS using surface normal
Stixel based real time object detection for ADAS using surface normalStixel based real time object detection for ADAS using surface normal
Stixel based real time object detection for ADAS using surface normal
TaeKang Woo
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNS
LiemNguyenDuy
 

Similar to Iccv11 salientobjectdetection (20)

Introduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer VisionIntroduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer Vision
 
Introduction to Computer Vision (uapycon 2017)
Introduction to Computer Vision (uapycon 2017)Introduction to Computer Vision (uapycon 2017)
Introduction to Computer Vision (uapycon 2017)
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
Computer Vision harris
Computer Vision harrisComputer Vision harris
Computer Vision harris
 
Conventional Neural Networks and compute
Conventional Neural Networks and computeConventional Neural Networks and compute
Conventional Neural Networks and compute
 
PPT s12-machine vision-s2
PPT s12-machine vision-s2PPT s12-machine vision-s2
PPT s12-machine vision-s2
 
Computer Graphics: Visible surface detection methods
Computer Graphics: Visible surface detection methodsComputer Graphics: Visible surface detection methods
Computer Graphics: Visible surface detection methods
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
Scale and object aware image retargeting for thumbnail browsing
Scale and object aware image retargeting for thumbnail browsingScale and object aware image retargeting for thumbnail browsing
Scale and object aware image retargeting for thumbnail browsing
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
3 D texturing
 3 D texturing 3 D texturing
3 D texturing
 
Reza talk
Reza talkReza talk
Reza talk
 
Poster 1-13-Paper ID 207
Poster 1-13-Paper ID 207Poster 1-13-Paper ID 207
Poster 1-13-Paper ID 207
 
Lec13 stereo converted
Lec13 stereo convertedLec13 stereo converted
Lec13 stereo converted
 
Lighting and shading
Lighting and shadingLighting and shading
Lighting and shading
 
Fcv scene efros
Fcv scene efrosFcv scene efros
Fcv scene efros
 
Computer vision old problems new solutions
Computer vision   old problems new solutionsComputer vision   old problems new solutions
Computer vision old problems new solutions
 
cnn.pptx
cnn.pptxcnn.pptx
cnn.pptx
 
Stixel based real time object detection for ADAS using surface normal
Stixel based real time object detection for ADAS using surface normalStixel based real time object detection for ADAS using surface normal
Stixel based real time object detection for ADAS using surface normal
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNS
 

Recently uploaded

What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
DianaGray10
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
"What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w..."What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w...
Fwdays
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
LizaNolte
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
Fwdays
 
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
zjhamm304
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 

Recently uploaded (20)

What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
"What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w..."What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w...
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
 
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 

Iccv11 salientobjectdetection

  • 1. Salient Object Detection by Composition Jie Feng1, Yichen Wei2, Litian Tao3, Chao Zhang1, Jian Sun2 1Key Laboratory of Machine Perception, Peking University 2Microsoft Research Asia 3Microsoft Search Technology Center Asia
  • 2. A key vision problem: object detection • Fundamental for image understanding • Extremely challenging – Huge number of object classes – Huge variations in object appearances
  • 3. What are salient objects? • Visually distinctive and semantically meaningful • Inherently ambiguous and subjective Yes! Yes? probably No!
  • 4. Why detect salient objects? • Relatively easy: large and distinct • Semantically important 1. Image summarization, cropping… 2. Object level matching, retrieval… 3. A generic object detector for later recognition – avoid running thousands of different detectors – a scalable system for image understanding
  • 5. Traditional approach: saliency map • Measures per-pixel importance • Loses information and deficient to find objects
  • 6. sliding window object detection • Face, human… • Car, bus… • Horse, dog… • Table, couch… • … • Slide different size windows over all positions • Evaluate a quality function, e.g., a car classifier • Output windows those are locally optimum
  • 7. Salient object detection by composition • A ‘composition’ based window saliency measure – intuitive and generalizes to different objects • A sliding window based generic object detector – fast and practical: 1-2 seconds per image – a few dozens/hundreds output windows • Effective pre-processing for later recognition tasks
  • 8. It is hard to represent a salient window • Given image I and window W • saliency(W) = cost of composing W using (I-W)
  • 10. Part based representation W {Si1... Si3} 1 10 I W {S o ... S o } • Each part S has an (inside/outside) area A(S) • Each part pair (p, q) has a composition cost c(p, q)
  • 11. Generate parts by over-segmentation Typically 100-200 segments in a natural image P.F.Felzenszwalb and D.P.Huttenlocher. Efficient graph- based image segmentation. IJCV, 2004
  • 12. An illustrative ‘composition’ example W={A, B, C D, E} a saliency(W)= A b B cost(A,a) +cost(B,b) +cost(C,c) +cost(D,d) +cost(E,e)
  • 13. Computational principles 1. Appearance proximity 2. Spatial proximity 3. Non-reusability 4. Non-scale-bias • Intuitive perceptions about saliency
  • 14. 1. Appearance proximity c(p, q1)=0.6 q1 c(p, q2)=0.2 p q2 • Salient parts have distinct appearances • q1 and q2 are equally distant from p, q2 is more similar
  • 15. 2. Spatial proximity c(p, q2)=0.2 p q2 q1 c(p, q1)=0.3 • Salient parts are far from similar parts • q1 and q2 are equally similar as p, q2 is closer
  • 16. 3. Non-reusability • An outside part can be used only once • Robust to background clutters
  • 17. 4. Non-scale-bias 0.3 0.6 • Normalized by window area and avoid large window bias • tight bounding box > loose one
  • 18. Define composition cost c(p, q) •
  • 19. Part based composition • Finding outside parts with the same area of inside parts and smallest composition cost • Need to find which outside part to compose which inside part with how much area • Formulated as an Earth Mover’s Distance (EMD) – optimal solution has polynomial (cubic) complexity • A greedy optimization – pre-computation + incremental sliding window update
  • 23. More implementation details • 6 window sizes: 2% to 50% of image area • 7 aspect ratios: 1:2 to 2:1 • 100-200 segments • 1-2 seconds for 300 by 300 image • Find local optimal windows by non-maximum suppression
  • 24. Evaluation on PASCAL VOC 07 • it’s for object detection – 20 object classes – Large object and background variation – Challenging for traditional saliency methods • not totally suitable for salient object detection – Not all labeled objects are salient: small, occluded, repetitive – Not all salient objects are labeled: only 20 classes • but still the best database we have
  • 25. Yellow: correct, Red: wrong, Blue: ground truth top 5 salient windows
  • 26. Yellow: correct, Red: wrong, Blue: ground truth
  • 27. Yellow: correct, Red: wrong, Blue: ground truth
  • 28. Yellow: correct, Red: wrong, Blue: ground truth
  • 29. Outperforms the state-of-the-art • Objectness: B.Alexe, T.Deselaers, and V.Ferrari. What is an object. In CVPR, 2010. • Uses mainly local cues: find locally salient windows that are globally not
  • 30. Yellow: correct, Red: wrong, Blue: ground truth ours objectness
  • 31. Yellow: correct, Red: wrong, Blue: ground truth ours ours objectness objectness
  • 33. Failure cases: lack of semantics • Partial background with object: man with background • Not annotated objects: painting, pillows • Similar objects together: two chairs
  • 34. Failure cases: lack of semantics • Partial object or object parts: wheels and seat
  • 35. #windows V.S. detection rate #top windows 5 10 20 30 50 recall 0.25 0.33 0.44 0.5 0.57 • Find many objects within a few windows • A practical pre-processing tool
  • 36. Evaluation on MSRA database • Less challenging: only a single large object – T.Liu, J.Sun, N.Zheng, X.Tang, and H.Shum. Learning to detect a salient object. In CVPR, 2007 • Use the most salient window of our approach in evaluation – pixel level precision/recall is comparable with previous methods • Our approach is principled for multi-object detection – benefits less from the database’s simplicity than previous methods