SlideShare a Scribd company logo
Elastic Edge Boxes for Object
Proposal on RGB-D Images
Jing Liu, Tongwei Ren, Jia Bei
Nanjing University
January 5, 2016
MMM 2016, Paper ID: 86
Multimedia AnalyzinG
and UnderStanding
MAGUS
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSOutline
• Motivation
• Elastic Edge Boxes Method
• Experiments
• Conclusion
2
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSObject Proposal
3
Object detection Image segmentation Image retrieval
• Aims to detect bounding box which possibly contains
class-independent objects in an image
hit
miss
• Applications
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUS
• High recall
• High efficiency
• High accuracy
• Low intersection over union (IoU) is not enough
Object Proposal is Challenging
4
IoU = 0.5
IoU = 0.8
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSCurrent Methods
5
• Generate a pool of boxes
and score the boxes
• Efficient but not accurate
enough
• Over-segment images and
merge the segments
• Accurate but not efficient
enough
[Uijlings et. al, IJCV13][Cheng et. al, CVPR14]
How to combine these two strategies to obtain good performance
in both efficiency and accuracy?
Window scoring Grouping
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSOutline
• Motivation
• Elastic Edge Boxes Method
• Experiments
• Conclusion
6
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSOverview
Initial boxes
generation
Elastic range
search
Bounding box
adjustment
7
Elastic edge box for
RGB-D object proposal
Step 1
resultRGB channels and
depth channel
Step 2
Step 3
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSInitial Boxes Generation
8
• Perform sliding window to sample boxes
• Calculate score by contours wholly enclosed
in a box
• Utilize edge boxes method[Dolla ́r et. al, ECCV 14]
Initial boxes
generation
Edge detection result Initial boxes
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUS
Initial boxes
generation
Elastic Range Search
9
• Super-pixels straddling the box are elastic
range
• Use Super-pixels wholly included in the box to
represent object (cyan)
• Use super-pixels adjacent to elastic range of
similar sum as object part to represent
background (blue)
Elastic range (yellow super-pixels)
Elastic range
search
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSBounding Box Adjustment
• Compute color distance, spatial distance
and depth distance as similar
measurement
• Only super-pixels more similar to object
than background in both RGB and depth
channels will be assigned to object
10
Bounding box
adjustment
Adjusted bounding box (red box)Decision
Initial boxes
generation
Elastic range
search
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSOutline
• Motivation
• Elastic Edge Boxes Method
• Experiments
• Conclusion
11
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSDataset
• Improvement
• More balanced
• 300 images in each group (2, 3, 4,
5, 5+ objects, respectively)
• Higher average object number
• PASCAL VOC 2012: 2.38
• Stereo objectness: 2.98
• NJU1500: 4.22
• NJU1500: 1,500 stereo images for object proposal
• Extend from stereo objectness dataset [Xu et. al, ICME15]
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSResult
• Suitable for various images under high IoU
• Challenging situations
13
0.843
0.878
0.875
0.817
0.852
0.786
0.876
Obscure
(sword)
Small
(dustbin)
Occluded
(Papa Smurf)
0.845
0.903 0.796 0.825
0.83
0.79
0.738
0.876
0.829
0.854 0.802
0.825
0.86 0.889
0.854
0.694
0.795
0.928
0.826
0.846
0.67
0.76
0.832
0.876
0.791
0.857
0.852
0.821
0.846
0.831
0.95
0.845
0.903 0.796 0.825
0.83
0.79
0.794
0.789
0.941
0.844
0.708
0.893
0.802
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUS
• Compare with eight state-of-the-art methods
• Including AIDC, BING, EB, OBJ, GOP, MCG, SS and MEB
• Under IoU = 0.5 and IoU = 0.8, respectively
Comparison
14
IoU = 0.5 IoU = 0.8
Comparable to other
methods when IoU = 0.5
Better than other
methods when IoU = 0.8
Ours
5.78s per image
MCG
60.12s per image
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSOutline
• Motivation
• Elastic Edge Boxes Method
• Experiments
• Conclusion
15
NANJING UNIVERSITY
Multimedia AnalyzinG
and UnderStanding
MAGUSConclusion
• Contribution
• First attempt to integrate window scoring and
grouping strategies for RGB-D object proposal
• Provide an RGB-D image dataset NJU1500 for object
proposal
• Future work
• Object proposal for video analysis
• Usage of object proposal in multimedia applications
16
17
Thank You
Email: ljing12@software.nju.edu.cn
Multimedia AnalyzinG
and UnderStanding
MAGUS

More Related Content

Viewers also liked

Servant leadership show
Servant leadership showServant leadership show
Servant leadership show
Susan Sinigaglio
 
τα πέτρινα γεφύρια σαν έκφραση αυθεντικής λαϊκής αρχιτεκτονικής
τα πέτρινα γεφύρια σαν έκφραση αυθεντικής λαϊκής αρχιτεκτονικήςτα πέτρινα γεφύρια σαν έκφραση αυθεντικής λαϊκής αρχιτεκτονικής
τα πέτρινα γεφύρια σαν έκφραση αυθεντικής λαϊκής αρχιτεκτονικήςsintos65
 
ηπειρωτες που υπηρετησαν την τεχνη Nea
ηπειρωτες που υπηρετησαν την τεχνη Neaηπειρωτες που υπηρετησαν την τεχνη Nea
ηπειρωτες που υπηρετησαν την τεχνη Neasintos65
 
Las vocales
Las vocalesLas vocales
Flores+de+bach
Flores+de+bachFlores+de+bach
Flores+de+bach
jessidesiree
 
Situación de Aprendizaje
Situación de AprendizajeSituación de Aprendizaje
Situación de Aprendizaje
CLAUSSDELORD
 
Power point futofarmaka
Power point futofarmakaPower point futofarmaka
Power point futofarmakasintos65
 
Legal Aspects of Sport Concussion by Steven Pachman and Adria Lamba
Legal Aspects of Sport Concussion by Steven Pachman and Adria LambaLegal Aspects of Sport Concussion by Steven Pachman and Adria Lamba
Legal Aspects of Sport Concussion by Steven Pachman and Adria Lamba
University of Michigan Injury Center
 

Viewers also liked (9)

Servant leadership show
Servant leadership showServant leadership show
Servant leadership show
 
τα πέτρινα γεφύρια σαν έκφραση αυθεντικής λαϊκής αρχιτεκτονικής
τα πέτρινα γεφύρια σαν έκφραση αυθεντικής λαϊκής αρχιτεκτονικήςτα πέτρινα γεφύρια σαν έκφραση αυθεντικής λαϊκής αρχιτεκτονικής
τα πέτρινα γεφύρια σαν έκφραση αυθεντικής λαϊκής αρχιτεκτονικής
 
ηπειρωτες που υπηρετησαν την τεχνη Nea
ηπειρωτες που υπηρετησαν την τεχνη Neaηπειρωτες που υπηρετησαν την τεχνη Nea
ηπειρωτες που υπηρετησαν την τεχνη Nea
 
Las vocales
Las vocalesLas vocales
Las vocales
 
Flores+de+bach
Flores+de+bachFlores+de+bach
Flores+de+bach
 
Situación de Aprendizaje
Situación de AprendizajeSituación de Aprendizaje
Situación de Aprendizaje
 
Power point futofarmaka
Power point futofarmakaPower point futofarmaka
Power point futofarmaka
 
Legal Aspects of Sport Concussion by Steven Pachman and Adria Lamba
Legal Aspects of Sport Concussion by Steven Pachman and Adria LambaLegal Aspects of Sport Concussion by Steven Pachman and Adria Lamba
Legal Aspects of Sport Concussion by Steven Pachman and Adria Lamba
 
Snehal Thaker
Snehal ThakerSnehal Thaker
Snehal Thaker
 

Similar to mmm16-liuj_sld

Object extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningObject extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learning
Aly Abdelkareem
 
SALPROP: SALIENT OBJECT PROPOSALS VIA AGGREGATED EDGE CUES
SALPROP: SALIENT OBJECT PROPOSALS VIA AGGREGATED EDGE CUESSALPROP: SALIENT OBJECT PROPOSALS VIA AGGREGATED EDGE CUES
SALPROP: SALIENT OBJECT PROPOSALS VIA AGGREGATED EDGE CUES
Prerana Mukherjee
 
Paper review
Paper reviewPaper review
Paper review
Junya Tanaka
 
Image Object Detection Pipeline
Image Object Detection PipelineImage Object Detection Pipeline
Image Object Detection Pipeline
Abhinav Dadhich
 
Efficient exploration of region hierarchies for semantic segmentation
Efficient exploration of region hierarchies for semantic segmentationEfficient exploration of region hierarchies for semantic segmentation
Efficient exploration of region hierarchies for semantic segmentation
Universitat Politècnica de Catalunya
 
ISM2014
ISM2014ISM2014
ISM2014
nlab_utokyo
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
Jinwon Lee
 
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbkseminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
RajeshKotian11
 
presentation on Faster Yolo
presentation on Faster Yolo presentation on Faster Yolo
presentation on Faster Yolo
toontown1
 
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Sergey Karayev
 
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksPR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
Jinwon Lee
 
Andrii Belas "Overview of object detection approaches: cases, algorithms and...
Andrii Belas  "Overview of object detection approaches: cases, algorithms and...Andrii Belas  "Overview of object detection approaches: cases, algorithms and...
Andrii Belas "Overview of object detection approaches: cases, algorithms and...
Lviv Startup Club
 
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
Nexgen Technology
 
IISc Internship Report
IISc Internship ReportIISc Internship Report
IISc Internship Report
HarshilJain26
 
Anits dip
Anits dipAnits dip
Presentation Selan dos Santos 4Eyes Lab
Presentation Selan dos Santos 4Eyes LabPresentation Selan dos Santos 4Eyes Lab
Presentation Selan dos Santos 4Eyes Lab
selan_rds
 
Targeting accurate object extraction from an image a comprehensive study of ...
Targeting accurate object extraction from an image  a comprehensive study of ...Targeting accurate object extraction from an image  a comprehensive study of ...
Targeting accurate object extraction from an image a comprehensive study of ...
LogicMindtech Nologies
 
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
multimediaeval
 
Centertrack and naver airush 2020 review
Centertrack and naver airush 2020 reviewCentertrack and naver airush 2020 review
Centertrack and naver airush 2020 review
경훈 김
 
2019 Project Showcase - Alexander Adam Laurence
2019 Project Showcase - Alexander Adam Laurence2019 Project Showcase - Alexander Adam Laurence
2019 Project Showcase - Alexander Adam Laurence
AlexanderAdamLaurenc
 

Similar to mmm16-liuj_sld (20)

Object extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningObject extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learning
 
SALPROP: SALIENT OBJECT PROPOSALS VIA AGGREGATED EDGE CUES
SALPROP: SALIENT OBJECT PROPOSALS VIA AGGREGATED EDGE CUESSALPROP: SALIENT OBJECT PROPOSALS VIA AGGREGATED EDGE CUES
SALPROP: SALIENT OBJECT PROPOSALS VIA AGGREGATED EDGE CUES
 
Paper review
Paper reviewPaper review
Paper review
 
Image Object Detection Pipeline
Image Object Detection PipelineImage Object Detection Pipeline
Image Object Detection Pipeline
 
Efficient exploration of region hierarchies for semantic segmentation
Efficient exploration of region hierarchies for semantic segmentationEfficient exploration of region hierarchies for semantic segmentation
Efficient exploration of region hierarchies for semantic segmentation
 
ISM2014
ISM2014ISM2014
ISM2014
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbkseminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
 
presentation on Faster Yolo
presentation on Faster Yolo presentation on Faster Yolo
presentation on Faster Yolo
 
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
 
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksPR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
 
Andrii Belas "Overview of object detection approaches: cases, algorithms and...
Andrii Belas  "Overview of object detection approaches: cases, algorithms and...Andrii Belas  "Overview of object detection approaches: cases, algorithms and...
Andrii Belas "Overview of object detection approaches: cases, algorithms and...
 
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
 
IISc Internship Report
IISc Internship ReportIISc Internship Report
IISc Internship Report
 
Anits dip
Anits dipAnits dip
Anits dip
 
Presentation Selan dos Santos 4Eyes Lab
Presentation Selan dos Santos 4Eyes LabPresentation Selan dos Santos 4Eyes Lab
Presentation Selan dos Santos 4Eyes Lab
 
Targeting accurate object extraction from an image a comprehensive study of ...
Targeting accurate object extraction from an image  a comprehensive study of ...Targeting accurate object extraction from an image  a comprehensive study of ...
Targeting accurate object extraction from an image a comprehensive study of ...
 
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
 
Centertrack and naver airush 2020 review
Centertrack and naver airush 2020 reviewCentertrack and naver airush 2020 review
Centertrack and naver airush 2020 review
 
2019 Project Showcase - Alexander Adam Laurence
2019 Project Showcase - Alexander Adam Laurence2019 Project Showcase - Alexander Adam Laurence
2019 Project Showcase - Alexander Adam Laurence
 

mmm16-liuj_sld

  • 1. Elastic Edge Boxes for Object Proposal on RGB-D Images Jing Liu, Tongwei Ren, Jia Bei Nanjing University January 5, 2016 MMM 2016, Paper ID: 86 Multimedia AnalyzinG and UnderStanding MAGUS
  • 2. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSOutline • Motivation • Elastic Edge Boxes Method • Experiments • Conclusion 2
  • 3. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSObject Proposal 3 Object detection Image segmentation Image retrieval • Aims to detect bounding box which possibly contains class-independent objects in an image hit miss • Applications
  • 4. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUS • High recall • High efficiency • High accuracy • Low intersection over union (IoU) is not enough Object Proposal is Challenging 4 IoU = 0.5 IoU = 0.8
  • 5. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSCurrent Methods 5 • Generate a pool of boxes and score the boxes • Efficient but not accurate enough • Over-segment images and merge the segments • Accurate but not efficient enough [Uijlings et. al, IJCV13][Cheng et. al, CVPR14] How to combine these two strategies to obtain good performance in both efficiency and accuracy? Window scoring Grouping
  • 6. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSOutline • Motivation • Elastic Edge Boxes Method • Experiments • Conclusion 6
  • 7. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSOverview Initial boxes generation Elastic range search Bounding box adjustment 7 Elastic edge box for RGB-D object proposal Step 1 resultRGB channels and depth channel Step 2 Step 3
  • 8. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSInitial Boxes Generation 8 • Perform sliding window to sample boxes • Calculate score by contours wholly enclosed in a box • Utilize edge boxes method[Dolla ́r et. al, ECCV 14] Initial boxes generation Edge detection result Initial boxes
  • 9. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUS Initial boxes generation Elastic Range Search 9 • Super-pixels straddling the box are elastic range • Use Super-pixels wholly included in the box to represent object (cyan) • Use super-pixels adjacent to elastic range of similar sum as object part to represent background (blue) Elastic range (yellow super-pixels) Elastic range search
  • 10. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSBounding Box Adjustment • Compute color distance, spatial distance and depth distance as similar measurement • Only super-pixels more similar to object than background in both RGB and depth channels will be assigned to object 10 Bounding box adjustment Adjusted bounding box (red box)Decision Initial boxes generation Elastic range search
  • 11. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSOutline • Motivation • Elastic Edge Boxes Method • Experiments • Conclusion 11
  • 12. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSDataset • Improvement • More balanced • 300 images in each group (2, 3, 4, 5, 5+ objects, respectively) • Higher average object number • PASCAL VOC 2012: 2.38 • Stereo objectness: 2.98 • NJU1500: 4.22 • NJU1500: 1,500 stereo images for object proposal • Extend from stereo objectness dataset [Xu et. al, ICME15]
  • 13. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSResult • Suitable for various images under high IoU • Challenging situations 13 0.843 0.878 0.875 0.817 0.852 0.786 0.876 Obscure (sword) Small (dustbin) Occluded (Papa Smurf) 0.845 0.903 0.796 0.825 0.83 0.79 0.738 0.876 0.829 0.854 0.802 0.825 0.86 0.889 0.854 0.694 0.795 0.928 0.826 0.846 0.67 0.76 0.832 0.876 0.791 0.857 0.852 0.821 0.846 0.831 0.95 0.845 0.903 0.796 0.825 0.83 0.79 0.794 0.789 0.941 0.844 0.708 0.893 0.802
  • 14. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUS • Compare with eight state-of-the-art methods • Including AIDC, BING, EB, OBJ, GOP, MCG, SS and MEB • Under IoU = 0.5 and IoU = 0.8, respectively Comparison 14 IoU = 0.5 IoU = 0.8 Comparable to other methods when IoU = 0.5 Better than other methods when IoU = 0.8 Ours 5.78s per image MCG 60.12s per image
  • 15. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSOutline • Motivation • Elastic Edge Boxes Method • Experiments • Conclusion 15
  • 16. NANJING UNIVERSITY Multimedia AnalyzinG and UnderStanding MAGUSConclusion • Contribution • First attempt to integrate window scoring and grouping strategies for RGB-D object proposal • Provide an RGB-D image dataset NJU1500 for object proposal • Future work • Object proposal for video analysis • Usage of object proposal in multimedia applications 16