SlideShare a Scribd company logo
1 of 24
▶ robocup.tugraz.at
* LORENZ Peter and STEINBAUER Gerald
August 2018
Institute of Software Technology
Graz University of Technology, Austria
The RoboCup Rescue Dataset
1
▶ robocup.tugraz.at▶ robocup.tugraz.at
Content
● Introduction - RoboCup Rescue League
● Data Set
○ Properties (Database Size, Image Size, Ground Truth)
○ Definition of the Confusion Matrix
● Standard Algorithms
○ Haar Cascade
○ CNN-Overfeat
○ Histogram of Oriented Gradients (HOG)
● Own Method: Bag of Visual Words
● Results
▶ robocup.tugraz.at▶ robocup.tugraz.at
RoboCup Rescue League
● Competition between international teams
● Several disciplines - our Focus: Autonomy
3
German Opens in
Magdeburg, 2015.
▶ robocup.tugraz.at▶ robocup.tugraz.at
Dataset Properties
● Size: 640×480 pixels
● Images for binary classification
○ 955 positive images: Victim on it.
○ 746 negative images: No victim.
● XML-Files: contain a polygon, that labels the
victim’s face → Ground Truth. (Only for positive images.)
▶ robocup.tugraz.at▶ robocup.tugraz.at
Confusion Matrix (1/4)
● True Positive (TP): Actual class and the prediction
of the detector are intersecting beyond a certain
threshold.
Rectangles:
- Ground Truth
- Prediction
▶ robocup.tugraz.at▶ robocup.tugraz.at
Confusion Matrix (2/4)
● True Negative (TN): There is no victim on the
image and there is no victim predicted.
Rectangles:
- Ground Truth
- Prediction
▶ robocup.tugraz.at▶ robocup.tugraz.at
Confusion Matrix (3/4)
● False Positive (FP): There is no victim on the
image, but there is one predicted.
Rectangles:
- Ground Truth
- Prediction
▶ robocup.tugraz.at▶ robocup.tugraz.at
Confusion Matrix (4/4)
● False Negative (FN):
○ There is a victim, but no prediction.
○ The prediction does not intersect with the actual
class.
Rectangles:
- Ground Truth
- Prediction
▶ robocup.tugraz.at▶ robocup.tugraz.at
Haar Cascade [Viola and Jones]
● Is continuously running, due to low computational
expensiveness.
● Standard Face Detection Algorithm.
● Face can only recognized in portrait. Solution:
▶ robocup.tugraz.at▶ robocup.tugraz.at
Haar Cascade (Recap)
● Haar Features:
● Similar to a Convolution.
● Each feature is a single value obtained by subtracting sum of pixels.
under white rectangle from sum of pixels under black rectangle.
● Integral Image
● Adaboost
● Cascade
Edge Features
Line Features
▶ robocup.tugraz.at▶ robocup.tugraz.at
Haar Cascade
● No big difference between 30% and 70% threshold
haar cascade 30% threshold. haar cascade 70% threshold.
Overlapping Area
Rectangles:
- Ground Truth
- Prediction
▶ robocup.tugraz.at▶ robocup.tugraz.at
CNN-Overfeat [Sermanet, Eigen, Zhang, Mathieu, Fergus and LeCun]
● Takes about 2 seconds for prediction.
● Only used, by previous action → heat detected.
● Is not trained by this dataset. Private dataset, data
used from public sources.
● Last layer is trained for classification.
▶ robocup.tugraz.at▶ robocup.tugraz.at
CNN-Overfeat
● The results tends to be positive!
CNN-Overfeat.
● Instead of only faces, whole images as training.
Learns whole image:
○ Whole body
○ Clothes
○ Environment
● Different dataset
▶ robocup.tugraz.at▶ robocup.tugraz.at
Histogram of oriented Gradients [Dalal and Triggs]
● HOG feature extraction (Recap)
○ Compute centered horizontal and vertical gradients.
○ Compute gradient orientation and magnitudes.
○ Example: Divide the image into 16x16 block of 50%
overlap. Each block consists of 2x2 cells.
○ Quantize the gradient orientation into 9 bins
■ Vote is the gradient magnitude
■ Interpolate votes bi-linearly between neighboring
bin center.
○ Concatenate histograms
▶ robocup.tugraz.at▶ robocup.tugraz.at
Histogram of oriented Gradients
Goal of the HOG detection. A real human
face [Haghighat] is taken and on the right
side signs of nose, eye socket and lips
can be seen.
Parameters: Cell size = 2x2,
Block Size = 1x1
TP is a
low!
HOG
.
▶ robocup.tugraz.at▶ robocup.tugraz.at
Bag of Visual Words (BoVW)
● Given:
○ positive training images containing an object class
“victim” → we cropped only to have the faces.
○ negative training images that do not.
● Classify:
○ a test image whether it contains the object class or
not.
▶ robocup.tugraz.at▶ robocup.tugraz.at
Several Stages for BoVW
1. Extract SIFT features from each image
● Invariant:
● Image transformation
● Lighting variations
● Occlusions
2. Setup a visual vocabulary from the training data
● Mini-Batch-K-Means: centroids are new words
● A histogram (length corresponds to the number of words) is
created for each image.
● A main histogram is setup. Combined of all sub-
histograms
▶ robocup.tugraz.at▶ robocup.tugraz.at
Several Stages for BoVW
3. Pyramid Matching (Graumann and Darrel)
● to distinguish the data more discriminatively
● we do not know how far away the victim is from
our camera.
● merges information of 3 levels.
▶ robocup.tugraz.at▶ robocup.tugraz.at
Several Stages for BoVW
4. Recognize Victims via SVM
● x_i and x_j are the i-th and j-th histogram.
● Similarity:
● SVM is trained:
● Kernel: G_ij
● {victim, no_victim}
● {dark_light, hole, normal_light, no_victim}
▶ robocup.tugraz.at▶ robocup.tugraz.at
2 Categories
We splitted the dataset into 2 categories:
1. with_puppet: There is at least one victim in the scene.
2. without_puppet: There is no victim in the scene.
● Accuracy is about 78.57%.
● The algorithm can strongly distinguish between the 2
categories.
● Sometimes, the algorithm finds faces in the wood
pattern.
▶ robocup.tugraz.at▶ robocup.tugraz.at
More Categories
We splitted the dataset into 4 categories:
1. dark_light: Faces are not well illuminated.
2. hole: Victim is situated in a hole in the wall.
3. normal_light: The victim’s face is lighted perfectly.
4. without_puppet: A scene of the arena, where no victim can be seen.
● Accuracy falls down on 56.90%.
● Diagonal of the matrix ought to be dark blue for a
perfect prediction.
● Mixes dark_light and hole.
● Mixes dark_light and normal_light.
● The algorithm can strongly distinguish puppet and
without_puppet.
▶ robocup.tugraz.at▶ robocup.tugraz.at
Thank you!
Q&A
Q: When will you provide a download link for the database?
A: I will provide a download link within next few weeks. https://osf.io/dwsnm/
▶ robocup.tugraz.at▶ robocup.tugraz.at
References
1. K. Lassnig and S. Loigge, “RoboCup Rescue 2016 Team Description Paper Tedusar,” in Proc. of the Intern. RoboCup
Symposium, 2016, robocup2016.org/Tedusar.pdf.
2. P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “OverFeat: Integrated Recognition, Localization
and Detection using Convolutional Networks,” CoRR, vol. abs/1312.6229, 2014. [Online]. Available:
http://arxiv.org/abs/1312.6229.
3. D. Sculley, “Web-scale K-means Clustering,” in Proc. Intern. Conf. on World Wide Web. New York, NY, USA: ACM, 2010,
pp. 1177–1178. [Online]. Available: http://doi.acm.org/10.1145/1772690.1772862
4. K. Grauman and T. Darrell, “The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features,” in
Intern. Conf. on Computer Vision, 2005. [Online]. Available: http://www.cs.utexas.edu/users/ai-lab/?kgrauman:iccv2005
5. N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” in Intern. Conf. on Computer Vision and
Pattern Recognition, vol. 1, 2005, pp. 886–893 vol. 1.
6. X. Peng, L. Wang, X. Wang, and Y. Qiao, “Bag of Visual Words and Fusion Methods for Action Recognition:
Comprehensive study and good Practice,” Intern. Conf. on Computer Vision Image Understanding, vol. 150, pp. 109 – 125,
2016. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1077314216300091
7. Q. Zhu, Y. Zhong, B. Zhao, G. S. Xia, and L. Zhang, “Bag-of-Visual-Words Scene Classifier With Local and Global
Features for High Spatial Resolution Remote Sensing Imagery,” IEEE Geoscience and Remote Sensing Letters, vol. 13,
no. 6, pp. 747–751, 2016.
8. M. Haghighat, “Biometrics for Cybersecurity and Unconstrained Environments,” 2016,
scholarlyrepository.miami.edu/oa_dissertations/1675
9. B. C. Russell, A. Toolbar, K. P. Murphy, and W. T. Freeman, “LabelMe: A Database and Web-Based Tool for Image
Annotation,” Intern. Journal Computer Vision, vol. 77, no. 1-3, pp. 157–173, 2008. [Online]. Available:
http://dx.doi.org/10.1007/s11263-007-0090-8
10. P. Viola and M. Jones, “Robust Real-Time Face Detection,” Intern. Journal Computer Vision, vol. 57, no. 2, pp. 137–154,
2004. [Online]. Available: https://doi.org/10.1023/B:VISI.0000013087.49260.fb
▶ robocup.tugraz.at▶ robocup.tugraz.at
Future Work
● As a replacement of Haar Cascade→Aggregate
Channel Features (ACF) https://github.com/pdollar/toolbox
○ Faster.
○ Might be more accurate.
○ Detect face in profile possible. http://www.cbsr.ia.ac.cn/users/zlei/papers/Yang-IJCB-14.pdf
● As a replacement of CNN-Overfeat → DCNN by Li
et al. http://users.eecs.northwestern.edu/~xsh835/assets/cvpr2015_cascnn.pdf
○ 14 FPS on a single CPU.
○ Robust:
■ Occlusion.
■ Light.

More Related Content

Similar to The RoboCup Rescue Dataset

Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
Daniel Cahall
 
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
MLconf
 
one shot15729752 Deep Learning for AI and DS
one shot15729752 Deep Learning for AI and DSone shot15729752 Deep Learning for AI and DS
one shot15729752 Deep Learning for AI and DS
ManiMaran230751
 
Pregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingPregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph Processing
Riyad Parvez
 
GAN Deep Learning Approaches to Image Processing Applications (1).pptx
GAN Deep Learning Approaches to Image Processing Applications (1).pptxGAN Deep Learning Approaches to Image Processing Applications (1).pptx
GAN Deep Learning Approaches to Image Processing Applications (1).pptx
RMDAcademicCoordinat
 

Similar to The RoboCup Rescue Dataset (20)

Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...
 
ProjectReport
ProjectReportProjectReport
ProjectReport
 
Seeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper reviewSeeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper review
 
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
 
1-pytorch-CNN-RNN.pdf
1-pytorch-CNN-RNN.pdf1-pytorch-CNN-RNN.pdf
1-pytorch-CNN-RNN.pdf
 
Neuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine PresentationNeuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine Presentation
 
Image Object Detection Pipeline
Image Object Detection PipelineImage Object Detection Pipeline
Image Object Detection Pipeline
 
one shot15729752 Deep Learning for AI and DS
one shot15729752 Deep Learning for AI and DSone shot15729752 Deep Learning for AI and DS
one shot15729752 Deep Learning for AI and DS
 
Pregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingPregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph Processing
 
Graph Analytics with ArangoDB
Graph Analytics with ArangoDBGraph Analytics with ArangoDB
Graph Analytics with ArangoDB
 
Deep Generative Modelling
Deep Generative ModellingDeep Generative Modelling
Deep Generative Modelling
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation
 
Final_From 2D Image To 3D Object.pptx
Final_From 2D Image To 3D Object.pptxFinal_From 2D Image To 3D Object.pptx
Final_From 2D Image To 3D Object.pptx
 
Image analysis using python
Image analysis using pythonImage analysis using python
Image analysis using python
 
VIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape EstimationVIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape Estimation
 
Object Detection An Overview
Object Detection An OverviewObject Detection An Overview
Object Detection An Overview
 
GAN Deep Learning Approaches to Image Processing Applications (1).pptx
GAN Deep Learning Approaches to Image Processing Applications (1).pptxGAN Deep Learning Approaches to Image Processing Applications (1).pptx
GAN Deep Learning Approaches to Image Processing Applications (1).pptx
 
KaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep LearningKaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep Learning
 

Recently uploaded

Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptx
GlendelCaroz
 
Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discs
Sérgio Sacani
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
GOWTHAMIM22
 

Recently uploaded (20)

In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptx
 
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfFORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolation
 
NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.
 
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptx
 
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 RpWASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
 
GBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of AsepsisGBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of Asepsis
 
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfMODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
 
Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discs
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 
A Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert EinsteinA Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert Einstein
 
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 
Fun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfFun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdf
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence acceleration
 
TEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdfTEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdf
 
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxSaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
 

The RoboCup Rescue Dataset

  • 1. ▶ robocup.tugraz.at * LORENZ Peter and STEINBAUER Gerald August 2018 Institute of Software Technology Graz University of Technology, Austria The RoboCup Rescue Dataset 1
  • 2. ▶ robocup.tugraz.at▶ robocup.tugraz.at Content ● Introduction - RoboCup Rescue League ● Data Set ○ Properties (Database Size, Image Size, Ground Truth) ○ Definition of the Confusion Matrix ● Standard Algorithms ○ Haar Cascade ○ CNN-Overfeat ○ Histogram of Oriented Gradients (HOG) ● Own Method: Bag of Visual Words ● Results
  • 3. ▶ robocup.tugraz.at▶ robocup.tugraz.at RoboCup Rescue League ● Competition between international teams ● Several disciplines - our Focus: Autonomy 3 German Opens in Magdeburg, 2015.
  • 4. ▶ robocup.tugraz.at▶ robocup.tugraz.at Dataset Properties ● Size: 640×480 pixels ● Images for binary classification ○ 955 positive images: Victim on it. ○ 746 negative images: No victim. ● XML-Files: contain a polygon, that labels the victim’s face → Ground Truth. (Only for positive images.)
  • 5. ▶ robocup.tugraz.at▶ robocup.tugraz.at Confusion Matrix (1/4) ● True Positive (TP): Actual class and the prediction of the detector are intersecting beyond a certain threshold. Rectangles: - Ground Truth - Prediction
  • 6. ▶ robocup.tugraz.at▶ robocup.tugraz.at Confusion Matrix (2/4) ● True Negative (TN): There is no victim on the image and there is no victim predicted. Rectangles: - Ground Truth - Prediction
  • 7. ▶ robocup.tugraz.at▶ robocup.tugraz.at Confusion Matrix (3/4) ● False Positive (FP): There is no victim on the image, but there is one predicted. Rectangles: - Ground Truth - Prediction
  • 8. ▶ robocup.tugraz.at▶ robocup.tugraz.at Confusion Matrix (4/4) ● False Negative (FN): ○ There is a victim, but no prediction. ○ The prediction does not intersect with the actual class. Rectangles: - Ground Truth - Prediction
  • 9. ▶ robocup.tugraz.at▶ robocup.tugraz.at Haar Cascade [Viola and Jones] ● Is continuously running, due to low computational expensiveness. ● Standard Face Detection Algorithm. ● Face can only recognized in portrait. Solution:
  • 10. ▶ robocup.tugraz.at▶ robocup.tugraz.at Haar Cascade (Recap) ● Haar Features: ● Similar to a Convolution. ● Each feature is a single value obtained by subtracting sum of pixels. under white rectangle from sum of pixels under black rectangle. ● Integral Image ● Adaboost ● Cascade Edge Features Line Features
  • 11. ▶ robocup.tugraz.at▶ robocup.tugraz.at Haar Cascade ● No big difference between 30% and 70% threshold haar cascade 30% threshold. haar cascade 70% threshold. Overlapping Area Rectangles: - Ground Truth - Prediction
  • 12. ▶ robocup.tugraz.at▶ robocup.tugraz.at CNN-Overfeat [Sermanet, Eigen, Zhang, Mathieu, Fergus and LeCun] ● Takes about 2 seconds for prediction. ● Only used, by previous action → heat detected. ● Is not trained by this dataset. Private dataset, data used from public sources. ● Last layer is trained for classification.
  • 13. ▶ robocup.tugraz.at▶ robocup.tugraz.at CNN-Overfeat ● The results tends to be positive! CNN-Overfeat. ● Instead of only faces, whole images as training. Learns whole image: ○ Whole body ○ Clothes ○ Environment ● Different dataset
  • 14. ▶ robocup.tugraz.at▶ robocup.tugraz.at Histogram of oriented Gradients [Dalal and Triggs] ● HOG feature extraction (Recap) ○ Compute centered horizontal and vertical gradients. ○ Compute gradient orientation and magnitudes. ○ Example: Divide the image into 16x16 block of 50% overlap. Each block consists of 2x2 cells. ○ Quantize the gradient orientation into 9 bins ■ Vote is the gradient magnitude ■ Interpolate votes bi-linearly between neighboring bin center. ○ Concatenate histograms
  • 15. ▶ robocup.tugraz.at▶ robocup.tugraz.at Histogram of oriented Gradients Goal of the HOG detection. A real human face [Haghighat] is taken and on the right side signs of nose, eye socket and lips can be seen. Parameters: Cell size = 2x2, Block Size = 1x1 TP is a low! HOG .
  • 16. ▶ robocup.tugraz.at▶ robocup.tugraz.at Bag of Visual Words (BoVW) ● Given: ○ positive training images containing an object class “victim” → we cropped only to have the faces. ○ negative training images that do not. ● Classify: ○ a test image whether it contains the object class or not.
  • 17. ▶ robocup.tugraz.at▶ robocup.tugraz.at Several Stages for BoVW 1. Extract SIFT features from each image ● Invariant: ● Image transformation ● Lighting variations ● Occlusions 2. Setup a visual vocabulary from the training data ● Mini-Batch-K-Means: centroids are new words ● A histogram (length corresponds to the number of words) is created for each image. ● A main histogram is setup. Combined of all sub- histograms
  • 18. ▶ robocup.tugraz.at▶ robocup.tugraz.at Several Stages for BoVW 3. Pyramid Matching (Graumann and Darrel) ● to distinguish the data more discriminatively ● we do not know how far away the victim is from our camera. ● merges information of 3 levels.
  • 19. ▶ robocup.tugraz.at▶ robocup.tugraz.at Several Stages for BoVW 4. Recognize Victims via SVM ● x_i and x_j are the i-th and j-th histogram. ● Similarity: ● SVM is trained: ● Kernel: G_ij ● {victim, no_victim} ● {dark_light, hole, normal_light, no_victim}
  • 20. ▶ robocup.tugraz.at▶ robocup.tugraz.at 2 Categories We splitted the dataset into 2 categories: 1. with_puppet: There is at least one victim in the scene. 2. without_puppet: There is no victim in the scene. ● Accuracy is about 78.57%. ● The algorithm can strongly distinguish between the 2 categories. ● Sometimes, the algorithm finds faces in the wood pattern.
  • 21. ▶ robocup.tugraz.at▶ robocup.tugraz.at More Categories We splitted the dataset into 4 categories: 1. dark_light: Faces are not well illuminated. 2. hole: Victim is situated in a hole in the wall. 3. normal_light: The victim’s face is lighted perfectly. 4. without_puppet: A scene of the arena, where no victim can be seen. ● Accuracy falls down on 56.90%. ● Diagonal of the matrix ought to be dark blue for a perfect prediction. ● Mixes dark_light and hole. ● Mixes dark_light and normal_light. ● The algorithm can strongly distinguish puppet and without_puppet.
  • 22. ▶ robocup.tugraz.at▶ robocup.tugraz.at Thank you! Q&A Q: When will you provide a download link for the database? A: I will provide a download link within next few weeks. https://osf.io/dwsnm/
  • 23. ▶ robocup.tugraz.at▶ robocup.tugraz.at References 1. K. Lassnig and S. Loigge, “RoboCup Rescue 2016 Team Description Paper Tedusar,” in Proc. of the Intern. RoboCup Symposium, 2016, robocup2016.org/Tedusar.pdf. 2. P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks,” CoRR, vol. abs/1312.6229, 2014. [Online]. Available: http://arxiv.org/abs/1312.6229. 3. D. Sculley, “Web-scale K-means Clustering,” in Proc. Intern. Conf. on World Wide Web. New York, NY, USA: ACM, 2010, pp. 1177–1178. [Online]. Available: http://doi.acm.org/10.1145/1772690.1772862 4. K. Grauman and T. Darrell, “The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features,” in Intern. Conf. on Computer Vision, 2005. [Online]. Available: http://www.cs.utexas.edu/users/ai-lab/?kgrauman:iccv2005 5. N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” in Intern. Conf. on Computer Vision and Pattern Recognition, vol. 1, 2005, pp. 886–893 vol. 1. 6. X. Peng, L. Wang, X. Wang, and Y. Qiao, “Bag of Visual Words and Fusion Methods for Action Recognition: Comprehensive study and good Practice,” Intern. Conf. on Computer Vision Image Understanding, vol. 150, pp. 109 – 125, 2016. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1077314216300091 7. Q. Zhu, Y. Zhong, B. Zhao, G. S. Xia, and L. Zhang, “Bag-of-Visual-Words Scene Classifier With Local and Global Features for High Spatial Resolution Remote Sensing Imagery,” IEEE Geoscience and Remote Sensing Letters, vol. 13, no. 6, pp. 747–751, 2016. 8. M. Haghighat, “Biometrics for Cybersecurity and Unconstrained Environments,” 2016, scholarlyrepository.miami.edu/oa_dissertations/1675 9. B. C. Russell, A. Toolbar, K. P. Murphy, and W. T. Freeman, “LabelMe: A Database and Web-Based Tool for Image Annotation,” Intern. Journal Computer Vision, vol. 77, no. 1-3, pp. 157–173, 2008. [Online]. Available: http://dx.doi.org/10.1007/s11263-007-0090-8 10. P. Viola and M. Jones, “Robust Real-Time Face Detection,” Intern. Journal Computer Vision, vol. 57, no. 2, pp. 137–154, 2004. [Online]. Available: https://doi.org/10.1023/B:VISI.0000013087.49260.fb
  • 24. ▶ robocup.tugraz.at▶ robocup.tugraz.at Future Work ● As a replacement of Haar Cascade→Aggregate Channel Features (ACF) https://github.com/pdollar/toolbox ○ Faster. ○ Might be more accurate. ○ Detect face in profile possible. http://www.cbsr.ia.ac.cn/users/zlei/papers/Yang-IJCB-14.pdf ● As a replacement of CNN-Overfeat → DCNN by Li et al. http://users.eecs.northwestern.edu/~xsh835/assets/cvpr2015_cascnn.pdf ○ 14 FPS on a single CPU. ○ Robust: ■ Occlusion. ■ Light.