Your SlideShare is downloading. ×
Performance Evaluation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Performance Evaluation

654
views

Published on

Empirical Evaluation of Object Detection and Tracking Algorithms. A work done as part of the VACE project under the technical guidance of NIST.

Empirical Evaluation of Object Detection and Tracking Algorithms. A work done as part of the VACE project under the technical guidance of NIST.


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
654
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • What is the most accurate task definition? How to come up with consistent and reliable reference annotations that facilitate meaningful evaluation? Though there were more questions than answers, the first step towards a large scale empirical evaluation of object detection and tracking was finally taken
  • i-Lids logo on one of the slides
  • Give source distribution agency (i-lids, LDC)
  • VACE metrics are normalized in a way that misses, false alarms, and track splits/merges are penalized; Before the metrics are computed, a one-to-one mapping between the reference and system output objects are established through a Bi-partite graph matching algorithm
  • Mean performance scores using the VACE metrics
  • Besides generating a set of numbers, we did a bunch of analyses on the results that would both aid developers in their debugging and also help us in providing deeper insights into the evaluation results
  • Mention that results are likely to improve; Realize our ultimate goal of providing long lasting resources to the computer vision for many years to come;
  • Transcript

    • 1. VACE-I (2000 to 2002)
      • A moderate size dataset of JPEG images to support Text, Face, and Person detection evaluation
      • Pilot study on Text tracking evaluation using 10 MPEG-1 videos
      • A suite of metrics to capture different performance attributes of a system
      • Reference annotations were created using ViPER (developed by UMD)
      • PSU ran the evaluations
      • Six participants (SRI, HNC, NGC, USC, UMD, and CMU)
      • Evaluation challenges that came up:
        • Task definition
        • Performance metrics
        • Annotation guidelines
      Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 2. Evaluation Support Researcher’s Input Converge on Common Ideas Progress Protocols, Metrics, Annotation guidelines, Scoring tool specs & I/O Formatting VACE-II Evaluation Process Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction Eva l Release Training Data Dry-run Process Participants USF/VM/NIST Sponsor Needs
    • 3. Design Deploy Disseminate VACE-II Evaluation Framework Participants Scoring tool (USF-DATE) Ground truth data (training & testing) Quality control Annotation Results / Analyses Workshop presentation and Technical reports Formal evaluation Task definitions Protocols / Metrics Annotation guidelines Scoring tool specs Schedule Dry-run (Micro-corpus) Data identification Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 4. Participants Object Task Sites Data Cycle I Text Detection CMU, COLIB, SRI BN Tracking SRI Face Detection PPATT, UIUC, UMD-Y, TAUCF BN/MR Tracking PPATT, UIUC, UMD-Y Hands Detection VT MR Tracking VT Person Detection USC, UMD-Y MR Tracking USC, UMD-Y Detection USC UAV Tracking USC Vehicle Detection USC, DCU UAV Tracking USC, DCU Eng Text Recognition BBN/SRI BN Cycle II Face Detection PPATT, QMUL MultiSiteMR Tracking PPATT, QMUL Person Detection AIT MultiSiteMR Tracking AIT Detection USC, QMUL, UMD-L Surveillance Tracking USC, QMUL, UMD-L Moving Vehicle Tracking USC, UCF, QMUL, UMD-L Surveillance Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 5. Data Identification
          • Domain
      • Task
      Meeting Room Broadcast News UAV Surveillance Text Detection & Tracking CNN,ABC Text Recognition CNN,ABC Face Detection & Tracking NIST, CMU, VT, TNO, EDI CNN,ABC Hand Detection & Tracking NIST CNN,ABC Person Detection & Tracking NIST CNN,ABC VIVID 2 I-LIDS Vehicle Detection & Tracking CNN,ABC VIVID 2 I-LIDS Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 6. Task Definitions Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms Task Definition Text
      • Annotate oriented bounding rectangle around text objects
      • Line level annotation for Detection task and Word Level for Recognition task
      • Rules based on similarity of font, proximity and readability levels
      Face
      • At least one eye, nose and part of mouth must be visible
      • Oriented rectangular bounding box to cover the eyes, nose and mouth
      • Both frontal and profile views handled even in presence of occlusion
      Hand
      • Point based location to indicate center of palm or hand region
      • Oriented bounding box to handle large hands
      Person
      • At-least part of the head and shoulders must be visible
      • Elliptical region around the head region and Rectangular region to cover upper body
      • In UAV, annotation is center point of person which will be derived from a oriented bounding box
      Vehicle
      • Emphasis on Moving Vehicles
      • Substantial part of vehicle body must be visible
      • Oriented bounding rectangle around vehicle in UAV
      VACE V ideo A nalysis C ontent E xtraction
    • 7. Ground truth data
      • Number of Clips annotated: More than 1100 clips
      • Size of Corpus: 450,000 I-frames (4.5 Million Frames)
      • Time taken to annotate each clip: 50X to 180X Real-time
      Ground truth data Person in Surveillance Face in Multi-Site Meetings Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 8. Metrics
      • VACE metrics
        • Sequence Frame Detection Accuracy ( SFDA ): Summative metric based on the spatial overlap of detected objects
        • Average Tracking Accuracy ( ATA ): Summative metric based on the spatio-temporal overlap of tracked objects
      • CLEAR metrics
        • Multiple Object Detection Accuracy ( MODA ): Accuracy metric counting the number of misses and false alarms
        • Multiple Object Detection Precision ( MODP ): Measures the spatial accuracy of detection
        • Multiple Object Tracking Accuracy ( MOTA ): Accuracy metric counting the number of misses, false alarms, and track splits/merges
        • Multiple Object Tracking Precision ( MOTP ): Measures the spatial accuracy of tracking
      Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 9. Cycle-I Results (Summer 2005) Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 10. Cycle-II Results (Spring 2006) Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 11. Results: Text Recognition in BNews (Spring 2006) Participant: SRI/BBN Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 12. Analyses
      • Visualization of output
      • Multiple runs based on Ambiguity (Face task in Meeting Room data)
      • Sub-scoring based on data source (Face and Person tasks in Multi-Site Meeting Room data)
      • Sub-scoring based on viewpoints and illumination conditions (Person and Vehicle tasks in Surveillance data)
      • Size-based runs (Person and Vehicle tasks in Surveillance data)
      • Sub-scoring based on source (Text recognition in Broadcast News data)
      • Statistical Analysis
      Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 13. Face Detection: MRoom (SFDA Score distribution) 2 levels of Ambiguity Algo1 Algo1-Amb Algo2 Algo2-Amb Algo3 Algo3-Amb Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction
    • 14. Recent Developments (May 2008 – )
      • Integrated the VACE/CLEAR detection and tracking metrics into NIST codebase (F4DE – Framework for Detection and Tracking Evaluation)
      • Created distribution sets of source videos used in evaluation for release
        • BNews and Meeting videos by LDC
        • Surveillance videos by UK Home Office
      • Identified consistency issues with ground truth data
        • Currently in the process of systematically fixing each of the errors
      • In the near future,
        • Publish the data as a standard reference for detection and tracking evaluation
        • Re-run all of the tasks with the updated ground truth to provide meaningful baseline
      Performance Evaluation of Video Analysis and Content Extraction (VACE) Algorithms VACE V ideo A nalysis C ontent E xtraction