Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Analysis of visual similarity in news videos with robust and memory efficient image retrieval

1,189 views

Published on

Analysis of visual similarity in news videos with robust and memory efficient image retrieval

  • Be the first to comment

  • Be the first to like this

Analysis of visual similarity in news videos with robust and memory efficient image retrieval

  1. 1. 11Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Analysis of Visual Similarity in News VideosAnalysis of Visual Similarity in News Videos with Robust and Memorywith Robust and Memory--EfficientEfficient Image RetrievalImage Retrieval David Chen, Peter Vajda, Sam Tsai, Maryam Daneshi, Matt Yu, Huizhong Chen, Andre Araujo, Bernd Girod Image, Video, and Multimedia Systems Group Stanford University
  2. 2. 22Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
  3. 3. 33Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
  4. 4. 44Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
  5. 5. 55Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Plays 30 second clip around query phrase match Would benefit from accurate segmentation of stories Would benefit from reliable generation of summary clips
  6. 6. 66Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Applications of Anchor DetectionApplications of Anchor Detection 1. Provide strong cues for story segmentation 2. Extract news story summaries/previews 3. Identify anchors for general person recognition TURNING TO TECH, SHARES OF RESEARCH IN MOTION REBOUNDED FROM A ONE MONTH LOW. THE COMPANY'S NEXT GENERATION BLACKBERRY-10 PRODUCT LINE IS EXPECTED TO BE UNVEILED IN JUST A FEW WEEKS. YOU MAY REMEMBER SHARES SOLD OFF LAST WEEK AFTER THE COMPANY ISSUED A CAUTIOUS OUTLOOK FOR ITS FOURTH QUARTER RESULTS. BUT TODAY SHARES BOUNCED BACK: UP 11.5% TO A UNDER $12. Anchor Brian Williams Anchor Susie Gharib Don’t confuse anchors with other people in the videos
  7. 7. 77Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Applications of Preview MatchingApplications of Preview Matching 1. Provide strong cues for story segmentation 2. Extract news story summaries/previews 3. Indicate the most important stories in a broadcast JUST A MESS. IN WASHINGTON, LAWMAKERS LEAVE TOWN FOR THE HOLIDAYS. THE CLOCK TICKS DOWN TO THE SO-CALLED FISCAL CLIFF. LATE TODAY, THE PRESIDENT HASTILY APPEARS TO ASK IF SOME OF THIS BUSINESS CAN BE FINISHED SOON.
  8. 8. 88Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval OutlineOutline • Related work in news video analysis • Long-range visual similarity • Anchor detection algorithm • Preview matching algorithm • Experimental results
  9. 9. 99Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Related Work in News Video AnalysisRelated Work in News Video Analysis • Model-based anchor detection [Zhang et al., 1998] [Hanjalic et al., 1998] [Liu et al., 2000] • Model-free anchor detection [Gao et al., 2002] [De Santo et al., 2006] [D’Anna et al., 2007] [Ma et al., 2008] [Broilo et al., 2011] • Spatio-temporal slices for reporter detection [Liu et al., 2007] [Zheng et al., 2010] • Classification of news video shots [Bertini et al., 2001] [Xiao et al., 2010] [Lee et al., 2011]
  10. 10. 1010Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval LongLong--Range Visual SimilarityRange Visual Similarity Frame Number FrameNumber 1 501 1001 1501 2001 2501 3001 1 501 1001 1501 2001 2501 3001 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
  11. 11. 1111Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval LongLong--Range Visual SimilarityRange Visual Similarity Frame Number FrameNumber 1 501 1001 1501 2001 2501 3001 3501 1 501 1001 1501 2001 2501 3001 3501 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 What causes these long- range visual similarities? What causes these long- range visual similarities?
  12. 12. 1212Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval LongLong--Range Visual SimilarityRange Visual Similarity NBC Nightly News on Dec. 21, 2012
  13. 13. 1313Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval LongLong--Range Visual SimilarityRange Visual SimilarityAnchor: Brian Williams Analyst: David Gregory Reporter: Kelly O’Donnell Reporter: Andrea Mitchell
  14. 14. 1414Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval LongLong--Range Visual SimilarityRange Visual Similarity
  15. 15. 1515Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Anchor Detection PipelineAnchor Detection Pipeline Exclude Frames Without Faces Extract Image Signatures Compare Image Signatures Form Initial Anchor Candidates Prune Away False Candidates Include Temporally Nearby Candidates Keyframes Detections Similarity Matrix Count number of long-range local peaks in the current row of the similarity matrix and pick initial candidates from high-count rows Compare initial candidates to one another and prune out candidates which are not very similar to the other initial candidates From pruned set of candidates, expand to include temporally nearby candidates which are also very similar in appearance
  16. 16. 1616Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval IntraIntra--Episode vs. InterEpisode vs. Inter--EpisodeEpisode • Intra-episode: compare frames within a single episode of a news program • Inter-episode: compare frames between different episodes of a news program
  17. 17. 1717Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Preview Matching PipelinePreview Matching Pipeline Detect and Recognize Text Adaptively Crop to Preview Region Extract Image Signature Compare Image Signatures Verify Geometry in Shortlist JUST A MESS JUST A MESS COMING UP COMING UP Database of Image Signatures Frame Matches
  18. 18. 1818Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval REVV: Residual Enhanced Visual VectorREVV: Residual Enhanced Visual Vector Extract Local Features Vector Quantize to Visual Words Visual Codebook Perform Mean Aggregation of Residuals Query Image …… Regularize with Power Law Reduce Dimensions by LDA Binarize Components from Sign Compute Weighted Correlations Database Signatures Ranked List 1.74 1.75 1.79 1.80 1.83 1.84 …
  19. 19. 1919Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Experimental SetupExperimental Setup • Anchor detection – Training on 12 episodes of NBC Nightly News (1 anchor/episode), ABC World News (1 anchor/episode), Nightly Business Report (2 anchors/episode) – Testing on 21 episodes of same three programs – Measure precision / recall / F-score • Preview matching – Testing on 10 episodes of NBC Nightly News and ABC World News – Measure precision / recall / F-score • Comparison of two memory-efficient signatures – GIST: 66 MB/episode [Oliva et al., 2001] [Douze et al., 2009] – REVV: 10 MB/episode [Chen et al., 2013]
  20. 20. 2020Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Anchor Detection ResultsAnchor Detection Results Recall Precision F-Score GIST Intra 0.53 0.84 0.65 REVV Intra 0.87 0.90 0.88
  21. 21. 2121Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Anchor Detection ResultsAnchor Detection Results Recall Precision F-Score REVV Intra 0.87 0.90 0.88 REVV Intra + Inter 0.90 0.91 0.90
  22. 22. 2222Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Preview Matching ResultsPreview Matching Results Recall Precision F-Score GIST 0.48 1.00 0.65 REVV 0.90 1.00 0.95 Type A: Preview occurs at beginning of broadcast
  23. 23. 2323Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Preview Matching ResultsPreview Matching Results Recall Precision F-Score GIST 0.62 1.00 0.77 REVV 0.93 1.00 0.96 Type B: Preview occurs prior to a commercial
  24. 24. 2424Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval ConclusionsConclusions • Long-range visual similarity in news videos provides a general and effective method for anchor detection and preview matching • A robust image signature is required to handle challenging appearance variations throughout a newscast • The image signature should be memory-efficient to enable parallelized processing of large video archives
  25. 25. Thank YouThank YouThank You dmchen@stanford.edudmchen@stanford.edu

×