Video Copy Detection Using Visual and Semantic Fingerprinting

1,888 views

Published on

Video copy detection using visual and semantic fingerprinting.

Presentation based on the following journal papers:

[1] Hyun-seok Min, Jaeyoung Choi, Wesley De Neve, Yong Man Ro. Near-Duplicate Video Clip Detection Using Model-Free Semantic Concept Detection and Adaptive Semantic Distance Measurement. IEEE Transactions on Circuits and Systems for Video Technology. Vol. 22(8). August 2012. pp. 1174-1187. DOI=http://dx.doi.org/10.1109/TCSVT.2012.2197080

[2] Hyun-seok Min, Jaeyoung Choi, Wesley De Neve, Yong Man Ro. Bimodal Fusion of Low-level Visual Features and High-level Semantic Features for Near-duplicate Video Copy Detection. EURASIP Signal Processing – Image Communication. Vol. 26(10). November 2011. pp. 612-627. DOI=http://dx.doi.org/10.1016/j.image.2011.04.001

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,888
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Video Copy Detection Using Visual and Semantic Fingerprinting

  1. 1. ELIS – Multimedia Lab Video Copy Detection using Visual and Semantic Fingerprinting Presentations aOG MIT November 17, 2011 Wesley De Neve Multimedia Lab Image and Video Systems LabDept. of Electronics & Information Systems Dept. of Electrical Engineering Faculty of Engineering & Architecture College of Information Science & Technology Ghent University – IBBT KAIST Ghent, Belgium Daejeon, South Korea
  2. 2. ELIS – Multimedia Lab Context Research Effort• Worked in South Korea during the past four years – at ICU and KAIST in Daejeon – main focus on advising graduate students • keeping track of the state-of-the-art • identifying and solving research questions • help communicate research results – main research topics • data-driven image annotation and tag relevance learning • face recognition using online social network context • video surveillance and privacy protection • video copy detection Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 2/42 November 17, 2011
  3. 3. ELIS – Multimedia Lab Outline• Introduction• Video copy detection – using visual features – using semantic features• Experimental results• Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 3/42 November 17, 2011
  4. 4. ELIS – Multimedia Lab Outline• Introduction• Video copy detection – using visual features – using semantic features• Experimental results• Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 4/42 November 17, 2011
  5. 5. ELIS – Multimedia Lab Introduction (1/3)• Increasing consumption of online video content – thanks to easy-to-use multimedia devices and online services – thanks to cheap storage and bandwidth – thanks to an increasing number of people going online• Increasing availability of online video content – digitization of professional video archives – user-generated video content Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 5/42 November 17, 2011
  6. 6. ELIS – Multimedia Lab Introduction (2/3)• Some statistics – professional video content • BBC Motion Gallery (as of January 2009) o contains over 2.5 million hours of video content o dating back 60 years in time – user-generated video content • YouTube (as of May 2011) o 48 hours of new video content are uploaded each minute Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 6/42 November 17, 2011
  7. 7. ELIS – Multimedia Lab Introduction (3/3)• Problem: digital video overload – our ability to automatically manage video clips does not keep up with our ability to create and store video clips – makes it, e.g., more and more difficult to find video clips of interest• Part of the solution: techniques for video copy detection – help in managing vast libraries of video clips • reduction of visual redundancy in video search results • detection of copyright infringement • metadata propagation along visual links • media usage monitoring • search by video query Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 7/42 November 17, 2011
  8. 8. ELIS – Multimedia Lab Duplicates versus Near-Duplicates• Duplicate video clips – exact video copies – can be easily detected using hashing• Near-duplicate video clips (NDVCs) – transformed video clips – detection is challenging transformationoriginal video clip black & white cropping flipping Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 8/42 November 17, 2011
  9. 9. ELIS – Multimedia LabApplications: Reduction of Visual Redundancy (1/2) visual redundancy visual redundancy Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 9/42 November 17, 2011
  10. 10. ELIS – Multimedia LabApplications: Detection of Copyright Infringement (2/2) • Missed by YouTube’s ContentID • Transformations used o scaling o recompression Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 10/42 November 17, 2011
  11. 11. ELIS – Multimedia Lab Outline• Introduction• Video copy detection – using visual features – using semantic features• Experimental results• Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 11/42 November 17, 2011
  12. 12. ELIS – Multimedia Lab System for Video Copy Detection: Conceptual Design query video clip Realized by means of video signatures collection of video matchingreference video clips original video clip copy found ≈ Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 12/42 November 17, 2011
  13. 13. ELIS – Multimedia Lab Video Signatures• Aim at uniquely characterizing a video clip• Commonly consist of visual features – e.g., color, texture, shape, and motion• Are low-dimensional representations – in order to facilitate more efficient matching dimensionality reduction ... 921600-D (1280x720) 128-D (128 bins) Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 13/42 November 17, 2011
  14. 14. ELIS – Multimedia Lab Room for Improvement• Observations – no single type of visual feature has thus far emerged that is robust against all possible transformations – transformations tend to preserve semantic features semantic textual helmet face wall clothes / features descriptions• Research question – how about (additionally) making use of semantic features? Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 14/42 November 17, 2011
  15. 15. ELIS – Multimedia Lab Outline• Introduction• Video copy detection – using visual features – using semantic features• Experimental results• Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 15/42 November 17, 2011
  16. 16. ELIS – Multimedia Lab Extraction of Semantic Features (1/2)• Question – how to extract semantic features? helmet face ? wall clothes• Our answer semantic features – by means of binary concept classifiers Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 16/42 November 17, 2011
  17. 17. ELIS – Multimedia Lab Extraction of Semantic Features (2/2)• Example of a binary classifier for ‘apple’ apple ‘apple’ classifier apple ‘not apple’ classifier• Concept classifiers – pieces of logic that know how, e.g., an “apple” image looks like – more formally: pieces of logic that know the statistical distribution of the visual features of, e.g., representative “apple” images Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 17/42 November 17, 2011
  18. 18. ELIS – Multimedia Lab Challenges Concept Classification (1/2)• Limited effectiveness – false negatives (due to intra-concept variability) apple ‘not apple’ classifier – false positives (due to inter-concept variability) apple ‘apple’ classifier Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 18/42 November 17, 2011
  19. 19. ELIS – Multimedia Lab Challenges Concept Classification (2/2)• Limited semantic coverage – only a limited number of concept classifiers can be supported • due to the high cost of training • experts need to collect training images for each concept classifier training images for training images for training images for ‘apple’ ‘orange’ ‘strawberry’ Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 19/42 November 17, 2011
  20. 20. ELIS – Multimedia Lab Classifier-Based Semantic Feature Extraction for Video Copy Detection• Challenges concept classification affect a semantic approach towards the task of video copy detection• How to deal with this? – limited effectiveness of concept classifiers • use of semantic features that can be easily and reliably detected o e.g., ‘people’ – limited semantic coverage of concept classifiers • use of semantic features that are general in nature o e.g., ‘people’ versus ‘Barack Obama’ • use of the temporal variation of the semantic features o extraction of semantic features at the level of video shots Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 20/42 November 17, 2011
  21. 21. ELIS – Multimedia Lab Outline• Introduction• Video copy detection – using visual features – using semantic features• Experimental results• Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 21/42 November 17, 2011
  22. 22. ELIS – Multimedia Lab Reference and Query Video Clips• 311 reference video clips with a total duration of 170 h – 101 video clips from MUSCLE-VCD-2007 • total duration: 80 h – 210 video clips from TRECVID 2008 • total duration: 90 h• 500 query video clips – the result of five transformations applied to 100 video clips randomly selected from the 311 reference video clips Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 22/42 November 17, 2011
  23. 23. ELIS – Multimedia Lab Transformations Applied original blur pattern insertioncaption insertion change in brightness crop Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 23/42 November 17, 2011
  24. 24. ELIS – Multimedia Lab Semantic Features• Use of Support Vector Machines (SVM) – binary classifiers with state-of-the-art effectiveness• 32 semantic concepts used – mean average precision (MAP): 0.51 – ‘gravel’, ‘park’, ‘pavement’, ‘road’, ‘rock’, ‘sand’, ‘sidewalk’, ‘face’, ‘people’, ‘indoor’, ‘field’, ‘peak’, ‘wood’, ‘night’, ‘street’, ‘flowers’, ‘leaves’, ‘trees’, ‘cloudy’, ‘sunny’, ‘sunset’, ‘brick’, ‘arch’, ‘buildings’, ‘wall’, ‘windows’, ‘beach’, ‘high-wave’, ‘low-wave’, ‘still water’, ‘mirrored water’, and ‘snow’ Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 24/42 November 17, 2011
  25. 25. ELIS – Multimedia Lab Video Matchingreferencevideo clip d1 queryvideo clip Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 25/42 November 17, 2011
  26. 26. ELIS – Multimedia Lab Video Matchingreferencevideo clip d2 queryvideo clip Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 26/42 November 17, 2011
  27. 27. ELIS – Multimedia Lab Video Matchingreferencevideo clip d3 queryvideo clip Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 27/42 November 17, 2011
  28. 28. ELIS – Multimedia Lab Video Matchingreferencevideo clip d4 queryvideo clip Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 28/42 November 17, 2011
  29. 29. ELIS – Multimedia Lab Video Matchingreferencevideo clip d5 queryvideo clip • di : linearly weighted combination of the Manhattan distance between the – visual features of the query video clip and the part of the reference video clip in the sliding window – semantic features of the query video clip and the part of the reference video clip in the sliding window Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 29/42 November 17, 2011
  30. 30. ELIS – Multimedia Lab Normalized Detection Cost Ratio (NDCR)• Definition NDCR = Pmiss + β * RFA where Pmiss = NFN / Nqueries missed detection probablity RFA = NFP / (Tquery * Trefdata) false alarm rate (per hour)• We set β to a value of 2 (“balanced profile”) – see “CBCD Evaluation Plan TRECVID 2010 v3” – assigns a higher cost to raising false alarms• A value of zero indicates perfect detection performance Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 30/42 November 17, 2011
  31. 31. ELIS – Multimedia Lab Semantic Concept Models Used• AP (Average Precision) – true positive rate: #true positives / (#true positives + #false positives) – averaged over 100 query video clips• MAP (Mean AP) of the 32 semantic concept models: 0.52 Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 31/42 November 17, 2011
  32. 32. ELIS – Multimedia Lab Effectiveness of Bimodal Fusion• Bimodal fusion of visual and semantic features outperforms the separate use of either type of features for all transformations Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 32/42 November 17, 2011
  33. 33. ELIS – Multimedia LabComparison of Effectiveness of Video Copy Detection• In general, bimodal fusion of visual and semantic features outperforms the other techniques for video copy detection Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 33/42 November 17, 2011
  34. 34. ELIS – Multimedia Lab Robustness Against Variation in Semantic Coverage• The more concepts used, the more effective NDVC detection Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 34/42 November 17, 2011
  35. 35. ELIS – Multimedia LabInfluence of Effectiveness of Semantic Concept Detection 0.925 0.711 • The effectiveness of NDVC detection starts to stabilize once the MAP of the concept detectors is higher than 0.3 Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 35/42 November 17, 2011
  36. 36. ELIS – Multimedia Lab Time Complexity of Creating Video Signatures• Measurements expressed in seconds – include the time to perform o shot segmentation o keyframe selection o feature extraction Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 36/42 November 17, 2011
  37. 37. ELIS – Multimedia Lab Time Complexity of Matching• Measurements expressed in seconds – include the time to o compute the temporal entropy for the proposed method o perform matching using a sliding window approach Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 37/42 November 17, 2011
  38. 38. ELIS – Multimedia Lab Storage Complexity• Measurements expressed in Mbytes – storing the 32 semantic features requires 4 bytes per shot – storing the MPEG-7 visual features requires about 0.4 kbytes per shot Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 38/42 November 17, 2011
  39. 39. ELIS – Multimedia Lab Outline• Introduction• Video copy detection – using visual features – using semantic features• Experimental results• Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 39/42 November 17, 2011
  40. 40. ELIS – Multimedia Lab Conclusions (1/3)• Discussed the novel idea of using semantic features for the purpose of video copy detection – given the observation that no single type of visual feature exists that is robust against all possible transformations – given the observation that transformations tend to preserve semantic information – (given the observation that the semantic features extracted can be reused for annotation purposes)• Experimental results – fusion of visual and semantic features outperforms • the seperate use of either type of features • temporal ordinal measurement, PCA-SIFT, and BoVW Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 40/42 November 17, 2011
  41. 41. ELIS – Multimedia Lab Conclusions (2/3)• Current and future extensions – use of the temporal variation of concept confidence values • studied by the National University of Singapore – classifier-free semantic feature extraction • takes advantage of collective knowledge available on Flickr o unrestricted semantic concept vocabulary (higher coverage) • accepted for publication in IEEE Trans. on CSVT – improved semantic distance measurement – indexing of semantic features Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 41/42 November 17, 2011
  42. 42. ELIS – Multimedia Lab Conclusions (3/3)• Publications of interest – “Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection” o published in Signal Processing – Image Communication – “Near-Duplicate Video Clip Detection Using Model-Free Semantic Concept Detection and Adaptive Semantic Distance Measurement” o published in IEEE Trans. on Circuits and Systems for Video Technology Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 42/42 November 17, 2011

×