SlideShare a Scribd company logo
1 of 43
AT&T Research at TRECVID 2009 Content-based Copy Detection
TRECVID 2009 TREC Video Retrieval Evaluation Specials for 2009  Tasks surveillance event detection high-level feature extraction search (interactive, manually-assisted, and/or fully automatic) content-based copy detection
Video data Sound and Vision The Netherlands Institute for Sound and Vision news magazine, science news, news reports, documentaries, educational programming, and archival video BBC rushes unedited material  All materials in MPEG-1.. yep!)
Datasets  Development  tv7.sv.devel (32.9 GB) (reference)  tv7.sv.test (31.4 GB) (reference)  tv8.sv.test (64.3 GB) (reference)  tv7.bbc.devel (12.2 GB) (non-reference)  tv7.bbc.test (10.9 GB) (non-reference)  tv8.bbc.test (10.8 GB) (non-reference)  Test  tv7.sv.devel (32.9 GB) (reference)  tv7.sv.test (31.4 GB) (reference)  tv8.sv.test (64.3 GB) (reference)  tv9.sv.test (114.8 GB) (reference)  tv7.bbc.devel (12.2 GB) (non-reference)  tv7.bbc.test (10.9 GB) (non-reference)  tv8.bbc.test (10.8 GB) (non-reference)  tv9.bbc.test (19.0 GB (non-reference)
Content-based copy detection copyright control business intelligence advertisement tracking law enforcement investigations
Video transformation  Picture in picture (The original video is inserted in front of a background video)  Insertions of pattern  Strong reencoding Change of gamma  Decrease in quality  Blur, change of gamma, frame dropping, contrast, compression, ratio, white noise Post production  Crop, shift, contrast, caption (text insertion), flip (vertical mirroring), insertion of pattern, Picture in Picture (the original video is in the background) Change to randomly choose 1 transformation from each of the 3 main categories.
AT&T Research at TRECVID 2009Content-based Copy Detection Applications discovering copyright infringement of multimedia content monitoring commercial air time querying video by example Approaches digital video watermarking content based copy detection (CBCD).
Overview
Content based sampling Shot boundary detection (SBD) Adopts a “divide and conquer” strategy Six independent detectors: Cut, fade in, fade out, fast dissolve (less than 5 frames), dissolve and motion Each detector is a finite state machine (FSM) FSMs depent on two types of visual features: Intra-frame (only one frame) Inter-frame (current frame+previous frame)
Overview
Transformation detection andnormalization for query keyframe Letterbox detection Picture-in-picture detection Query Keyframe Normalization
Transformation detection andnormalization for query keyframe ,[object Object]
Picture-in-picture detection
Canny edge detection operator		http://en.wikipedia.org/wiki/Canny_edge_detector
Transformation detection andnormalization for query keyframe Query Keyframe Normalization Equalize and blur the query keyframe to overcome the effect of change of Gamma and white noise transformations.
Transformation detection andnormalization for query keyframe And we have 10 types of query keyframe: original, letterbox removed, PiP scaled, equalized, blurred and flipped versions of these five types
Overview
Reference keyframe transformation Only 2 transformations  Half-resolution rescaling  For compared  with the detected PiP region in the query keyframes Strong re-encoding For dealing with the strong re-encoded query keyframes.  And we have 3 types of reference keyframe
Overview
Scale-invariant feature transform SIFT Extraction
Scale-invariant feature transform SIFT Extraction It’s main feature for locating video copies Locating the keypoints that have local maximum Difference of Gaussian values both in scale and in space. (specified by location, scale and orientation) Computing a descriptor for each keypoint. The descriptor is the gradient orientation histogram, which is a 128 dimension feature vector.
Overview
Locality sensitive hashing (LSH) The basic idea  hash the input items so that similar items are mapped to the same buckets with high probability a – random vector following a Gaussian distribution with zero mean and unit variance w – preset bucket size b – in range [0,w]
Overview
Indexing and search by LSH Sort LSH values independency Save with SIFT identifications in separate index file SIFT identifications: (String) Reference video ID Keyframe ID SIFT ID
Overview
Keyframe level query refinement Two issues: the original SIFT matching by Euclidian distance is not reliable it‘s possible that two SIFT features that are far away mapped to the same LSH value
Keyframe level query refinement Random Sample Consensus (RANSAC)
Keyframe level query refinement Random Sample Consensus (RANSAC) ,[object Object]
Determine the affine model
Transform all keypoints in the reference keyframe into the query keyframe
Count the number of keypoints in the reference whose transformed to the coordinates of their matching keypoints in the query keyframe. These keypoints are called inliers
Repeat steps 1 to 4 for a certain number of times, and output the maximum number of inliers,[object Object]
Overview
Keyframe level result merge If one reference keyframe appears more than once in the 12 lists New relevance score set to be maximum score
Overview
Video level result fusion Get pair (i, j) with the best sum relevance
Overview
Video relevance score normalization Normalize the relevance scores  into range [0,1] x – original relevance score y – normalized one
Overview
CBCD result generation Query video ID Reference video ID Information of copied reference video segment Starting frame of copied segment in the query video Decision score
CBCD Evaluation Results Dataset 1407 short query videos 838 reference videos 208 non-reference videos Extract For entire reference video set 268,000 keyframes 57,000,000 SIFT features For entire query video set 18,000 keyframes 2,600,000 SIFT features
CBCD Evaluation Criteria Parameters for NoFA profile Parameters for Balanced profile

More Related Content

Similar to At&t research at trecvid 2009

Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainVideoguy
 
28 h 264-avc_by_dhchang
28   h 264-avc_by_dhchang28   h 264-avc_by_dhchang
28 h 264-avc_by_dhchangBadri Patro
 
Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...Eastern European Computer Vision Conference
 
TAROT2013 Testing School - Myra Cohen presentation
TAROT2013 Testing School - Myra Cohen presentationTAROT2013 Testing School - Myra Cohen presentation
TAROT2013 Testing School - Myra Cohen presentationHenry Muccini
 
martelli.ppt
martelli.pptmartelli.ppt
martelli.pptVideoguy
 
Video Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionVideo Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionChamp Yen
 
Dynamic Threshold in Clip Analysis and Retrieval
Dynamic Threshold in Clip Analysis and RetrievalDynamic Threshold in Clip Analysis and Retrieval
Dynamic Threshold in Clip Analysis and RetrievalCSCJournals
 
Aruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisAruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisArunaRavi
 
mpeg4copy-120428133000-phpapp01.ppt
mpeg4copy-120428133000-phpapp01.pptmpeg4copy-120428133000-phpapp01.ppt
mpeg4copy-120428133000-phpapp01.pptPawachMetharattanara
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel ArchitecturesJoel Falcou
 
FutureComm 2010: Video Quality Analysis and Measurement
FutureComm 2010: Video Quality Analysis and MeasurementFutureComm 2010: Video Quality Analysis and Measurement
FutureComm 2010: Video Quality Analysis and MeasurementRADVISION Ltd.
 
displayport_dsc_protocols_webinar.pdf
displayport_dsc_protocols_webinar.pdfdisplayport_dsc_protocols_webinar.pdf
displayport_dsc_protocols_webinar.pdfssuser884d0a
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsQualcomm Research
 
PERFORMANCE EVALUATION OF H.265/MPEG-HEVC, VP9 AND H.264/MPEGAVC VIDEO CODING
PERFORMANCE EVALUATION OF H.265/MPEG-HEVC, VP9 AND H.264/MPEGAVC VIDEO CODINGPERFORMANCE EVALUATION OF H.265/MPEG-HEVC, VP9 AND H.264/MPEGAVC VIDEO CODING
PERFORMANCE EVALUATION OF H.265/MPEG-HEVC, VP9 AND H.264/MPEGAVC VIDEO CODINGijma
 
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...Alpen-Adria-Universität
 
Are you Digitized Files Really OK? Levels of QC and Film Digitization (SCHALL...
Are you Digitized Files Really OK? Levels of QC and Film Digitization (SCHALL...Are you Digitized Files Really OK? Levels of QC and Film Digitization (SCHALL...
Are you Digitized Files Really OK? Levels of QC and Film Digitization (SCHALL...FIAT/IFTA
 

Similar to At&t research at trecvid 2009 (20)

Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
 
28 h 264-avc_by_dhchang
28   h 264-avc_by_dhchang28   h 264-avc_by_dhchang
28 h 264-avc_by_dhchang
 
Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...
 
TAROT2013 Testing School - Myra Cohen presentation
TAROT2013 Testing School - Myra Cohen presentationTAROT2013 Testing School - Myra Cohen presentation
TAROT2013 Testing School - Myra Cohen presentation
 
PPT
PPTPPT
PPT
 
martelli.ppt
martelli.pptmartelli.ppt
martelli.ppt
 
Video Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionVideo Compression Standards - History & Introduction
Video Compression Standards - History & Introduction
 
Dynamic Threshold in Clip Analysis and Retrieval
Dynamic Threshold in Clip Analysis and RetrievalDynamic Threshold in Clip Analysis and Retrieval
Dynamic Threshold in Clip Analysis and Retrieval
 
A04840107
A04840107A04840107
A04840107
 
Aruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisAruna Ravi - M.S Thesis
Aruna Ravi - M.S Thesis
 
H263.ppt
H263.pptH263.ppt
H263.ppt
 
mpeg4copy-120428133000-phpapp01.ppt
mpeg4copy-120428133000-phpapp01.pptmpeg4copy-120428133000-phpapp01.ppt
mpeg4copy-120428133000-phpapp01.ppt
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures
 
Real time SHVC decoder
Real time SHVC decoderReal time SHVC decoder
Real time SHVC decoder
 
FutureComm 2010: Video Quality Analysis and Measurement
FutureComm 2010: Video Quality Analysis and MeasurementFutureComm 2010: Video Quality Analysis and Measurement
FutureComm 2010: Video Quality Analysis and Measurement
 
displayport_dsc_protocols_webinar.pdf
displayport_dsc_protocols_webinar.pdfdisplayport_dsc_protocols_webinar.pdf
displayport_dsc_protocols_webinar.pdf
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecs
 
PERFORMANCE EVALUATION OF H.265/MPEG-HEVC, VP9 AND H.264/MPEGAVC VIDEO CODING
PERFORMANCE EVALUATION OF H.265/MPEG-HEVC, VP9 AND H.264/MPEGAVC VIDEO CODINGPERFORMANCE EVALUATION OF H.265/MPEG-HEVC, VP9 AND H.264/MPEGAVC VIDEO CODING
PERFORMANCE EVALUATION OF H.265/MPEG-HEVC, VP9 AND H.264/MPEGAVC VIDEO CODING
 
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
 
Are you Digitized Files Really OK? Levels of QC and Film Digitization (SCHALL...
Are you Digitized Files Really OK? Levels of QC and Film Digitization (SCHALL...Are you Digitized Files Really OK? Levels of QC and Film Digitization (SCHALL...
Are you Digitized Files Really OK? Levels of QC and Film Digitization (SCHALL...
 

At&t research at trecvid 2009

  • 1. AT&T Research at TRECVID 2009 Content-based Copy Detection
  • 2. TRECVID 2009 TREC Video Retrieval Evaluation Specials for 2009 Tasks surveillance event detection high-level feature extraction search (interactive, manually-assisted, and/or fully automatic) content-based copy detection
  • 3. Video data Sound and Vision The Netherlands Institute for Sound and Vision news magazine, science news, news reports, documentaries, educational programming, and archival video BBC rushes unedited material All materials in MPEG-1.. yep!)
  • 4. Datasets Development tv7.sv.devel (32.9 GB) (reference) tv7.sv.test (31.4 GB) (reference) tv8.sv.test (64.3 GB) (reference) tv7.bbc.devel (12.2 GB) (non-reference) tv7.bbc.test (10.9 GB) (non-reference) tv8.bbc.test (10.8 GB) (non-reference) Test tv7.sv.devel (32.9 GB) (reference) tv7.sv.test (31.4 GB) (reference) tv8.sv.test (64.3 GB) (reference) tv9.sv.test (114.8 GB) (reference) tv7.bbc.devel (12.2 GB) (non-reference) tv7.bbc.test (10.9 GB) (non-reference) tv8.bbc.test (10.8 GB) (non-reference) tv9.bbc.test (19.0 GB (non-reference)
  • 5. Content-based copy detection copyright control business intelligence advertisement tracking law enforcement investigations
  • 6. Video transformation Picture in picture (The original video is inserted in front of a background video) Insertions of pattern Strong reencoding Change of gamma Decrease in quality Blur, change of gamma, frame dropping, contrast, compression, ratio, white noise Post production Crop, shift, contrast, caption (text insertion), flip (vertical mirroring), insertion of pattern, Picture in Picture (the original video is in the background) Change to randomly choose 1 transformation from each of the 3 main categories.
  • 7. AT&T Research at TRECVID 2009Content-based Copy Detection Applications discovering copyright infringement of multimedia content monitoring commercial air time querying video by example Approaches digital video watermarking content based copy detection (CBCD).
  • 9. Content based sampling Shot boundary detection (SBD) Adopts a “divide and conquer” strategy Six independent detectors: Cut, fade in, fade out, fast dissolve (less than 5 frames), dissolve and motion Each detector is a finite state machine (FSM) FSMs depent on two types of visual features: Intra-frame (only one frame) Inter-frame (current frame+previous frame)
  • 11. Transformation detection andnormalization for query keyframe Letterbox detection Picture-in-picture detection Query Keyframe Normalization
  • 12.
  • 14. Canny edge detection operator http://en.wikipedia.org/wiki/Canny_edge_detector
  • 15. Transformation detection andnormalization for query keyframe Query Keyframe Normalization Equalize and blur the query keyframe to overcome the effect of change of Gamma and white noise transformations.
  • 16. Transformation detection andnormalization for query keyframe And we have 10 types of query keyframe: original, letterbox removed, PiP scaled, equalized, blurred and flipped versions of these five types
  • 18. Reference keyframe transformation Only 2 transformations Half-resolution rescaling For compared with the detected PiP region in the query keyframes Strong re-encoding For dealing with the strong re-encoded query keyframes. And we have 3 types of reference keyframe
  • 21. Scale-invariant feature transform SIFT Extraction It’s main feature for locating video copies Locating the keypoints that have local maximum Difference of Gaussian values both in scale and in space. (specified by location, scale and orientation) Computing a descriptor for each keypoint. The descriptor is the gradient orientation histogram, which is a 128 dimension feature vector.
  • 23. Locality sensitive hashing (LSH) The basic idea hash the input items so that similar items are mapped to the same buckets with high probability a – random vector following a Gaussian distribution with zero mean and unit variance w – preset bucket size b – in range [0,w]
  • 25. Indexing and search by LSH Sort LSH values independency Save with SIFT identifications in separate index file SIFT identifications: (String) Reference video ID Keyframe ID SIFT ID
  • 27. Keyframe level query refinement Two issues: the original SIFT matching by Euclidian distance is not reliable it‘s possible that two SIFT features that are far away mapped to the same LSH value
  • 28. Keyframe level query refinement Random Sample Consensus (RANSAC)
  • 29.
  • 31. Transform all keypoints in the reference keyframe into the query keyframe
  • 32. Count the number of keypoints in the reference whose transformed to the coordinates of their matching keypoints in the query keyframe. These keypoints are called inliers
  • 33.
  • 35. Keyframe level result merge If one reference keyframe appears more than once in the 12 lists New relevance score set to be maximum score
  • 37. Video level result fusion Get pair (i, j) with the best sum relevance
  • 39. Video relevance score normalization Normalize the relevance scores into range [0,1] x – original relevance score y – normalized one
  • 41. CBCD result generation Query video ID Reference video ID Information of copied reference video segment Starting frame of copied segment in the query video Decision score
  • 42. CBCD Evaluation Results Dataset 1407 short query videos 838 reference videos 208 non-reference videos Extract For entire reference video set 268,000 keyframes 57,000,000 SIFT features For entire query video set 18,000 keyframes 2,600,000 SIFT features
  • 43. CBCD Evaluation Criteria Parameters for NoFA profile Parameters for Balanced profile
  • 47. About http://trec.nist.gov/ http://www.itl.nist.gov/iaui/894.02/projects/trecvid/ http://www-nlpir.nist.gov/projects/tvpubs/tv9.papers/att.pdf
  • 48. Want more information? KirillLazarev Skype: kirill_lazarev Mail: k.s.lazarev@gmail.com Twitter: http://twitter.com/kslazarev