• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Video Fingerprinting and Applications: A Review

Video Fingerprinting and Applications: A Review



This presentation reviews the development in video fingerprinting technology in the past decade and its applications in content identification.

This presentation reviews the development in video fingerprinting technology in the past decade and its applications in content identification.



Total Views
Views on SlideShare
Embed Views



3 Embeds 20

http://www.linkedin.com 13
http://www.slideshare.net 6
http://www.slashdocs.com 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.


11 of 1 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Hi,

    Can anyone tell me how much can it cost to embed video fingerprinting technology to my software?

    I don't need any rocket science from market leaders. Just smth. workable.

    I've found a couple of small companies like this
    and http://duplicatevideosearch.com/video-fingerprinting-sdk/, but the smallest price is starting from $5000, which is too big for me.

    If anyone know opensource or free video fingerprinting libraries, I would very appreciate if you could share this with me.

    - Albert
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Video Fingerprinting and Applications: A Review Video Fingerprinting and Applications: A Review Presentation Transcript

    • Video Fingerprinting and Applications: a review Jian Lu Vobile, Inc. Media Forensics & Security Conference EI’09, San Jose, CA
    • From Research to Applications 1999 2008 1999
    • What’s Video Fingerprinting •  A video fingerprint is a unique identifier extracted from video content –  Video fingerprints are often just string of bits, representing some “signatures” of the video content, and usually not in fixed length. –  Video fingerprinting refers to the process of extracting fingerprints from the video content. –  Comparing to watermarking, fingerprinting does not add to or alter video content. –  Also known as “robust hashing”, “perceptual hashing”, “content-based copy detection (CBCD)” in research literature.
    • Human vs. Video Fingerprint Human Fingerprint Video Fingerprint Uniquely identify human Uniquely identify video Physical form Digital form Pictorial Time-based binary
    • Identification by Fingerprint Human identification Video identification
    • Video Fingerprinting Algorithms
    • Desired Properties •  Robust –  Largely invariant for the same content under various types of processing, conversion, and manipulation. •  Discriminating –  Distinctly different for different content. •  Compact –  Low data rate •  Low complexity –  Fast fingerprint generation and matching
    • Type of Video Signatures Spatial Temporal Color Transform-D Signatures Signatures Signatures Signatures Granularity Group of Bins of 3D transforms Whole frame frames histograms on GOP Blocks or Down- Frame other types of sampled transforms subdivision frames Points of Key frames interest Every frame
    • Variants of Spatial Signatures •  Block-based –  Quantized mean block intensity –  Luminance block patterns ✪ •  ordinal ranking of average block intensity –  Differential luminance block patterns ✪ •  Centroid of gradient orientations •  Dominant edge orientation •  Points-of-interest –  Corner features (Harris points) –  Scale-space features
    • An Example of Spatial Signature
    • Variants of Temporal Signatures •  Temporal luminance patterns –  Ordinal ranking of average frame or block intensity in a group of frames •  Temporal differential luminance patterns ✪ –  Sum of absolute pixel or block difference – quantized and thresholded –  Block motion vectors – histogram of quantized directions •  Shot duration sequence
    • Color Signatures •  Histogram-based –  Level-quantized histogram, e.g., (32, 16, 16) for Y, U, V, followed by magnitude quantization on each bin ✪ –  Level-quantized histogram, followed by ordinal ranking of histogram bins by magnitude
    • Transform-Domain Signatures •  Affine transformation resilient –  Polar Fourier transform –  Radon transform ✪ –  Singular Value Decomposition •  Energy compaction –  3D DCT –  3D Wavelet transform
    • Which One to Use? •  Spatial signatures, particularly block-based, are the overall category winner, and most widely used. •  Temporal and color signatures are less robust, but can be used along with spatial signatures to enhance discriminability. •  Transform-domain signatures are computationally expensive and not widely used in practice. •  The weakness of block-based spatial signatures is their lack of resilience against excessive geometric distortion, e.g., rotation and cropping.
    • Challenges of Geometric Distortions Original Rotation by 10 degrees Rotation + Cropping
    • Fingerprinting performance •  Video fingerprint using block-based spatial signatures –  Data size: a few hundreds bits per frame or <10 Kbps –  Speed: 1/10 playback time (10x RT) or faster for standard-def video.
    • Fingerprint Matching and Search
    • Similarity Measures •  Distance-based ✪ –  L1 (Manhattan) or L2 (Euclidean) distance •  For non-binary signatures •  Weights can be assigned when multiple signatures are used –  Hamming Distance •  For binary signatures •  Probability-based –  Probabilistic models for common distortion vectors
    • Complexity of Fingerprint Search •  Exhaustive search has linear complexity, or O(K*N) –  N is the size of reference fingerprint DB, in minutes or hours. –  K is length of the query video. –  N can be further decomposed into M*L •  M is number of reference video fingerprints in DB •  L is the average length of video fingerprints in DB •  The curse is on N or M, the DB size.
    • Strategies for Fast Search Strategies Fingerprint Search Motion Vector Search Reduce search space ✪ LSH Greedy search Sequential alignment Hierarchical search Early exit Hamming distance > T SAD > T Approximation in Frame down-sampling Block down-sampling distance calculation
    • Locality Sensitive Hashing (LSH) •  Consider ε-NNS problem, –  For a query point q, find an approximate point p such that d(q,p) < (1+ε) d(q,P) –  LSH guarantees p can be found, with high probability, in O(N1/(1+ε)) •  Geometric reasoning: –  Close points in space are likely to be close after hashing (e.g., a projection onto a lower dimensional space) –  By using multiple hash functions, the probability of close points falling close is increased
    • Other Approximation Techniques •  Multi-resolution coarse-to-fine search –  Fine-level search can be terminated (early exit) if coarse-level search is far off. –  Rank candidates by coarse-level search scores and take only top N candidates for fine-level search. •  Adaptive hashing – “learning to hashing” –  Hashing is non-deterministic; system is trained to adapt to identification task and data. –  A substantial reduction in search space.
    • Applications
    • UGC & P2P – copyright concerns? P2P UGC •  UGC Traffic in 07/2007 (Source: comScore, November 30, 2007) –  70 million people viewed 2.5 billion videos on YouTube.com (39.4% of total UGC audience) –  38 million people viewed 360 million videos on MySpace.com (22.6% of total UGC audience) •  P2P Traffic 2007 (Source: iPoque, November 28, 2007) –  Average 50-60% total Internet traffic: 49% in Middle East; 83% in Eastern Europe. –  BitTorrent 66.7%, eDonkey 28.6% of total P2P traffic
    • Video Content Registration •  A reference video fingerprint database is pre- populated. •  Two types of information are stored with video fingerprint data in the reference database –  Metadata, e.g., title, owner, release date, etc. –  Business rules, e.g., allow, filter, or advertise, possibly based on certain conditions •  MovieLabs’ Content Recognition Rules (CRR) is an industry standard interface for expressing and exchanging rules.
    • Video Content Filtering
    • Video Content Tracking
    • Example: Video Content Tracking
    • Tracking Olympic Video Distribution
    • Other Applications •  Broadcast monitoring –  Audit TV program and commercial airings •  Contextual Ads (monetization) –  Pair ads with identified content like Google AdSense •  Video asset management –  Content-based IDs identify linkage between edits and sources •  Content-based video search –  Query by video clip
    • Summary •  Research in video fingerprinting began a decade ago; it had developed into a technology and been adopted by the industry. •  Different types of signatures are used to form a video fingerprint, including spatial, temporal, color, and transform-domain signatures. •  Spatial signatures are overall winner judged by multiple criteria, and widely adopted as primary signatures; temporal and color signatures can be used as secondary signatures to enhance discriminability. •  Brute-force, exhaustive fingerprint search is an O(K*N) problem. •  Fast approximate algorithms make fingerprint search tractable and scalable for practical applications. •  Current applications focus on copyright enforcement, other applications being developed and experimented include contextual advertising, asset management, and content-based video search.