Comparison of Semantic Similarity Measures for                                    NDVC Detection Using Semantic Features  ...
Upcoming SlideShare
Loading in …5

Comparison of Semantic Similarity Measures for NDVC Detection Using Semantic Features


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Comparison of Semantic Similarity Measures for NDVC Detection Using Semantic Features

  1. 1. Comparison of Semantic Similarity Measures for NDVC Detection Using Semantic Features Hyun-seok Min, Jae Young Choi, Wesley De Neve, and Yong Man Ro Image and Video Systems Lab Korea Advanced Institute of Science and Technology (KAIST) Daejeon, South Korea e-mail: website: INTRODUCTION 1.3. Jiang–Conrath : based on the conditional probability of encountering- Observations an instance of a child concept in a certain corpus - an increasing number of near-duplicate video clips (NDVCs) can be found on websites for video sharing 1 simJC (ti , t j ) = . - content transformations tend to preserve semantic information log( p(ti )) + log( p(t j )) - log(p(lso(ti , t j )))- Novel idea - NDVC detection by means of semantic features and adaptive 1.4. Lin : follows from his theory of similarity between arbitrary objects semantic distance measurement- Objective 2 × log p(lso(ti , t j )) - to answer the question: ‘which semantic similarity measure is most simL (ti , t j ) = . effective in the context of NDVC detection using semantic features?’ log p(ti ) + log p(t j )II. SEMANTIC NDVC DETECTION 2. Similarity measurement using Flickr tag occurrence and co-occurrence Input: query video clip statistics Video shot segmentation Image folksonomy I ti ∩ j I ti ∩ j : the set of images annotated with both t t ti and tj simTC (ti , t j ) = , ... ... I ti I ti : the set of images annotated with tag ti Tag relevance learning Shot 1 ... Shot i ... Shot N using neighbor voting IV. EXPERIMENTS Semantic concept detection 1. Experimental setup ... ... - Use of TRECVID 2009 for creating NDVCs and reference video clips Creation of a semantic video signature - Use of MIRFLICKR-25000 as a source of collective knowledge - Use of Toolbox and the Natural Language Toolkit (NLTK) for WordNet- Matching of semantic video signatures based semantic similarity measurement Reference video 2. Experimental results Output: NDVC identification database - Semantic NDVC detection is, in general, most effective when similarity measurement makes use of tag statistics derived from Flickr Fig. 1. NDVC detection by means of semantic video signatures. - similarity measurement using Flickr-based tag statistics is able to   exploit an unrestricted concept vocabulary, whereas the WordNet- Ai  ti , j , wi , j , j  1,..., Ai , wi , j is a weight value for tag ti,j based similarity measures are only able to make use of semantic concepts that are part of the English-language version of WordNet 0.8 q r q r q r q r T Tag statistics Leacock–ChodorowDshot (S , S ) = SQFD( A , A ) = w | -w G w | -w , 0.7 Jiang–Conrath Lin 0.6 Resnik SQFD: Signature Quadratic Form Distance 0.5 NDCR W: vector of weight values for the tags t under consideration 0.4 G: matrix of ground distances (computed using tag statistics) 0.3 III. SEMANTIC SIMILARITY MEASURES 0.2 0.1 1. Similarity measurement using the WordNet knowledge base 0 blur crop pattern change in mirroring resize shift average 1.1. Leacock–Chodorow : relies on the length of the shortest path insertion brightness between two concepts Transformations len(ti , t j ) simLC (ti , t j ) = log , Fig. 2. Influence of semantic similarity measurement on the effectiveness of semantic 2E NDVC detection. The lower the NDCR, the more effective NDVC detection. len(ti , t j ) : the shortest path between two concepts (ti, tj) V. CONCLUSIONS E : the overall depth of the taxonomy used - We presented a novel technique for NDVC detection 1.2. Resnik : measures the information content of the most specific - takes advantage of the collective knowledge in an image folksonomy, common ancestor of two concepts thus allowing for the use of an unrestricted concept vocabulary - We quantified the influence of several semantic similarity measures on simR (ti , t j ) = log p(lso(ti , t j )), the effectiveness of NDVC detection using semantic features - semantic NDVC detection is most effective when semantic similarity lso(ti , t j ) : the lowest super-ordinate of ti and tj measurement takes advantage of tag occurrence and co-occurrence statistics derived from Flickr (an unstructured source of knowledge), p(t ) : the probability of encountering an instance of a concept t outperforming semantic similarity measurement that takes advantage in a certain corpus of WordNet (a knowledge base with a hierarchical structure) The International Conference on Multimedia Information Technology and Applications (MITA), July 2012, Beijing (China)