A Journey Into the Emotions of Software Developers
Towards a Better Understanding of Model-Free Semantic Concept Detection for Annotation and Near-Duplicate Video Clip Detection
1. Towards a Better Understanding of Model-Free Semantic Concept
Detection for Annotation and Near-Duplicate Video Clip Detection
Hyun-seok Min, Jae Young Choi, Wesley De Neve, and Yong Man Ro
Image and Video Systems Lab
Korea Advanced Institute of Science and Technology (KAIST)
Daejeon, South Korea
e-mail: ymro@ee.kaist.ac.kr website: http://ivylab.kaist.ac.kr
I. INTRODUCTION 2. Effectiveness of NDVC detection
- Observations - The higher the tag relevance threshold
- content transformations tend to preserve semantic information - the lower the average number of detected concepts per shot
- prior research showed that model-free semantic concept detection - the less effective NDVC detection
can be used for identifying near-duplicate video clips (NDVCs)
- model: mapping of a distribution of visual features on concepts
- no need for training and allows using an unrestricted vocabulary
- Research challenge
- to better understand the usefulness of model-free semantic concept
detection for both video annotation and NDVC detection
II. VIDEO ANNOTATION AND NDVC DETECTION
Input: query video clip
Noisy
Video shot segmentation image folksonomy
Annotation
... ...
Fig. 3. NDCR as a function of the tag relevance threshold.
NDVC detection
Tag relevance learning
Shot 1 ... Shot i ... Shot N using neighbor voting
Detected semantic Detected semantic Detected semantic
Semantic concept detection concepts concepts concepts
Key frame
(tag relevance (tag relevance (tag relevance
... ...
threshold=1.01) threshold=1.10) threshold=1.15)
Creation of a semantic feature signature sky, night, star,
geotagged, dark,
sky, night, star,
nightscene, milky way,
Matching of semantic feature signatures Reference video geotagged, dark,
game, sea,
nightscene, milky way, sky, night
database Reference constellation,
Output: NDVC identification game, sea,
impressedbeauty,
constellation
concert, line,
Fig. 1. Annotation and NDVC detection using model-free semantic concept detection. unitedkingdom, light,
…
- Metric for measuring the relevance of a tag t w.r.t. a shot Si: star, sagittarius,
c : the frequency of t in the set of k neighbors NDVC
milky way, sky, night,
moon, sea, eclipse,
star, sagittarius,
milky way, sky, night, star, sagittarius,
c Lt (Blur) sunset, light, moon, sea, eclipse, milky way
R(t ) = - , Lt : the number of images labeled with t in F impressedbeauty, sunset
k F lunar, …
F : the number of images in F
night, milky way, sky,
- Layout of the semantic feature signature Ai of a shot Si: NDVC
moon, leamington,
night, milky way, sky, night, milky way,
sagittarius, market,
[ ]
Ai = ti , j , wi , j , j = 1,..., Ai , wi , j : a weight value for tag ti,j
(Crop)
texture, galaxy, blue,
light, …
milky way, sky, stars,
moon sky
- Adaptive semantic distance measurement between shots Sq and Sr: NDVC night, aquila, scorpius,
milky way, sky, stars,
(Picture-in- constellation, space, milky way, sky
night, aquila, scorpius
q r q r q r q r T picture) house, light, telescope,
Dshot (S , S ) = SQFD( A , A ) = w | -w G w | -w , jupiter, …
sky, night, star, sky, night, star,
geotagged, dark, geotagged, dark,
SQFD: Signature Quadratic Form Distance NDVC
nightscene, milky nightscene, milky sky, night
(Brightness
W: weight values for the tags t under consideration (see R(t)) change)
way, game, sea, way, game, sea,
constellation, constellation
G: matrix of ground distances (computed using tag statistics) impressedbeauty, …
Fig. 4. Example keyframes. Correct semantic concepts have been underlined. In
III. EXPERIMENTAL RESULTS addition, semantic concepts that have been detected for both the reference and near-
duplicate video clips have been marked in bold.
1. Effectiveness of video annotation
- The higher the tag relevance threshold
- the lower the average number of detected concepts per shot
IV. CONCLUSIONS
- the higher the precision of video annotation - The problem of detecting semantic concepts for the goal of identifying
NDVCs is more relaxed than the problem of detecting semantic concepts
for the purpose of video annotation
- incorrectly detected semantic concepts negatively affect the
effectiveness of automatic video annotation
- incorrectly detected semantic concepts do not negatively affect the
effectiveness of NDVC detection
- as long as the same incorrect semantic concepts are detected for
both the original and near-duplicate video clips
- Practical implication
- the use of a high tag relevance threshold may result in a high
precision of annotation, but in an NDVC detection effectiveness that is
low, and vice versa
- important for a video management system that simultaneously aims
at annotating newly uploaded video clips and NDVC detection
Fig. 2. Precision of annotation as a function of the tag relevance threshold.
IEEE International Conference on Image Processing (ICIP), September 2011, Brussels (Belgium)