Me12tt tub

Feature Selection Methods for Bag-
of-(visual)-Words Approaches
Schmiedeke, Kelm and Sikora
Communication Systems Group
Technische Universität Berlin

4 October, 2012

Motivation 2

sports

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Lessons from last year 3

Features derived from metadata (esp. tags)
outperform visual and ASR ones
• Metadata: Naive Bayes (non translated)
• Visual feat.: SVM (avg. pooled histograms)
• ASR transcripts: kNN (JSD)

Uploader mainly contribute to a single category


This year‘s question 4

Does feature selection improve results achieved
with BoW model?


Feature Selection/ Transformation 5

Mutual information:

Term Frequency:

PCA (Eigenvalue decomposition):


Feature Selection 6

Concepts for terms selection:

Top terms for religion: Top terms for politics: Top terms for health:
bibl (0.0897) lunch (0.1200) jama (0.0495)
jesu (0.0797) obama (0.1113) health (0.0378)
god (0.0796) polit (0.0982) report (0.0357)
unleaven(0.0782) grittv (0.0881) harta (0.0227)
eeli (0.0782) flander (0.0861) exceric (0.0211)
davideel(0.0781) laura (0.0855) yoga (0.0203)
ministri(0.0780) economi(0.0747) study (0.0192)

… … …

daytripp (0.0) sonnet (0.0) ilsr (0.0)
adagio (0.0) screenplai (0.0) resystem (0.0)
acustica (0.0) acustica (0.0) acustica (0.0)


Feature Selection 7

Top-k-Union:

bibl (0.0897) lunch (0.1200) jama (0.0495)
god (0.0796) polit (0.0982) report (0.0357)
misistri(0.0780) economi(0.0747) study (0.0192)

… … …



Feature Selection 8

Top-k:

bibl (0.0897) lunch (0.1200) jama (0.0495)
god (0.0796) polit (0.0982) report (0.0357)

… … …



Feature Selection 9

Union>th:

bibl (0.0897) lunch (0.1200) jama (0.0495)
god (0.0796) polit (0.0982) report (0.0357)

… … …

0.0002 0.0002 0.0001


Feature Selection 10

Intersection>Th:

bibl (0.0897) lunch (0.1200) jama (0.0495)
god (0.0796) polit (0.0982) report (0.0357)
… … …
web appl gossip
python googl interview
xbox teen iphon
big music san
expo tv texa
… … …
0.0002 0.0002 0.0001


Official runs 11

Bag of clustered SURF features transformed
using PCA
• Result does not benefit from transformation

official run without FS/FT
mAP 0.2301 0.2309
CA 41.63 % 41.71 %


Official runs 12

Bag of filtered ASR transcripts terms (Union>Th)
• Result does benefit from selection

mAP 0.1035 0.0522
CA 32.53 % 26.54 %


Official runs 13

Bag of clustered SURF features filtered using MI
and intersection>th strategy
• Result does slightly benefit from selection

mAP 0.2259 0.2221
CA 40.80 % 40.78 %


Official runs 14

Bag of filtered terms derived from tags, title and
descriptions (Union>Th)
• Result does benefit from selection

mAP 0.5225 0.4146
CA 58.18 % 55.70 %


Official runs 15

Bag of clustered SURF features transformed
using PCA and decision fusion using uploader
• Result does benefit from transformation

mAP 0.3304 0.2988
CA 52.14 % 49.19 %


Conclusion & Future Work 16

FS showed potential for improving the results

Choice of using MI or TF is not critical, both
methods achieve roughly same results
• Metadata (mAP) : MI12004 (0.5277) vs. TF14976 (0.5275)

Investigation in different scaling schemes (NB)

Use of class-independent selection score (MI)


Backup 17


Backup 18


Extracting visual features 19

SURF are extracted from each key frame
• At keypoints and at a regular grid

Vocabulary is built using hierarchical clustering
on SURF features of development set
• 4096/8196 codewords

Term vector for a single video is obtained by bin-
wise pooling of each key frames’ term vector
• avg


MediaEval 2012: Tagging Task 20

Question: What is the videos’ blip.tv category?
Blip.tv database (cc): ~ 3300 h
• 5288 training videos
• 9550 test videos
Official evaluation measurement is Mean
Average Precision (mAP)
Workshop will be held 4-5 October 2012 in Pisa,
Italy


Me12tt tub

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (12)

Recently uploaded

Recently uploaded (20)

Me12tt tub