Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visual)-Words Approaches

1,046 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visual)-Words Approaches

  1. 1. Feature Selection Methods for Bag- of-(visual)-Words Approaches Schmiedeke, Kelm and Sikora Communication Systems Group Technische Universität Berlin 4 October, 2012
  2. 2. Motivation 2 sports Schmiedeke: “Feature Selection Methods for BoW Approaches”
  3. 3. Lessons from last year 3 Features derived from metadata (esp. tags) outperform visual and ASR ones • Metadata: Naive Bayes (non translated) • Visual feat.: SVM (avg. pooled histograms) • ASR transcripts: kNN (JSD) Uploader mainly contribute to a single category Schmiedeke: “Feature Selection Methods for BoW Approaches”
  4. 4. This year‘s question 4 Does feature selection improve results achieved with BoW model? Schmiedeke: “Feature Selection Methods for BoW Approaches”
  5. 5. Feature Selection/ Transformation 5 Mutual information: Term Frequency: PCA (Eigenvalue decomposition): Schmiedeke: “Feature Selection Methods for BoW Approaches”
  6. 6. Feature Selection 6 Concepts for terms selection:Top terms for religion: Top terms for politics: Top terms for health:bibl (0.0897) lunch (0.1200) jama (0.0495)jesu (0.0797) obama (0.1113) health (0.0378)god (0.0796) polit (0.0982) report (0.0357)unleaven(0.0782) grittv (0.0881) harta (0.0227)eeli (0.0782) flander (0.0861) exceric (0.0211)davideel(0.0781) laura (0.0855) yoga (0.0203)ministri(0.0780) economi(0.0747) study (0.0192)… … …daytripp (0.0) sonnet (0.0) ilsr (0.0)adagio (0.0) screenplai (0.0) resystem (0.0)acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: “Feature Selection Methods for BoW Approaches”
  7. 7. Feature Selection 7 Top-k-Union:Top terms for religion: Top terms for politics: Top terms for health:bibl (0.0897) lunch (0.1200) jama (0.0495)jesu (0.0797) obama (0.1113) health (0.0378)god (0.0796) polit (0.0982) report (0.0357)unleaven(0.0782) grittv (0.0881) harta (0.0227)eeli (0.0782) flander (0.0861) exceric (0.0211)davideel(0.0781) laura (0.0855) yoga (0.0203)misistri(0.0780) economi(0.0747) study (0.0192)… … …daytripp (0.0) sonnet (0.0) ilsr (0.0)adagio (0.0) screenplai (0.0) resystem (0.0)acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: “Feature Selection Methods for BoW Approaches”
  8. 8. Feature Selection 8 Top-k:Top terms for religion: Top terms for politics: Top terms for health:bibl (0.0897) lunch (0.1200) jama (0.0495)jesu (0.0797) obama (0.1113) health (0.0378)god (0.0796) polit (0.0982) report (0.0357)unleaven(0.0782) grittv (0.0881) harta (0.0227)eeli (0.0782) flander (0.0861) exceric (0.0211)davideel(0.0781) laura (0.0855) yoga (0.0203)misistri(0.0780) economi(0.0747) study (0.0192)… … …daytripp (0.0) sonnet (0.0) ilsr (0.0)adagio (0.0) screenplai (0.0) resystem (0.0)acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: “Feature Selection Methods for BoW Approaches”
  9. 9. Feature Selection 9 Union>th:Top terms for religion: Top terms for politics: Top terms for health:bibl (0.0897) lunch (0.1200) jama (0.0495)jesu (0.0797) obama (0.1113) health (0.0378)god (0.0796) polit (0.0982) report (0.0357)unleaven(0.0782) grittv (0.0881) harta (0.0227)eeli (0.0782) flander (0.0861) exceric (0.0211)davideel(0.0781) laura (0.0855) yoga (0.0203)misistri(0.0780) economi(0.0747) study (0.0192)… … …daytripp (0.0) sonnet (0.0) ilsr (0.0)adagio (0.0) screenplai (0.0) resystem (0.0)acustica (0.0) acustica (0.0) acustica (0.0) 0.0002 0.0002 0.0001 Schmiedeke: “Feature Selection Methods for BoW Approaches”
  10. 10. Feature Selection 10 Intersection>Th:Top terms for religion: Top terms for politics: Top terms for health:bibl (0.0897) lunch (0.1200) jama (0.0495)jesu (0.0797) obama (0.1113) health (0.0378)god (0.0796) polit (0.0982) report (0.0357)… … …web appl gossippython googl interviewxbox teen iphonbig music sanexpo tv texa… … …daytripp (0.0) sonnet (0.0) ilsr (0.0)adagio (0.0) screenplai (0.0) resystem (0.0)acustica (0.0) acustica (0.0) acustica (0.0) 0.0002 0.0002 0.0001 Schmiedeke: “Feature Selection Methods for BoW Approaches”
  11. 11. Official runs 11 Bag of clustered SURF features transformed using PCA • Result does not benefit from transformation official run without FS/FT mAP 0.2301 0.2309 CA 41.63 % 41.71 % Schmiedeke: “Feature Selection Methods for BoW Approaches”
  12. 12. Official runs 12 Bag of filtered ASR transcripts terms (Union>Th) • Result does benefit from selection official run without FS/FT mAP 0.1035 0.0522 CA 32.53 % 26.54 % Schmiedeke: “Feature Selection Methods for BoW Approaches”
  13. 13. Official runs 13 Bag of clustered SURF features filtered using MI and intersection>th strategy • Result does slightly benefit from selection official run without FS/FT mAP 0.2259 0.2221 CA 40.80 % 40.78 % Schmiedeke: “Feature Selection Methods for BoW Approaches”
  14. 14. Official runs 14 Bag of filtered terms derived from tags, title and descriptions (Union>Th) • Result does benefit from selection official run without FS/FT mAP 0.5225 0.4146 CA 58.18 % 55.70 % Schmiedeke: “Feature Selection Methods for BoW Approaches”
  15. 15. Official runs 15 Bag of clustered SURF features transformed using PCA and decision fusion using uploader • Result does benefit from transformation official run without FS/FT mAP 0.3304 0.2988 CA 52.14 % 49.19 % Schmiedeke: “Feature Selection Methods for BoW Approaches”
  16. 16. Conclusion & Future Work 16 FS showed potential for improving the results Choice of using MI or TF is not critical, both methods achieve roughly same results • Metadata (mAP) : MI12004 (0.5277) vs. TF14976 (0.5275) Investigation in different scaling schemes (NB) Use of class-independent selection score (MI) Schmiedeke: “Feature Selection Methods for BoW Approaches”
  17. 17. Backup 17 Schmiedeke: “Feature Selection Methods for BoW Approaches”

×