Me11 genre

•

0 likes•217 views

Sebastian Schmiedeke

Technology

Genre Tagging Task: Prediction
using Bag-of-(visual)-Words Approaches
Schmiedeke, Kelm and Sikora
Communication Systems Group
Technische Universität Berlin

Dienstag, 9. Oktober 2012

Motivation 2

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

Experimental Setup 3

Classifier
ASR Transcript

Translation into English
Support Vector
Machine
Title, multiclass (one-vs.-one)
Meta-
Description, Linear or RBF kernel

Bag-of-Words representation
data
Comments

Tags Naive Bayes Genre
without a-priori labels
knowledge
Temporal pooling

Grey SURF
Nearest Neighbour
Classification
Key based on Jensen-
Frames Shannon-Divergence

Colour SURF

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

Extracting textual features 4

Vocabulary is built on video documents
• Stemming
• Stop word removal
• (Translation into English)

Term vectors are generated for each video
document
• (calculation of Tf-idf)

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

Extracting visual features 5

SURF are extracted from each key frame
• At keypoints and at a regular grid

Vocabulary is built using k-Means on SURF
features of development set
• 2048 codewords

Term vector for a single video is obtained by bin-
wise pooling of each key frames’ term vector
• max,avg,median

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

Official runs 6

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

additional runs (textual) 7

To answer some research questions: (for this database)
• Is translation into English useful?
(linear SVM, C=1,
non-translated)

• What is the effect of classification methods?

(non-translated metadata)

• Resources?

(linear SVM, C=1, translated)

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

additional runs (visual) 8

• Which pooling method works best?
(linear SVM, C=1)

• Grey-SURF vs. Colour-SURF
(linear SVM, C=1,
pooled by averaging)

• Local vs. Global features

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

Backup 9

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

Backup 10

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

additional runs (fusion) 11

• Direct fusion?

(linear SVM, C=1)

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

MediaEval 2011: Genre Tagging 12

Question: What is the videos’ blip.tv category?
Blip.tv database (cc): ~ 350 h
• 247 trainings videos
• 1727 test videos
Official evaluation measurement is Mean
Average Precision (MAP)
Workshop will be held 1-2 September 2011 in
Pisa, Italy (satelite of InterSpeech)

Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

How to convert PDF to text with Nanonetsnaman860154

A Call to Action for Generative AI in 2024Results

Histor y of HAM Radio presentation slidevu2urc

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Real Time Object Detection Using Open CVKhem

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

🐬 The future of MySQL is Postgres 🐘RTylerCroy

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men

CNv6 Instructor Chapter 6 Quality of Service

Powerful Google developer tools for immediate impact! (2023-24 C)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

How to Troubleshoot Apps for the Modern Connected Worker

Finology Group – Insurtech Innovation Award 2024

How to convert PDF to text with Nanonets

A Call to Action for Generative AI in 2024

Histor y of HAM Radio presentation slide

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Automating Google Workspace (GWS) & more with Apps Script

Handwritten Text Recognition for manuscripts and early printed texts

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Real Time Object Detection Using Open CV

The 7 Things I Know About Cyber Security After 25 Years | April 2024

🐬 The future of MySQL is Postgres 🐘

What Are The Drone Anti-jamming Systems Technology?

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Advantages of Hiring UIUX Design Service Providers for Your Business

08448380779 Call Girls In Civil Lines Women Seeking Men

Featured

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Featured (20)

2024 State of Marketing Report – by Hubspot

Everything You Need To Know About ChatGPT

Product Design Trends in 2024 | Teenage Engineerings

How Race, Age and Gender Shape Attitudes Towards Mental Health

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Me11 genre

1. Genre Tagging Task: Prediction using Bag-of-(visual)-Words Approaches Schmiedeke, Kelm and Sikora Communication Systems Group Technische Universität Berlin Dienstag, 9. Oktober 2012

2. Motivation 2 Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

3. Experimental Setup 3 Classifier ASR Transcript Translation into English Support Vector Machine Title, multiclass (one-vs.-one) Meta- Description, Linear or RBF kernel Bag-of-Words representation data Comments Tags Naive Bayes Genre without a-priori labels knowledge Temporal pooling Grey SURF Nearest Neighbour Classification Key based on Jensen- Frames Shannon-Divergence Colour SURF Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

4. Extracting textual features 4 Vocabulary is built on video documents • Stemming • Stop word removal • (Translation into English) Term vectors are generated for each video document • (calculation of Tf-idf) Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

5. Extracting visual features 5 SURF are extracted from each key frame • At keypoints and at a regular grid Vocabulary is built using k-Means on SURF features of development set • 2048 codewords Term vector for a single video is obtained by bin- wise pooling of each key frames’ term vector • max,avg,median Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

6. Official runs 6 Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

7. additional runs (textual) 7 To answer some research questions: (for this database) • Is translation into English useful? (linear SVM, C=1, non-translated) • What is the effect of classification methods? (non-translated metadata) • Resources? (linear SVM, C=1, translated) Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

8. additional runs (visual) 8 • Which pooling method works best? (linear SVM, C=1) • Grey-SURF vs. Colour-SURF (linear SVM, C=1, pooled by averaging) • Local vs. Global features Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

9. Backup 9 Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

10. Backup 10 Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

11. additional runs (fusion) 11 • Direct fusion? (linear SVM, C=1) Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

12. MediaEval 2011: Genre Tagging 12 Question: What is the videos’ blip.tv category? Blip.tv database (cc): ~ 350 h • 247 trainings videos • 1727 test videos Official evaluation measurement is Mean Average Precision (MAP) Workshop will be held 1-2 September 2011 in Pisa, Italy (satelite of InterSpeech) Schmiedeke: “Prediction using Bag-of-(visual)-Words Approaches”

Me11 genre

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Me11 genre