TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization with one-vs-all classifiers & MediaEval 2012 Tagging Task: Prediction based on One Best List and Confusion Networks
This document summarizes two projects presented by Delft University of Technology at the TUD MediaEval 2012 Tagging Task. The first project used one-vs-all classifiers and feature fusion to perform multi-modality video categorization. The second project compared different models for predicting tags based on automatic speech recognition output, including support vector machines, dynamic Bayesian networks, and conditional random fields. The dynamic Bayesian network model achieved the best performance overall.
=============================
THIS PRESENTATION IS OUTDATED
See a newer version here: http://www.slideshare.net/openservices/introduction-to-oslc-and-linked-data
===================================
An introduction to Open Services for Lifecycle Collaboration (OSLC):
- The OSLC community
- Linked Data and RDF
- OSLC specifications
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
eArtius HMGE Algorithm Applied to Optimization Tasks with 10,000 Design Varia...eArtius, Inc.
eArtius has developed a multi-objectve optimization technology which is not sensitive to the model dimension because it performs optimization in a sub-space of the design space related to the most significant design variables. All non-significant design variables are dynamically recognized in runtime, and simply ignored.Thus, eArtius algorithms are equally efficient for low-dimensional and high-dimensional tasks.
=============================
THIS PRESENTATION IS OUTDATED
See a newer version here: http://www.slideshare.net/openservices/introduction-to-oslc-and-linked-data
===================================
An introduction to Open Services for Lifecycle Collaboration (OSLC):
- The OSLC community
- Linked Data and RDF
- OSLC specifications
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
eArtius HMGE Algorithm Applied to Optimization Tasks with 10,000 Design Varia...eArtius, Inc.
eArtius has developed a multi-objectve optimization technology which is not sensitive to the model dimension because it performs optimization in a sub-space of the design space related to the most significant design variables. All non-significant design variables are dynamically recognized in runtime, and simply ignored.Thus, eArtius algorithms are equally efficient for low-dimensional and high-dimensional tasks.
Similar to TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization with one-vs-all classifiers & MediaEval 2012 Tagging Task: Prediction based on One Best List and Confusion Networks
Towards Using Semantic Features for Near-Duplicate Video DetectionWesley De Neve
Towards Using Semantic Features for Near-Duplicate Video Detection.
Paper presented at the ICME 2010 Workshop on Visual Content Identification and Search in Singapore.
This presentation covers:
- Definition of APM
- Comparison of APM approaches & vendors (scenario, agent and network-based)
- Challenges of Cloud & Virtualization for APM vendors
- Performance Vision's Virtual Appliance offering
For more information, please visit: http://www.securactive.net
This episode discusses general software diagnostics, its definition, artefacts, its past and present, pattern language, software diagnostics certifications and maturity levels.
MediaEval 2017 - Satellite Task: Visual and textual analysis of social media ...multimediaeval
Presenter: Konstantinos Avgerinakis, Centre for Research & Technology Hellas - Information Technologies Institute, Greece
Paper: http://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_31.pdf
Video: https://youtu.be/IRUxoWsCP2c
Authors: Konstantinos Avgerinakis, Anastasia Moumtzidou, Stelios Andreadis, Emmanouil Michail, Ilias Gialampoukidis, Stefanos Vrochidis, Ioannis Kompatsiaris
Abstract: This paper presents the algorithms that CERTH team deployed in order to tackle disaster recognition tasks and more specifically Disaster Image Retrieval from Social Media (DIRSM) and Flood-Detection in Satellite images (FDSI). Visual and textual analysis, as well as late fusion of their similarity scores, were deployed in social media images, while color analysis in the RGB and near-infrared channel of satellite images was performed in order to discriminate flooded from non-flooded images. Deep Convolutional Neural Network (DCNN), DBpedia Spotlight and combMAX was implemented to tackle DIRSM, while Mahalanobis Distance-based classification and morphological post-processing were applied to deal with FDSI.
A study of the characteristics of Behaviour Driven DevelopmentCarlos Solís
We present a set of main BDD characteristics identified through an analysis of relevant literature and current BDD toolkits.
http://ulir.ul.ie/bitstream/handle/10344/1256/Solis_2011_behaviour.pdf
Do Workflow-Based Systems Satisfy the Demands of the Agile Enterprise of the ...Ilia Bider
Presentation at ACM 2012 workshop http://acm2012.blogs.dsv.su.se attached to BPM 2012 conference in Tallinn http://bpm2012.ut.ee/
Abstract.Workflow-based systems dominate the theory and practice of Business Process Management (BPM) leaving little space to other directions, including Adaptive Case Management. While there are reasons for such dominance in today's enterprise environment, it is time the BPM community studied this dominance in the light of the requirements of the enterprises of the future. This paper analyzes whether workflow-based systems will be able to satisfy business needs in the future based on the assumption that the essential property of the enterprise of the future is agility. The paper identifies properties
that a business process should possess in order to be suitable for employing a workflow-based system to support it. Then, it analyzes whether these properties are compatible with the needs of the enterprise of the future and shows why workflow-based systems may become obsolete in the future.
Apresentação feita Guido Falkenberg, VP de Produtos da Software AG no evento Grupo de Usuários de Brasília, 2012.
Similar to TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization with one-vs-all classifiers & MediaEval 2012 Tagging Task: Prediction based on One Best List and Confusion Networks (20)
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization with one-vs-all classifiers & MediaEval 2012 Tagging Task: Prediction based on One Best List and Confusion Networks
1. TUD MediaEval 2012 Tagging Task
Reporter: Martha A. Larson
Multimedia Information Retrieval Lab
Delft University of Technology
05-10-2012
Delft
University of
Technology
Challenge the future
2. Outline
• TUD-MM: Multi-modality video categorization with one-
vs-all classifiers
• Peng Xu, Yangyang Shi, Martha A. Larson
• MediaEval 2012 Tagging Task: Prediction based on One Best
List and Confusion Networks
• Yangyang Shi, Martha A. Larson, Catholijn M. Jonker
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
2
4. Introduction
• Features from different modalities
• Visual feature
• Visual Words based representation & Global video representation
• Text features
• ASR, Metadata
• Term-frequency, LDA
• Classification and Fusion
• One-vs-all linear SVMs
• Reciprocal Rank Fusion
• Post-processing procedure to assign one category label for each video
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
4
5. Visual representations
• Visual words based video representation
• SIFT features are extracted from each key-frame
• Visual vocabulary is build by hierarchical k-means clustering
• The normalized term-frequency of the entire video
• Global video representation
• Edit features
• Content features
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
5
6. Classification and Fusion
• One-vs-all linear SVM
• C is determined by the 5-folder cross-validation
• Reciprocal Rank Fusion (RRF)*
• K=60 is to balance the importance of the lower ranked items
• The weights w(r) are determined by the cross-validation errors
from each modalities
• Post-processing procedure
* G. V. Cormack, C. L. A. Clarke, and S. Buettcher. Reciprocal rank fusion outperforms
Condorcet and individual rank learning methods. SIGIR '09, pages 758-759..
•
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
6
7. Result analysis
• MAP of different runs
Run_1 Run_2 Run_3 Run_4 Run_5 *Run_6 *Run_7
MAP 0.0061 0.3127 0.2279 0.3675 0.2157 0.0577 0.0047
• Run_1 to Run_5 are official runs
• Run_6 is the visual-only run without post-processing
• Run_7 is the visual-only run with global feature
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
7
8. Performance of visual features
Random basline VW Global
0,025
0,02
0,015
0,01
0,005
0
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
8
9. MediaEval 2012 Tagging Task:
Prediction based on One Best List and
Confusion Networks
Yangyang Shi, Martha A. Larson, Catholijn M. Jonker
05-10-2012
Delft
University of
Technology
Challenge the future
10. Models for One-best list and
Confusion Networks
Dynamic
Bayesian
Networks
Support Conditional
vector random
machine fields
ASR
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
10
11. One-best List SVM
Linear
Cut-off 3 kernel multi-
TF-IDF
vocabulary class SVM
(c=0.5)
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
11
12. One-best List DBN
E1 E2 E3
T1 T2 T3
W1 W2 W3
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
12
13. One-best List DBN
•
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
13
14. Results on Only ASR Run
Models MAP
Run2-one-best SVM 0.23
Run2-one-best DBN 0.25
Run2-one-best CRF 0.10
Run2-CN-CRF 0.09
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
14
15. Average Precision on Each Genre
0,8
0,7
0,6
0,5
0,4
DBN
0,3
0,2 SVM
0,1
0
TUD MediaEval 2012 Tagging Task
Visual similarity measures for semantic video retrieval
15
16. Discussion and Future work
• Discussion
• Visual only methods can be improved in several ways
• Features selection or dimensional reduction methods can be applied.
• Genre-level video representation
• CRF failure
• A document is treated as a item rather than one word.
• Feature size is too big to converge.
• DBN outperforms SVM: The sequence order information probably helps
prediction
• Potentials
• Generate clear and useful labels
Visual similarity measures MediaEval 2012 Tagging Task
Video Search Reranking for Genre retrieval
TUD for semantic video Tagging
16
17. Thank you!
Visual similarity measures for semantic Genre retrieval
Video Search Reranking for video Tagging
17