LIMSI @ MediaEval SED 2014

•

0 likes•189 views

This paper provides an overview of the Social Event Detection (SED) system developed at LIMSI for the 2014 cam- paign. Our approach is based on a hierarchical agglomera- tive clustering that uses textual metadata, user-based knowledge and geographical information. These dierent sources of knowledge, either used separately or in cascade, reach good results for the full clustering subtask with a normal- ized mutual information equal to 0.95 and F1 scores greater than 0.82 for our best run

LIMSI @ MediaEval SED 2014
Camille Guinaudeau, Antoine Laurent, Hervé Bredin

Introduction
Social Event Detection task
—  Mining social events in large collections of online
multimedia
LIMSI team only participates to the « full clustering »
task.
Our system
—  relies on a hierarchical clustering approach,
—  is only based on the metadata associated with
images.

Developement and test sets
Development dataset divided into 3 smaller datasets
Dev A, Dev B and Dev C
à lower computation time
Same number of clusters and the same distribution
in terms of number of images per cluster
Number of images in each cluster quite similar to the
number of images contained in the test set (110,541)

User-based clustering
One cluster per user

User-based clustering
Comparison of the time reference of a randomly chosen picture with the
date of all the other pictures in the cluster
à pick the closest one

User-based clustering
If time distance is less than α hours before or after the time reference,
then the two pictures belongs in the same cluster
Time reference = the mean of the two time references

User-based clustering
If time distance is greater than α hours before or after the time reference,
then the two pictures define two clusters

User-based clustering
Dev A Dev B Dev C
1h 0.9874 0.9872 0.9874
10h 0.9813 0.9796 0.9798
20h 0.9785 0.9766 0.9770
24h 0.9777 0.9755 0.9757
30h 0.9763 0.9743 0.9749
100h 0.9678 0.9673 0.9665
Homogeneity
equals one when each cluster contains only members of a
single class

Hierarchical clustering approach
Starts with the set of clusters defined in
the user-based clustering
Based on a single-linkage clustering
method
Distance matrix maintained at each
iteration
d[u;v] = distance between cluster u and v
Final clustering is obtained by forming flat clusters from the hierarchical
clustering
A thresholdθis used so that observations in each cluster have no
intergroup distance greater than θ

Distance matrices
Textual metadata distance matrix
Each cluster is represented by a vector composed by lemmas
weighted with a BM25 score
A cosine distance is computed between two vectors
à distance between the two corresponding clusters
Vectors creations
•  Words are extracted from the textual metadata (title,
description and tags)
•  Words are lemmatized and only nouns, adjectives and non
modal verbs are kept
•  Each lemma is associated with a score computed using the
BM25 weighting function

Distance matrices
Geographic distance matrix
For each user based cluster u and v that contains at
least one picture with GPS information
A geographic distance is computed between u and v
à the minimum distance between any picture from
cluster u and any picture from cluster v
à If the associate date of the two clusters is greater
than 48h, the geographic distance is artificially
increased

Submitted runs
All run are based on the preliminary clustering
à α = 20 hours / 24 hours or 30 hours
Hierarchical clustering obtained thanks to :
—  Textual metadata only
—  Geographical information only
—  both sources of knowledge
à Combination is done in cascade (hierarchical clustering
based on text is applied on the result of the geographical
clustering)

Results
Dev A Dev B Dev C Test
α 20h 20h 20h 20h 24h 30h 24h 24h
Text ✔ ✔ ✔ ✔ ✔ ✔ ✔
Geo ✔ ✔ ✔ ✔ ✔ ✔ ✔
F1 0.7895 0.7869 0.7912 0.8214 0.8140 0.8115 0.7563 0.7387
NMI 0.9479 0.9472 0.9483 0.9554 0.9532 0.9526 0.9423 0.9359
Div F1 0.6880 0.7258 0.7224 0.8207 0.8132 0.8107 0.7557 0.7380

Conclusions and future works
Our system only based on metadata informations
works well with 82% of F1 score
Results obtained on every dataset are homogenous
We could improve the method by :
—  using the pictures and the associated metadata
—  using web queries (searching pictures on Google
image…)

What's hot

Altima: A KBB-like Reference Pricing SystemNeil Ryan

AbstractIshitva Minocha

Development InfographicRealMassive

Discovering human places of interest from multimodal mobile phone dataWei-Yuan Chang

LHCb Computing Workshop 2018: PV finding with CNNsHenry Schreiner

Bootstrap Custom Image Classification using Transfer Learning by Danielle Dea...Wee Hyong Tok

Thesis PresentationReuben Feinman

IRJET- Design the Surveillance Algorithm and Motion Detection of Objects for ...IRJET Journal

A Scalable Dataflow Implementation of Curran's Approximation AlgorithmNECST Lab @ Politecnico di Milano

An Efficient Cluster Tree Based Data Collection Scheme for Large Mobile With ...kavitha.s kavi

Masters ThesisMatt Moynihan

Matteoli ieee gold_2010_cleangrssieee

RecSys Challenge 2014, SemWexMFF groupLadislav Peska

Robust FIR System Identification for Super-Gaussian Noise Based on Hyperbolic...Hiroki_Tanji

Report_SmartSuggestJigar Shah

Extended Light Mapsstefan_b

Deep Reinforcement Learning: Q-LearningKai-Wen Zhao

Davis scholars studiopresentationbndavis18

Multiple volumetric datasetsSu Yan-Jen

Lec4 ClusteringJeff Hammerbacher

What's hot (20)

Altima: A KBB-like Reference Pricing System

Abstract

Development Infographic

Discovering human places of interest from multimodal mobile phone data

LHCb Computing Workshop 2018: PV finding with CNNs

Bootstrap Custom Image Classification using Transfer Learning by Danielle Dea...

Thesis Presentation

IRJET- Design the Surveillance Algorithm and Motion Detection of Objects for ...

A Scalable Dataflow Implementation of Curran's Approximation Algorithm

An Efficient Cluster Tree Based Data Collection Scheme for Large Mobile With ...

Masters Thesis

Matteoli ieee gold_2010_clean

RecSys Challenge 2014, SemWexMFF group

Robust FIR System Identification for Super-Gaussian Noise Based on Hyperbolic...

Report_SmartSuggest

Extended Light Maps

Deep Reinforcement Learning: Q-Learning

Davis scholars studiopresentation

Multiple volumetric datasets

Lec4 Clustering

Viewers also liked

4845 Programa de Embajadores Rotarios 2016 2017Miguel DE PAOLI

Overview of the MediaEval 2014 Visual Privacy Task multimediaeval

The NNI Query-by-Example System for MediaEval 2014multimediaeval

Emotion in Music Task at MediaEval 2014multimediaeval

The Search and Hyperlinking Task at MediaEval 2014multimediaeval

UPC at MediaEval 2014 Social Event Detection Taskmultimediaeval

Synchronizing Multi-User Photo Galleries with MRFmultimediaeval

MediaEval 2014: THU-HCSIL Approach to Emotion in Music Task using Multi-level...multimediaeval

Stravinsqi/De Montfort University at the MediaEval 2014 C@merata Taskmultimediaeval

UNED @ Retrieving Diverse Social Images Taskmultimediaeval

T he SPL - IT Query by Example Search on Speech system for MediaEval 2014multimediaeval

TALP-UPC at MediaEval 2014 Placing Task: Combining Geographical Knowledge Bas...multimediaeval

4845 Distrito Rotary International Argentina Paraguay Presentación Miguel DE PAOLI

RECOD at MediaEval 2014: Violent Scenes Detection Taskmultimediaeval

The Munich LSTM-RNN Approach to the MediaEval 2014 “Emotion in Music” Taskmultimediaeval

Viewers also liked (15)

4845 Programa de Embajadores Rotarios 2016 2017

Overview of the MediaEval 2014 Visual Privacy Task

The NNI Query-by-Example System for MediaEval 2014

Emotion in Music Task at MediaEval 2014

The Search and Hyperlinking Task at MediaEval 2014

UPC at MediaEval 2014 Social Event Detection Task

Synchronizing Multi-User Photo Galleries with MRF

MediaEval 2014: THU-HCSIL Approach to Emotion in Music Task using Multi-level...

Stravinsqi/De Montfort University at the MediaEval 2014 C@merata Task

UNED @ Retrieving Diverse Social Images Task

T he SPL - IT Query by Example Search on Speech system for MediaEval 2014

TALP-UPC at MediaEval 2014 Placing Task: Combining Geographical Knowledge Bas...

4845 Distrito Rotary International Argentina Paraguay Presentación

RECOD at MediaEval 2014: Violent Scenes Detection Task

The Munich LSTM-RNN Approach to the MediaEval 2014 “Emotion in Music” Task

Similar to LIMSI @ MediaEval SED 2014

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...multimediaeval

thesis-presentationAnirudh Ladha

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...multimediaeval

slide-171212080528.pptxSharanrajK22MMT1003

IRJET- Study of SVM and CNN in Semantic Concept DetectionIRJET Journal

Real Time Object Dectection using machine learningpratik pratyay

Placing Images with Refined Language Models and Similarity Search with PCA-re...Symeon Papadopoulos

Mining Regional Knowledge in Spatial Datasetbutest

Flag segmentation, feature extraction & identification using support vector m...R M Shahidul Islam Shahed

Support Vector Machines USING MACHINE LEARNING HOW IT WORKSrajalakshmi5921

CenterForDomainSpecificComputing-PosterYunming Zhang

Rethinking metrics: metrics 2.0Dieter Plaetinck

Unsupervised/Self-supervvised visual object trackingYu Huang

B4UConference_machine learning_deeplearningHoa Le

[SOCRS2013]Differential Context Modeling in Collaborative FilteringYONG ZHENG

Types of Machine Learnig Algorithms(CART, ID3)Fatimakhan325

Yulia Honcharenko "Application of metric learning for logo recognition"Fwdays

ML basic & clusteringmonalisa Das

A Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videosijtsrd

IRJET- Content Based Video Activity ClassifierIRJET Journal

Similar to LIMSI @ MediaEval SED 2014 (20)

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...

thesis-presentation

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...

slide-171212080528.pptx

IRJET- Study of SVM and CNN in Semantic Concept Detection

Real Time Object Dectection using machine learning

Placing Images with Refined Language Models and Similarity Search with PCA-re...

Mining Regional Knowledge in Spatial Dataset

Flag segmentation, feature extraction & identification using support vector m...

Support Vector Machines USING MACHINE LEARNING HOW IT WORKS

CenterForDomainSpecificComputing-Poster

Rethinking metrics: metrics 2.0

Unsupervised/Self-supervvised visual object tracking

B4UConference_machine learning_deeplearning

[SOCRS2013]Differential Context Modeling in Collaborative Filtering

Types of Machine Learnig Algorithms(CART, ID3)

Yulia Honcharenko "Application of metric learning for logo recognition"

ML basic & clustering

A Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videos

IRJET- Content Based Video Activity Classifier

More from multimediaeval

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...multimediaeval

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...multimediaeval

Sports Video Classification: Classification of Strokes in Table Tennis for Me...multimediaeval

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...multimediaeval

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Taskmultimediaeval

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...multimediaeval

Fooling an Automatic Image Quality Estimatormultimediaeval

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...multimediaeval

Pixel Privacy: Quality Camouflage for Social Imagesmultimediaeval

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matchingmultimediaeval

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...multimediaeval

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...multimediaeval

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...multimediaeval

Deep Conditional Adversarial learning for polyp Segmentationmultimediaeval

A Temporal-Spatial Attention Model for Medical Image Detectionmultimediaeval

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...multimediaeval

Fine-tuning for Polyp Segmentation with Attentionmultimediaeval

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...multimediaeval

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...multimediaeval

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...multimediaeval

More from multimediaeval (20)

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

Fooling an Automatic Image Quality Estimator

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

Pixel Privacy: Quality Camouflage for Social Images

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...

Deep Conditional Adversarial learning for polyp Segmentation

A Temporal-Spatial Attention Model for Medical Image Detection

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...

Fine-tuning for Polyp Segmentation with Attention

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...

LIMSI @ MediaEval SED 2014

1. LIMSI @ MediaEval SED 2014 Camille Guinaudeau, Antoine Laurent, Hervé Bredin

2. Introduction Social Event Detection task —  Mining social events in large collections of online multimedia LIMSI team only participates to the « full clustering » task. Our system —  relies on a hierarchical clustering approach, —  is only based on the metadata associated with images.

3. Developement and test sets Development dataset divided into 3 smaller datasets Dev A, Dev B and Dev C à lower computation time Same number of clusters and the same distribution in terms of number of images per cluster Number of images in each cluster quite similar to the number of images contained in the test set (110,541)

4. User-based clustering One cluster per user

5. User-based clustering Comparison of the time reference of a randomly chosen picture with the date of all the other pictures in the cluster à pick the closest one

6. User-based clustering If time distance is less than α hours before or after the time reference, then the two pictures belongs in the same cluster Time reference = the mean of the two time references

7. User-based clustering If time distance is greater than α hours before or after the time reference, then the two pictures define two clusters

8. User-based clustering Dev A Dev B Dev C 1h 0.9874 0.9872 0.9874 10h 0.9813 0.9796 0.9798 20h 0.9785 0.9766 0.9770 24h 0.9777 0.9755 0.9757 30h 0.9763 0.9743 0.9749 100h 0.9678 0.9673 0.9665 Homogeneity equals one when each cluster contains only members of a single class

9. Hierarchical clustering approach Starts with the set of clusters defined in the user-based clustering Based on a single-linkage clustering method Distance matrix maintained at each iteration d[u;v] = distance between cluster u and v Final clustering is obtained by forming flat clusters from the hierarchical clustering A thresholdθis used so that observations in each cluster have no intergroup distance greater than θ

10. Distance matrices Textual metadata distance matrix Each cluster is represented by a vector composed by lemmas weighted with a BM25 score A cosine distance is computed between two vectors à distance between the two corresponding clusters Vectors creations •  Words are extracted from the textual metadata (title, description and tags) •  Words are lemmatized and only nouns, adjectives and non modal verbs are kept •  Each lemma is associated with a score computed using the BM25 weighting function

11. Distance matrices Geographic distance matrix For each user based cluster u and v that contains at least one picture with GPS information A geographic distance is computed between u and v à the minimum distance between any picture from cluster u and any picture from cluster v à If the associate date of the two clusters is greater than 48h, the geographic distance is artificially increased

12. Submitted runs All run are based on the preliminary clustering à α = 20 hours / 24 hours or 30 hours Hierarchical clustering obtained thanks to : —  Textual metadata only —  Geographical information only —  both sources of knowledge à Combination is done in cascade (hierarchical clustering based on text is applied on the result of the geographical clustering)

13. Results Dev A Dev B Dev C Test α 20h 20h 20h 20h 24h 30h 24h 24h Text ✔ ✔ ✔ ✔ ✔ ✔ ✔ Geo ✔ ✔ ✔ ✔ ✔ ✔ ✔ F1 0.7895 0.7869 0.7912 0.8214 0.8140 0.8115 0.7563 0.7387 NMI 0.9479 0.9472 0.9483 0.9554 0.9532 0.9526 0.9423 0.9359 Div F1 0.6880 0.7258 0.7224 0.8207 0.8132 0.8107 0.7557 0.7380

14. Results Dev A Dev B Dev C Test α 20h 20h 20h 20h 24h 30h 24h 24h Text ✔ ✔ ✔ ✔ ✔ ✔ ✔ Geo ✔ ✔ ✔ ✔ ✔ ✔ ✔ F1 0.7895 0.7869 0.7912 0.8214 0.8140 0.8115 0.7563 0.7387 NMI 0.9479 0.9472 0.9483 0.9554 0.9532 0.9526 0.9423 0.9359 Div F1 0.6880 0.7258 0.7224 0.8207 0.8132 0.8107 0.7557 0.7380

15. Results Dev A Dev B Dev C Test α 20h 20h 20h 20h 24h 30h 24h 24h Text ✔ ✔ ✔ ✔ ✔ ✔ ✔ Geo ✔ ✔ ✔ ✔ ✔ ✔ ✔ F1 0.7895 0.7869 0.7912 0.8214 0.8140 0.8115 0.7563 0.7387 NMI 0.9479 0.9472 0.9483 0.9554 0.9532 0.9526 0.9423 0.9359 Div F1 0.6880 0.7258 0.7224 0.8207 0.8132 0.8107 0.7557 0.7380

16. Results Dev A Dev B Dev C Test α 20h 20h 20h 20h 24h 30h 24h 24h Text ✔ ✔ ✔ ✔ ✔ ✔ ✔ Geo ✔ ✔ ✔ ✔ ✔ ✔ ✔ F1 0.7895 0.7869 0.7912 0.8214 0.8140 0.8115 0.7563 0.7387 NMI 0.9479 0.9472 0.9483 0.9554 0.9532 0.9526 0.9423 0.9359 Div F1 0.6880 0.7258 0.7224 0.8207 0.8132 0.8107 0.7557 0.7380

17. Conclusions and future works Our system only based on metadata informations works well with 82% of F1 score Results obtained on every dataset are homogenous We could improve the method by : —  using the pictures and the associated metadata —  using web queries (searching pictures on Google image…)

LIMSI @ MediaEval SED 2014

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to LIMSI @ MediaEval SED 2014

Similar to LIMSI @ MediaEval SED 2014 (20)

More from multimediaeval

More from multimediaeval (20)

LIMSI @ MediaEval SED 2014