SlideShare a Scribd company logo
1 of 19
Sophia Antipolis, France, October 29-31, 2018
NewsREEL Multimedia at MediaEval 2018:
Dataset Analysis and Baselines
Andreas Lommatzsch, Benjamin Kille
 NewsREEL MultiMedia runs as a pilot task in 2018
 Questions
• What are appropriate approaches?
• What is the level of precision we can expects?
• What are the specific challenges when
implementing a recommender?
• How to improve the Challenge?
Motivation - Objectives
Analyze Different Baseline Strategies – Report Initial Results
NewsREEL
Multimedia
• Baseline-Algorithms
• Evaluation Results
• Discussions
• Conclusion and Outlook
Outline
Structure of the Presentation
Baseline-Algorithms
 Random Recommender
• Assign a random number of impressions
 Use a k-Nearest Neighbor-based Approach
• based on Text Terms
• based Image Labels
 Compute Features indicating that the
item is relevant
• based on Text Terms
• based Image Labels
Algorithms
3 Baseline Approaches
 Approach
• Assign a random number of impressions for each item
 Evaluation-Scores
• Precision@10: 0.0
• Precision@10%: 0.1
• AveragePrecision@10%: 0.1
 Remarks
• Scores significantly lower than the
random recommender, indicate that the
suggestions should be sorted
reversely
Baseline Approaches
Random Recommender
 Approach
• Use a k-Nearest Neighbor approach
• Define an appropriate metric for
computing the similarity (neighbors)
• Similarity
o metrics: cosine similarity / token overlap
o Features: text tokens, image labels
• Predict the number of impressions based on average of the 10 nearest
neighbors
 Challenges
• Which terms should be considered
• Similarity defined based on
image labels (weights)
Baseline Approaches
K-Nearest Neighbor Recommender
Ø Avg
 Approach
• Compute the impact of relevant features
• Features:
o Text tokens
o Image labels
 Challenges:
• Weighting model for the features
• Sparsity of the features
• Combination of the scores
computed for each feature
Baseline Approaches
Combine the impact of different features
Implementation
• Straight-forward implementation of the prediction approaches
• Retraceable results
• Efficiently to compute on a standard computer
Remarks
• Images without Labels (~1%)
• News portals with a very low number of impressions
Baseline Approaches
Discussion
• Baseline-Algorithms
• Evaluation Results
• Discussions
• Conclusion and Outlook
Outline
Structure of the Presentation
• Evaluation Results
Evaluation Results
Results
• In general, the similarity-based approaches perform better than the random
baseline
• Big differences between the domains: best results for domain 13554
• Best results are reached using text-features
• The image labeling configuration have an influence on the results
Evaluation Results
Results
• Baseline-Algorithms
• Evaluation Results
• Discussions
• Conclusion and Outlook
Objective
Analyze Different Baseline Strategies – Report Observations
 Text vs Image labels based features
• Text –based features seems to contain more information
• Common terms (stop words) must be excluded
• Observations
• Top-terms in domain 13555: middle-class, unique, bug
• Top image labels: in domain 13554: snake, roof, folding chair
 Remarks
• Domain specific weighting models should be applied
• The correlation between text and images
should be investigated in detail
(example illustration / representative photo)
Discussion
Analyze Different Baseline Strategies – Findings
• Frequently used images
• Big differences between the news portals (domains)
• Reasons
• Items have a longer lifecycle
• Items stay relevant for longer period of time
• Conclusions
• A fuzzy fingerprinting method
should be added as a baseline
detecting duplicates
• Exclude frequently used images?
Discussion
Analyze Different Baseline Strategies – Findings
Daily Police Report
Sport
 The labeling configuration makes a difference
 Image Label Categories do not match our setting
 Popular labels:
• suite, shirt
• animals
 Frequently occurring semantically incorrect labels
• Cables => classified as snakes
• Stage => blue flashing lights (emergencies)
• Barometer => clock, tachometer, loupe
 Very long, specific labels:
• “dragonfly, darning needle, devil's darning needle,
sewing needle, snake feeder, snake doctor, mosquito”
• “German shepherd, German shepherd dog,
German police dog, alsatian”
Discussion
Analyze Different Baseline Strategies – Findings
• Temporal Changes in the Dataset
• Domain 13554 (motor-talk) is much easier than the other domains
• Observations
• Items have a longer lifecycle in domain 13554
• Items stay relevant for longer period of time
• Items part of both training and item set
• Conclusions
• A fuzzy fingerprinting method
should be added as a baseline
• Adapt the bin size?
Discussion
Analyze Different Baseline Strategies – Findings
• Baseline-Algorithms
• Evaluation Results
• Discussions
• Conclusion and Outlook
Outline
Structure of the Presentation
 The NewsREEL Multimedia challenge is well-defined
 Potential for optimization
• Larger dataset
• Improve the provided labels (labeling precision)
• Additional features (e.g., low –level visual features)
• Consider temporal aspects
• Special handling for images illustrating frequent categories / image fingerprinting
 Algorithms
• Improved weighting models for the features,
machine learning and data mining
• More sophisticated models
(SVM, Low Rank Approximation,
neuronal Networks, random forest)
• Combining different features
• Use of low level features
Baseline Approaches
Evaluation Results
• Andreas Lommatzsch
DAI-Lab, TU Berlin
andreas@dai-lab.de
• Benjamin Kille
DAI-Lab, TU Berlin
benjamin.kille@dai-labor.de
• Additional Information
• http://www.newsreelchallenge.org/
• The code is available on request
(Java/Maven)
Contact
Further Information

More Related Content

Similar to MediaEval 2018: Baseline Algorithms for Predicting the Interest in News

Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveoralonso
 
Validating Ideas Through Prototyping
Validating Ideas Through PrototypingValidating Ideas Through Prototyping
Validating Ideas Through PrototypingChris Risdon
 
Data Visualisation Quick Demo
Data Visualisation Quick DemoData Visualisation Quick Demo
Data Visualisation Quick DemoIan Stuart
 
MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...
MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...
MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...multimediaeval
 
Content Strategy From the Outside In
Content Strategy From the Outside InContent Strategy From the Outside In
Content Strategy From the Outside InChip Gettinger
 
Design Recommender systems from scratch
Design Recommender systems from scratchDesign Recommender systems from scratch
Design Recommender systems from scratchDr. Amit Sachan
 
Babysitting your orm essenmacher, adam
Babysitting your orm   essenmacher, adamBabysitting your orm   essenmacher, adam
Babysitting your orm essenmacher, adamAdam Essenmacher
 
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013Neo4j
 
Introduction to Competitive Intelligence Portals
Introduction to Competitive Intelligence PortalsIntroduction to Competitive Intelligence Portals
Introduction to Competitive Intelligence PortalsComintelli
 
Evaluating the Big Deal: What metrics matter?
Evaluating the Big Deal: What metrics matter?Evaluating the Big Deal: What metrics matter?
Evaluating the Big Deal: What metrics matter?Selena Killick
 
Audit Webinar: Surefire ways to succeed with Data Analytics
Audit Webinar: Surefire ways to succeed with Data AnalyticsAudit Webinar: Surefire ways to succeed with Data Analytics
Audit Webinar: Surefire ways to succeed with Data AnalyticsCaseWare IDEA
 
Effective code reviews
Effective code reviewsEffective code reviews
Effective code reviewsnextbuild
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentTasktop
 
How to Get Started with a Cross Functional Approach to Content Management - T...
How to Get Started with a Cross Functional Approach to Content Management - T...How to Get Started with a Cross Functional Approach to Content Management - T...
How to Get Started with a Cross Functional Approach to Content Management - T...Lasselle-Ramsay
 
DITA Quick Start Webinar Series: Building a Project Plan
DITA Quick Start Webinar Series: Building a Project PlanDITA Quick Start Webinar Series: Building a Project Plan
DITA Quick Start Webinar Series: Building a Project PlanSuite Solutions
 
Lean Startup Introduction - EFYI'16 - Slides
Lean Startup Introduction - EFYI'16 - SlidesLean Startup Introduction - EFYI'16 - Slides
Lean Startup Introduction - EFYI'16 - SlidesGregory Prokopski
 
DITA Quick Start Webinar: Defining Your Style Sheet Requirements
DITA Quick Start Webinar: Defining Your Style Sheet RequirementsDITA Quick Start Webinar: Defining Your Style Sheet Requirements
DITA Quick Start Webinar: Defining Your Style Sheet RequirementsSuite Solutions
 

Similar to MediaEval 2018: Baseline Algorithms for Predicting the Interest in News (20)

Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
 
Validating Ideas Through Prototyping
Validating Ideas Through PrototypingValidating Ideas Through Prototyping
Validating Ideas Through Prototyping
 
Data Visualisation Quick Demo
Data Visualisation Quick DemoData Visualisation Quick Demo
Data Visualisation Quick Demo
 
MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...
MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...
MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...
 
Content Strategy From the Outside In
Content Strategy From the Outside InContent Strategy From the Outside In
Content Strategy From the Outside In
 
Design Recommender systems from scratch
Design Recommender systems from scratchDesign Recommender systems from scratch
Design Recommender systems from scratch
 
Babysitting your orm essenmacher, adam
Babysitting your orm   essenmacher, adamBabysitting your orm   essenmacher, adam
Babysitting your orm essenmacher, adam
 
Richard Wilburn - Lean Truth
Richard Wilburn - Lean TruthRichard Wilburn - Lean Truth
Richard Wilburn - Lean Truth
 
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
 
Introduction to Competitive Intelligence Portals
Introduction to Competitive Intelligence PortalsIntroduction to Competitive Intelligence Portals
Introduction to Competitive Intelligence Portals
 
Dip
DipDip
Dip
 
Evaluating the Big Deal: What metrics matter?
Evaluating the Big Deal: What metrics matter?Evaluating the Big Deal: What metrics matter?
Evaluating the Big Deal: What metrics matter?
 
Audit Webinar: Surefire ways to succeed with Data Analytics
Audit Webinar: Surefire ways to succeed with Data AnalyticsAudit Webinar: Surefire ways to succeed with Data Analytics
Audit Webinar: Surefire ways to succeed with Data Analytics
 
Effective code reviews
Effective code reviewsEffective code reviews
Effective code reviews
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics Environment
 
How to Get Started with a Cross Functional Approach to Content Management - T...
How to Get Started with a Cross Functional Approach to Content Management - T...How to Get Started with a Cross Functional Approach to Content Management - T...
How to Get Started with a Cross Functional Approach to Content Management - T...
 
DITA Quick Start Webinar Series: Building a Project Plan
DITA Quick Start Webinar Series: Building a Project PlanDITA Quick Start Webinar Series: Building a Project Plan
DITA Quick Start Webinar Series: Building a Project Plan
 
Lean Startup Introduction - EFYI'16 - Slides
Lean Startup Introduction - EFYI'16 - SlidesLean Startup Introduction - EFYI'16 - Slides
Lean Startup Introduction - EFYI'16 - Slides
 
DITA Quick Start Webinar: Defining Your Style Sheet Requirements
DITA Quick Start Webinar: Defining Your Style Sheet RequirementsDITA Quick Start Webinar: Defining Your Style Sheet Requirements
DITA Quick Start Webinar: Defining Your Style Sheet Requirements
 
The art of project estimation
The art of project estimationThe art of project estimation
The art of project estimation
 

More from multimediaeval

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...multimediaeval
 
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...multimediaeval
 
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...multimediaeval
 
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...multimediaeval
 
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 TaskEssex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Taskmultimediaeval
 
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...multimediaeval
 
Fooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality EstimatorFooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality Estimatormultimediaeval
 
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...multimediaeval
 
Pixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social ImagesPixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social Imagesmultimediaeval
 
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-MatchingHCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matchingmultimediaeval
 
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...multimediaeval
 
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...multimediaeval
 
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...multimediaeval
 
Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentationmultimediaeval
 
A Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image DetectionA Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image Detectionmultimediaeval
 
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...multimediaeval
 
Fine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with AttentionFine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with Attentionmultimediaeval
 
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...multimediaeval
 
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...multimediaeval
 
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ... Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...multimediaeval
 

More from multimediaeval (20)

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
 
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
 
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
 
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
 
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 TaskEssex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
 
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
 
Fooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality EstimatorFooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality Estimator
 
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
 
Pixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social ImagesPixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social Images
 
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-MatchingHCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
 
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
 
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
 
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
 
Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentation
 
A Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image DetectionA Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image Detection
 
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
 
Fine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with AttentionFine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with Attention
 
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
 
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
 
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ... Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 

Recently uploaded

TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
The Black hole shadow in Modified Gravity
The Black hole shadow in Modified GravityThe Black hole shadow in Modified Gravity
The Black hole shadow in Modified GravitySubhadipsau21168
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)Jshifa
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.k64182334
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 

Recently uploaded (20)

TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
The Black hole shadow in Modified Gravity
The Black hole shadow in Modified GravityThe Black hole shadow in Modified Gravity
The Black hole shadow in Modified Gravity
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 

MediaEval 2018: Baseline Algorithms for Predicting the Interest in News

  • 1. Sophia Antipolis, France, October 29-31, 2018 NewsREEL Multimedia at MediaEval 2018: Dataset Analysis and Baselines Andreas Lommatzsch, Benjamin Kille
  • 2.  NewsREEL MultiMedia runs as a pilot task in 2018  Questions • What are appropriate approaches? • What is the level of precision we can expects? • What are the specific challenges when implementing a recommender? • How to improve the Challenge? Motivation - Objectives Analyze Different Baseline Strategies – Report Initial Results NewsREEL Multimedia
  • 3. • Baseline-Algorithms • Evaluation Results • Discussions • Conclusion and Outlook Outline Structure of the Presentation
  • 4. Baseline-Algorithms  Random Recommender • Assign a random number of impressions  Use a k-Nearest Neighbor-based Approach • based on Text Terms • based Image Labels  Compute Features indicating that the item is relevant • based on Text Terms • based Image Labels Algorithms 3 Baseline Approaches
  • 5.  Approach • Assign a random number of impressions for each item  Evaluation-Scores • Precision@10: 0.0 • Precision@10%: 0.1 • AveragePrecision@10%: 0.1  Remarks • Scores significantly lower than the random recommender, indicate that the suggestions should be sorted reversely Baseline Approaches Random Recommender
  • 6.  Approach • Use a k-Nearest Neighbor approach • Define an appropriate metric for computing the similarity (neighbors) • Similarity o metrics: cosine similarity / token overlap o Features: text tokens, image labels • Predict the number of impressions based on average of the 10 nearest neighbors  Challenges • Which terms should be considered • Similarity defined based on image labels (weights) Baseline Approaches K-Nearest Neighbor Recommender Ø Avg
  • 7.  Approach • Compute the impact of relevant features • Features: o Text tokens o Image labels  Challenges: • Weighting model for the features • Sparsity of the features • Combination of the scores computed for each feature Baseline Approaches Combine the impact of different features
  • 8. Implementation • Straight-forward implementation of the prediction approaches • Retraceable results • Efficiently to compute on a standard computer Remarks • Images without Labels (~1%) • News portals with a very low number of impressions Baseline Approaches Discussion
  • 9. • Baseline-Algorithms • Evaluation Results • Discussions • Conclusion and Outlook Outline Structure of the Presentation
  • 11. • In general, the similarity-based approaches perform better than the random baseline • Big differences between the domains: best results for domain 13554 • Best results are reached using text-features • The image labeling configuration have an influence on the results Evaluation Results Results
  • 12. • Baseline-Algorithms • Evaluation Results • Discussions • Conclusion and Outlook Objective Analyze Different Baseline Strategies – Report Observations
  • 13.  Text vs Image labels based features • Text –based features seems to contain more information • Common terms (stop words) must be excluded • Observations • Top-terms in domain 13555: middle-class, unique, bug • Top image labels: in domain 13554: snake, roof, folding chair  Remarks • Domain specific weighting models should be applied • The correlation between text and images should be investigated in detail (example illustration / representative photo) Discussion Analyze Different Baseline Strategies – Findings
  • 14. • Frequently used images • Big differences between the news portals (domains) • Reasons • Items have a longer lifecycle • Items stay relevant for longer period of time • Conclusions • A fuzzy fingerprinting method should be added as a baseline detecting duplicates • Exclude frequently used images? Discussion Analyze Different Baseline Strategies – Findings Daily Police Report Sport
  • 15.  The labeling configuration makes a difference  Image Label Categories do not match our setting  Popular labels: • suite, shirt • animals  Frequently occurring semantically incorrect labels • Cables => classified as snakes • Stage => blue flashing lights (emergencies) • Barometer => clock, tachometer, loupe  Very long, specific labels: • “dragonfly, darning needle, devil's darning needle, sewing needle, snake feeder, snake doctor, mosquito” • “German shepherd, German shepherd dog, German police dog, alsatian” Discussion Analyze Different Baseline Strategies – Findings
  • 16. • Temporal Changes in the Dataset • Domain 13554 (motor-talk) is much easier than the other domains • Observations • Items have a longer lifecycle in domain 13554 • Items stay relevant for longer period of time • Items part of both training and item set • Conclusions • A fuzzy fingerprinting method should be added as a baseline • Adapt the bin size? Discussion Analyze Different Baseline Strategies – Findings
  • 17. • Baseline-Algorithms • Evaluation Results • Discussions • Conclusion and Outlook Outline Structure of the Presentation
  • 18.  The NewsREEL Multimedia challenge is well-defined  Potential for optimization • Larger dataset • Improve the provided labels (labeling precision) • Additional features (e.g., low –level visual features) • Consider temporal aspects • Special handling for images illustrating frequent categories / image fingerprinting  Algorithms • Improved weighting models for the features, machine learning and data mining • More sophisticated models (SVM, Low Rank Approximation, neuronal Networks, random forest) • Combining different features • Use of low level features Baseline Approaches Evaluation Results
  • 19. • Andreas Lommatzsch DAI-Lab, TU Berlin andreas@dai-lab.de • Benjamin Kille DAI-Lab, TU Berlin benjamin.kille@dai-labor.de • Additional Information • http://www.newsreelchallenge.org/ • The code is available on request (Java/Maven) Contact Further Information