Geotagging Social Media Content with a Refined Language Modelling Approach

Symeon Papadopoulos
Symeon PapadopoulosResearcher at CERTH-ITI, Co-founder at infalia
Geotagging Social Media Content with a
Refined Language Modelling Approach
Georgios Kordopatis-Zilos, Symeon Papadopoulos, and Yiannis Kompatsiaris
Centre for Research and Technology Hellas (CERTH) – Information Technologies Institute (ITI)
PAISI 2015, May 19, 2015, Ho Chi Minh City, Vietnam
Where is it?
#2
Depicted landmark
Eiffel Tower
Location
Paris, Tennessee
Keyword “Tennesee” is very important
to correctly place the photo.
Source (Wikipedia):
http://en.wikipedia.org/wiki/Eiffel_Tow
er_(Paris,_Tennessee)
The Problem
• A lot of multimedia content is associated with
geographic information
• Being able to collect and analyze large amounts of
geotagged content could be very useful for several
applications, e.g., situational awareness in incidents such
as natural disasters, verification, geographic trends, etc.
• Yet, only a very small percentage of Web media content
carries explicit information (i.e. GPS coordinates), for
instance ~1% of tweets are geotagged
• To this end, methods that can infer the geographic
location of Web multimedia content are of interest.
#3
A Refined Language Model for Geotagging
• Extend and improve the widely used Language
Model for the problem of location estimation from
text metadata
• The proposed improvements include:
– Feature selection based on a cross-validation approach
– Feature weighting based on spatial entropy
– Multiple resolution grids
• Extensive evaluation on a public benchmark shows
highly competitive performance and reveals new
insights and challenges
#4
Related Work: Gazetteer-based
Methods that use large dictionaries and Volunteered
Geographic Information (VGI), e.g., Geonames, Yahoo!
GeoPlanet, OpenStreetMap, etc.
• Semantics-based IR approach for integrating
gazetteers and VGI (Kessler et al., 2009)
• Similarity matching mediating multiple gazetteers in
a meta-gazetteer service (Smart et al., 2010)
• Comma groups extracted with heuristic methods
from lists of toponyms (Lieberman et al., 2010)
#5
Related Work: Language Models
Language Models: large corpora of geotagged text to
create location-specific language model, i.e. what are
the most frequent keywords for a given location
• Base approach (Serdyukov et al., 2009)
• Disjoint dynamically sized cells (Hauff et al., 2012)
• User frequency instead of term frequency (O’Hare &
Murdock, 2012)
• Clustering, use of χ2 for feature selection and
similarity search (Van Laere et al., 2011)
#6
Related Work: Multimodal Approaches
Multimodal approaches do not only use text, but also
leverage the visual content and other social metadata
of geotagged multimedia.
• Combination of text metadata and visual content at
two levels of granularity, city- (100km) and
landmark-level (100m) (Crandall et al., 2009)
• Build user models leveraging user’s upload history,
SN data and hometown (Trevisiol et al., 2013)
• Hierarchical approach using both text-based and
visual similarity (Kelm et al., 2011)
#7
Related Work: MediaEval Placing Task
• Yearly benchmarking task where different approaches
compete
– Each participant can submit up to 5 runs with different
instances/configurations of their method
• Dataset for Placing Task 2014
– Flickr CC-licensed images & videos, subset of YFCC 100M
– Training: 5M, Testing: 510K (multiple subsets of increasing size
are used for reporting)
• Evaluation
– Estimated location of test image/video is compared against the
known one, and it is checked whether it belongs to a circle of
radius of 10m, 100m, 1km, 10km, 100km and 1000km
– Then, the percentage of images/videos that were correctly
placed within each radius are reported, e.g., P@1km
• Competing approaches: both gazetteer- and LM-based
#8
Overview of Approach
#9
Geographic Language Model (1/2)
• Training data: Corpus Dtr of images and videos
• Test data: Corpus Dts
• For each item (either in training or test data), we
have: user id, title, tags, description
• Title and tags of training images used for building the
model. For testing, description is used only if the
item has neither title nor tags associated with it.
• Pre-processing: punctuation and symbol removal,
lowercasing, numeric tags removed, composite
phrases (e.g. “new+york”  “new”, “york”) are split
into their components
#10
Geographic Language Model (2/2)
• Generate rectangular grid C of areas (cells) of size 0.01⁰ x
0.01⁰ (~1km x 1km near equator)
• For each cell and each tag in the training corpus,
compute the tag-cell probability:
𝑝 𝑡 𝑐 =
𝑁 𝑢
𝑁𝑡
– Nu: number of users in Dtr that used tag t inside the borders of c
– Nt: total count of users that used tag t in any cell
• For a new text T with N tags, compute the Most Likely
Cell (MLC) cj based on the following:
𝑐𝑗 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑖
𝑘=1
𝑁
𝑝(𝑡 𝑘|𝑐𝑖)
#11
Geographic Language Model: Example
#12
new: 0.15
york: 0.27
manhattan: 0.45
liberty: 0.33
…
nyc: 0.52
Feature Selection
• Retaining all possible tags of the training set might result in
noise and overfitting and makes the resulting model very
memory-demanding  select only those tags that are really
discriminative
• Use a variant of Cross-Validation on the training set:
– split the training set in p folds (=10 in our tests)
– use the p-1 folds for creating the LM and one for computing its
accuracy (P@r, where r is the radius we are interested to optimize)
– compute tag geographicity: 𝑡𝑔𝑒𝑜 𝑡 =
𝑁 𝑟
𝑁𝑡
, where Nr is the number of
correctly classified items where tag t appears and Nt is the total
number of items where tag t appears
– select only tags that exceed threshold θtgeo and that have been used
by a minimum number of users θu
#13
Feature Weighting using Spatial Entropy
• We want to penalize tags that appear in many different places and to give more
importance to tags that appear only for specific places.
• To this end, we define a measure of the stochasticity of the tag’s appearance, i.e.
its spatial entropy:
𝑒 𝑡 = −
𝑖=1
𝑀
𝑝 𝑡 𝑐𝑖 log 𝑝(𝑡|𝑐𝑖)
where M is the total number of cells.
• Once the entropy values are computed, we apply Gaussian normalization to
suppress the weight of tags that take extreme values:
𝑁 𝑒 𝑡 , 𝜇, 𝜎 =
1
𝜎 2𝜋
𝑒−(
𝑒 𝑡 −𝜇
2𝜎 )2
where N is the Gaussian function and μ, σ are the mean value and variance
of the distribution and are estimated on Dtr.
• Consequently, the MLC is computed based on:
𝑐𝑗 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑖
𝑘=1
𝑁
𝑝(𝑡 𝑘|𝑐𝑖) ∙ 𝑁 𝑒 𝑡 𝑘 , 𝜇, 𝜎
#14
Entropy Histogram & Gaussian Weighting
#15
+ --
Similarity Search
• Having assigned an item to the Most Likely Cell (MLC), we refine the
location estimate by searching for the most similar item in the cell.
• The k most similar items are retrieved (from Dtr) using Jaccard similarity
on the respective sets of tags:
𝐽 𝑥, 𝑦 =
|𝑇𝑥 ∩ 𝑇𝑦|
|𝑇𝑥 ∪ 𝑇𝑦|
• The final estimation is the centre of gravity of the k most similar images:
𝑙𝑜𝑐 𝑥 =
1
𝑘
𝑖=1
𝑘
𝐽 𝑥, 𝑦𝑖
𝛼 𝑙𝑜𝑐(𝑦𝑖)
where α determines how strongly the result is influenced by the
most similar items.
• If less than k images are retrieved, then only those are used in the above
calculation. If no similar images are retrieved, then the centre of MLC is
provided as output.
#16
Multiple Resolution Grids
• To increase the granularity of the prediction and at
the same time its reliability, we devised the following
dual grid scheme:
– we build two LMs: one of size 0.01° x 0.01° (coarse
granularity) and one of size 0.001° x 0.001° (fine
granularity)
– conduct location estimations based on both
– if the fine granularity estimations falls within the cell of
the estimation based on the coarse granularity, then we
select the fine granularity
– otherwise, we select the coarse (since we consider it by
default more reliable)
#17
Evaluation
• Benchmark dataset: MediaEval 2014
• Training set: 5M, Test set: 510K
• All experiments conducted on the full test set (510K)
• Two stages of evaluation:
– participation in contest (with a limited version of the
proposed approach)
– post-contest performance exploration
#18
Evaluation: MediaEval 2014 Contest (1/4)
• Out of the five runs, three were based on variations
of the presented approach
– run1: LM + feature weighting with spatial entropy +
similarity search + multiple resolution grid
– run4: LM only
– run5: LM + similarity search
(similarity search parameters: α=1, k=4)
• Performance was measured with P@r, where r =
10m, 100m, 1km, 10km, 100km and 1000km
#19
Evaluation: MediaEval 2014 Contest (2/4)
#20
• Proposed improvements (run1) outperform base
approach (run4) and base approach + similarity
search (run5)
• The improvement is more pronounced in the small
ranges (10m, 100m, 1km)
Evaluation: MediaEval 2014 Contest (3/4)
#21
Proposed
Proposed
Evaluation: MediaEval 2014 Contest (4/4)
#22
Number of image tags
Post-Contest Evaluation
Explore the role of different factors:
• Big training set (YFCC100M): ~48M geotagged items
• Feature Selection (FS)
• Feature Weighting with Spatial Entropy (SE)
• Multiple Resolution Grid (MG)
• Similarity Search (SS)
Two settings:
• FAIR: All users from the training set are completely
removed from the test set
• OVERFIT: Users are not removed from the test set even
when some of their media items are included in the
training set.
#23
Post-Contest Evaluation
#24
• Clear improvement with the addition of MG and SS
• The proposed improvements together with the use
of the bigger dataset make the approach perform
better than all other methods in MediaEval 2014
Geographic Error Analysis
• More data leads to
lower error across
the globe.
• Several small US
cities suffer from low
accuracy due to
having names of
large European
cities.
#25
ALL + YFCC100M
run 4
Big Data vs. Complex Algorithms
#26
Using 10x more
data for training led
to equivalent
performance with
using more
complex algorithm
(LM + extensions)
with less data!
Placeability of Media Items
• Sum of cell-tag probabilities is a good indicator of
how confident we are in the decision of the classifier.
#27
Complicated Cases
#28
Statue of Liberty
Security Incident Heatmaps
#29
earthquake
riot
Conclusion
• Key contributions
– Improved geotagging approach, extending the widely used
language model in three ways: feature selection, weighting,
multiple resolution grids
– Thorough analysis of geotagging accuracy offering new insights
and highlighting new challenges
• Future Work
– Exploit visual features to improve (currently visual-only
approaches perform very poorly)
– Integrate gazetteer and structured location data sources (e.g.
Foursquare venues, OpenStreetMap, etc.)
– Evaluate in more challenging settings and datasets (e.g. Twitter,
Instagram)
#30
References (1/2)
• C. Kessler, K. Janowicz, and M. Bishr. An Agenda for the Next Generation
Gazetteer: Geographic Information Contribution and Retrieval. In Proceedings of
the 17th ACM SIGSPATIAL International Conference on Advances in Geographic
Information Systems, pages 91100. ACM, 2009
• P. Smart, C. Jones, and F. Twaroch. Multi-source Toponym Data Integration and
Mediation for a Meta-gazetteer Service. In Proceedings of the 6th international
conference on Geographic information science. GIScience10. Springer-Verlag,
Berlin, Heidelberg, 234248, 2010
• M.D. Lieberman, H. Samet, and J. Sankaranayananan. Geotagging: using Proximity,
Sibling, and Prominence Clues to Understand Comma Groups. In Proceedings of
the 6th Workshop on Geographic Information Retrieval, 2010
• P. Serdyukov, V. Murdock, and R. Van Zwol. Placing Flickr Photos on a Map. In
SIGIR09, pages 484-491, New York, NY, USA, 2009. ACM
• C. Hauff and G. Houben. Geo-location Estimation of Flickr images: Social Web
based Enrichment. ECIR 2012, p. 85-96. Springer LNCS 7224, April 1-5 2012
• N. O'Hare, and V. Murdock. Modeling Locations with Social Media. Information
Retrieval, pp. 133, 2012
• O. Van Laere, S. Schockaert, and B. Dhoedt. Finding Locations of Flickr Resources
using Language Models and Similarity Search. ICMR 11, New York, USA, 2011. ACM
#31
References (2/2)
• D.J. Crandall, L. Backstrom, D. Huttenlocher, and J.
Kleinberg. Mapping the World's Photos. In Proceedings
of the 18th international conference on World wide web,
WWW 09, pages 761770, New York, NY, USA, 2009. ACM
• M. Trevisiol, H. Jegou, J. Delhumeau, and S. Gravier.
Retrieving Geo-Location of Videos with a Divide and
Conquer Hierarchical Multimodal Approach. ICMR13,
Dallas, United States, April 2013. ACM
• P. Kelm, S. Schmiedeke, and T. Sikora. A Hierarchical,
Multi-modal Approach for Placing Videos on the Map
using Millions of Flickr Photographs. In Proceedings of
the 2011 ACM Workshop on Social and Behavioural
Networked Media Access, SBNMA 11, pages 1520, New
York, NY, USA, 2011. ACM
#32
Thank you!
• Resources:
Slides: http://www.slideshare.net/sympapadopoulos/reveal-geotagging
Code: https://github.com/socialsensor/multimedia-geotagging
Benchmark:
http://www.multimediaeval.org/mediaeval2014/placing2014/
• Get in touch:
@sympapadopoulos / papadop@iti.gr
George Kordopatis / georgekordopatis@iti.gr
#33
1 of 33

Recommended

Geotagging Photographs By Sanjay Rana by
Geotagging Photographs By Sanjay RanaGeotagging Photographs By Sanjay Rana
Geotagging Photographs By Sanjay Ranasanjay_rana
1.4K views34 slides
Content-based image retrieval using a mobile device as a novel interface by
Content-based image retrieval using a mobile device as a novel interfaceContent-based image retrieval using a mobile device as a novel interface
Content-based image retrieval using a mobile device as a novel interfaceJonathon Hare
1.8K views15 slides
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat... by
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...Jonathon Hare
1.7K views13 slides
Real Time Myanmar Traffic Sign Recognition System using HOG and SVM by
Real Time Myanmar Traffic Sign Recognition System using HOG and SVMReal Time Myanmar Traffic Sign Recognition System using HOG and SVM
Real Time Myanmar Traffic Sign Recognition System using HOG and SVMijtsrd
158 views5 slides
Searching Images: Recent research at Southampton by
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonJonathon Hare
1.4K views42 slides
Searching Images: Recent research at Southampton by
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonJonathon Hare
1.3K views42 slides

More Related Content

What's hot

Remote sensing e course (Geohydrology) by
Remote sensing e course (Geohydrology)Remote sensing e course (Geohydrology)
Remote sensing e course (Geohydrology)Fatwa Ramdani
2.2K views21 slides
Automatic Building detection for satellite Images using IGV and DSM by
Automatic Building detection for satellite Images using IGV and DSMAutomatic Building detection for satellite Images using IGV and DSM
Automatic Building detection for satellite Images using IGV and DSMAmit Raikar
995 views38 slides
Unsupervised semi-supervised object detection by
Unsupervised semi-supervised object detectionUnsupervised semi-supervised object detection
Unsupervised semi-supervised object detectionYu Huang
123 views38 slides
Real-Time Volumetric Tests (EG 2008) by
Real-Time Volumetric Tests (EG 2008)Real-Time Volumetric Tests (EG 2008)
Real-Time Volumetric Tests (EG 2008)Matthias Trapp
652 views14 slides
Cnn acuracia remotesensing-08-00329 by
Cnn acuracia remotesensing-08-00329Cnn acuracia remotesensing-08-00329
Cnn acuracia remotesensing-08-00329Universidade Fumec
84 views21 slides
Real-time Object Tracking by
Real-time Object TrackingReal-time Object Tracking
Real-time Object TrackingWonsang You
1.5K views22 slides

What's hot(20)

Remote sensing e course (Geohydrology) by Fatwa Ramdani
Remote sensing e course (Geohydrology)Remote sensing e course (Geohydrology)
Remote sensing e course (Geohydrology)
Fatwa Ramdani2.2K views
Automatic Building detection for satellite Images using IGV and DSM by Amit Raikar
Automatic Building detection for satellite Images using IGV and DSMAutomatic Building detection for satellite Images using IGV and DSM
Automatic Building detection for satellite Images using IGV and DSM
Amit Raikar995 views
Unsupervised semi-supervised object detection by Yu Huang
Unsupervised semi-supervised object detectionUnsupervised semi-supervised object detection
Unsupervised semi-supervised object detection
Yu Huang123 views
Real-Time Volumetric Tests (EG 2008) by Matthias Trapp
Real-Time Volumetric Tests (EG 2008)Real-Time Volumetric Tests (EG 2008)
Real-Time Volumetric Tests (EG 2008)
Matthias Trapp652 views
Real-time Object Tracking by Wonsang You
Real-time Object TrackingReal-time Object Tracking
Real-time Object Tracking
Wonsang You1.5K views
Improving search time for contentment based image retrieval via, LSH, MTRee, ... by IOSR Journals
Improving search time for contentment based image retrieval via, LSH, MTRee, ...Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
IOSR Journals365 views
Text extraction using document structure features and support vector machines by Konstantinos Zagoris
Text extraction using document structure features and support vector machinesText extraction using document structure features and support vector machines
Text extraction using document structure features and support vector machines
Digital Heritage Documentation Via TLS And Photogrammetry Case Study by theijes
Digital Heritage Documentation Via TLS And Photogrammetry Case StudyDigital Heritage Documentation Via TLS And Photogrammetry Case Study
Digital Heritage Documentation Via TLS And Photogrammetry Case Study
theijes41 views
unrban-building-damage-detection-by-PJLi.ppt by grssieee
unrban-building-damage-detection-by-PJLi.pptunrban-building-damage-detection-by-PJLi.ppt
unrban-building-damage-detection-by-PJLi.ppt
grssieee533 views
Depth Fusion from RGB and Depth Sensors II by Yu Huang
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors II
Yu Huang1.4K views
Multiple Object Tracking by RainakSharma
Multiple Object TrackingMultiple Object Tracking
Multiple Object Tracking
RainakSharma84 views
Detection and Tracking of Moving Object: A Survey by IJERA Editor
Detection and Tracking of Moving Object: A SurveyDetection and Tracking of Moving Object: A Survey
Detection and Tracking of Moving Object: A Survey
IJERA Editor307 views
Scene Text Detection on Images using Cellular Automata by Konstantinos Zagoris
Scene Text Detection on Images using Cellular AutomataScene Text Detection on Images using Cellular Automata
Scene Text Detection on Images using Cellular Automata
A smart guidance navigation robot using petri net, database location, and rad... by journalBEEI
A smart guidance navigation robot using petri net, database location, and rad...A smart guidance navigation robot using petri net, database location, and rad...
A smart guidance navigation robot using petri net, database location, and rad...
journalBEEI53 views
Segmentation - based Historical Handwritten Word Spotting using document-spec... by Konstantinos Zagoris
Segmentation - based Historical Handwritten Word Spotting using document-spec...Segmentation - based Historical Handwritten Word Spotting using document-spec...
Segmentation - based Historical Handwritten Word Spotting using document-spec...
Fast Feature Pyramids for Object Detection by suthi
Fast Feature Pyramids for Object DetectionFast Feature Pyramids for Object Detection
Fast Feature Pyramids for Object Detection
suthi 978 views
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys... by Bartlomiej Twardowski
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...

Similar to Geotagging Social Media Content with a Refined Language Modelling Approach

In-depth Exploration of Geotagging Performance by
In-depth Exploration of Geotagging PerformanceIn-depth Exploration of Geotagging Performance
In-depth Exploration of Geotagging PerformanceSymeon Papadopoulos
495 views25 slides
Placing Images with Refined Language Models and Similarity Search with PCA-re... by
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Symeon Papadopoulos
406 views14 slides
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S... by
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...multimediaeval
147 views14 slides
CERTH/CEA LIST at MediaEval Placing Task 2015 by
CERTH/CEA LIST at MediaEval Placing Task 2015CERTH/CEA LIST at MediaEval Placing Task 2015
CERTH/CEA LIST at MediaEval Placing Task 2015Symeon Papadopoulos
899 views15 slides
Where Next by
Where NextWhere Next
Where NextRoberto Trasarti
542 views32 slides
Carpita metulini 111220_dssr_bari_version2 by
Carpita metulini 111220_dssr_bari_version2Carpita metulini 111220_dssr_bari_version2
Carpita metulini 111220_dssr_bari_version2University of Salerno
70 views30 slides

Similar to Geotagging Social Media Content with a Refined Language Modelling Approach(20)

Placing Images with Refined Language Models and Similarity Search with PCA-re... by Symeon Papadopoulos
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S... by multimediaeval
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
multimediaeval147 views
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION... by cscpconf
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...
cscpconf36 views
Medical diagnosis classification by csandit
Medical diagnosis classificationMedical diagnosis classification
Medical diagnosis classification
csandit314 views
ACL2015 Poster: Twitter User Geolocation Using a Unified Text and Network Pre... by Afshin Rahimi
ACL2015 Poster: Twitter User Geolocation Using a Unified Text and Network Pre...ACL2015 Poster: Twitter User Geolocation Using a Unified Text and Network Pre...
ACL2015 Poster: Twitter User Geolocation Using a Unified Text and Network Pre...
Afshin Rahimi388 views
LIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHM by ijnlc
LIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHMLIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHM
LIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHM
ijnlc17 views
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques... by Edge AI and Vision Alliance
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
GeoAI: A Model-Agnostic Meta-Ensemble Zero-Shot Learning Method for Hyperspec... by Konstantinos Demertzis
GeoAI: A Model-Agnostic Meta-Ensemble Zero-Shot Learning Method for Hyperspec...GeoAI: A Model-Agnostic Meta-Ensemble Zero-Shot Learning Method for Hyperspec...
GeoAI: A Model-Agnostic Meta-Ensemble Zero-Shot Learning Method for Hyperspec...
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster by multimediaeval
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
multimediaeval84 views
MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ... by multimediaeval
MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...
MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...
multimediaeval261 views
geolocation twitter network text geotagging by afshinrahimi1983
geolocation twitter network text geotagginggeolocation twitter network text geotagging
geolocation twitter network text geotagging
afshinrahimi1983165 views
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION by csandit
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTIONHOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
csandit117 views
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION by cscpconf
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTIONHOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
cscpconf46 views
Combinatorial optimization and deep reinforcement learning by 민재 정
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning
민재 정3K views
Analysis of Educational Robotics activities using a machine learning approach by Lorenzo Cesaretti
Analysis of Educational Robotics activities using a machine learning approachAnalysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approach
Lorenzo Cesaretti161 views

More from Symeon Papadopoulos

DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno... by
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...Symeon Papadopoulos
844 views29 slides
Deepfakes: An Emerging Internet Threat and their Detection by
Deepfakes: An Emerging Internet Threat and their DetectionDeepfakes: An Emerging Internet Threat and their Detection
Deepfakes: An Emerging Internet Threat and their DetectionSymeon Papadopoulos
1.5K views50 slides
Knowledge-based Fusion for Image Tampering Localization by
Knowledge-based Fusion for Image Tampering LocalizationKnowledge-based Fusion for Image Tampering Localization
Knowledge-based Fusion for Image Tampering LocalizationSymeon Papadopoulos
133 views24 slides
Deepfake Detection: The Importance of Training Data Preprocessing and Practic... by
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...Symeon Papadopoulos
166 views19 slides
COVID-19 Infodemic vs Contact Tracing by
COVID-19 Infodemic vs Contact TracingCOVID-19 Infodemic vs Contact Tracing
COVID-19 Infodemic vs Contact TracingSymeon Papadopoulos
205 views11 slides
Similarity-based retrieval of multimedia content by
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSymeon Papadopoulos
811 views61 slides

More from Symeon Papadopoulos(20)

DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno... by Symeon Papadopoulos
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
Deepfakes: An Emerging Internet Threat and their Detection by Symeon Papadopoulos
Deepfakes: An Emerging Internet Threat and their DetectionDeepfakes: An Emerging Internet Threat and their Detection
Deepfakes: An Emerging Internet Threat and their Detection
Symeon Papadopoulos1.5K views
Knowledge-based Fusion for Image Tampering Localization by Symeon Papadopoulos
Knowledge-based Fusion for Image Tampering LocalizationKnowledge-based Fusion for Image Tampering Localization
Knowledge-based Fusion for Image Tampering Localization
Deepfake Detection: The Importance of Training Data Preprocessing and Practic... by Symeon Papadopoulos
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
Similarity-based retrieval of multimedia content by Symeon Papadopoulos
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia content
Aggregating and Analyzing the Context of Social Media Content by Symeon Papadopoulos
Aggregating and Analyzing the Context of Social Media ContentAggregating and Analyzing the Context of Social Media Content
Aggregating and Analyzing the Context of Social Media Content
Symeon Papadopoulos5.9K views
Learning to detect Misleading Content on Twitter by Symeon Papadopoulos
Learning to detect Misleading Content on TwitterLearning to detect Misleading Content on Twitter
Learning to detect Misleading Content on Twitter
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers by Symeon Papadopoulos
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersNear-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Perceived versus Actual Predictability of Personal Information in Social Netw... by Symeon Papadopoulos
Perceived versus Actual Predictability of Personal Information in Social Netw...Perceived versus Actual Predictability of Personal Information in Social Netw...
Perceived versus Actual Predictability of Personal Information in Social Netw...
Web and Social Media Image Forensics for News Professionals by Symeon Papadopoulos
Web and Social Media Image Forensics for News ProfessionalsWeb and Social Media Image Forensics for News Professionals
Web and Social Media Image Forensics for News Professionals
Symeon Papadopoulos1.2K views
Predicting News Popularity by Mining Online Discussions by Symeon Papadopoulos
Predicting News Popularity by Mining Online DiscussionsPredicting News Popularity by Mining Online Discussions
Predicting News Popularity by Mining Online Discussions
Symeon Papadopoulos1.2K views

Recently uploaded

Throughput by
ThroughputThroughput
ThroughputMoisés Armani Ramírez
32 views11 slides
handbook for web 3 adoption.pdf by
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdfLiveplex
19 views16 slides
The Research Portal of Catalonia: Growing more (information) & more (services) by
The Research Portal of Catalonia: Growing more (information) & more (services)The Research Portal of Catalonia: Growing more (information) & more (services)
The Research Portal of Catalonia: Growing more (information) & more (services)CSUC - Consorci de Serveis Universitaris de Catalunya
66 views25 slides
Combining Orchestration and Choreography for a Clean Architecture by
Combining Orchestration and Choreography for a Clean ArchitectureCombining Orchestration and Choreography for a Clean Architecture
Combining Orchestration and Choreography for a Clean ArchitectureThomasHeinrichs1
68 views24 slides
How the World's Leading Independent Automotive Distributor is Reinventing Its... by
How the World's Leading Independent Automotive Distributor is Reinventing Its...How the World's Leading Independent Automotive Distributor is Reinventing Its...
How the World's Leading Independent Automotive Distributor is Reinventing Its...NUS-ISS
15 views25 slides
Data-centric AI and the convergence of data and model engineering: opportunit... by
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...Paolo Missier
29 views40 slides

Recently uploaded(20)

handbook for web 3 adoption.pdf by Liveplex
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdf
Liveplex19 views
Combining Orchestration and Choreography for a Clean Architecture by ThomasHeinrichs1
Combining Orchestration and Choreography for a Clean ArchitectureCombining Orchestration and Choreography for a Clean Architecture
Combining Orchestration and Choreography for a Clean Architecture
ThomasHeinrichs168 views
How the World's Leading Independent Automotive Distributor is Reinventing Its... by NUS-ISS
How the World's Leading Independent Automotive Distributor is Reinventing Its...How the World's Leading Independent Automotive Distributor is Reinventing Its...
How the World's Leading Independent Automotive Distributor is Reinventing Its...
NUS-ISS15 views
Data-centric AI and the convergence of data and model engineering: opportunit... by Paolo Missier
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier29 views
Understanding GenAI/LLM and What is Google Offering - Felix Goh by NUS-ISS
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
NUS-ISS39 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma17 views
The details of description: Techniques, tips, and tangents on alternative tex... by BookNet Canada
The details of description: Techniques, tips, and tangents on alternative tex...The details of description: Techniques, tips, and tangents on alternative tex...
The details of description: Techniques, tips, and tangents on alternative tex...
BookNet Canada110 views
AI: mind, matter, meaning, metaphors, being, becoming, life values by Twain Liu 刘秋艳
AI: mind, matter, meaning, metaphors, being, becoming, life valuesAI: mind, matter, meaning, metaphors, being, becoming, life values
AI: mind, matter, meaning, metaphors, being, becoming, life values
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor... by Vadym Kazulkin
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
Vadym Kazulkin70 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman25 views
AMAZON PRODUCT RESEARCH.pdf by JerikkLaureta
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdf
JerikkLaureta14 views
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu... by NUS-ISS
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
NUS-ISS32 views
Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst449 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10165 views
.conf Go 2023 - Data analysis as a routine by Splunk
.conf Go 2023 - Data analysis as a routine.conf Go 2023 - Data analysis as a routine
.conf Go 2023 - Data analysis as a routine
Splunk90 views
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab11 views

Geotagging Social Media Content with a Refined Language Modelling Approach

  • 1. Geotagging Social Media Content with a Refined Language Modelling Approach Georgios Kordopatis-Zilos, Symeon Papadopoulos, and Yiannis Kompatsiaris Centre for Research and Technology Hellas (CERTH) – Information Technologies Institute (ITI) PAISI 2015, May 19, 2015, Ho Chi Minh City, Vietnam
  • 2. Where is it? #2 Depicted landmark Eiffel Tower Location Paris, Tennessee Keyword “Tennesee” is very important to correctly place the photo. Source (Wikipedia): http://en.wikipedia.org/wiki/Eiffel_Tow er_(Paris,_Tennessee)
  • 3. The Problem • A lot of multimedia content is associated with geographic information • Being able to collect and analyze large amounts of geotagged content could be very useful for several applications, e.g., situational awareness in incidents such as natural disasters, verification, geographic trends, etc. • Yet, only a very small percentage of Web media content carries explicit information (i.e. GPS coordinates), for instance ~1% of tweets are geotagged • To this end, methods that can infer the geographic location of Web multimedia content are of interest. #3
  • 4. A Refined Language Model for Geotagging • Extend and improve the widely used Language Model for the problem of location estimation from text metadata • The proposed improvements include: – Feature selection based on a cross-validation approach – Feature weighting based on spatial entropy – Multiple resolution grids • Extensive evaluation on a public benchmark shows highly competitive performance and reveals new insights and challenges #4
  • 5. Related Work: Gazetteer-based Methods that use large dictionaries and Volunteered Geographic Information (VGI), e.g., Geonames, Yahoo! GeoPlanet, OpenStreetMap, etc. • Semantics-based IR approach for integrating gazetteers and VGI (Kessler et al., 2009) • Similarity matching mediating multiple gazetteers in a meta-gazetteer service (Smart et al., 2010) • Comma groups extracted with heuristic methods from lists of toponyms (Lieberman et al., 2010) #5
  • 6. Related Work: Language Models Language Models: large corpora of geotagged text to create location-specific language model, i.e. what are the most frequent keywords for a given location • Base approach (Serdyukov et al., 2009) • Disjoint dynamically sized cells (Hauff et al., 2012) • User frequency instead of term frequency (O’Hare & Murdock, 2012) • Clustering, use of χ2 for feature selection and similarity search (Van Laere et al., 2011) #6
  • 7. Related Work: Multimodal Approaches Multimodal approaches do not only use text, but also leverage the visual content and other social metadata of geotagged multimedia. • Combination of text metadata and visual content at two levels of granularity, city- (100km) and landmark-level (100m) (Crandall et al., 2009) • Build user models leveraging user’s upload history, SN data and hometown (Trevisiol et al., 2013) • Hierarchical approach using both text-based and visual similarity (Kelm et al., 2011) #7
  • 8. Related Work: MediaEval Placing Task • Yearly benchmarking task where different approaches compete – Each participant can submit up to 5 runs with different instances/configurations of their method • Dataset for Placing Task 2014 – Flickr CC-licensed images & videos, subset of YFCC 100M – Training: 5M, Testing: 510K (multiple subsets of increasing size are used for reporting) • Evaluation – Estimated location of test image/video is compared against the known one, and it is checked whether it belongs to a circle of radius of 10m, 100m, 1km, 10km, 100km and 1000km – Then, the percentage of images/videos that were correctly placed within each radius are reported, e.g., P@1km • Competing approaches: both gazetteer- and LM-based #8
  • 10. Geographic Language Model (1/2) • Training data: Corpus Dtr of images and videos • Test data: Corpus Dts • For each item (either in training or test data), we have: user id, title, tags, description • Title and tags of training images used for building the model. For testing, description is used only if the item has neither title nor tags associated with it. • Pre-processing: punctuation and symbol removal, lowercasing, numeric tags removed, composite phrases (e.g. “new+york”  “new”, “york”) are split into their components #10
  • 11. Geographic Language Model (2/2) • Generate rectangular grid C of areas (cells) of size 0.01⁰ x 0.01⁰ (~1km x 1km near equator) • For each cell and each tag in the training corpus, compute the tag-cell probability: 𝑝 𝑡 𝑐 = 𝑁 𝑢 𝑁𝑡 – Nu: number of users in Dtr that used tag t inside the borders of c – Nt: total count of users that used tag t in any cell • For a new text T with N tags, compute the Most Likely Cell (MLC) cj based on the following: 𝑐𝑗 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑖 𝑘=1 𝑁 𝑝(𝑡 𝑘|𝑐𝑖) #11
  • 12. Geographic Language Model: Example #12 new: 0.15 york: 0.27 manhattan: 0.45 liberty: 0.33 … nyc: 0.52
  • 13. Feature Selection • Retaining all possible tags of the training set might result in noise and overfitting and makes the resulting model very memory-demanding  select only those tags that are really discriminative • Use a variant of Cross-Validation on the training set: – split the training set in p folds (=10 in our tests) – use the p-1 folds for creating the LM and one for computing its accuracy (P@r, where r is the radius we are interested to optimize) – compute tag geographicity: 𝑡𝑔𝑒𝑜 𝑡 = 𝑁 𝑟 𝑁𝑡 , where Nr is the number of correctly classified items where tag t appears and Nt is the total number of items where tag t appears – select only tags that exceed threshold θtgeo and that have been used by a minimum number of users θu #13
  • 14. Feature Weighting using Spatial Entropy • We want to penalize tags that appear in many different places and to give more importance to tags that appear only for specific places. • To this end, we define a measure of the stochasticity of the tag’s appearance, i.e. its spatial entropy: 𝑒 𝑡 = − 𝑖=1 𝑀 𝑝 𝑡 𝑐𝑖 log 𝑝(𝑡|𝑐𝑖) where M is the total number of cells. • Once the entropy values are computed, we apply Gaussian normalization to suppress the weight of tags that take extreme values: 𝑁 𝑒 𝑡 , 𝜇, 𝜎 = 1 𝜎 2𝜋 𝑒−( 𝑒 𝑡 −𝜇 2𝜎 )2 where N is the Gaussian function and μ, σ are the mean value and variance of the distribution and are estimated on Dtr. • Consequently, the MLC is computed based on: 𝑐𝑗 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑖 𝑘=1 𝑁 𝑝(𝑡 𝑘|𝑐𝑖) ∙ 𝑁 𝑒 𝑡 𝑘 , 𝜇, 𝜎 #14
  • 15. Entropy Histogram & Gaussian Weighting #15 + --
  • 16. Similarity Search • Having assigned an item to the Most Likely Cell (MLC), we refine the location estimate by searching for the most similar item in the cell. • The k most similar items are retrieved (from Dtr) using Jaccard similarity on the respective sets of tags: 𝐽 𝑥, 𝑦 = |𝑇𝑥 ∩ 𝑇𝑦| |𝑇𝑥 ∪ 𝑇𝑦| • The final estimation is the centre of gravity of the k most similar images: 𝑙𝑜𝑐 𝑥 = 1 𝑘 𝑖=1 𝑘 𝐽 𝑥, 𝑦𝑖 𝛼 𝑙𝑜𝑐(𝑦𝑖) where α determines how strongly the result is influenced by the most similar items. • If less than k images are retrieved, then only those are used in the above calculation. If no similar images are retrieved, then the centre of MLC is provided as output. #16
  • 17. Multiple Resolution Grids • To increase the granularity of the prediction and at the same time its reliability, we devised the following dual grid scheme: – we build two LMs: one of size 0.01° x 0.01° (coarse granularity) and one of size 0.001° x 0.001° (fine granularity) – conduct location estimations based on both – if the fine granularity estimations falls within the cell of the estimation based on the coarse granularity, then we select the fine granularity – otherwise, we select the coarse (since we consider it by default more reliable) #17
  • 18. Evaluation • Benchmark dataset: MediaEval 2014 • Training set: 5M, Test set: 510K • All experiments conducted on the full test set (510K) • Two stages of evaluation: – participation in contest (with a limited version of the proposed approach) – post-contest performance exploration #18
  • 19. Evaluation: MediaEval 2014 Contest (1/4) • Out of the five runs, three were based on variations of the presented approach – run1: LM + feature weighting with spatial entropy + similarity search + multiple resolution grid – run4: LM only – run5: LM + similarity search (similarity search parameters: α=1, k=4) • Performance was measured with P@r, where r = 10m, 100m, 1km, 10km, 100km and 1000km #19
  • 20. Evaluation: MediaEval 2014 Contest (2/4) #20 • Proposed improvements (run1) outperform base approach (run4) and base approach + similarity search (run5) • The improvement is more pronounced in the small ranges (10m, 100m, 1km)
  • 21. Evaluation: MediaEval 2014 Contest (3/4) #21 Proposed Proposed
  • 22. Evaluation: MediaEval 2014 Contest (4/4) #22 Number of image tags
  • 23. Post-Contest Evaluation Explore the role of different factors: • Big training set (YFCC100M): ~48M geotagged items • Feature Selection (FS) • Feature Weighting with Spatial Entropy (SE) • Multiple Resolution Grid (MG) • Similarity Search (SS) Two settings: • FAIR: All users from the training set are completely removed from the test set • OVERFIT: Users are not removed from the test set even when some of their media items are included in the training set. #23
  • 24. Post-Contest Evaluation #24 • Clear improvement with the addition of MG and SS • The proposed improvements together with the use of the bigger dataset make the approach perform better than all other methods in MediaEval 2014
  • 25. Geographic Error Analysis • More data leads to lower error across the globe. • Several small US cities suffer from low accuracy due to having names of large European cities. #25 ALL + YFCC100M run 4
  • 26. Big Data vs. Complex Algorithms #26 Using 10x more data for training led to equivalent performance with using more complex algorithm (LM + extensions) with less data!
  • 27. Placeability of Media Items • Sum of cell-tag probabilities is a good indicator of how confident we are in the decision of the classifier. #27
  • 30. Conclusion • Key contributions – Improved geotagging approach, extending the widely used language model in three ways: feature selection, weighting, multiple resolution grids – Thorough analysis of geotagging accuracy offering new insights and highlighting new challenges • Future Work – Exploit visual features to improve (currently visual-only approaches perform very poorly) – Integrate gazetteer and structured location data sources (e.g. Foursquare venues, OpenStreetMap, etc.) – Evaluate in more challenging settings and datasets (e.g. Twitter, Instagram) #30
  • 31. References (1/2) • C. Kessler, K. Janowicz, and M. Bishr. An Agenda for the Next Generation Gazetteer: Geographic Information Contribution and Retrieval. In Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 91100. ACM, 2009 • P. Smart, C. Jones, and F. Twaroch. Multi-source Toponym Data Integration and Mediation for a Meta-gazetteer Service. In Proceedings of the 6th international conference on Geographic information science. GIScience10. Springer-Verlag, Berlin, Heidelberg, 234248, 2010 • M.D. Lieberman, H. Samet, and J. Sankaranayananan. Geotagging: using Proximity, Sibling, and Prominence Clues to Understand Comma Groups. In Proceedings of the 6th Workshop on Geographic Information Retrieval, 2010 • P. Serdyukov, V. Murdock, and R. Van Zwol. Placing Flickr Photos on a Map. In SIGIR09, pages 484-491, New York, NY, USA, 2009. ACM • C. Hauff and G. Houben. Geo-location Estimation of Flickr images: Social Web based Enrichment. ECIR 2012, p. 85-96. Springer LNCS 7224, April 1-5 2012 • N. O'Hare, and V. Murdock. Modeling Locations with Social Media. Information Retrieval, pp. 133, 2012 • O. Van Laere, S. Schockaert, and B. Dhoedt. Finding Locations of Flickr Resources using Language Models and Similarity Search. ICMR 11, New York, USA, 2011. ACM #31
  • 32. References (2/2) • D.J. Crandall, L. Backstrom, D. Huttenlocher, and J. Kleinberg. Mapping the World's Photos. In Proceedings of the 18th international conference on World wide web, WWW 09, pages 761770, New York, NY, USA, 2009. ACM • M. Trevisiol, H. Jegou, J. Delhumeau, and S. Gravier. Retrieving Geo-Location of Videos with a Divide and Conquer Hierarchical Multimodal Approach. ICMR13, Dallas, United States, April 2013. ACM • P. Kelm, S. Schmiedeke, and T. Sikora. A Hierarchical, Multi-modal Approach for Placing Videos on the Map using Millions of Flickr Photographs. In Proceedings of the 2011 ACM Workshop on Social and Behavioural Networked Media Access, SBNMA 11, pages 1520, New York, NY, USA, 2011. ACM #32
  • 33. Thank you! • Resources: Slides: http://www.slideshare.net/sympapadopoulos/reveal-geotagging Code: https://github.com/socialsensor/multimedia-geotagging Benchmark: http://www.multimediaeval.org/mediaeval2014/placing2014/ • Get in touch: @sympapadopoulos / papadop@iti.gr George Kordopatis / georgekordopatis@iti.gr #33

Editor's Notes

  1. http://irevolution.net/2014/04/03/using-aidr-to-collect-and-analyze-tweets-from-chile-earthquake/