SlideShare a Scribd company logo
1 of 32
EXTRACTING AND COMPARING PLACES USING
GEOSOCIAL MEDIA
Frank O. Ostermann et al.
ISSDQ 2015
 Introduction: Places and geosocial media
 Methods: Discovering relationships between
semantic similarity and geographic location
 Results: London case study
 Outlook: Current and future work
29.09.2015F.O.Ostermann - ISSDQ 2015 2
EXTRACTING AND COMPARING PLACES USING
GEOSOCIAL MEDIA
LONDON CASE STUDY
29.09.2015F.O.Ostermann - ISSDQ 2015 3
RESEARCH OBJECTIVES
INTRODUCTION
Contribute to our understanding of the semantics of places
by combining
methods from diverse disciplines
and
datasets from several sources,
in particular geosocial media.
29.09.2015F.O.Ostermann - ISSDQ 2015 4
WHY GEOSOCIAL MEDIA? WHY SEMANTICS OF PLACES?
INTRODUCTION
• rich and multi-faceted view on the perception and
semantics of geographic places
• improve interoperability between existing geospatial
datasets and geographic information retrieval for
future streams of geographic information
29.09.2015F.O.Ostermann - ISSDQ 2015 5
GEOSOCIAL MEDIA
INTRODUCTION
Geography
Explicit Implicit
Participation
Explicit
Volunteered Geographic
Information (VGI)
Open Street Map
Volunteered Geographic Content
(VGC)
Wikipedia articles on non-geographic
topics containing place names,
Foursquare
Implicit
Contributed / Ambient
Geographic Information (CGI/AGI)
Public Tweets referring to the
properties of an identifiable place.
User-Generated Geographic Content
(UGGC)
Public Flickr images containing a place
name or being georeferenced
Adopted from [1]
29.09.2015F.O.Ostermann - ISSDQ 2015 6
CONCEPTUALIZING PLACES
INTRODUCTION
Agnew (1987):
• Specific location (where?)
• Locale (properties)
• Sense of place (attachment)
Winter and Freksa (2012):
• Place through contrast
• Spatio-temporal location
• Semantics
29.09.2015F.O.Ostermann - ISSDQ 2015 7
DISCOVERING PLACES IN/FROM GEOSOCIAL MEDIA
INTRODUCTION
• Theory-guided research and local case study:
• How to people see and understand the places they frequent?
• What is different across media sources?
• More than one (volunteered) data source
• Identification of places and their semantics
• Comparison of places between data sources
• Comparison of places with geographic features and authoritative
data sources
29.09.2015F.O.Ostermann - ISSDQ 2015 8
CASE STUDY RESEARCH QUESTIONS
INTRODUCTION
• Can we identify distinct places in one data source
(Flickr) based on purely spatial clustering from another
data source (Twitter)?
• How does semantic similarity between places vary
over space and scale? How does Tobler’s first law of
geography hold with regards to scale and space?
• Study Area: Greater London
 Introduction: Places and geosocial media
 Methods: Discovering relationships between
semantic similarity and geographic location
 Results: London case study
 Outlook: Current and future work
29.09.2015F.O.Ostermann - ISSDQ 2015 9
EXTRACTING AND COMPARING PLACES USING
GEOSOCIAL MEDIA
LONDON CASE STUDY
29.09.2015F.O.Ostermann - ISSDQ 2015 10
OVERALL WORKFLOW
METHODS
Mine Twitter
for potential
places
Mine Flickr
for matching
images
Build term
vectors for
image
clusters
Calculate
cosine
similarities
Analyse
correlation
and spatial
variations
29.09.2015F.O.Ostermann - ISSDQ 2015 11
GEOSOCIAL DATA SET #1: TWITTER
METHODS
Twitter: abundant but noisy and not very rich in content
What: All geo-referenced Tweets
Where: Greater London Area
When: Nov 5, 2012 - October 3, 2013 (334 days)
Who: No tourists, please. Only users with Tweets spanning 30+ day
15,246,565 Tweets from 40,246 users.
29.09.2015F.O.Ostermann - ISSDQ 2015 12
CLUSTERING TWITTER
METHODS
• User-defined radius
Rmax=300m
• User-defined
minimum cluster size
• Adaptive densities
and shapes
29.09.2015F.O.Ostermann - ISSDQ 2015 13
GEOSOCIAL DATA SET
METHODS
• Resulted in 55,000 potential places
• Too high for an exploratory analysis
• Fillter out clusters with < 100
distinct users
• 3501 clusters for further analysis
29.09.2015F.O.Ostermann - ISSDQ 2015 14
FROM TWEETS TO PLACES
METHODS
Flickr: less abundant, but more stable, focused on geo, and richer
What: All geo-referenced Flickr images
Where: Greater London Area bounding box
When: until November 2014
More than five million images.
29.09.2015F.O.Ostermann - ISSDQ 2015 15
MEASURING SEMANTIC SIMILARITY
METHODS
Cosine similarity
• Cosine of the angle between two vectors
• If vectors have same orientation, angle is 0°, cosine similarity is 1
• Orthogonal vectors have cosine similarity of 0
• Serves as an approximation of semantic similarity between places
• Has been used successfully in Geographic Information Retrieval
Hypothesis
Based on Tobler’s first law of geography, we expect a negative
correlation between distance and cosine similarity
29.09.2015F.O.Ostermann - ISSDQ 2015 16
MEASURING SENSE OF PLACE
METHODS
1. Buffer Tweet clusters
2. Point-in-polygon intersection Flickr images and Tweet clusters using
PostGIS (inner join with multiple entries).
3. Build term vectors for remaining Flickr images through lexical
matching, using set of activities, elements, or qualities (Purves et al.
2011) in titles, descriptions, tags
4. Aggregate to Twitter clusters
5. Normalize to binary (present, not present)
 Introduction: Places and geosocial media
 Methods: Discovering relationships between
semantic similarity and geographic location
 Results: London case study
 Outlook: Current and future work
29.09.2015F.O.Ostermann - ISSDQ 2015 17
EXTRACTING AND COMPARING PLACES USING
GEOSOCIAL MEDIA
LONDON CASE STUDY
29.09.2015F.O.Ostermann - ISSDQ 2015 18
SEMANTIC SIMILARITY BETWEEN NEIGHBORS
RESULTS
High similarity suggests they
might not be distinct places in
the sense of Freksa and
Winter (2012)
Exploratory analysis
Cosine similarity for combined term vectors with nearest neighbor
(asymmetrical)
29.09.2015F.O.Ostermann - ISSDQ 2015 19
COSINE SIMILARITY NEAREST NEIGHBORS
RESULTS
29.09.2015F.O.Ostermann - ISSDQ 2015 20
SIMILARITY AND DISTANCE NEAREST NEIGHBORS
METHODS
Normality not given (Shapiro-Wilk)
Non-parametric Kendall’s Tau correlation tests
• Weak to moderate negative correlation
• Kendall's rank correlation tau z = -25.8158, p-value < 0.000
sample estimates tau -0.2921797).
• Negative correlation consistent with Tobler’s first law of
geography,
• Shows that near things are in general more related than distant
things.
29.09.2015F.O.Ostermann - ISSDQ 2015 21
CORRELATION OVER DISTANCE FOR ALL PAIRS
RESULTS
• Calculated cosine similarity
for all cluster pairs
• Calculated Kendall’s Tau
for all clusters
• Heterogeneous spatial
variation (non-stationarity)
29.09.2015F.O.Ostermann - ISSDQ 2015 22
CORRELATION DISTANCE AND SIMILARITY
RESULTS
29.09.2015F.O.Ostermann - ISSDQ 2015 23
CORRELATION DISTANCE AND SIMILARITY
RESULTS
29.09.2015F.O.Ostermann - ISSDQ 2015 24
CORRELATION DISTANCE AND SIMILARITY
RESULTS
29.09.2015F.O.Ostermann - ISSDQ 2015 25
CORRELATION DISTANCE AND SIMILARITY
RESULTS
29.09.2015F.O.Ostermann - ISSDQ 2015 26
OUTLIERS OR PLACES?
RESULTS
• Correlation between distance and cosine similarity is stronger in the
city centre
• Shorter distances to all others, and correlation breaks down at some
distance
• Downtown London has higher average similarity scores than
periphery
• Shortest distance band shows clearly clusters of high average
similarity scores, suggesting areas that are internally more
semantically similar than others
29.09.2015F.O.Ostermann - ISSDQ 2015 27
BACK TO SQUARE 1
RESULTS
• We can identify several coarse places when comparing the average
cosine similarity for low distance bands
• Negative correlation between distance and cosine similarity is
strongest for smaller distances, and flattens out over longer
distances. Consistent with Li et al. (2014), showing that Tobler’s first
law of geography is only consistently true within a specific distance
• Results support our assumption that distinct locales are discoverable
through geographic semantics in user-generated geographic content.
 Introduction: Places and geosocial media
 Methods: Discovering relationships between
semantic similarity and geographic location
 Results: London case study
 Outlook: Current and future work
29.09.2015F.O.Ostermann - ISSDQ 2015 28
EXTRACTING AND COMPARING PLACES USING
GEOSOCIAL MEDIA
LONDON CASE STUDY
29.09.2015F.O.Ostermann - ISSDQ 2015 29
FUTURE WORK
OUTLOOK
• Measure correlation between similarity independently in the 3
dimensions of activities, elements and qualities.
• Measure the impact of the temporal dimension by investigating time
slices of Twitter and Flickr data.
• Merge neighbouring places with cosine similarity greater than some
given threshold value in an iterative clustering process
• Ground the resulting places through POIs from OSM and in-depth
qualitative analysis
23.06.2015F.O.Ostermann - ifgi GI-Forum 30
HYBRID GEO-INFORMATION PROCESSING
OUTLOOK
Time-consuming and resource-intensive
• Manual annotation and ground truthing
• Parameterization of spatio-temporal clustering
Other challenges:
• Dependency on data quality
• Overfitting
• Diversity of contexts and tasks
• Near real-time
Crowdsourced Supervision
29.09.2015F.O.Ostermann - ISSDQ 2015 31
HYBRID GEO-INFORMATION PROCESSING
OUTLOOK
29.09.2015F.O.Ostermann - ISSDQ 2015 32
EXTRACTING AND COMPARING PLACES USING
GEOSOCIAL MEDIA
Thank you!
f.o.ostermann@utwente.nl
@f_ostermann
nl.linkedin.com/in/foost
[1] Craglia, M., Ostermann, F., & Spinsanti, L. (2012). Digital Earth from vision to
practice: making sense of citizen-generated content. International Journal of Digital
Earth, 5(5), 398–416.

More Related Content

Viewers also liked (6)

File4 around the world 3 am level- according to the atf & aef compet
File4 around the world  3 am level- according to the atf & aef competFile4 around the world  3 am level- according to the atf & aef compet
File4 around the world 3 am level- according to the atf & aef compet
 
Describing graphs
Describing graphsDescribing graphs
Describing graphs
 
Worksheet 04: Should - Shouldn't
Worksheet 04: Should - Shouldn'tWorksheet 04: Should - Shouldn't
Worksheet 04: Should - Shouldn't
 
Describing Places And Buildings
Describing Places And BuildingsDescribing Places And Buildings
Describing Places And Buildings
 
Describing Places
Describing PlacesDescribing Places
Describing Places
 
Describing places
Describing placesDescribing places
Describing places
 

Similar to Extracting and comparing places using geosocial media

Improving volunteered geographic data quality using semantic similarity measu...
Improving volunteered geographic data quality using semantic similarity measu...Improving volunteered geographic data quality using semantic similarity measu...
Improving volunteered geographic data quality using semantic similarity measu...
arno974
 
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial DataESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
Kostis Kyzirakos
 

Similar to Extracting and comparing places using geosocial media (12)

Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial Systems
 
Improving volunteered geographic data quality using semantic similarity measu...
Improving volunteered geographic data quality using semantic similarity measu...Improving volunteered geographic data quality using semantic similarity measu...
Improving volunteered geographic data quality using semantic similarity measu...
 
Degrees of Openness: Challenges and Solutions in the Analysis of Open Access ...
Degrees of Openness: Challenges and Solutions in the Analysis of Open Access ...Degrees of Openness: Challenges and Solutions in the Analysis of Open Access ...
Degrees of Openness: Challenges and Solutions in the Analysis of Open Access ...
 
Spatial data analysis
Spatial data analysisSpatial data analysis
Spatial data analysis
 
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial DataESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
 
Energy Transitions
Energy TransitionsEnergy Transitions
Energy Transitions
 
OpenCoesione: from transparency to civic engagement
OpenCoesione: from transparency to civic engagementOpenCoesione: from transparency to civic engagement
OpenCoesione: from transparency to civic engagement
 
Enriching geo-social media through geographic contextualization
Enriching geo-social media through geographic contextualizationEnriching geo-social media through geographic contextualization
Enriching geo-social media through geographic contextualization
 
Enriching geo-social media content @AGILE 2015
Enriching geo-social media content @AGILE 2015Enriching geo-social media content @AGILE 2015
Enriching geo-social media content @AGILE 2015
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
 
Data Models and Query Languages for Linked Geospatial Data
Data Models and Query Languages for Linked Geospatial DataData Models and Query Languages for Linked Geospatial Data
Data Models and Query Languages for Linked Geospatial Data
 
Slides ecir2016
Slides ecir2016Slides ecir2016
Slides ecir2016
 

More from foostermann (9)

Mining user-generated geographic content: An interactive, crowdsourced approa...
Mining user-generated geographic content: An interactive, crowdsourced approa...Mining user-generated geographic content: An interactive, crowdsourced approa...
Mining user-generated geographic content: An interactive, crowdsourced approa...
 
Credibility and Relevance of User-Generated Content on Crisis Events
Credibility and Relevance of User-Generated Content on Crisis EventsCredibility and Relevance of User-Generated Content on Crisis Events
Credibility and Relevance of User-Generated Content on Crisis Events
 
Geographic context analysis of volunteered information
Geographic context analysis of volunteered informationGeographic context analysis of volunteered information
Geographic context analysis of volunteered information
 
Challenges and opportunities of geo-social media
Challenges and opportunities of geo-social mediaChallenges and opportunities of geo-social media
Challenges and opportunities of geo-social media
 
Handling crowdsourced geographic information
Handling crowdsourced geographic informationHandling crowdsourced geographic information
Handling crowdsourced geographic information
 
The crowd and the cloud
The crowd and the cloudThe crowd and the cloud
The crowd and the cloud
 
Processing and understanding geo-social media content
Processing and understanding geo-social media contentProcessing and understanding geo-social media content
Processing and understanding geo-social media content
 
Hybrid geo-information processing
Hybrid geo-information processingHybrid geo-information processing
Hybrid geo-information processing
 
Multi-sensory integration for a digital earth nervous system
Multi-sensory integration for a digital earth nervous systemMulti-sensory integration for a digital earth nervous system
Multi-sensory integration for a digital earth nervous system
 

Recently uploaded

Heat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysHeat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree days
Brahmesh Reddy B R
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptx
GlendelCaroz
 

Recently uploaded (20)

GBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of AsepsisGBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of Asepsis
 
Taphonomy and Quality of the Fossil Record
Taphonomy and Quality of the  Fossil RecordTaphonomy and Quality of the  Fossil Record
Taphonomy and Quality of the Fossil Record
 
Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte Carlo
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
Polyethylene and its polymerization.pptx
Polyethylene and its polymerization.pptxPolyethylene and its polymerization.pptx
Polyethylene and its polymerization.pptx
 
Heat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysHeat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree days
 
Vital Signs of Animals Presentation By Aftab Ahmed Rahimoon
Vital Signs of Animals Presentation By Aftab Ahmed RahimoonVital Signs of Animals Presentation By Aftab Ahmed Rahimoon
Vital Signs of Animals Presentation By Aftab Ahmed Rahimoon
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
 
NuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfNuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdf
 
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfFORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
 
Introduction and significance of Symbiotic algae
Introduction and significance of  Symbiotic algaeIntroduction and significance of  Symbiotic algae
Introduction and significance of Symbiotic algae
 
MSC IV_Forensic medicine - Mechanical injuries.pdf
MSC IV_Forensic medicine - Mechanical injuries.pdfMSC IV_Forensic medicine - Mechanical injuries.pdf
MSC IV_Forensic medicine - Mechanical injuries.pdf
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
Vital Signs of Animals Presentation By Aftab Ahmed Rahimoon
Vital Signs of Animals Presentation By Aftab Ahmed RahimoonVital Signs of Animals Presentation By Aftab Ahmed Rahimoon
Vital Signs of Animals Presentation By Aftab Ahmed Rahimoon
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) Enzymology
 
MSCII_ FCT UNIT 5 TOXICOLOGY.pdf
MSCII_              FCT UNIT 5 TOXICOLOGY.pdfMSCII_              FCT UNIT 5 TOXICOLOGY.pdf
MSCII_ FCT UNIT 5 TOXICOLOGY.pdf
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptx
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptx
 
Fun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfFun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdf
 

Extracting and comparing places using geosocial media

  • 1. EXTRACTING AND COMPARING PLACES USING GEOSOCIAL MEDIA Frank O. Ostermann et al. ISSDQ 2015
  • 2.  Introduction: Places and geosocial media  Methods: Discovering relationships between semantic similarity and geographic location  Results: London case study  Outlook: Current and future work 29.09.2015F.O.Ostermann - ISSDQ 2015 2 EXTRACTING AND COMPARING PLACES USING GEOSOCIAL MEDIA LONDON CASE STUDY
  • 3. 29.09.2015F.O.Ostermann - ISSDQ 2015 3 RESEARCH OBJECTIVES INTRODUCTION Contribute to our understanding of the semantics of places by combining methods from diverse disciplines and datasets from several sources, in particular geosocial media.
  • 4. 29.09.2015F.O.Ostermann - ISSDQ 2015 4 WHY GEOSOCIAL MEDIA? WHY SEMANTICS OF PLACES? INTRODUCTION • rich and multi-faceted view on the perception and semantics of geographic places • improve interoperability between existing geospatial datasets and geographic information retrieval for future streams of geographic information
  • 5. 29.09.2015F.O.Ostermann - ISSDQ 2015 5 GEOSOCIAL MEDIA INTRODUCTION Geography Explicit Implicit Participation Explicit Volunteered Geographic Information (VGI) Open Street Map Volunteered Geographic Content (VGC) Wikipedia articles on non-geographic topics containing place names, Foursquare Implicit Contributed / Ambient Geographic Information (CGI/AGI) Public Tweets referring to the properties of an identifiable place. User-Generated Geographic Content (UGGC) Public Flickr images containing a place name or being georeferenced Adopted from [1]
  • 6. 29.09.2015F.O.Ostermann - ISSDQ 2015 6 CONCEPTUALIZING PLACES INTRODUCTION Agnew (1987): • Specific location (where?) • Locale (properties) • Sense of place (attachment) Winter and Freksa (2012): • Place through contrast • Spatio-temporal location • Semantics
  • 7. 29.09.2015F.O.Ostermann - ISSDQ 2015 7 DISCOVERING PLACES IN/FROM GEOSOCIAL MEDIA INTRODUCTION • Theory-guided research and local case study: • How to people see and understand the places they frequent? • What is different across media sources? • More than one (volunteered) data source • Identification of places and their semantics • Comparison of places between data sources • Comparison of places with geographic features and authoritative data sources
  • 8. 29.09.2015F.O.Ostermann - ISSDQ 2015 8 CASE STUDY RESEARCH QUESTIONS INTRODUCTION • Can we identify distinct places in one data source (Flickr) based on purely spatial clustering from another data source (Twitter)? • How does semantic similarity between places vary over space and scale? How does Tobler’s first law of geography hold with regards to scale and space? • Study Area: Greater London
  • 9.  Introduction: Places and geosocial media  Methods: Discovering relationships between semantic similarity and geographic location  Results: London case study  Outlook: Current and future work 29.09.2015F.O.Ostermann - ISSDQ 2015 9 EXTRACTING AND COMPARING PLACES USING GEOSOCIAL MEDIA LONDON CASE STUDY
  • 10. 29.09.2015F.O.Ostermann - ISSDQ 2015 10 OVERALL WORKFLOW METHODS Mine Twitter for potential places Mine Flickr for matching images Build term vectors for image clusters Calculate cosine similarities Analyse correlation and spatial variations
  • 11. 29.09.2015F.O.Ostermann - ISSDQ 2015 11 GEOSOCIAL DATA SET #1: TWITTER METHODS Twitter: abundant but noisy and not very rich in content What: All geo-referenced Tweets Where: Greater London Area When: Nov 5, 2012 - October 3, 2013 (334 days) Who: No tourists, please. Only users with Tweets spanning 30+ day 15,246,565 Tweets from 40,246 users.
  • 12. 29.09.2015F.O.Ostermann - ISSDQ 2015 12 CLUSTERING TWITTER METHODS • User-defined radius Rmax=300m • User-defined minimum cluster size • Adaptive densities and shapes
  • 13. 29.09.2015F.O.Ostermann - ISSDQ 2015 13 GEOSOCIAL DATA SET METHODS • Resulted in 55,000 potential places • Too high for an exploratory analysis • Fillter out clusters with < 100 distinct users • 3501 clusters for further analysis
  • 14. 29.09.2015F.O.Ostermann - ISSDQ 2015 14 FROM TWEETS TO PLACES METHODS Flickr: less abundant, but more stable, focused on geo, and richer What: All geo-referenced Flickr images Where: Greater London Area bounding box When: until November 2014 More than five million images.
  • 15. 29.09.2015F.O.Ostermann - ISSDQ 2015 15 MEASURING SEMANTIC SIMILARITY METHODS Cosine similarity • Cosine of the angle between two vectors • If vectors have same orientation, angle is 0°, cosine similarity is 1 • Orthogonal vectors have cosine similarity of 0 • Serves as an approximation of semantic similarity between places • Has been used successfully in Geographic Information Retrieval Hypothesis Based on Tobler’s first law of geography, we expect a negative correlation between distance and cosine similarity
  • 16. 29.09.2015F.O.Ostermann - ISSDQ 2015 16 MEASURING SENSE OF PLACE METHODS 1. Buffer Tweet clusters 2. Point-in-polygon intersection Flickr images and Tweet clusters using PostGIS (inner join with multiple entries). 3. Build term vectors for remaining Flickr images through lexical matching, using set of activities, elements, or qualities (Purves et al. 2011) in titles, descriptions, tags 4. Aggregate to Twitter clusters 5. Normalize to binary (present, not present)
  • 17.  Introduction: Places and geosocial media  Methods: Discovering relationships between semantic similarity and geographic location  Results: London case study  Outlook: Current and future work 29.09.2015F.O.Ostermann - ISSDQ 2015 17 EXTRACTING AND COMPARING PLACES USING GEOSOCIAL MEDIA LONDON CASE STUDY
  • 18. 29.09.2015F.O.Ostermann - ISSDQ 2015 18 SEMANTIC SIMILARITY BETWEEN NEIGHBORS RESULTS High similarity suggests they might not be distinct places in the sense of Freksa and Winter (2012) Exploratory analysis Cosine similarity for combined term vectors with nearest neighbor (asymmetrical)
  • 19. 29.09.2015F.O.Ostermann - ISSDQ 2015 19 COSINE SIMILARITY NEAREST NEIGHBORS RESULTS
  • 20. 29.09.2015F.O.Ostermann - ISSDQ 2015 20 SIMILARITY AND DISTANCE NEAREST NEIGHBORS METHODS Normality not given (Shapiro-Wilk) Non-parametric Kendall’s Tau correlation tests • Weak to moderate negative correlation • Kendall's rank correlation tau z = -25.8158, p-value < 0.000 sample estimates tau -0.2921797). • Negative correlation consistent with Tobler’s first law of geography, • Shows that near things are in general more related than distant things.
  • 21. 29.09.2015F.O.Ostermann - ISSDQ 2015 21 CORRELATION OVER DISTANCE FOR ALL PAIRS RESULTS • Calculated cosine similarity for all cluster pairs • Calculated Kendall’s Tau for all clusters • Heterogeneous spatial variation (non-stationarity)
  • 22. 29.09.2015F.O.Ostermann - ISSDQ 2015 22 CORRELATION DISTANCE AND SIMILARITY RESULTS
  • 23. 29.09.2015F.O.Ostermann - ISSDQ 2015 23 CORRELATION DISTANCE AND SIMILARITY RESULTS
  • 24. 29.09.2015F.O.Ostermann - ISSDQ 2015 24 CORRELATION DISTANCE AND SIMILARITY RESULTS
  • 25. 29.09.2015F.O.Ostermann - ISSDQ 2015 25 CORRELATION DISTANCE AND SIMILARITY RESULTS
  • 26. 29.09.2015F.O.Ostermann - ISSDQ 2015 26 OUTLIERS OR PLACES? RESULTS • Correlation between distance and cosine similarity is stronger in the city centre • Shorter distances to all others, and correlation breaks down at some distance • Downtown London has higher average similarity scores than periphery • Shortest distance band shows clearly clusters of high average similarity scores, suggesting areas that are internally more semantically similar than others
  • 27. 29.09.2015F.O.Ostermann - ISSDQ 2015 27 BACK TO SQUARE 1 RESULTS • We can identify several coarse places when comparing the average cosine similarity for low distance bands • Negative correlation between distance and cosine similarity is strongest for smaller distances, and flattens out over longer distances. Consistent with Li et al. (2014), showing that Tobler’s first law of geography is only consistently true within a specific distance • Results support our assumption that distinct locales are discoverable through geographic semantics in user-generated geographic content.
  • 28.  Introduction: Places and geosocial media  Methods: Discovering relationships between semantic similarity and geographic location  Results: London case study  Outlook: Current and future work 29.09.2015F.O.Ostermann - ISSDQ 2015 28 EXTRACTING AND COMPARING PLACES USING GEOSOCIAL MEDIA LONDON CASE STUDY
  • 29. 29.09.2015F.O.Ostermann - ISSDQ 2015 29 FUTURE WORK OUTLOOK • Measure correlation between similarity independently in the 3 dimensions of activities, elements and qualities. • Measure the impact of the temporal dimension by investigating time slices of Twitter and Flickr data. • Merge neighbouring places with cosine similarity greater than some given threshold value in an iterative clustering process • Ground the resulting places through POIs from OSM and in-depth qualitative analysis
  • 30. 23.06.2015F.O.Ostermann - ifgi GI-Forum 30 HYBRID GEO-INFORMATION PROCESSING OUTLOOK Time-consuming and resource-intensive • Manual annotation and ground truthing • Parameterization of spatio-temporal clustering Other challenges: • Dependency on data quality • Overfitting • Diversity of contexts and tasks • Near real-time Crowdsourced Supervision
  • 31. 29.09.2015F.O.Ostermann - ISSDQ 2015 31 HYBRID GEO-INFORMATION PROCESSING OUTLOOK
  • 32. 29.09.2015F.O.Ostermann - ISSDQ 2015 32 EXTRACTING AND COMPARING PLACES USING GEOSOCIAL MEDIA Thank you! f.o.ostermann@utwente.nl @f_ostermann nl.linkedin.com/in/foost [1] Craglia, M., Ostermann, F., & Spinsanti, L. (2012). Digital Earth from vision to practice: making sense of citizen-generated content. International Journal of Digital Earth, 5(5), 398–416.