SlideShare a Scribd company logo
1 of 64
PROCESSING AND UNDERSTANDING GEO-SOCIAL
MEDIA CONTENT
EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
Frank O. Ostermann
IfGI GI-Forum, 23.06.2015
 Introduction: Geo-social media APIs as
sensors
 Where we are: Current state-of-the-art and
practical examples from disaster response
 Outlook: Future research directions
23.06.2015F.O.Ostermann - ifgi GI-Forum 2
PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA
CONTENT
EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
23.06.2015F.O.Ostermann - ifgi GI-Forum 3
ONCE UPON A TIME…
INTRODUCTION
… there were Desktop-GIS and Shapefiles,
digitized or scanned from paper maps,
or from raw surveying or satellite data.
 Mobile Web 2.0
 Cloud Computing
 Internet of Things (in
particular, sensors)
23.06.2015F.O.Ostermann - ifgi GI-Forum 4
THREE DISRUPTIVE INNOVATIONS
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 5
… AND CORRESPONDING BUZZWORDS (& BUZZ-VIS)
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 6
BEYOND THE BUZZ
EXAMPLES
 Real-time data input
stream
 Citizens as sensors
 Multi-layered, inter-
operable data sets
 Linked and open data
 GEOSS, Eye on Earth,
INSPIRE, …
23.06.2015F.O.Ostermann - ifgi GI-Forum 7
THE BIG PICTURE: NEXT GENERATION DIGITAL EARTH
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 8
LOW-COST IN-SITU AND MOBILE SENSORS
INTRODUCTION
Publiclaboratory.com
Mikrokopter.de
Libelium
Waspmote
23.06.2015F.O.Ostermann - ifgi GI-Forum 9
CITIZENS AS SENSORS
INTRODUCTION
+ = !
Why not treat information from the citizens
as another type of sensor data?
23.06.2015F.O.Ostermann - ifgi GI-Forum 10
WHO IS THE CROWD?
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 11
WHAT DOES THE CROWD WANT?
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 12
NEW SOURCES OF GEO-INFORMATION
INTRODUCTION
Geography
Explicit Implicit
Participation
Explicit
Volunteered Geographic Information
(VGI)
Open Street Map
Volunteered Geographic Content (VGC)
Wikipedia articles on non-geographic
topics containing place names,
Foursquare
Implicit
Contributed / Ambient Geographic
Information (CGI/AGI)
Public Tweets referring to the
properties of an identifiable place.
User-Generated Geographic Content
(UGGC)
Public Flickr images containing a place
name or being georeferenced
Adopted from [1]
23.06.2015F.O.Ostermann - ifgi GI-Forum 13
TWITTER
INTRODUCTION
• 140 characters micro-blogging platform
• Asymetric following – being followed
• Inflated user numbers:
• 100 million daily active vs.
• 300 million montly active vs.
• 1 billion registered (number of bots high, >40% never tweeted)
• Two APIs: Streaming API & Search API
• Rich metadata returned
• <5% with coordinates, but much more with toponyms
• Huge ecosystem of third-party apps and services
• Boost to data-driven research, but what about reproducibility?
23.06.2015F.O.Ostermann - ifgi GI-Forum 14
FLICKR
INTRODUCTION
• 92 million users
• 1 million photos shared every day
• Pioneer, then declined, then bounced back
• API offers detailed search functionality
• ~20% geocoded, many more with toponyms
• Potentially rich source of data:
• Title
• Tags
• Description
• But: Bulk uploads (and tagging)
23.06.2015F.O.Ostermann - ifgi GI-Forum 15
GEO-SOCIAL MEDIA SENSORS – SO WHAT‘S DIFFERENT?
INTRODUCTION
• Often In-situ
• Rich, pre-processed information but varying level of quality
• Uneven spatio-temporal distribution (stream)
• Redundancy of content and channels (sharing)
• Heterogeneous structure
• Unknown source/lineage
• Unclear / changing licencing, property rights, liability (e.g.
OpenStreetMap)
• Unknown/Immeasurable precision, error, completeness
• Uncertainty about the uncertainty!
• How to calibrate? (Should we?)
23.06.2015F.O.Ostermann - ifgi GI-Forum 16
QUALITY OF GEO-SOCIAL MEDIA INFORMATION
INTRODUCTION
Adopted from [2, 3]
Source
Credibility
Relevance
Content
Location
Context
Natual Language
Processing
Social Network
Analysis
Geographic
Contextualization
23.06.2015F.O.Ostermann - ifgi GI-Forum 17
AUTOMATIC IMAGE GEO-TAG CREATION
INTRODUCTION
 Introduction: Geo-social media APIs as
sensors
 Where we are : Current state-of-the-art
and practical examples from disaster
response
 Outlook: Future research directions
23.06.2015F.O.Ostermann - ifgi GI-Forum 18
PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA
CONTENT
EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
23.06.2015F.O.Ostermann - ifgi GI-Forum 19
GEO-SOCIAL MEDIA AND CRISIS MANAGEMENT
WHERE WE ARE
Social media offers… Crisis management needs…
rich up-to-date information up-to-date information
new paths of communication redundant paths of communication
noise, uncertain lineage and accuracy high-quality and reliable information
Crowd-sourced data curation faces limits of
 Sustainability
 Scalability
23.06.2015F.O.Ostermann - ifgi GI-Forum 20
HUMANITARIAN OPENSTREETMAP TEAM
INTRODUCTION
• Many activations, last one after Nepal earthquake
• Three main communication channels:
• Tasking manager
• E-Mail list
• IRC channel
23.06.2015F.O.Ostermann - ifgi GI-Forum 21
USHAHIDI – BEYOND CRISIS MAPPING
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 22
TWITCIDENT - CROWDSENSE
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 23
CRISISTRACKER (AIDR)
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 24
AIDR
WHERE WE ARE
http://irevolution.net/2013/10/01/aidr-artificial-
intelligence-for-disaster-response/
23.06.2015F.O.Ostermann - ifgi GI-Forum 25
GEOGRAPHIC CONTEXT ANALYSIS OF VOLUNTEERED
INFORMATION (GEOCONAVI)
WHERE WE ARE
1. Deploy a system for using UGC
in crisis decision support on forest
fires
2. Assess the added value of
using UGC for forest fire response.
23.06.2015F.O.Ostermann - ifgi GI-Forum 26
FOREST FIRE CHARACTERISTICS
WHERE WE ARE
• Dynamics require near real-time
processing
• Less signals since often in sparsely
populated areas
• Predictability and recurrence facilitate
sensor and model calibration
23.06.2015F.O.Ostermann - ifgi GI-Forum 27
GEOCONAVI FIGHTING FOREST FIRES
WHERE WE ARE
1.1 Retrieval
Scheduled Java code
accessing APIs
2.1 Topicality
Scheduled PLSQL job
2.2 Geo-Coding
a) Scheduled PLSQL job
b) Scheduled Java code
2.3 Geographic context
Scheduled PLSQL job
3.1 Spatio-temporal
clustering
Scheduled Python script
calling SatScan job
2.4 Quality Assessment
Scheduled PLSQL job
1.2 Storage
Scheduled Java code
writing to DBMS
Oracle DBMS
3.2 Quality Re-Assessment
Scheduled PLSQL job
Twitter
Stream-
ing API
Flickr
Search
API
Dissemination
SMS, WFS, WMS, RSS, SES
EFFIS
Hotspot
Data
European Media Monitor
Geo-coding API
 Flickr API
 Twitter Streaming API
 Keyword-based:
 Domain expertise
 Task-oriented
 Scheduled scripts
 Writing to Oracle DBMS
23.06.2015F.O.Ostermann - ifgi GI-Forum 28
DATA COLLECTION AND STORAGE
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 29
EXAMPLE GEO-SOCIAL MEDIA
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 30
EXAMPLE GEO-SOCIAL MEDIA
WHERE WE ARE
“Back at hotel. Fire skirted
round village. Little evidence of
significant damage. Helicopters
still overhead damping scrub.
Beer unaffected”
(Canada BCGovFireInfo): “Important
notice from the Reg Dist of Bulkley-
Nechako regarding evacuations due
to wildfires in the area
http://ow.ly/2sBxH”
“Are you a fireman?
Cause you’re always there to extinguish
the fire inside my heart.”
23.06.2015F.O.Ostermann - ifgi GI-Forum 31
GEOCONAVI FIGHTING FOREST FIRES
WHERE WE ARE
1.1 Retrieval
Scheduled Java code
accessing APIs
2.1 Topicality
Scheduled PLSQL job
2.2 Geo-Coding
a) Scheduled PLSQL job
b) Scheduled Java code
2.3 Geographic context
Scheduled PLSQL job
3.1 Spatio-temporal
clustering
Scheduled Python script
calling SatScan job
2.4 Quality Assessment
Scheduled PLSQL job
1.2 Storage
Scheduled Java code
writing to DBMS
Oracle DBMS
3.2 Quality Re-Assessment
Scheduled PLSQL job
Twitter
Stream-
ing API
Flickr
Search
API
Dissemination
SMS, WFS, WMS, RSS, SES
EFFIS
Hotspot
Data
European Media Monitor
Geo-coding API
23.06.2015F.O.Ostermann - ifgi GI-Forum 32
SCORING GEO-SOCIAL MEDIA
WHERE WE ARE
• Sum of weighted scores: QS(Oj) = ∑N
i=1wisji
• with w being weight for criterion i, and s being the score for the geo-
social media object j
• Topicality: keyword-based
• Proximity: next concurrent reported hotspot
• Land cover: Forest, no-Forest, Built-up
• Population Density: Risk factor
• Information clusters: Similar messages or lone signal?
23.06.2015F.O.Ostermann - ifgi GI-Forum 33
TOPICALITY MACHINE LEARNING CLASSIFICATION
WHERE WE ARE
1. Manually annotated (Yes/No) random sample
2. Counted keyword occurences
3. Used Weka 10-fold stratified cross validation with
a) Decision trees
b) Naive Bayes
c) Association Rules
4. J48 Decision Tree works best
Classified as YES Classified as NO
On Forest Fire 1196 370
Not on Forest Fire 403 3712
23.06.2015F.O.Ostermann - ifgi GI-Forum 34
GEOCODING GEO-SOCIAL MEDIA
WHERE WE ARE
Several Geocoders used:
• GISCO/LAU2 brute string matching
• European Media Monitor algorithms
• Yahoo! Placemaker (2010)
TWITTER FLICKR
August 2010 August 2011 August 2010 August 2011
Retrieved items 2,904,065 7,996,228 7,991 17,850
Percentage with
toponym
35% 27% 53%
50%
Percentage with
coordinates
1.1% 0.92% 20% 21%
23.06.2015F.O.Ostermann - ifgi GI-Forum 35
GEOCONAVI FIGHTING FOREST FIRES
WHERE WE ARE
1.1 Retrieval
Scheduled Java code
accessing APIs
2.1 Topicality
Scheduled PLSQL job
2.2 Geo-Coding
a) Scheduled PLSQL job
b) Scheduled Java code
2.3 Geographic context
Scheduled PLSQL job
3.1 Spatio-temporal
clustering
Scheduled Python script
calling SatScan job
2.4 Quality Assessment
Scheduled PLSQL job
1.2 Storage
Scheduled Java code
writing to DBMS
Oracle DBMS
3.2 Quality Re-Assessment
Scheduled PLSQL job
Twitter
Stream-
ing API
Flickr
Search
API
Dissemination
SMS, WFS, WMS, RSS, SES
EFFIS
Hotspot
Data
European Media Monitor
Geo-coding API
23.06.2015F.O.Ostermann - ifgi GI-Forum 36
SPATIO-TEMPORAL CLUSTERING
WHERE WE ARE
• SatScan external software
• Scheduled Python script
1. Reads new geo-social media from database
2. Converts it to SatScan input format
3. Calls SatScan from the command line with appropriate parameters
4. Waits for SatScan to complete analysis
5. Reads SatScan output
6. Stores relevant information in database
23.06.2015F.O.Ostermann - ifgi GI-Forum 37
SPATIO-TEMPORAL CLUSTERING PARAMETERS
WHERE WE ARE
 Type of clustering algorithm
 Spatial location of clusters based on grid/locations or not
 Type of spatial overlap of clusters
 Maximum spatial cluster size
 Maximum temporal cluster size
 Used in 2011: Discrete Poisson adjusting for population, no grid, no
overlap, max radius 50 km, max temporal extent 10% of study period (9
days)
23.06.2015F.O.Ostermann - ifgi GI-Forum 38
GEOCONAVI FIGHTING FOREST FIRES
WHERE WE ARE
1.1 Retrieval
Scheduled Java code
accessing APIs
2.1 Topicality
Scheduled PLSQL job
2.2 Geo-Coding
a) Scheduled PLSQL job
b) Scheduled Java code
2.3 Geographic context
Scheduled PLSQL job
3.1 Spatio-temporal
clustering
Scheduled Python script
calling SatScan job
2.4 Quality Assessment
Scheduled PLSQL job
1.2 Storage
Scheduled Java code
writing to DBMS
Oracle DBMS
3.2 Quality Re-Assessment
Scheduled PLSQL job
Twitter
Stream-
ing API
Flickr
Search
API
Dissemination
SMS, WFS, WMS, RSS, SES
EFFIS
Hotspot
Data
European Media Monitor
Geo-coding API
23.06.2015F.O.Ostermann - ifgi GI-Forum 39
VISUALIZATION AND SHARING
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 40
FOREST FIRES IN FRANCE 2011
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 41
FOREST FIRES IN FRANCE BY GEOCONAVI
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 42
FRENCH FOREST FIRE SOCIAL MEDIA
WHERE WE ARE
(2) Machine-learned
relevance filter:
25,684 items left
(3) Geocoded and
context enriched:
5,770 items left
(4) Clustered in
space and time:
129 clusters with
2,682 items
(5) Second relevance filter:
11 clusters left
with 469 items
(1) Containing French keywords:
659,676 Tweets and
39,016 Flickr images
23.06.2015F.O.Ostermann - ifgi GI-Forum 43
GEOCONAVI RESULTS
WHERE WE ARE
• Simple keyword queries suffice
• Additional Geo-coding indispensable
• Topicality and context filtering plus spatio-temporal clustering crucial
• Able to detect fires from Tweets and Flickr images by spatio-temporal
clustering
• Relevance, credibility and overall quality vary greatly, thus more rules
and human assessment needed
23.06.2015F.O.Ostermann - ifgi GI-Forum 44
SEMANTICS OF PLACES ACROSS GEO-SOCIAL MEDIA
WHERE WE ARE
 Theory-guided research and local case study:
 How to people see and understand the places they frequent?
 What is different across media sources?
 More than one (volunteered) data source
 Identification of places and their semantics
 Comparison of places between data sources
 Comparison of places with geographic features and authoritative data
sources
23.06.2015F.O.Ostermann - ifgi GI-Forum 45
SEMANTICS OF PLACES - IMPLEMETATION
WHERE WE ARE
 Shatford-Panofsky and Agnew
 Greater London Area
 From Twitter to Flickr
 Data Mining (Spatio-temporal clustering) -> Semantic Analysis (Cosine
Similarity, …)
 Geo-demographic data
23.06.2015F.O.Ostermann - ifgi GI-Forum 46
COSINE SIMILARITY NEAREST NEIGHBORS
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 47
CORRELATION DISTANCE & SIMILARITY
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 48
CORRELATION DISTANCE & SIMILARITY
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 49
CORRELATION DISTANCE & SIMILARITY
WHERE WE ARE
 Introduction: Geo-social media APIs as
sensors
 Where we are : Current state-of-the-art
and practical examples from disaster
response
 Outlook: Future research directions
23.06.2015F.O.Ostermann - ifgi GI-Forum 50
PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA
CONTENT
EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
23.06.2015F.O.Ostermann - ifgi GI-Forum 51
UNSOLVED PROBLEMS FROM FRENCH CASE STUDY
WHERE WE ARE
Relevant datasets for contextualization
• Choice
• Integration
Settings for data mining and machine learning
• Method
• Parameters
Geospatial Semantic Web
Multi-Sensory Integration
Crowdsourced Supervision
23.06.2015F.O.Ostermann - ifgi GI-Forum 52
INTEGRATING GEO-SOCIAL MEDIA
OUTLOOK
23.06.2015F.O.Ostermann - ifgi GI-Forum 53
INTEGRATING GEO-SOCIAL MEDIA
OUTLOOK
23.06.2015F.O.Ostermann - ifgi GI-Forum 54
HYBRID GEO-INFORMATION PROCESSING
OUTLOOK
Time-consuming and resource-intensive
• Manual annotation and experiments for topicality filtering
• Parameterization of spatio-temporal clustering
Other challenges:
• Dependency on data quality
• Overfitting
• Diversity of contexts and tasks
• Near real-time
Crowdsourced Supervision
23.06.2015F.O.Ostermann - ifgi GI-Forum 55
GEOCONAVI FIGHTING FOREST FIRES
OUTLOOK
1.1 Retrieval
Scheduled Java code
accessing APIs
2.1 Topicality
Scheduled PLSQL job
2.2 Geo-Coding
a) Scheduled PLSQL job
b) Scheduled Java code
2.3 Geographic context
Scheduled PLSQL job
3.1 Spatio-temporal
clustering
Scheduled Python script
calling SatScan job
2.4 Quality Assessment
Scheduled PLSQL job
1.2 Storage
Scheduled Java code
writing to DBMS
Oracle DBMS
3.2 Quality Re-Assessment
Scheduled PLSQL job
Twitter
Stream-
ing API
Flickr
Search
API
Dissemination
SMS, WFS, WMS, RSS, SES
EFFIS
Hotspot
Data
European Media Monitor
Geo-coding API
23.06.2015F.O.Ostermann - ifgi GI-Forum 56
HYBRID GEO-INFORMATION PROCESSING
OUTLOOK
Developing hybrid quality assurance mechanisms for near real-
time geo-information streams
• Link the characteristics of geographic information with machine
learning class labelling and regression
• Provide a multi-modal interface to let human oracles simultaneously
label instances
• Translate the learner models into nomothetic principles on
geographic semantics
23.06.2015F.O.Ostermann - ifgi GI-Forum 57
MACHINE LEARNING FOR GEO-SOCIAL MEDIA
OUTLOOK
Every data instance needs multi-class labelling:
• Content type
• Geographic footprints of locations and/or events
• Distinct event membership
• Credibility based on a combination of the other class labels
Learners have to deal with characteristics of geographic information:
• Spatial autocorrelation
• Vague boundaries and class memberships
• Uncontrolled variance
23.06.2015F.O.Ostermann - ifgi GI-Forum 58
MACHINE LEARNING FOR GEO-SOCIAL MEDIA
OUTLOOK
• Multiple human oracles annotate instances for all model classes
• Responses will modify the
• Learners
• Parameters used for the geographic analysis steps to compute
footprints and clusters.
• Resulting models indirectly encode the semantic similarity of
geographic places and concepts
• Reference to (linked) data repositories such as DBpedia and
GeoNames when possible.
23.06.2015F.O.Ostermann - ifgi GI-Forum 59
ACTIVE LEARNING
OUTLOOK
• Active learners profit from domain expertise
• Passive learners suited for domain novices
• Learner chooses instances to be labelled and presents them to the
human annotator
• Maximize the impact of human annotation
• Learner remains flexible towards new instances
23.06.2015F.O.Ostermann - ifgi GI-Forum 60
EXAMPLE QUERIES
OUTLOOK
Toponym disambiguation:
• “Does this [item] talk about [location A] or [location B], or none, or
both?”
Spatial footprint calculation for vague geographies:
• “Is this spatial footprint for [item] correct? If not, is it too large, too
small, or wrong shape, or wrong place?”
Spatio-temporal clustering:
• “Does this [item] belong to a cluster named [event] in [location]? If
not, what’s wrong: Event, Location, or both?”
23.06.2015F.O.Ostermann - ifgi GI-Forum 61
HYBRID GEO-INFORMATION PROCESSING WORKFLOW
OUTLOOK
23.06.2015F.O.Ostermann - ifgi GI-Forum 62
HYBRID GEO-INFORMATION PROCESSING MODEL
OUTLOOK
23.06.2015F.O.Ostermann - ifgi GI-Forum 63
HYBRID GEO-INFORMATION PROCESSING METHODS
OUTLOOK
Key Techniques
• Decision Trees
• Naive Bayes
• Support Vector Machines
Key Technologies
• Apache Spark / Storm (Analytical geoprocessing tasks)
• Pybossa (Crowdsourced supervision)
• Cloud Computing
23.06.2015F.O.Ostermann - ifgi GI-Forum 64
CHALLENGES AND OPPORTUNITIES OF GEO-SOCIAL
MEDIA
EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
Thank you!
f.o.ostermann@utwente.nl
@f_ostermann
nl.linkedin.com/in/foost

More Related Content

Similar to Processing and understanding geo-social media content

Enriching geo-social media through geographic contextualization
Enriching geo-social media through geographic contextualizationEnriching geo-social media through geographic contextualization
Enriching geo-social media through geographic contextualizationfoostermann
 
Enriching geo-social media content @AGILE 2015
Enriching geo-social media content @AGILE 2015Enriching geo-social media content @AGILE 2015
Enriching geo-social media content @AGILE 2015Frank Ostermann
 
Geographic context analysis of volunteered information
Geographic context analysis of volunteered informationGeographic context analysis of volunteered information
Geographic context analysis of volunteered informationfoostermann
 
Challenges and opportunities of geo-social media
Challenges and opportunities of geo-social mediaChallenges and opportunities of geo-social media
Challenges and opportunities of geo-social mediafoostermann
 
Hybrid geo-information processing
Hybrid geo-information processingHybrid geo-information processing
Hybrid geo-information processingfoostermann
 
Open Source based GIS devlopment cases by Gaia3D_20150417
Open Source based GIS devlopment cases by Gaia3D_20150417Open Source based GIS devlopment cases by Gaia3D_20150417
Open Source based GIS devlopment cases by Gaia3D_20150417BJ Jang
 
Handling crowdsourced geographic information
Handling crowdsourced geographic informationHandling crowdsourced geographic information
Handling crowdsourced geographic informationfoostermann
 
Calit2-a Persistent UCSD/UCI Framework for Collaboration
Calit2-a Persistent UCSD/UCI Framework for CollaborationCalit2-a Persistent UCSD/UCI Framework for Collaboration
Calit2-a Persistent UCSD/UCI Framework for CollaborationLarry Smarr
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept Miha Ahronovitz
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...Wolfgang Ksoll
 
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigData_Europe
 
Emerging Trends in Crisis Informatics
Emerging Trends in Crisis InformaticsEmerging Trends in Crisis Informatics
Emerging Trends in Crisis InformaticsAdam Papendieck
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008Ian Foster
 
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...Codiax
 
SC7 Hangout 3: The BDE Secure Societies Pilot
SC7 Hangout 3: The BDE Secure Societies PilotSC7 Hangout 3: The BDE Secure Societies Pilot
SC7 Hangout 3: The BDE Secure Societies PilotBigData_Europe
 
Calit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for ApplicationsCalit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for ApplicationsLarry Smarr
 
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...Laurent Lefort
 
Calit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for ApplicationsCalit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for ApplicationsLarry Smarr
 

Similar to Processing and understanding geo-social media content (20)

Enriching geo-social media through geographic contextualization
Enriching geo-social media through geographic contextualizationEnriching geo-social media through geographic contextualization
Enriching geo-social media through geographic contextualization
 
Enriching geo-social media content @AGILE 2015
Enriching geo-social media content @AGILE 2015Enriching geo-social media content @AGILE 2015
Enriching geo-social media content @AGILE 2015
 
Geographic context analysis of volunteered information
Geographic context analysis of volunteered informationGeographic context analysis of volunteered information
Geographic context analysis of volunteered information
 
Challenges and opportunities of geo-social media
Challenges and opportunities of geo-social mediaChallenges and opportunities of geo-social media
Challenges and opportunities of geo-social media
 
Hybrid geo-information processing
Hybrid geo-information processingHybrid geo-information processing
Hybrid geo-information processing
 
Open Source based GIS devlopment cases by Gaia3D_20150417
Open Source based GIS devlopment cases by Gaia3D_20150417Open Source based GIS devlopment cases by Gaia3D_20150417
Open Source based GIS devlopment cases by Gaia3D_20150417
 
Handling crowdsourced geographic information
Handling crowdsourced geographic informationHandling crowdsourced geographic information
Handling crowdsourced geographic information
 
Calit2-a Persistent UCSD/UCI Framework for Collaboration
Calit2-a Persistent UCSD/UCI Framework for CollaborationCalit2-a Persistent UCSD/UCI Framework for Collaboration
Calit2-a Persistent UCSD/UCI Framework for Collaboration
 
Ogf27 Ligo
Ogf27 LigoOgf27 Ligo
Ogf27 Ligo
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
 
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
 
Emerging Trends in Crisis Informatics
Emerging Trends in Crisis InformaticsEmerging Trends in Crisis Informatics
Emerging Trends in Crisis Informatics
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
 
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
 
SC7 Hangout 3: The BDE Secure Societies Pilot
SC7 Hangout 3: The BDE Secure Societies PilotSC7 Hangout 3: The BDE Secure Societies Pilot
SC7 Hangout 3: The BDE Secure Societies Pilot
 
Calit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for ApplicationsCalit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for Applications
 
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
 
Processing Large Complex Data
Processing Large Complex DataProcessing Large Complex Data
Processing Large Complex Data
 
Calit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for ApplicationsCalit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for Applications
 

Recently uploaded

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 

Recently uploaded (20)

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 

Processing and understanding geo-social media content

  • 1. PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA CONTENT EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS Frank O. Ostermann IfGI GI-Forum, 23.06.2015
  • 2.  Introduction: Geo-social media APIs as sensors  Where we are: Current state-of-the-art and practical examples from disaster response  Outlook: Future research directions 23.06.2015F.O.Ostermann - ifgi GI-Forum 2 PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA CONTENT EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
  • 3. 23.06.2015F.O.Ostermann - ifgi GI-Forum 3 ONCE UPON A TIME… INTRODUCTION … there were Desktop-GIS and Shapefiles, digitized or scanned from paper maps, or from raw surveying or satellite data.
  • 4.  Mobile Web 2.0  Cloud Computing  Internet of Things (in particular, sensors) 23.06.2015F.O.Ostermann - ifgi GI-Forum 4 THREE DISRUPTIVE INNOVATIONS INTRODUCTION
  • 5. 23.06.2015F.O.Ostermann - ifgi GI-Forum 5 … AND CORRESPONDING BUZZWORDS (& BUZZ-VIS) INTRODUCTION
  • 6. 23.06.2015F.O.Ostermann - ifgi GI-Forum 6 BEYOND THE BUZZ EXAMPLES
  • 7.  Real-time data input stream  Citizens as sensors  Multi-layered, inter- operable data sets  Linked and open data  GEOSS, Eye on Earth, INSPIRE, … 23.06.2015F.O.Ostermann - ifgi GI-Forum 7 THE BIG PICTURE: NEXT GENERATION DIGITAL EARTH INTRODUCTION
  • 8. 23.06.2015F.O.Ostermann - ifgi GI-Forum 8 LOW-COST IN-SITU AND MOBILE SENSORS INTRODUCTION Publiclaboratory.com Mikrokopter.de Libelium Waspmote
  • 9. 23.06.2015F.O.Ostermann - ifgi GI-Forum 9 CITIZENS AS SENSORS INTRODUCTION + = ! Why not treat information from the citizens as another type of sensor data?
  • 10. 23.06.2015F.O.Ostermann - ifgi GI-Forum 10 WHO IS THE CROWD? INTRODUCTION
  • 11. 23.06.2015F.O.Ostermann - ifgi GI-Forum 11 WHAT DOES THE CROWD WANT? INTRODUCTION
  • 12. 23.06.2015F.O.Ostermann - ifgi GI-Forum 12 NEW SOURCES OF GEO-INFORMATION INTRODUCTION Geography Explicit Implicit Participation Explicit Volunteered Geographic Information (VGI) Open Street Map Volunteered Geographic Content (VGC) Wikipedia articles on non-geographic topics containing place names, Foursquare Implicit Contributed / Ambient Geographic Information (CGI/AGI) Public Tweets referring to the properties of an identifiable place. User-Generated Geographic Content (UGGC) Public Flickr images containing a place name or being georeferenced Adopted from [1]
  • 13. 23.06.2015F.O.Ostermann - ifgi GI-Forum 13 TWITTER INTRODUCTION • 140 characters micro-blogging platform • Asymetric following – being followed • Inflated user numbers: • 100 million daily active vs. • 300 million montly active vs. • 1 billion registered (number of bots high, >40% never tweeted) • Two APIs: Streaming API & Search API • Rich metadata returned • <5% with coordinates, but much more with toponyms • Huge ecosystem of third-party apps and services • Boost to data-driven research, but what about reproducibility?
  • 14. 23.06.2015F.O.Ostermann - ifgi GI-Forum 14 FLICKR INTRODUCTION • 92 million users • 1 million photos shared every day • Pioneer, then declined, then bounced back • API offers detailed search functionality • ~20% geocoded, many more with toponyms • Potentially rich source of data: • Title • Tags • Description • But: Bulk uploads (and tagging)
  • 15. 23.06.2015F.O.Ostermann - ifgi GI-Forum 15 GEO-SOCIAL MEDIA SENSORS – SO WHAT‘S DIFFERENT? INTRODUCTION • Often In-situ • Rich, pre-processed information but varying level of quality • Uneven spatio-temporal distribution (stream) • Redundancy of content and channels (sharing) • Heterogeneous structure • Unknown source/lineage • Unclear / changing licencing, property rights, liability (e.g. OpenStreetMap) • Unknown/Immeasurable precision, error, completeness • Uncertainty about the uncertainty! • How to calibrate? (Should we?)
  • 16. 23.06.2015F.O.Ostermann - ifgi GI-Forum 16 QUALITY OF GEO-SOCIAL MEDIA INFORMATION INTRODUCTION Adopted from [2, 3] Source Credibility Relevance Content Location Context Natual Language Processing Social Network Analysis Geographic Contextualization
  • 17. 23.06.2015F.O.Ostermann - ifgi GI-Forum 17 AUTOMATIC IMAGE GEO-TAG CREATION INTRODUCTION
  • 18.  Introduction: Geo-social media APIs as sensors  Where we are : Current state-of-the-art and practical examples from disaster response  Outlook: Future research directions 23.06.2015F.O.Ostermann - ifgi GI-Forum 18 PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA CONTENT EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
  • 19. 23.06.2015F.O.Ostermann - ifgi GI-Forum 19 GEO-SOCIAL MEDIA AND CRISIS MANAGEMENT WHERE WE ARE Social media offers… Crisis management needs… rich up-to-date information up-to-date information new paths of communication redundant paths of communication noise, uncertain lineage and accuracy high-quality and reliable information Crowd-sourced data curation faces limits of  Sustainability  Scalability
  • 20. 23.06.2015F.O.Ostermann - ifgi GI-Forum 20 HUMANITARIAN OPENSTREETMAP TEAM INTRODUCTION • Many activations, last one after Nepal earthquake • Three main communication channels: • Tasking manager • E-Mail list • IRC channel
  • 21. 23.06.2015F.O.Ostermann - ifgi GI-Forum 21 USHAHIDI – BEYOND CRISIS MAPPING INTRODUCTION
  • 22. 23.06.2015F.O.Ostermann - ifgi GI-Forum 22 TWITCIDENT - CROWDSENSE WHERE WE ARE
  • 23. 23.06.2015F.O.Ostermann - ifgi GI-Forum 23 CRISISTRACKER (AIDR) WHERE WE ARE
  • 24. 23.06.2015F.O.Ostermann - ifgi GI-Forum 24 AIDR WHERE WE ARE http://irevolution.net/2013/10/01/aidr-artificial- intelligence-for-disaster-response/
  • 25. 23.06.2015F.O.Ostermann - ifgi GI-Forum 25 GEOGRAPHIC CONTEXT ANALYSIS OF VOLUNTEERED INFORMATION (GEOCONAVI) WHERE WE ARE 1. Deploy a system for using UGC in crisis decision support on forest fires 2. Assess the added value of using UGC for forest fire response.
  • 26. 23.06.2015F.O.Ostermann - ifgi GI-Forum 26 FOREST FIRE CHARACTERISTICS WHERE WE ARE • Dynamics require near real-time processing • Less signals since often in sparsely populated areas • Predictability and recurrence facilitate sensor and model calibration
  • 27. 23.06.2015F.O.Ostermann - ifgi GI-Forum 27 GEOCONAVI FIGHTING FOREST FIRES WHERE WE ARE 1.1 Retrieval Scheduled Java code accessing APIs 2.1 Topicality Scheduled PLSQL job 2.2 Geo-Coding a) Scheduled PLSQL job b) Scheduled Java code 2.3 Geographic context Scheduled PLSQL job 3.1 Spatio-temporal clustering Scheduled Python script calling SatScan job 2.4 Quality Assessment Scheduled PLSQL job 1.2 Storage Scheduled Java code writing to DBMS Oracle DBMS 3.2 Quality Re-Assessment Scheduled PLSQL job Twitter Stream- ing API Flickr Search API Dissemination SMS, WFS, WMS, RSS, SES EFFIS Hotspot Data European Media Monitor Geo-coding API
  • 28.  Flickr API  Twitter Streaming API  Keyword-based:  Domain expertise  Task-oriented  Scheduled scripts  Writing to Oracle DBMS 23.06.2015F.O.Ostermann - ifgi GI-Forum 28 DATA COLLECTION AND STORAGE WHERE WE ARE
  • 29. 23.06.2015F.O.Ostermann - ifgi GI-Forum 29 EXAMPLE GEO-SOCIAL MEDIA WHERE WE ARE
  • 30. 23.06.2015F.O.Ostermann - ifgi GI-Forum 30 EXAMPLE GEO-SOCIAL MEDIA WHERE WE ARE “Back at hotel. Fire skirted round village. Little evidence of significant damage. Helicopters still overhead damping scrub. Beer unaffected” (Canada BCGovFireInfo): “Important notice from the Reg Dist of Bulkley- Nechako regarding evacuations due to wildfires in the area http://ow.ly/2sBxH” “Are you a fireman? Cause you’re always there to extinguish the fire inside my heart.”
  • 31. 23.06.2015F.O.Ostermann - ifgi GI-Forum 31 GEOCONAVI FIGHTING FOREST FIRES WHERE WE ARE 1.1 Retrieval Scheduled Java code accessing APIs 2.1 Topicality Scheduled PLSQL job 2.2 Geo-Coding a) Scheduled PLSQL job b) Scheduled Java code 2.3 Geographic context Scheduled PLSQL job 3.1 Spatio-temporal clustering Scheduled Python script calling SatScan job 2.4 Quality Assessment Scheduled PLSQL job 1.2 Storage Scheduled Java code writing to DBMS Oracle DBMS 3.2 Quality Re-Assessment Scheduled PLSQL job Twitter Stream- ing API Flickr Search API Dissemination SMS, WFS, WMS, RSS, SES EFFIS Hotspot Data European Media Monitor Geo-coding API
  • 32. 23.06.2015F.O.Ostermann - ifgi GI-Forum 32 SCORING GEO-SOCIAL MEDIA WHERE WE ARE • Sum of weighted scores: QS(Oj) = ∑N i=1wisji • with w being weight for criterion i, and s being the score for the geo- social media object j • Topicality: keyword-based • Proximity: next concurrent reported hotspot • Land cover: Forest, no-Forest, Built-up • Population Density: Risk factor • Information clusters: Similar messages or lone signal?
  • 33. 23.06.2015F.O.Ostermann - ifgi GI-Forum 33 TOPICALITY MACHINE LEARNING CLASSIFICATION WHERE WE ARE 1. Manually annotated (Yes/No) random sample 2. Counted keyword occurences 3. Used Weka 10-fold stratified cross validation with a) Decision trees b) Naive Bayes c) Association Rules 4. J48 Decision Tree works best Classified as YES Classified as NO On Forest Fire 1196 370 Not on Forest Fire 403 3712
  • 34. 23.06.2015F.O.Ostermann - ifgi GI-Forum 34 GEOCODING GEO-SOCIAL MEDIA WHERE WE ARE Several Geocoders used: • GISCO/LAU2 brute string matching • European Media Monitor algorithms • Yahoo! Placemaker (2010) TWITTER FLICKR August 2010 August 2011 August 2010 August 2011 Retrieved items 2,904,065 7,996,228 7,991 17,850 Percentage with toponym 35% 27% 53% 50% Percentage with coordinates 1.1% 0.92% 20% 21%
  • 35. 23.06.2015F.O.Ostermann - ifgi GI-Forum 35 GEOCONAVI FIGHTING FOREST FIRES WHERE WE ARE 1.1 Retrieval Scheduled Java code accessing APIs 2.1 Topicality Scheduled PLSQL job 2.2 Geo-Coding a) Scheduled PLSQL job b) Scheduled Java code 2.3 Geographic context Scheduled PLSQL job 3.1 Spatio-temporal clustering Scheduled Python script calling SatScan job 2.4 Quality Assessment Scheduled PLSQL job 1.2 Storage Scheduled Java code writing to DBMS Oracle DBMS 3.2 Quality Re-Assessment Scheduled PLSQL job Twitter Stream- ing API Flickr Search API Dissemination SMS, WFS, WMS, RSS, SES EFFIS Hotspot Data European Media Monitor Geo-coding API
  • 36. 23.06.2015F.O.Ostermann - ifgi GI-Forum 36 SPATIO-TEMPORAL CLUSTERING WHERE WE ARE • SatScan external software • Scheduled Python script 1. Reads new geo-social media from database 2. Converts it to SatScan input format 3. Calls SatScan from the command line with appropriate parameters 4. Waits for SatScan to complete analysis 5. Reads SatScan output 6. Stores relevant information in database
  • 37. 23.06.2015F.O.Ostermann - ifgi GI-Forum 37 SPATIO-TEMPORAL CLUSTERING PARAMETERS WHERE WE ARE  Type of clustering algorithm  Spatial location of clusters based on grid/locations or not  Type of spatial overlap of clusters  Maximum spatial cluster size  Maximum temporal cluster size  Used in 2011: Discrete Poisson adjusting for population, no grid, no overlap, max radius 50 km, max temporal extent 10% of study period (9 days)
  • 38. 23.06.2015F.O.Ostermann - ifgi GI-Forum 38 GEOCONAVI FIGHTING FOREST FIRES WHERE WE ARE 1.1 Retrieval Scheduled Java code accessing APIs 2.1 Topicality Scheduled PLSQL job 2.2 Geo-Coding a) Scheduled PLSQL job b) Scheduled Java code 2.3 Geographic context Scheduled PLSQL job 3.1 Spatio-temporal clustering Scheduled Python script calling SatScan job 2.4 Quality Assessment Scheduled PLSQL job 1.2 Storage Scheduled Java code writing to DBMS Oracle DBMS 3.2 Quality Re-Assessment Scheduled PLSQL job Twitter Stream- ing API Flickr Search API Dissemination SMS, WFS, WMS, RSS, SES EFFIS Hotspot Data European Media Monitor Geo-coding API
  • 39. 23.06.2015F.O.Ostermann - ifgi GI-Forum 39 VISUALIZATION AND SHARING WHERE WE ARE
  • 40. 23.06.2015F.O.Ostermann - ifgi GI-Forum 40 FOREST FIRES IN FRANCE 2011 WHERE WE ARE
  • 41. 23.06.2015F.O.Ostermann - ifgi GI-Forum 41 FOREST FIRES IN FRANCE BY GEOCONAVI WHERE WE ARE
  • 42. 23.06.2015F.O.Ostermann - ifgi GI-Forum 42 FRENCH FOREST FIRE SOCIAL MEDIA WHERE WE ARE (2) Machine-learned relevance filter: 25,684 items left (3) Geocoded and context enriched: 5,770 items left (4) Clustered in space and time: 129 clusters with 2,682 items (5) Second relevance filter: 11 clusters left with 469 items (1) Containing French keywords: 659,676 Tweets and 39,016 Flickr images
  • 43. 23.06.2015F.O.Ostermann - ifgi GI-Forum 43 GEOCONAVI RESULTS WHERE WE ARE • Simple keyword queries suffice • Additional Geo-coding indispensable • Topicality and context filtering plus spatio-temporal clustering crucial • Able to detect fires from Tweets and Flickr images by spatio-temporal clustering • Relevance, credibility and overall quality vary greatly, thus more rules and human assessment needed
  • 44. 23.06.2015F.O.Ostermann - ifgi GI-Forum 44 SEMANTICS OF PLACES ACROSS GEO-SOCIAL MEDIA WHERE WE ARE  Theory-guided research and local case study:  How to people see and understand the places they frequent?  What is different across media sources?  More than one (volunteered) data source  Identification of places and their semantics  Comparison of places between data sources  Comparison of places with geographic features and authoritative data sources
  • 45. 23.06.2015F.O.Ostermann - ifgi GI-Forum 45 SEMANTICS OF PLACES - IMPLEMETATION WHERE WE ARE  Shatford-Panofsky and Agnew  Greater London Area  From Twitter to Flickr  Data Mining (Spatio-temporal clustering) -> Semantic Analysis (Cosine Similarity, …)  Geo-demographic data
  • 46. 23.06.2015F.O.Ostermann - ifgi GI-Forum 46 COSINE SIMILARITY NEAREST NEIGHBORS WHERE WE ARE
  • 47. 23.06.2015F.O.Ostermann - ifgi GI-Forum 47 CORRELATION DISTANCE & SIMILARITY WHERE WE ARE
  • 48. 23.06.2015F.O.Ostermann - ifgi GI-Forum 48 CORRELATION DISTANCE & SIMILARITY WHERE WE ARE
  • 49. 23.06.2015F.O.Ostermann - ifgi GI-Forum 49 CORRELATION DISTANCE & SIMILARITY WHERE WE ARE
  • 50.  Introduction: Geo-social media APIs as sensors  Where we are : Current state-of-the-art and practical examples from disaster response  Outlook: Future research directions 23.06.2015F.O.Ostermann - ifgi GI-Forum 50 PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA CONTENT EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
  • 51. 23.06.2015F.O.Ostermann - ifgi GI-Forum 51 UNSOLVED PROBLEMS FROM FRENCH CASE STUDY WHERE WE ARE Relevant datasets for contextualization • Choice • Integration Settings for data mining and machine learning • Method • Parameters Geospatial Semantic Web Multi-Sensory Integration Crowdsourced Supervision
  • 52. 23.06.2015F.O.Ostermann - ifgi GI-Forum 52 INTEGRATING GEO-SOCIAL MEDIA OUTLOOK
  • 53. 23.06.2015F.O.Ostermann - ifgi GI-Forum 53 INTEGRATING GEO-SOCIAL MEDIA OUTLOOK
  • 54. 23.06.2015F.O.Ostermann - ifgi GI-Forum 54 HYBRID GEO-INFORMATION PROCESSING OUTLOOK Time-consuming and resource-intensive • Manual annotation and experiments for topicality filtering • Parameterization of spatio-temporal clustering Other challenges: • Dependency on data quality • Overfitting • Diversity of contexts and tasks • Near real-time Crowdsourced Supervision
  • 55. 23.06.2015F.O.Ostermann - ifgi GI-Forum 55 GEOCONAVI FIGHTING FOREST FIRES OUTLOOK 1.1 Retrieval Scheduled Java code accessing APIs 2.1 Topicality Scheduled PLSQL job 2.2 Geo-Coding a) Scheduled PLSQL job b) Scheduled Java code 2.3 Geographic context Scheduled PLSQL job 3.1 Spatio-temporal clustering Scheduled Python script calling SatScan job 2.4 Quality Assessment Scheduled PLSQL job 1.2 Storage Scheduled Java code writing to DBMS Oracle DBMS 3.2 Quality Re-Assessment Scheduled PLSQL job Twitter Stream- ing API Flickr Search API Dissemination SMS, WFS, WMS, RSS, SES EFFIS Hotspot Data European Media Monitor Geo-coding API
  • 56. 23.06.2015F.O.Ostermann - ifgi GI-Forum 56 HYBRID GEO-INFORMATION PROCESSING OUTLOOK Developing hybrid quality assurance mechanisms for near real- time geo-information streams • Link the characteristics of geographic information with machine learning class labelling and regression • Provide a multi-modal interface to let human oracles simultaneously label instances • Translate the learner models into nomothetic principles on geographic semantics
  • 57. 23.06.2015F.O.Ostermann - ifgi GI-Forum 57 MACHINE LEARNING FOR GEO-SOCIAL MEDIA OUTLOOK Every data instance needs multi-class labelling: • Content type • Geographic footprints of locations and/or events • Distinct event membership • Credibility based on a combination of the other class labels Learners have to deal with characteristics of geographic information: • Spatial autocorrelation • Vague boundaries and class memberships • Uncontrolled variance
  • 58. 23.06.2015F.O.Ostermann - ifgi GI-Forum 58 MACHINE LEARNING FOR GEO-SOCIAL MEDIA OUTLOOK • Multiple human oracles annotate instances for all model classes • Responses will modify the • Learners • Parameters used for the geographic analysis steps to compute footprints and clusters. • Resulting models indirectly encode the semantic similarity of geographic places and concepts • Reference to (linked) data repositories such as DBpedia and GeoNames when possible.
  • 59. 23.06.2015F.O.Ostermann - ifgi GI-Forum 59 ACTIVE LEARNING OUTLOOK • Active learners profit from domain expertise • Passive learners suited for domain novices • Learner chooses instances to be labelled and presents them to the human annotator • Maximize the impact of human annotation • Learner remains flexible towards new instances
  • 60. 23.06.2015F.O.Ostermann - ifgi GI-Forum 60 EXAMPLE QUERIES OUTLOOK Toponym disambiguation: • “Does this [item] talk about [location A] or [location B], or none, or both?” Spatial footprint calculation for vague geographies: • “Is this spatial footprint for [item] correct? If not, is it too large, too small, or wrong shape, or wrong place?” Spatio-temporal clustering: • “Does this [item] belong to a cluster named [event] in [location]? If not, what’s wrong: Event, Location, or both?”
  • 61. 23.06.2015F.O.Ostermann - ifgi GI-Forum 61 HYBRID GEO-INFORMATION PROCESSING WORKFLOW OUTLOOK
  • 62. 23.06.2015F.O.Ostermann - ifgi GI-Forum 62 HYBRID GEO-INFORMATION PROCESSING MODEL OUTLOOK
  • 63. 23.06.2015F.O.Ostermann - ifgi GI-Forum 63 HYBRID GEO-INFORMATION PROCESSING METHODS OUTLOOK Key Techniques • Decision Trees • Naive Bayes • Support Vector Machines Key Technologies • Apache Spark / Storm (Analytical geoprocessing tasks) • Pybossa (Crowdsourced supervision) • Cloud Computing
  • 64. 23.06.2015F.O.Ostermann - ifgi GI-Forum 64 CHALLENGES AND OPPORTUNITIES OF GEO-SOCIAL MEDIA EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS Thank you! f.o.ostermann@utwente.nl @f_ostermann nl.linkedin.com/in/foost