SlideShare a Scribd company logo
PROCESSING AND UNDERSTANDING GEO-SOCIAL
MEDIA CONTENT
EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
Frank O. Ostermann
IfGI GI-Forum, 23.06.2015
 Introduction: Geo-social media APIs as
sensors
 Where we are: Current state-of-the-art and
practical examples from disaster response
 Outlook: Future research directions
23.06.2015F.O.Ostermann - ifgi GI-Forum 2
PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA
CONTENT
EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
23.06.2015F.O.Ostermann - ifgi GI-Forum 3
ONCE UPON A TIME…
INTRODUCTION
… there were Desktop-GIS and Shapefiles,
digitized or scanned from paper maps,
or from raw surveying or satellite data.
 Mobile Web 2.0
 Cloud Computing
 Internet of Things (in
particular, sensors)
23.06.2015F.O.Ostermann - ifgi GI-Forum 4
THREE DISRUPTIVE INNOVATIONS
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 5
… AND CORRESPONDING BUZZWORDS (& BUZZ-VIS)
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 6
BEYOND THE BUZZ
EXAMPLES
 Real-time data input
stream
 Citizens as sensors
 Multi-layered, inter-
operable data sets
 Linked and open data
 GEOSS, Eye on Earth,
INSPIRE, …
23.06.2015F.O.Ostermann - ifgi GI-Forum 7
THE BIG PICTURE: NEXT GENERATION DIGITAL EARTH
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 8
LOW-COST IN-SITU AND MOBILE SENSORS
INTRODUCTION
Publiclaboratory.com
Mikrokopter.de
Libelium
Waspmote
23.06.2015F.O.Ostermann - ifgi GI-Forum 9
CITIZENS AS SENSORS
INTRODUCTION
+ = !
Why not treat information from the citizens
as another type of sensor data?
23.06.2015F.O.Ostermann - ifgi GI-Forum 10
WHO IS THE CROWD?
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 11
WHAT DOES THE CROWD WANT?
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 12
NEW SOURCES OF GEO-INFORMATION
INTRODUCTION
Geography
Explicit Implicit
Participation
Explicit
Volunteered Geographic Information
(VGI)
Open Street Map
Volunteered Geographic Content (VGC)
Wikipedia articles on non-geographic
topics containing place names,
Foursquare
Implicit
Contributed / Ambient Geographic
Information (CGI/AGI)
Public Tweets referring to the
properties of an identifiable place.
User-Generated Geographic Content
(UGGC)
Public Flickr images containing a place
name or being georeferenced
Adopted from [1]
23.06.2015F.O.Ostermann - ifgi GI-Forum 13
TWITTER
INTRODUCTION
• 140 characters micro-blogging platform
• Asymetric following – being followed
• Inflated user numbers:
• 100 million daily active vs.
• 300 million montly active vs.
• 1 billion registered (number of bots high, >40% never tweeted)
• Two APIs: Streaming API & Search API
• Rich metadata returned
• <5% with coordinates, but much more with toponyms
• Huge ecosystem of third-party apps and services
• Boost to data-driven research, but what about reproducibility?
23.06.2015F.O.Ostermann - ifgi GI-Forum 14
FLICKR
INTRODUCTION
• 92 million users
• 1 million photos shared every day
• Pioneer, then declined, then bounced back
• API offers detailed search functionality
• ~20% geocoded, many more with toponyms
• Potentially rich source of data:
• Title
• Tags
• Description
• But: Bulk uploads (and tagging)
23.06.2015F.O.Ostermann - ifgi GI-Forum 15
GEO-SOCIAL MEDIA SENSORS – SO WHAT‘S DIFFERENT?
INTRODUCTION
• Often In-situ
• Rich, pre-processed information but varying level of quality
• Uneven spatio-temporal distribution (stream)
• Redundancy of content and channels (sharing)
• Heterogeneous structure
• Unknown source/lineage
• Unclear / changing licencing, property rights, liability (e.g.
OpenStreetMap)
• Unknown/Immeasurable precision, error, completeness
• Uncertainty about the uncertainty!
• How to calibrate? (Should we?)
23.06.2015F.O.Ostermann - ifgi GI-Forum 16
QUALITY OF GEO-SOCIAL MEDIA INFORMATION
INTRODUCTION
Adopted from [2, 3]
Source
Credibility
Relevance
Content
Location
Context
Natual Language
Processing
Social Network
Analysis
Geographic
Contextualization
23.06.2015F.O.Ostermann - ifgi GI-Forum 17
AUTOMATIC IMAGE GEO-TAG CREATION
INTRODUCTION
 Introduction: Geo-social media APIs as
sensors
 Where we are : Current state-of-the-art
and practical examples from disaster
response
 Outlook: Future research directions
23.06.2015F.O.Ostermann - ifgi GI-Forum 18
PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA
CONTENT
EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
23.06.2015F.O.Ostermann - ifgi GI-Forum 19
GEO-SOCIAL MEDIA AND CRISIS MANAGEMENT
WHERE WE ARE
Social media offers… Crisis management needs…
rich up-to-date information up-to-date information
new paths of communication redundant paths of communication
noise, uncertain lineage and accuracy high-quality and reliable information
Crowd-sourced data curation faces limits of
 Sustainability
 Scalability
23.06.2015F.O.Ostermann - ifgi GI-Forum 20
HUMANITARIAN OPENSTREETMAP TEAM
INTRODUCTION
• Many activations, last one after Nepal earthquake
• Three main communication channels:
• Tasking manager
• E-Mail list
• IRC channel
23.06.2015F.O.Ostermann - ifgi GI-Forum 21
USHAHIDI – BEYOND CRISIS MAPPING
INTRODUCTION
23.06.2015F.O.Ostermann - ifgi GI-Forum 22
TWITCIDENT - CROWDSENSE
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 23
CRISISTRACKER (AIDR)
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 24
AIDR
WHERE WE ARE
http://irevolution.net/2013/10/01/aidr-artificial-
intelligence-for-disaster-response/
23.06.2015F.O.Ostermann - ifgi GI-Forum 25
GEOGRAPHIC CONTEXT ANALYSIS OF VOLUNTEERED
INFORMATION (GEOCONAVI)
WHERE WE ARE
1. Deploy a system for using UGC
in crisis decision support on forest
fires
2. Assess the added value of
using UGC for forest fire response.
23.06.2015F.O.Ostermann - ifgi GI-Forum 26
FOREST FIRE CHARACTERISTICS
WHERE WE ARE
• Dynamics require near real-time
processing
• Less signals since often in sparsely
populated areas
• Predictability and recurrence facilitate
sensor and model calibration
23.06.2015F.O.Ostermann - ifgi GI-Forum 27
GEOCONAVI FIGHTING FOREST FIRES
WHERE WE ARE
1.1 Retrieval
Scheduled Java code
accessing APIs
2.1 Topicality
Scheduled PLSQL job
2.2 Geo-Coding
a) Scheduled PLSQL job
b) Scheduled Java code
2.3 Geographic context
Scheduled PLSQL job
3.1 Spatio-temporal
clustering
Scheduled Python script
calling SatScan job
2.4 Quality Assessment
Scheduled PLSQL job
1.2 Storage
Scheduled Java code
writing to DBMS
Oracle DBMS
3.2 Quality Re-Assessment
Scheduled PLSQL job
Twitter
Stream-
ing API
Flickr
Search
API
Dissemination
SMS, WFS, WMS, RSS, SES
EFFIS
Hotspot
Data
European Media Monitor
Geo-coding API
 Flickr API
 Twitter Streaming API
 Keyword-based:
 Domain expertise
 Task-oriented
 Scheduled scripts
 Writing to Oracle DBMS
23.06.2015F.O.Ostermann - ifgi GI-Forum 28
DATA COLLECTION AND STORAGE
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 29
EXAMPLE GEO-SOCIAL MEDIA
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 30
EXAMPLE GEO-SOCIAL MEDIA
WHERE WE ARE
“Back at hotel. Fire skirted
round village. Little evidence of
significant damage. Helicopters
still overhead damping scrub.
Beer unaffected”
(Canada BCGovFireInfo): “Important
notice from the Reg Dist of Bulkley-
Nechako regarding evacuations due
to wildfires in the area
http://ow.ly/2sBxH”
“Are you a fireman?
Cause you’re always there to extinguish
the fire inside my heart.”
23.06.2015F.O.Ostermann - ifgi GI-Forum 31
GEOCONAVI FIGHTING FOREST FIRES
WHERE WE ARE
1.1 Retrieval
Scheduled Java code
accessing APIs
2.1 Topicality
Scheduled PLSQL job
2.2 Geo-Coding
a) Scheduled PLSQL job
b) Scheduled Java code
2.3 Geographic context
Scheduled PLSQL job
3.1 Spatio-temporal
clustering
Scheduled Python script
calling SatScan job
2.4 Quality Assessment
Scheduled PLSQL job
1.2 Storage
Scheduled Java code
writing to DBMS
Oracle DBMS
3.2 Quality Re-Assessment
Scheduled PLSQL job
Twitter
Stream-
ing API
Flickr
Search
API
Dissemination
SMS, WFS, WMS, RSS, SES
EFFIS
Hotspot
Data
European Media Monitor
Geo-coding API
23.06.2015F.O.Ostermann - ifgi GI-Forum 32
SCORING GEO-SOCIAL MEDIA
WHERE WE ARE
• Sum of weighted scores: QS(Oj) = ∑N
i=1wisji
• with w being weight for criterion i, and s being the score for the geo-
social media object j
• Topicality: keyword-based
• Proximity: next concurrent reported hotspot
• Land cover: Forest, no-Forest, Built-up
• Population Density: Risk factor
• Information clusters: Similar messages or lone signal?
23.06.2015F.O.Ostermann - ifgi GI-Forum 33
TOPICALITY MACHINE LEARNING CLASSIFICATION
WHERE WE ARE
1. Manually annotated (Yes/No) random sample
2. Counted keyword occurences
3. Used Weka 10-fold stratified cross validation with
a) Decision trees
b) Naive Bayes
c) Association Rules
4. J48 Decision Tree works best
Classified as YES Classified as NO
On Forest Fire 1196 370
Not on Forest Fire 403 3712
23.06.2015F.O.Ostermann - ifgi GI-Forum 34
GEOCODING GEO-SOCIAL MEDIA
WHERE WE ARE
Several Geocoders used:
• GISCO/LAU2 brute string matching
• European Media Monitor algorithms
• Yahoo! Placemaker (2010)
TWITTER FLICKR
August 2010 August 2011 August 2010 August 2011
Retrieved items 2,904,065 7,996,228 7,991 17,850
Percentage with
toponym
35% 27% 53%
50%
Percentage with
coordinates
1.1% 0.92% 20% 21%
23.06.2015F.O.Ostermann - ifgi GI-Forum 35
GEOCONAVI FIGHTING FOREST FIRES
WHERE WE ARE
1.1 Retrieval
Scheduled Java code
accessing APIs
2.1 Topicality
Scheduled PLSQL job
2.2 Geo-Coding
a) Scheduled PLSQL job
b) Scheduled Java code
2.3 Geographic context
Scheduled PLSQL job
3.1 Spatio-temporal
clustering
Scheduled Python script
calling SatScan job
2.4 Quality Assessment
Scheduled PLSQL job
1.2 Storage
Scheduled Java code
writing to DBMS
Oracle DBMS
3.2 Quality Re-Assessment
Scheduled PLSQL job
Twitter
Stream-
ing API
Flickr
Search
API
Dissemination
SMS, WFS, WMS, RSS, SES
EFFIS
Hotspot
Data
European Media Monitor
Geo-coding API
23.06.2015F.O.Ostermann - ifgi GI-Forum 36
SPATIO-TEMPORAL CLUSTERING
WHERE WE ARE
• SatScan external software
• Scheduled Python script
1. Reads new geo-social media from database
2. Converts it to SatScan input format
3. Calls SatScan from the command line with appropriate parameters
4. Waits for SatScan to complete analysis
5. Reads SatScan output
6. Stores relevant information in database
23.06.2015F.O.Ostermann - ifgi GI-Forum 37
SPATIO-TEMPORAL CLUSTERING PARAMETERS
WHERE WE ARE
 Type of clustering algorithm
 Spatial location of clusters based on grid/locations or not
 Type of spatial overlap of clusters
 Maximum spatial cluster size
 Maximum temporal cluster size
 Used in 2011: Discrete Poisson adjusting for population, no grid, no
overlap, max radius 50 km, max temporal extent 10% of study period (9
days)
23.06.2015F.O.Ostermann - ifgi GI-Forum 38
GEOCONAVI FIGHTING FOREST FIRES
WHERE WE ARE
1.1 Retrieval
Scheduled Java code
accessing APIs
2.1 Topicality
Scheduled PLSQL job
2.2 Geo-Coding
a) Scheduled PLSQL job
b) Scheduled Java code
2.3 Geographic context
Scheduled PLSQL job
3.1 Spatio-temporal
clustering
Scheduled Python script
calling SatScan job
2.4 Quality Assessment
Scheduled PLSQL job
1.2 Storage
Scheduled Java code
writing to DBMS
Oracle DBMS
3.2 Quality Re-Assessment
Scheduled PLSQL job
Twitter
Stream-
ing API
Flickr
Search
API
Dissemination
SMS, WFS, WMS, RSS, SES
EFFIS
Hotspot
Data
European Media Monitor
Geo-coding API
23.06.2015F.O.Ostermann - ifgi GI-Forum 39
VISUALIZATION AND SHARING
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 40
FOREST FIRES IN FRANCE 2011
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 41
FOREST FIRES IN FRANCE BY GEOCONAVI
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 42
FRENCH FOREST FIRE SOCIAL MEDIA
WHERE WE ARE
(2) Machine-learned
relevance filter:
25,684 items left
(3) Geocoded and
context enriched:
5,770 items left
(4) Clustered in
space and time:
129 clusters with
2,682 items
(5) Second relevance filter:
11 clusters left
with 469 items
(1) Containing French keywords:
659,676 Tweets and
39,016 Flickr images
23.06.2015F.O.Ostermann - ifgi GI-Forum 43
GEOCONAVI RESULTS
WHERE WE ARE
• Simple keyword queries suffice
• Additional Geo-coding indispensable
• Topicality and context filtering plus spatio-temporal clustering crucial
• Able to detect fires from Tweets and Flickr images by spatio-temporal
clustering
• Relevance, credibility and overall quality vary greatly, thus more rules
and human assessment needed
23.06.2015F.O.Ostermann - ifgi GI-Forum 44
SEMANTICS OF PLACES ACROSS GEO-SOCIAL MEDIA
WHERE WE ARE
 Theory-guided research and local case study:
 How to people see and understand the places they frequent?
 What is different across media sources?
 More than one (volunteered) data source
 Identification of places and their semantics
 Comparison of places between data sources
 Comparison of places with geographic features and authoritative data
sources
23.06.2015F.O.Ostermann - ifgi GI-Forum 45
SEMANTICS OF PLACES - IMPLEMETATION
WHERE WE ARE
 Shatford-Panofsky and Agnew
 Greater London Area
 From Twitter to Flickr
 Data Mining (Spatio-temporal clustering) -> Semantic Analysis (Cosine
Similarity, …)
 Geo-demographic data
23.06.2015F.O.Ostermann - ifgi GI-Forum 46
COSINE SIMILARITY NEAREST NEIGHBORS
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 47
CORRELATION DISTANCE & SIMILARITY
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 48
CORRELATION DISTANCE & SIMILARITY
WHERE WE ARE
23.06.2015F.O.Ostermann - ifgi GI-Forum 49
CORRELATION DISTANCE & SIMILARITY
WHERE WE ARE
 Introduction: Geo-social media APIs as
sensors
 Where we are : Current state-of-the-art
and practical examples from disaster
response
 Outlook: Future research directions
23.06.2015F.O.Ostermann - ifgi GI-Forum 50
PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA
CONTENT
EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
23.06.2015F.O.Ostermann - ifgi GI-Forum 51
UNSOLVED PROBLEMS FROM FRENCH CASE STUDY
WHERE WE ARE
Relevant datasets for contextualization
• Choice
• Integration
Settings for data mining and machine learning
• Method
• Parameters
Geospatial Semantic Web
Multi-Sensory Integration
Crowdsourced Supervision
23.06.2015F.O.Ostermann - ifgi GI-Forum 52
INTEGRATING GEO-SOCIAL MEDIA
OUTLOOK
23.06.2015F.O.Ostermann - ifgi GI-Forum 53
INTEGRATING GEO-SOCIAL MEDIA
OUTLOOK
23.06.2015F.O.Ostermann - ifgi GI-Forum 54
HYBRID GEO-INFORMATION PROCESSING
OUTLOOK
Time-consuming and resource-intensive
• Manual annotation and experiments for topicality filtering
• Parameterization of spatio-temporal clustering
Other challenges:
• Dependency on data quality
• Overfitting
• Diversity of contexts and tasks
• Near real-time
Crowdsourced Supervision
23.06.2015F.O.Ostermann - ifgi GI-Forum 55
GEOCONAVI FIGHTING FOREST FIRES
OUTLOOK
1.1 Retrieval
Scheduled Java code
accessing APIs
2.1 Topicality
Scheduled PLSQL job
2.2 Geo-Coding
a) Scheduled PLSQL job
b) Scheduled Java code
2.3 Geographic context
Scheduled PLSQL job
3.1 Spatio-temporal
clustering
Scheduled Python script
calling SatScan job
2.4 Quality Assessment
Scheduled PLSQL job
1.2 Storage
Scheduled Java code
writing to DBMS
Oracle DBMS
3.2 Quality Re-Assessment
Scheduled PLSQL job
Twitter
Stream-
ing API
Flickr
Search
API
Dissemination
SMS, WFS, WMS, RSS, SES
EFFIS
Hotspot
Data
European Media Monitor
Geo-coding API
23.06.2015F.O.Ostermann - ifgi GI-Forum 56
HYBRID GEO-INFORMATION PROCESSING
OUTLOOK
Developing hybrid quality assurance mechanisms for near real-
time geo-information streams
• Link the characteristics of geographic information with machine
learning class labelling and regression
• Provide a multi-modal interface to let human oracles simultaneously
label instances
• Translate the learner models into nomothetic principles on
geographic semantics
23.06.2015F.O.Ostermann - ifgi GI-Forum 57
MACHINE LEARNING FOR GEO-SOCIAL MEDIA
OUTLOOK
Every data instance needs multi-class labelling:
• Content type
• Geographic footprints of locations and/or events
• Distinct event membership
• Credibility based on a combination of the other class labels
Learners have to deal with characteristics of geographic information:
• Spatial autocorrelation
• Vague boundaries and class memberships
• Uncontrolled variance
23.06.2015F.O.Ostermann - ifgi GI-Forum 58
MACHINE LEARNING FOR GEO-SOCIAL MEDIA
OUTLOOK
• Multiple human oracles annotate instances for all model classes
• Responses will modify the
• Learners
• Parameters used for the geographic analysis steps to compute
footprints and clusters.
• Resulting models indirectly encode the semantic similarity of
geographic places and concepts
• Reference to (linked) data repositories such as DBpedia and
GeoNames when possible.
23.06.2015F.O.Ostermann - ifgi GI-Forum 59
ACTIVE LEARNING
OUTLOOK
• Active learners profit from domain expertise
• Passive learners suited for domain novices
• Learner chooses instances to be labelled and presents them to the
human annotator
• Maximize the impact of human annotation
• Learner remains flexible towards new instances
23.06.2015F.O.Ostermann - ifgi GI-Forum 60
EXAMPLE QUERIES
OUTLOOK
Toponym disambiguation:
• “Does this [item] talk about [location A] or [location B], or none, or
both?”
Spatial footprint calculation for vague geographies:
• “Is this spatial footprint for [item] correct? If not, is it too large, too
small, or wrong shape, or wrong place?”
Spatio-temporal clustering:
• “Does this [item] belong to a cluster named [event] in [location]? If
not, what’s wrong: Event, Location, or both?”
23.06.2015F.O.Ostermann - ifgi GI-Forum 61
HYBRID GEO-INFORMATION PROCESSING WORKFLOW
OUTLOOK
23.06.2015F.O.Ostermann - ifgi GI-Forum 62
HYBRID GEO-INFORMATION PROCESSING MODEL
OUTLOOK
23.06.2015F.O.Ostermann - ifgi GI-Forum 63
HYBRID GEO-INFORMATION PROCESSING METHODS
OUTLOOK
Key Techniques
• Decision Trees
• Naive Bayes
• Support Vector Machines
Key Technologies
• Apache Spark / Storm (Analytical geoprocessing tasks)
• Pybossa (Crowdsourced supervision)
• Cloud Computing
23.06.2015F.O.Ostermann - ifgi GI-Forum 64
CHALLENGES AND OPPORTUNITIES OF GEO-SOCIAL
MEDIA
EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
Thank you!
f.o.ostermann@utwente.nl
@f_ostermann
nl.linkedin.com/in/foost

More Related Content

Similar to Processing and understanding geo-social media content

Enriching geo-social media through geographic contextualization
Enriching geo-social media through geographic contextualizationEnriching geo-social media through geographic contextualization
Enriching geo-social media through geographic contextualization
foostermann
 
Enriching geo-social media content @AGILE 2015
Enriching geo-social media content @AGILE 2015Enriching geo-social media content @AGILE 2015
Enriching geo-social media content @AGILE 2015
Frank Ostermann
 
Geographic context analysis of volunteered information
Geographic context analysis of volunteered informationGeographic context analysis of volunteered information
Geographic context analysis of volunteered information
foostermann
 
Challenges and opportunities of geo-social media
Challenges and opportunities of geo-social mediaChallenges and opportunities of geo-social media
Challenges and opportunities of geo-social media
foostermann
 
Hybrid geo-information processing
Hybrid geo-information processingHybrid geo-information processing
Hybrid geo-information processing
foostermann
 
Open Source based GIS devlopment cases by Gaia3D_20150417
Open Source based GIS devlopment cases by Gaia3D_20150417Open Source based GIS devlopment cases by Gaia3D_20150417
Open Source based GIS devlopment cases by Gaia3D_20150417
BJ Jang
 
Handling crowdsourced geographic information
Handling crowdsourced geographic informationHandling crowdsourced geographic information
Handling crowdsourced geographic information
foostermann
 
Calit2-a Persistent UCSD/UCI Framework for Collaboration
Calit2-a Persistent UCSD/UCI Framework for CollaborationCalit2-a Persistent UCSD/UCI Framework for Collaboration
Calit2-a Persistent UCSD/UCI Framework for Collaboration
Larry Smarr
 
Ogf27 Ligo
Ogf27 LigoOgf27 Ligo
Ogf27 Ligo
kentblackburn
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
Miha Ahronovitz
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
Wolfgang Ksoll
 
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigData_Europe
 
Emerging Trends in Crisis Informatics
Emerging Trends in Crisis InformaticsEmerging Trends in Crisis Informatics
Emerging Trends in Crisis Informatics
Adam Papendieck
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
Ian Foster
 
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
Codiax
 
SC7 Hangout 3: The BDE Secure Societies Pilot
SC7 Hangout 3: The BDE Secure Societies PilotSC7 Hangout 3: The BDE Secure Societies Pilot
SC7 Hangout 3: The BDE Secure Societies Pilot
BigData_Europe
 
Calit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for ApplicationsCalit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for Applications
Larry Smarr
 
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Laurent Lefort
 
Processing Large Complex Data
Processing Large Complex DataProcessing Large Complex Data
Processing Large Complex Data
Yiannis Kompatsiaris
 
Calit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for ApplicationsCalit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for Applications
Larry Smarr
 

Similar to Processing and understanding geo-social media content (20)

Enriching geo-social media through geographic contextualization
Enriching geo-social media through geographic contextualizationEnriching geo-social media through geographic contextualization
Enriching geo-social media through geographic contextualization
 
Enriching geo-social media content @AGILE 2015
Enriching geo-social media content @AGILE 2015Enriching geo-social media content @AGILE 2015
Enriching geo-social media content @AGILE 2015
 
Geographic context analysis of volunteered information
Geographic context analysis of volunteered informationGeographic context analysis of volunteered information
Geographic context analysis of volunteered information
 
Challenges and opportunities of geo-social media
Challenges and opportunities of geo-social mediaChallenges and opportunities of geo-social media
Challenges and opportunities of geo-social media
 
Hybrid geo-information processing
Hybrid geo-information processingHybrid geo-information processing
Hybrid geo-information processing
 
Open Source based GIS devlopment cases by Gaia3D_20150417
Open Source based GIS devlopment cases by Gaia3D_20150417Open Source based GIS devlopment cases by Gaia3D_20150417
Open Source based GIS devlopment cases by Gaia3D_20150417
 
Handling crowdsourced geographic information
Handling crowdsourced geographic informationHandling crowdsourced geographic information
Handling crowdsourced geographic information
 
Calit2-a Persistent UCSD/UCI Framework for Collaboration
Calit2-a Persistent UCSD/UCI Framework for CollaborationCalit2-a Persistent UCSD/UCI Framework for Collaboration
Calit2-a Persistent UCSD/UCI Framework for Collaboration
 
Ogf27 Ligo
Ogf27 LigoOgf27 Ligo
Ogf27 Ligo
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
 
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
 
Emerging Trends in Crisis Informatics
Emerging Trends in Crisis InformaticsEmerging Trends in Crisis Informatics
Emerging Trends in Crisis Informatics
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
 
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
Roelof Pieters (Overstory) – Tackling Forest Fires and Deforestation with Sat...
 
SC7 Hangout 3: The BDE Secure Societies Pilot
SC7 Hangout 3: The BDE Secure Societies PilotSC7 Hangout 3: The BDE Secure Societies Pilot
SC7 Hangout 3: The BDE Secure Societies Pilot
 
Calit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for ApplicationsCalit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for Applications
 
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
 
Processing Large Complex Data
Processing Large Complex DataProcessing Large Complex Data
Processing Large Complex Data
 
Calit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for ApplicationsCalit2 - CSE's Living Laboratory for Applications
Calit2 - CSE's Living Laboratory for Applications
 

Recently uploaded

Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
NoelManyise1
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
ronaldlakony0
 

Recently uploaded (20)

Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
 

Processing and understanding geo-social media content

  • 1. PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA CONTENT EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS Frank O. Ostermann IfGI GI-Forum, 23.06.2015
  • 2.  Introduction: Geo-social media APIs as sensors  Where we are: Current state-of-the-art and practical examples from disaster response  Outlook: Future research directions 23.06.2015F.O.Ostermann - ifgi GI-Forum 2 PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA CONTENT EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
  • 3. 23.06.2015F.O.Ostermann - ifgi GI-Forum 3 ONCE UPON A TIME… INTRODUCTION … there were Desktop-GIS and Shapefiles, digitized or scanned from paper maps, or from raw surveying or satellite data.
  • 4.  Mobile Web 2.0  Cloud Computing  Internet of Things (in particular, sensors) 23.06.2015F.O.Ostermann - ifgi GI-Forum 4 THREE DISRUPTIVE INNOVATIONS INTRODUCTION
  • 5. 23.06.2015F.O.Ostermann - ifgi GI-Forum 5 … AND CORRESPONDING BUZZWORDS (& BUZZ-VIS) INTRODUCTION
  • 6. 23.06.2015F.O.Ostermann - ifgi GI-Forum 6 BEYOND THE BUZZ EXAMPLES
  • 7.  Real-time data input stream  Citizens as sensors  Multi-layered, inter- operable data sets  Linked and open data  GEOSS, Eye on Earth, INSPIRE, … 23.06.2015F.O.Ostermann - ifgi GI-Forum 7 THE BIG PICTURE: NEXT GENERATION DIGITAL EARTH INTRODUCTION
  • 8. 23.06.2015F.O.Ostermann - ifgi GI-Forum 8 LOW-COST IN-SITU AND MOBILE SENSORS INTRODUCTION Publiclaboratory.com Mikrokopter.de Libelium Waspmote
  • 9. 23.06.2015F.O.Ostermann - ifgi GI-Forum 9 CITIZENS AS SENSORS INTRODUCTION + = ! Why not treat information from the citizens as another type of sensor data?
  • 10. 23.06.2015F.O.Ostermann - ifgi GI-Forum 10 WHO IS THE CROWD? INTRODUCTION
  • 11. 23.06.2015F.O.Ostermann - ifgi GI-Forum 11 WHAT DOES THE CROWD WANT? INTRODUCTION
  • 12. 23.06.2015F.O.Ostermann - ifgi GI-Forum 12 NEW SOURCES OF GEO-INFORMATION INTRODUCTION Geography Explicit Implicit Participation Explicit Volunteered Geographic Information (VGI) Open Street Map Volunteered Geographic Content (VGC) Wikipedia articles on non-geographic topics containing place names, Foursquare Implicit Contributed / Ambient Geographic Information (CGI/AGI) Public Tweets referring to the properties of an identifiable place. User-Generated Geographic Content (UGGC) Public Flickr images containing a place name or being georeferenced Adopted from [1]
  • 13. 23.06.2015F.O.Ostermann - ifgi GI-Forum 13 TWITTER INTRODUCTION • 140 characters micro-blogging platform • Asymetric following – being followed • Inflated user numbers: • 100 million daily active vs. • 300 million montly active vs. • 1 billion registered (number of bots high, >40% never tweeted) • Two APIs: Streaming API & Search API • Rich metadata returned • <5% with coordinates, but much more with toponyms • Huge ecosystem of third-party apps and services • Boost to data-driven research, but what about reproducibility?
  • 14. 23.06.2015F.O.Ostermann - ifgi GI-Forum 14 FLICKR INTRODUCTION • 92 million users • 1 million photos shared every day • Pioneer, then declined, then bounced back • API offers detailed search functionality • ~20% geocoded, many more with toponyms • Potentially rich source of data: • Title • Tags • Description • But: Bulk uploads (and tagging)
  • 15. 23.06.2015F.O.Ostermann - ifgi GI-Forum 15 GEO-SOCIAL MEDIA SENSORS – SO WHAT‘S DIFFERENT? INTRODUCTION • Often In-situ • Rich, pre-processed information but varying level of quality • Uneven spatio-temporal distribution (stream) • Redundancy of content and channels (sharing) • Heterogeneous structure • Unknown source/lineage • Unclear / changing licencing, property rights, liability (e.g. OpenStreetMap) • Unknown/Immeasurable precision, error, completeness • Uncertainty about the uncertainty! • How to calibrate? (Should we?)
  • 16. 23.06.2015F.O.Ostermann - ifgi GI-Forum 16 QUALITY OF GEO-SOCIAL MEDIA INFORMATION INTRODUCTION Adopted from [2, 3] Source Credibility Relevance Content Location Context Natual Language Processing Social Network Analysis Geographic Contextualization
  • 17. 23.06.2015F.O.Ostermann - ifgi GI-Forum 17 AUTOMATIC IMAGE GEO-TAG CREATION INTRODUCTION
  • 18.  Introduction: Geo-social media APIs as sensors  Where we are : Current state-of-the-art and practical examples from disaster response  Outlook: Future research directions 23.06.2015F.O.Ostermann - ifgi GI-Forum 18 PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA CONTENT EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
  • 19. 23.06.2015F.O.Ostermann - ifgi GI-Forum 19 GEO-SOCIAL MEDIA AND CRISIS MANAGEMENT WHERE WE ARE Social media offers… Crisis management needs… rich up-to-date information up-to-date information new paths of communication redundant paths of communication noise, uncertain lineage and accuracy high-quality and reliable information Crowd-sourced data curation faces limits of  Sustainability  Scalability
  • 20. 23.06.2015F.O.Ostermann - ifgi GI-Forum 20 HUMANITARIAN OPENSTREETMAP TEAM INTRODUCTION • Many activations, last one after Nepal earthquake • Three main communication channels: • Tasking manager • E-Mail list • IRC channel
  • 21. 23.06.2015F.O.Ostermann - ifgi GI-Forum 21 USHAHIDI – BEYOND CRISIS MAPPING INTRODUCTION
  • 22. 23.06.2015F.O.Ostermann - ifgi GI-Forum 22 TWITCIDENT - CROWDSENSE WHERE WE ARE
  • 23. 23.06.2015F.O.Ostermann - ifgi GI-Forum 23 CRISISTRACKER (AIDR) WHERE WE ARE
  • 24. 23.06.2015F.O.Ostermann - ifgi GI-Forum 24 AIDR WHERE WE ARE http://irevolution.net/2013/10/01/aidr-artificial- intelligence-for-disaster-response/
  • 25. 23.06.2015F.O.Ostermann - ifgi GI-Forum 25 GEOGRAPHIC CONTEXT ANALYSIS OF VOLUNTEERED INFORMATION (GEOCONAVI) WHERE WE ARE 1. Deploy a system for using UGC in crisis decision support on forest fires 2. Assess the added value of using UGC for forest fire response.
  • 26. 23.06.2015F.O.Ostermann - ifgi GI-Forum 26 FOREST FIRE CHARACTERISTICS WHERE WE ARE • Dynamics require near real-time processing • Less signals since often in sparsely populated areas • Predictability and recurrence facilitate sensor and model calibration
  • 27. 23.06.2015F.O.Ostermann - ifgi GI-Forum 27 GEOCONAVI FIGHTING FOREST FIRES WHERE WE ARE 1.1 Retrieval Scheduled Java code accessing APIs 2.1 Topicality Scheduled PLSQL job 2.2 Geo-Coding a) Scheduled PLSQL job b) Scheduled Java code 2.3 Geographic context Scheduled PLSQL job 3.1 Spatio-temporal clustering Scheduled Python script calling SatScan job 2.4 Quality Assessment Scheduled PLSQL job 1.2 Storage Scheduled Java code writing to DBMS Oracle DBMS 3.2 Quality Re-Assessment Scheduled PLSQL job Twitter Stream- ing API Flickr Search API Dissemination SMS, WFS, WMS, RSS, SES EFFIS Hotspot Data European Media Monitor Geo-coding API
  • 28.  Flickr API  Twitter Streaming API  Keyword-based:  Domain expertise  Task-oriented  Scheduled scripts  Writing to Oracle DBMS 23.06.2015F.O.Ostermann - ifgi GI-Forum 28 DATA COLLECTION AND STORAGE WHERE WE ARE
  • 29. 23.06.2015F.O.Ostermann - ifgi GI-Forum 29 EXAMPLE GEO-SOCIAL MEDIA WHERE WE ARE
  • 30. 23.06.2015F.O.Ostermann - ifgi GI-Forum 30 EXAMPLE GEO-SOCIAL MEDIA WHERE WE ARE “Back at hotel. Fire skirted round village. Little evidence of significant damage. Helicopters still overhead damping scrub. Beer unaffected” (Canada BCGovFireInfo): “Important notice from the Reg Dist of Bulkley- Nechako regarding evacuations due to wildfires in the area http://ow.ly/2sBxH” “Are you a fireman? Cause you’re always there to extinguish the fire inside my heart.”
  • 31. 23.06.2015F.O.Ostermann - ifgi GI-Forum 31 GEOCONAVI FIGHTING FOREST FIRES WHERE WE ARE 1.1 Retrieval Scheduled Java code accessing APIs 2.1 Topicality Scheduled PLSQL job 2.2 Geo-Coding a) Scheduled PLSQL job b) Scheduled Java code 2.3 Geographic context Scheduled PLSQL job 3.1 Spatio-temporal clustering Scheduled Python script calling SatScan job 2.4 Quality Assessment Scheduled PLSQL job 1.2 Storage Scheduled Java code writing to DBMS Oracle DBMS 3.2 Quality Re-Assessment Scheduled PLSQL job Twitter Stream- ing API Flickr Search API Dissemination SMS, WFS, WMS, RSS, SES EFFIS Hotspot Data European Media Monitor Geo-coding API
  • 32. 23.06.2015F.O.Ostermann - ifgi GI-Forum 32 SCORING GEO-SOCIAL MEDIA WHERE WE ARE • Sum of weighted scores: QS(Oj) = ∑N i=1wisji • with w being weight for criterion i, and s being the score for the geo- social media object j • Topicality: keyword-based • Proximity: next concurrent reported hotspot • Land cover: Forest, no-Forest, Built-up • Population Density: Risk factor • Information clusters: Similar messages or lone signal?
  • 33. 23.06.2015F.O.Ostermann - ifgi GI-Forum 33 TOPICALITY MACHINE LEARNING CLASSIFICATION WHERE WE ARE 1. Manually annotated (Yes/No) random sample 2. Counted keyword occurences 3. Used Weka 10-fold stratified cross validation with a) Decision trees b) Naive Bayes c) Association Rules 4. J48 Decision Tree works best Classified as YES Classified as NO On Forest Fire 1196 370 Not on Forest Fire 403 3712
  • 34. 23.06.2015F.O.Ostermann - ifgi GI-Forum 34 GEOCODING GEO-SOCIAL MEDIA WHERE WE ARE Several Geocoders used: • GISCO/LAU2 brute string matching • European Media Monitor algorithms • Yahoo! Placemaker (2010) TWITTER FLICKR August 2010 August 2011 August 2010 August 2011 Retrieved items 2,904,065 7,996,228 7,991 17,850 Percentage with toponym 35% 27% 53% 50% Percentage with coordinates 1.1% 0.92% 20% 21%
  • 35. 23.06.2015F.O.Ostermann - ifgi GI-Forum 35 GEOCONAVI FIGHTING FOREST FIRES WHERE WE ARE 1.1 Retrieval Scheduled Java code accessing APIs 2.1 Topicality Scheduled PLSQL job 2.2 Geo-Coding a) Scheduled PLSQL job b) Scheduled Java code 2.3 Geographic context Scheduled PLSQL job 3.1 Spatio-temporal clustering Scheduled Python script calling SatScan job 2.4 Quality Assessment Scheduled PLSQL job 1.2 Storage Scheduled Java code writing to DBMS Oracle DBMS 3.2 Quality Re-Assessment Scheduled PLSQL job Twitter Stream- ing API Flickr Search API Dissemination SMS, WFS, WMS, RSS, SES EFFIS Hotspot Data European Media Monitor Geo-coding API
  • 36. 23.06.2015F.O.Ostermann - ifgi GI-Forum 36 SPATIO-TEMPORAL CLUSTERING WHERE WE ARE • SatScan external software • Scheduled Python script 1. Reads new geo-social media from database 2. Converts it to SatScan input format 3. Calls SatScan from the command line with appropriate parameters 4. Waits for SatScan to complete analysis 5. Reads SatScan output 6. Stores relevant information in database
  • 37. 23.06.2015F.O.Ostermann - ifgi GI-Forum 37 SPATIO-TEMPORAL CLUSTERING PARAMETERS WHERE WE ARE  Type of clustering algorithm  Spatial location of clusters based on grid/locations or not  Type of spatial overlap of clusters  Maximum spatial cluster size  Maximum temporal cluster size  Used in 2011: Discrete Poisson adjusting for population, no grid, no overlap, max radius 50 km, max temporal extent 10% of study period (9 days)
  • 38. 23.06.2015F.O.Ostermann - ifgi GI-Forum 38 GEOCONAVI FIGHTING FOREST FIRES WHERE WE ARE 1.1 Retrieval Scheduled Java code accessing APIs 2.1 Topicality Scheduled PLSQL job 2.2 Geo-Coding a) Scheduled PLSQL job b) Scheduled Java code 2.3 Geographic context Scheduled PLSQL job 3.1 Spatio-temporal clustering Scheduled Python script calling SatScan job 2.4 Quality Assessment Scheduled PLSQL job 1.2 Storage Scheduled Java code writing to DBMS Oracle DBMS 3.2 Quality Re-Assessment Scheduled PLSQL job Twitter Stream- ing API Flickr Search API Dissemination SMS, WFS, WMS, RSS, SES EFFIS Hotspot Data European Media Monitor Geo-coding API
  • 39. 23.06.2015F.O.Ostermann - ifgi GI-Forum 39 VISUALIZATION AND SHARING WHERE WE ARE
  • 40. 23.06.2015F.O.Ostermann - ifgi GI-Forum 40 FOREST FIRES IN FRANCE 2011 WHERE WE ARE
  • 41. 23.06.2015F.O.Ostermann - ifgi GI-Forum 41 FOREST FIRES IN FRANCE BY GEOCONAVI WHERE WE ARE
  • 42. 23.06.2015F.O.Ostermann - ifgi GI-Forum 42 FRENCH FOREST FIRE SOCIAL MEDIA WHERE WE ARE (2) Machine-learned relevance filter: 25,684 items left (3) Geocoded and context enriched: 5,770 items left (4) Clustered in space and time: 129 clusters with 2,682 items (5) Second relevance filter: 11 clusters left with 469 items (1) Containing French keywords: 659,676 Tweets and 39,016 Flickr images
  • 43. 23.06.2015F.O.Ostermann - ifgi GI-Forum 43 GEOCONAVI RESULTS WHERE WE ARE • Simple keyword queries suffice • Additional Geo-coding indispensable • Topicality and context filtering plus spatio-temporal clustering crucial • Able to detect fires from Tweets and Flickr images by spatio-temporal clustering • Relevance, credibility and overall quality vary greatly, thus more rules and human assessment needed
  • 44. 23.06.2015F.O.Ostermann - ifgi GI-Forum 44 SEMANTICS OF PLACES ACROSS GEO-SOCIAL MEDIA WHERE WE ARE  Theory-guided research and local case study:  How to people see and understand the places they frequent?  What is different across media sources?  More than one (volunteered) data source  Identification of places and their semantics  Comparison of places between data sources  Comparison of places with geographic features and authoritative data sources
  • 45. 23.06.2015F.O.Ostermann - ifgi GI-Forum 45 SEMANTICS OF PLACES - IMPLEMETATION WHERE WE ARE  Shatford-Panofsky and Agnew  Greater London Area  From Twitter to Flickr  Data Mining (Spatio-temporal clustering) -> Semantic Analysis (Cosine Similarity, …)  Geo-demographic data
  • 46. 23.06.2015F.O.Ostermann - ifgi GI-Forum 46 COSINE SIMILARITY NEAREST NEIGHBORS WHERE WE ARE
  • 47. 23.06.2015F.O.Ostermann - ifgi GI-Forum 47 CORRELATION DISTANCE & SIMILARITY WHERE WE ARE
  • 48. 23.06.2015F.O.Ostermann - ifgi GI-Forum 48 CORRELATION DISTANCE & SIMILARITY WHERE WE ARE
  • 49. 23.06.2015F.O.Ostermann - ifgi GI-Forum 49 CORRELATION DISTANCE & SIMILARITY WHERE WE ARE
  • 50.  Introduction: Geo-social media APIs as sensors  Where we are : Current state-of-the-art and practical examples from disaster response  Outlook: Future research directions 23.06.2015F.O.Ostermann - ifgi GI-Forum 50 PROCESSING AND UNDERSTANDING GEO-SOCIAL MEDIA CONTENT EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS
  • 51. 23.06.2015F.O.Ostermann - ifgi GI-Forum 51 UNSOLVED PROBLEMS FROM FRENCH CASE STUDY WHERE WE ARE Relevant datasets for contextualization • Choice • Integration Settings for data mining and machine learning • Method • Parameters Geospatial Semantic Web Multi-Sensory Integration Crowdsourced Supervision
  • 52. 23.06.2015F.O.Ostermann - ifgi GI-Forum 52 INTEGRATING GEO-SOCIAL MEDIA OUTLOOK
  • 53. 23.06.2015F.O.Ostermann - ifgi GI-Forum 53 INTEGRATING GEO-SOCIAL MEDIA OUTLOOK
  • 54. 23.06.2015F.O.Ostermann - ifgi GI-Forum 54 HYBRID GEO-INFORMATION PROCESSING OUTLOOK Time-consuming and resource-intensive • Manual annotation and experiments for topicality filtering • Parameterization of spatio-temporal clustering Other challenges: • Dependency on data quality • Overfitting • Diversity of contexts and tasks • Near real-time Crowdsourced Supervision
  • 55. 23.06.2015F.O.Ostermann - ifgi GI-Forum 55 GEOCONAVI FIGHTING FOREST FIRES OUTLOOK 1.1 Retrieval Scheduled Java code accessing APIs 2.1 Topicality Scheduled PLSQL job 2.2 Geo-Coding a) Scheduled PLSQL job b) Scheduled Java code 2.3 Geographic context Scheduled PLSQL job 3.1 Spatio-temporal clustering Scheduled Python script calling SatScan job 2.4 Quality Assessment Scheduled PLSQL job 1.2 Storage Scheduled Java code writing to DBMS Oracle DBMS 3.2 Quality Re-Assessment Scheduled PLSQL job Twitter Stream- ing API Flickr Search API Dissemination SMS, WFS, WMS, RSS, SES EFFIS Hotspot Data European Media Monitor Geo-coding API
  • 56. 23.06.2015F.O.Ostermann - ifgi GI-Forum 56 HYBRID GEO-INFORMATION PROCESSING OUTLOOK Developing hybrid quality assurance mechanisms for near real- time geo-information streams • Link the characteristics of geographic information with machine learning class labelling and regression • Provide a multi-modal interface to let human oracles simultaneously label instances • Translate the learner models into nomothetic principles on geographic semantics
  • 57. 23.06.2015F.O.Ostermann - ifgi GI-Forum 57 MACHINE LEARNING FOR GEO-SOCIAL MEDIA OUTLOOK Every data instance needs multi-class labelling: • Content type • Geographic footprints of locations and/or events • Distinct event membership • Credibility based on a combination of the other class labels Learners have to deal with characteristics of geographic information: • Spatial autocorrelation • Vague boundaries and class memberships • Uncontrolled variance
  • 58. 23.06.2015F.O.Ostermann - ifgi GI-Forum 58 MACHINE LEARNING FOR GEO-SOCIAL MEDIA OUTLOOK • Multiple human oracles annotate instances for all model classes • Responses will modify the • Learners • Parameters used for the geographic analysis steps to compute footprints and clusters. • Resulting models indirectly encode the semantic similarity of geographic places and concepts • Reference to (linked) data repositories such as DBpedia and GeoNames when possible.
  • 59. 23.06.2015F.O.Ostermann - ifgi GI-Forum 59 ACTIVE LEARNING OUTLOOK • Active learners profit from domain expertise • Passive learners suited for domain novices • Learner chooses instances to be labelled and presents them to the human annotator • Maximize the impact of human annotation • Learner remains flexible towards new instances
  • 60. 23.06.2015F.O.Ostermann - ifgi GI-Forum 60 EXAMPLE QUERIES OUTLOOK Toponym disambiguation: • “Does this [item] talk about [location A] or [location B], or none, or both?” Spatial footprint calculation for vague geographies: • “Is this spatial footprint for [item] correct? If not, is it too large, too small, or wrong shape, or wrong place?” Spatio-temporal clustering: • “Does this [item] belong to a cluster named [event] in [location]? If not, what’s wrong: Event, Location, or both?”
  • 61. 23.06.2015F.O.Ostermann - ifgi GI-Forum 61 HYBRID GEO-INFORMATION PROCESSING WORKFLOW OUTLOOK
  • 62. 23.06.2015F.O.Ostermann - ifgi GI-Forum 62 HYBRID GEO-INFORMATION PROCESSING MODEL OUTLOOK
  • 63. 23.06.2015F.O.Ostermann - ifgi GI-Forum 63 HYBRID GEO-INFORMATION PROCESSING METHODS OUTLOOK Key Techniques • Decision Trees • Naive Bayes • Support Vector Machines Key Technologies • Apache Spark / Storm (Analytical geoprocessing tasks) • Pybossa (Crowdsourced supervision) • Cloud Computing
  • 64. 23.06.2015F.O.Ostermann - ifgi GI-Forum 64 CHALLENGES AND OPPORTUNITIES OF GEO-SOCIAL MEDIA EARTH OBSERVATION WITH UNCALIBRATED IN-SITU SENSORS Thank you! f.o.ostermann@utwente.nl @f_ostermann nl.linkedin.com/in/foost