SlideShare a Scribd company logo
1
Geographies of
crowdsourced information
and their implications
@a_ballatore
VGI-Alive, AGILE
June 2018, Lund, Sweden
Andrea Ballatore*
Department of Geography
Birkbeck, University of London
Stefano De Sabbata
School of Geography, Geology and the Env.,
University of Leicester
aballatore.space
2
Outline
1. Where are we?
2. Core research questions
3. Paradigm limitations
4. Beyond the usual suspects
5. Case studies: Search behaviour
6. [Stefano will do the rest]
3
Where are we?
4
The success of
crowdsourcing
"85% of the top global brands have used
crowdsourcing in the last ten years. […] According
to Gartner, 75% of the world’s high performing
enterprises will be using crowdsourcing by 2018."
(Deloitte, 2016)
It is when new, successful technologies withdraw
into the "woodwork of everyday banality" that
their effects become real and profound.
(Vincent Mosco, 2004)
https://ideascale.com
5
The success of
crowdsourcing
"85% of the top global brands have used
crowdsourcing in the last ten years. […] According
to Gartner, 75% of the world’s high performing
enterprises will be using crowdsourcing by 2018."
(Deloitte, 2016)
It is when new, successful technologies withdraw
into the "woodwork of everyday banality" that
their effects become real and profound.
(Vincent Mosco, 2004)
https://ideascale.com
6
Crowdsourcing + geolocation:
A mature field
(See et al, 2016)
7
Volunteered or Crowdsourced GI?
8
Placing the crowds
• Crowdsourced
geographic information
(CGI)
• From experimental phase
to oligopoly (e.g.,
Google, Facebook)
• From cyberoptimism to
cyberpessimism
https://www.cagle.com
9
Placing the crowds
• Crowdsourced
geographic information
(CGI)
• From experimental phase
to oligopoly (e.g.,
Google, Facebook)
• From cyberoptimism to
cyberpessimism
(2016)
10
Researching CGI
as domain
Using CGI for
other domains
11
Core CGI research
1. Who are the contributors and why do they engage
in spatial information production, and what
incentives work or do not work? How do they
collaborate and organise? How do we include
marginalised communities?
(Budhathoki and Haythornthwaite 2013)
2. How can we calculate the quality and fitness for
purpose of crowdsourced data in a reliable,
preferably intrinsic way? (Goodchild and Li 2012)
3. What are the limitations of such models and what
are their spatial, epistemic and cultural biases?
(Dodge and Kitchin 2013)
12
Core CGI research
1. Who are the contributors and why do they engage
in spatial information production, and what
incentives work or do not work? How do they
collaborate and organise? How do we include
marginalised communities?
(Budhathoki and Haythornthwaite 2013)
2. How can we calculate the quality and fitness for
purpose of crowdsourced data in a reliable,
preferably intrinsic way? (Goodchild and Li 2012)
3. What are the limitations of such models and what
are their spatial, epistemic and cultural biases?
(Dodge and Kitchin 2013)
13
CGI for other domains
Natural sciences: biology,
climate science, Earth
sensing
Social sciences: urban planning,
transportation,
public health, economics,
human geography
Humanities: digital humanities,
history, cultural analytics
14
Limitations of
crowdsourcing
15
• "Ignorance of crowds"
(Carr, 2007)
• Conditions for wisdom
• Menial work, no real
innovation/creativity
• Undermining paid work
• Variable quality
Limitations of the paradigm
2004
16
• "Ignorance of crowds"
(Carr, 2007)
• Conditions for wisdom
• Menial work, no real
innovation/creativity
• Undermining paid work
• Variable quality
Limitations of the paradigm
2007
17
• Large volumes of information
do not imply usefulness or
fitness for purpose
• We need representative
samples, not large samples
(e.g., random sample of
1,000 > 1M non-random)
Limitations of the paradigm
18
Each CGI source is a particular viewpoint
and will return a different image of the
social and natural world.
suifaijohnmak.wordpress.com
19
Diversity/biases of CGI
• Thematic: e.g., tourism, outdoors,
typical/atypical behaviour, sharing bias
• Demographic: Western Educated Industrialised
Rich Democratic (WEIRD) (not always!)
• Social: 90%-9%-1%, hyperactive minorities of
contributors
• Geographic: urban/rural,
developed/developing, central/peripheral,
human/natural
20
Diversity/biases of CGI
• Thematic: e.g., tourism, outdoors,
typical/atypical behaviour, sharing bias
• Demographic: Western Educated Industrialised
Rich Democratic (WEIRD) (not always!)
• Social: 90%-9%-1%, hyperactive minorities of
contributors
• Geographic: urban/rural,
developed/developing, central/peripheral,
human/natural
21
• Without centralised planning and protocols,
data quality remains uneven (coverage!)
• Wikipedia replaced Britannica, but
OpenStreetMap is not replacing Google
Maps
• CGI cannot replace established data
collection protocols and sources, but can
provide useful ancillary data
CGI strictures
22
• Without centralised planning and protocols,
data quality remains uneven (coverage!)
• Wikipedia replaced Britannica, but
OpenStreetMap is not replacing Google
Maps
• CGI cannot replace established data
collection protocols and sources, but can
provide useful ancillary data
CGI strictures
23
Reinventing wheels
• Some CGI replicates work
that has been done better
by professionals
• More useful to focus on
"missing" data:
bookscrounger.com
24
Open data vs CGI
Authoritative datasets
are becoming
cheaper/free
25
Broadening our
horizons
26
Most studies
on OSM,
Wikipedia,
Twitter, Flickr.
There's
more out
there!
https://fineartamerica.com
The usual suspects
27
28
• Google Maps
• From 5M to 50M
contributors in
2017
• 700K new places
monthly
• Gamification
29
• Hundreds of millions of users, billions
of reviews
• Measurable effects on spatial and
economic behaviour
• Sentiment about points of interest,
cities, and neighbourhoods
30
• Micro-economic data (e.g. price of onions
in India, new shops in Ghana)
• For profit, contributors are paid
• Applications: International Development,
Government, Global Security, and Business
31
Case studies
32
33
Online visibility of CGI projects
• Search engines are the key entry point
to discover new information
• Feedback loop between Wikipedia and
Google Search to attract new
contributors
• Making CGI findable and consumable
for search engines and social media
• Study on CGI on Google Search (2018)
34
Online visibility of CGI projects
• Search engines are the key entry point
to discover new information
• Feedback loop between Wikipedia and
Google Search to attract new
contributors
• Making CGI findable and consumable
for search engines and social media
• Study on CGI on Google Search (2018)
35
Interest in CGI projects
(Ballatore & Jokar Arsanjani, 2018)
(Ballatore & Jokar Arsanjani, 2018)
Search Interest in
Wikimapia/OSM combined
37
Search Interest in
Wikimapia vs OSM
(Ballatore & Jokar Arsanjani, 2018)
Andrea Ballatore
Department of Geography,
Birkbeck, University of London
Stefano De Sabbata*
School of Geography, Geology and the Env.,
University of Leicester
S c h o o l o f
G e o g r a p h y ,
G e o l o g y a n d t h e
E n v i r o n m e n t
Geographies of
crowdsourced information
and their implications
AGILE 2018 – Lund, Sweden
Case studies
• Operationalising quality in health studies
– with J Bright, S Lee, B Ganesh, DK Humphreys
– OpenStreetMap
– impact of availability on alcohol-related harm
• Analysis of global gazetteers
– with E Acheson and RS Purves
– GeoNames and Getty TGN
– patterns of coverage
• Urban Information Geographies
– with A Ballatore
– with N Tate
– quality and representativeness of VGI
Operationalising
VGI in health
studies
• with J Bright, S Lee, B
Ganesh, DK Humphreys
• OpenStreetMap
• impact of availability on
alcohol-related harm
• https://www.sciencedirect.com/scie
nce/article/pii/S1353829217305804
Quality indicator
Values Transformation
Num. of map features per square kilometre Log transformed, scaled
Average num. of edits per map feature Log transformed, scaled
Num. of users who made at least one edit Log transformed, scaled
Num. of features edited by more than one user Log transformed, scaled
Num. of days with at least one edit Log transformed, scaled
Days since the last edit in the area Inversed, scaled
Proportion of buildings with full address -1 if 0%, 0 if no buildings
• Intrinsic quality indicator based on Barron et al. (2014)
• Calculated as a linear combination of:
Replicating CRESH study
The overall conclusions are essentially identical
• significant positive correlation between density of
alcohol outlets and incidence of alcohol related harms
• an effect size which is comparable (though smaller)
The correlation
between OSM-based
estimates and CRESH
is stronger where the
quality index is higher
Analysis of global
gazetteers
• with E Acheson and RS
Purves
• GeoNames and
Getty TGN
• patterns of coverage
• https://www.sciencedirect.com/scie
nce/article/pii/S0198971516302496
Comparing GeoNames and TGN
Two distinct levels of coverages
• in “high coverage” countries, TGN same order of
magnitude as counts in GeoNames
• in “low coverage” countries, GeoNames has over 60 times
as much content as TGN
GeoNames and TGN in Great Britain
Gazetteer Features Pop place
GeoNames 54,701 16,475
Getty TGN 24,003 16,816
Ordnance Survey 50k 248,626
Ordnance Survey OpenNames 41,490
Feature
types
Urban
Information
Geographies
• with A Ballatore
• with N Tate
• quality and
representativeness of
VGI
Findings
• Overall confirmed (some) hypothesis
– representation similar biases as participation
– Higher qualifications strongest factor
– Wealth (house prices) strong factor in both, more so for Wikipedia
– Twitter strongly influenced by perc. of ppl. aged 30-44 (positively) and
households with de-pendent children (negatively)
• However
– models account only for about 44–55% of variability
– need for more explanatory factors, e.g., tourism-related activities
• …or can these difference be used as indicators?
– ethnic composition is not a factor in the UK
• Twitter and Wikipedia similar but distinct geographies
– only representative of themselves
Comparing Twitter and Wikipedia
Number of tweets
(951 obs.)
~ Number of articles
Adj. R2 = 0.490
De Sabbata
and Tate
(forthcoming)
A geodemographic approach
Conclusions
• More interaction between research on CGI
core and domain applications
– beyond technology, towards social domains
– beyond the usual suspects (OSM, etc)
• Access to data: sampling and scraping
• Systematic work on limitations is needed
– Urban Information Geographies
• Impact of GDPR on access and use of data
Thank you for your attention.
Any questions?
Stefano De Sabbata
School of Geography, Geology and the Env.,
University of Leicester
Andrea Ballatore
Department of Geography,
Birkbeck, University of London
a.ballatore@bbk.ac.uk
aballatore.space
@a_ballatore
Thanks!

More Related Content

What's hot

GIS and Agent-based modeling: Part 2
GIS and Agent-based modeling: Part 2GIS and Agent-based modeling: Part 2
GIS and Agent-based modeling: Part 2
crooksAndrew
 
2016 Annual Report - UN Global Pulse
2016 Annual Report - UN Global Pulse 2016 Annual Report - UN Global Pulse
2016 Annual Report - UN Global Pulse
UN Global Pulse
 
David Cowen UW-Madison Geospatial Summit 2015
David Cowen UW-Madison Geospatial Summit 2015David Cowen UW-Madison Geospatial Summit 2015
David Cowen UW-Madison Geospatial Summit 2015
Wisconsin State Cartographer's Office
 
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
crooksAndrew
 
Not the Geography You Remember
Not the Geography You RememberNot the Geography You Remember
Not the Geography You Remember
Bill Bass
 
Data Innovation: Generating Climate Solutions Event
Data Innovation: Generating Climate Solutions EventData Innovation: Generating Climate Solutions Event
Data Innovation: Generating Climate Solutions Event
UN Global Pulse
 
Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...
Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...
Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...
ESD UNU-IAS
 
GSDI Liaison report on Earth Observation-related activities for the CEOS WGISS
GSDI Liaison report on Earth Observation-related activities for the CEOS WGISSGSDI Liaison report on Earth Observation-related activities for the CEOS WGISS
GSDI Liaison report on Earth Observation-related activities for the CEOS WGISS
Remetey-Fülöpp Gábor
 

What's hot (8)

GIS and Agent-based modeling: Part 2
GIS and Agent-based modeling: Part 2GIS and Agent-based modeling: Part 2
GIS and Agent-based modeling: Part 2
 
2016 Annual Report - UN Global Pulse
2016 Annual Report - UN Global Pulse 2016 Annual Report - UN Global Pulse
2016 Annual Report - UN Global Pulse
 
David Cowen UW-Madison Geospatial Summit 2015
David Cowen UW-Madison Geospatial Summit 2015David Cowen UW-Madison Geospatial Summit 2015
David Cowen UW-Madison Geospatial Summit 2015
 
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
 
Not the Geography You Remember
Not the Geography You RememberNot the Geography You Remember
Not the Geography You Remember
 
Data Innovation: Generating Climate Solutions Event
Data Innovation: Generating Climate Solutions EventData Innovation: Generating Climate Solutions Event
Data Innovation: Generating Climate Solutions Event
 
Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...
Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...
Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...
 
GSDI Liaison report on Earth Observation-related activities for the CEOS WGISS
GSDI Liaison report on Earth Observation-related activities for the CEOS WGISSGSDI Liaison report on Earth Observation-related activities for the CEOS WGISS
GSDI Liaison report on Earth Observation-related activities for the CEOS WGISS
 

Similar to Geographies of crowdsourced information and their implications (VGI-Alive Keynote), AGILE 2018, Lund

Understanding the Volunteer in VGI
Understanding the Volunteer in VGIUnderstanding the Volunteer in VGI
Understanding the Volunteer in VGI
Christopher J. Parker
 
Smart Citizens
Smart CitizensSmart Citizens
Smart Citizens
Maria Antonia Brovelli
 
Smart Citizens
Smart CitizensSmart Citizens
Smart Citizens
Maria Antonia Brovelli
 
From crowdsourced geographic information to participatory citizen science - e...
From crowdsourced geographic information to participatory citizen science - e...From crowdsourced geographic information to participatory citizen science - e...
From crowdsourced geographic information to participatory citizen science - e...
Muki Haklay
 
Current Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
Current Disruptions in Media: Earthquakes or New Openings? Stanford as CatalystCurrent Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
Current Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
Martha Russell
 
Digitization and Landscape-Research Nexus
Digitization and Landscape-Research NexusDigitization and Landscape-Research Nexus
Digitization and Landscape-Research Nexus
Private
 
The Global Land Programme
The Global Land ProgrammeThe Global Land Programme
The Global Land Programme
CIFOR-ICRAF
 
Geographical information system : GIS and Social Media
Geographical information system : GIS and Social Media Geographical information system : GIS and Social Media
Geographical information system : GIS and Social Media
Imran Ghaznavi
 
Open Data and Open Software Geospatial Applications
Open Data and Open Software Geospatial ApplicationsOpen Data and Open Software Geospatial Applications
Open Data and Open Software Geospatial Applications
Jody Garnett
 
EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...
EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...
EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...
Muki Haklay
 
Extreme Citizen Science: the socio-political potential of citizen science
Extreme Citizen Science: the socio-political potential of citizen scienceExtreme Citizen Science: the socio-political potential of citizen science
Extreme Citizen Science: the socio-political potential of citizen science
Muki Haklay
 
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
Vittorio Romano
 
Modeling of future transport systems from different point of views: data, tec...
Modeling of future transport systems from different point of views: data, tec...Modeling of future transport systems from different point of views: data, tec...
Modeling of future transport systems from different point of views: data, tec...
Sonia Yeh
 
Free as in Puppies: Compensating for ICT Constraints in Citizen Science
Free as in Puppies: Compensating for ICT Constraints in Citizen ScienceFree as in Puppies: Compensating for ICT Constraints in Citizen Science
Free as in Puppies: Compensating for ICT Constraints in Citizen Science
Andrea Wiggins
 
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
michellep
 
Citizen Science in Open Science context: measuring & understanding impacts of...
Citizen Science in Open Science context: measuring & understanding impacts of...Citizen Science in Open Science context: measuring & understanding impacts of...
Citizen Science in Open Science context: measuring & understanding impacts of...
Muki Haklay
 
From Telling Stories with Data to Telling Stories with Data Infrastructures: ...
From Telling Stories with Data to Telling Stories with Data Infrastructures: ...From Telling Stories with Data to Telling Stories with Data Infrastructures: ...
From Telling Stories with Data to Telling Stories with Data Infrastructures: ...
Liliana Bounegru
 
Crowdsourcing and Participation in Cartography (G572 Guest Lecture)
Crowdsourcing and Participation in Cartography (G572 Guest Lecture)Crowdsourcing and Participation in Cartography (G572 Guest Lecture)
Crowdsourcing and Participation in Cartography (G572 Guest Lecture)
Carl Sack
 
AGIForesight _2020_Full Article
AGIForesight _2020_Full ArticleAGIForesight _2020_Full Article
AGIForesight _2020_Full Article
Martin Penney
 
COST Actions: ENERGIC, Mapping and the citizen sensor.
COST Actions: ENERGIC,  Mapping and the citizen sensor.COST Actions: ENERGIC,  Mapping and the citizen sensor.
COST Actions: ENERGIC, Mapping and the citizen sensor.
Vyron
 

Similar to Geographies of crowdsourced information and their implications (VGI-Alive Keynote), AGILE 2018, Lund (20)

Understanding the Volunteer in VGI
Understanding the Volunteer in VGIUnderstanding the Volunteer in VGI
Understanding the Volunteer in VGI
 
Smart Citizens
Smart CitizensSmart Citizens
Smart Citizens
 
Smart Citizens
Smart CitizensSmart Citizens
Smart Citizens
 
From crowdsourced geographic information to participatory citizen science - e...
From crowdsourced geographic information to participatory citizen science - e...From crowdsourced geographic information to participatory citizen science - e...
From crowdsourced geographic information to participatory citizen science - e...
 
Current Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
Current Disruptions in Media: Earthquakes or New Openings? Stanford as CatalystCurrent Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
Current Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
 
Digitization and Landscape-Research Nexus
Digitization and Landscape-Research NexusDigitization and Landscape-Research Nexus
Digitization and Landscape-Research Nexus
 
The Global Land Programme
The Global Land ProgrammeThe Global Land Programme
The Global Land Programme
 
Geographical information system : GIS and Social Media
Geographical information system : GIS and Social Media Geographical information system : GIS and Social Media
Geographical information system : GIS and Social Media
 
Open Data and Open Software Geospatial Applications
Open Data and Open Software Geospatial ApplicationsOpen Data and Open Software Geospatial Applications
Open Data and Open Software Geospatial Applications
 
EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...
EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...
EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...
 
Extreme Citizen Science: the socio-political potential of citizen science
Extreme Citizen Science: the socio-political potential of citizen scienceExtreme Citizen Science: the socio-political potential of citizen science
Extreme Citizen Science: the socio-political potential of citizen science
 
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
 
Modeling of future transport systems from different point of views: data, tec...
Modeling of future transport systems from different point of views: data, tec...Modeling of future transport systems from different point of views: data, tec...
Modeling of future transport systems from different point of views: data, tec...
 
Free as in Puppies: Compensating for ICT Constraints in Citizen Science
Free as in Puppies: Compensating for ICT Constraints in Citizen ScienceFree as in Puppies: Compensating for ICT Constraints in Citizen Science
Free as in Puppies: Compensating for ICT Constraints in Citizen Science
 
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
 
Citizen Science in Open Science context: measuring & understanding impacts of...
Citizen Science in Open Science context: measuring & understanding impacts of...Citizen Science in Open Science context: measuring & understanding impacts of...
Citizen Science in Open Science context: measuring & understanding impacts of...
 
From Telling Stories with Data to Telling Stories with Data Infrastructures: ...
From Telling Stories with Data to Telling Stories with Data Infrastructures: ...From Telling Stories with Data to Telling Stories with Data Infrastructures: ...
From Telling Stories with Data to Telling Stories with Data Infrastructures: ...
 
Crowdsourcing and Participation in Cartography (G572 Guest Lecture)
Crowdsourcing and Participation in Cartography (G572 Guest Lecture)Crowdsourcing and Participation in Cartography (G572 Guest Lecture)
Crowdsourcing and Participation in Cartography (G572 Guest Lecture)
 
AGIForesight _2020_Full Article
AGIForesight _2020_Full ArticleAGIForesight _2020_Full Article
AGIForesight _2020_Full Article
 
COST Actions: ENERGIC, Mapping and the citizen sensor.
COST Actions: ENERGIC,  Mapping and the citizen sensor.COST Actions: ENERGIC,  Mapping and the citizen sensor.
COST Actions: ENERGIC, Mapping and the citizen sensor.
 

Recently uploaded

一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 

Recently uploaded (20)

一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 

Geographies of crowdsourced information and their implications (VGI-Alive Keynote), AGILE 2018, Lund

  • 1. 1 Geographies of crowdsourced information and their implications @a_ballatore VGI-Alive, AGILE June 2018, Lund, Sweden Andrea Ballatore* Department of Geography Birkbeck, University of London Stefano De Sabbata School of Geography, Geology and the Env., University of Leicester aballatore.space
  • 2. 2 Outline 1. Where are we? 2. Core research questions 3. Paradigm limitations 4. Beyond the usual suspects 5. Case studies: Search behaviour 6. [Stefano will do the rest]
  • 4. 4 The success of crowdsourcing "85% of the top global brands have used crowdsourcing in the last ten years. […] According to Gartner, 75% of the world’s high performing enterprises will be using crowdsourcing by 2018." (Deloitte, 2016) It is when new, successful technologies withdraw into the "woodwork of everyday banality" that their effects become real and profound. (Vincent Mosco, 2004) https://ideascale.com
  • 5. 5 The success of crowdsourcing "85% of the top global brands have used crowdsourcing in the last ten years. […] According to Gartner, 75% of the world’s high performing enterprises will be using crowdsourcing by 2018." (Deloitte, 2016) It is when new, successful technologies withdraw into the "woodwork of everyday banality" that their effects become real and profound. (Vincent Mosco, 2004) https://ideascale.com
  • 6. 6 Crowdsourcing + geolocation: A mature field (See et al, 2016)
  • 8. 8 Placing the crowds • Crowdsourced geographic information (CGI) • From experimental phase to oligopoly (e.g., Google, Facebook) • From cyberoptimism to cyberpessimism https://www.cagle.com
  • 9. 9 Placing the crowds • Crowdsourced geographic information (CGI) • From experimental phase to oligopoly (e.g., Google, Facebook) • From cyberoptimism to cyberpessimism (2016)
  • 10. 10 Researching CGI as domain Using CGI for other domains
  • 11. 11 Core CGI research 1. Who are the contributors and why do they engage in spatial information production, and what incentives work or do not work? How do they collaborate and organise? How do we include marginalised communities? (Budhathoki and Haythornthwaite 2013) 2. How can we calculate the quality and fitness for purpose of crowdsourced data in a reliable, preferably intrinsic way? (Goodchild and Li 2012) 3. What are the limitations of such models and what are their spatial, epistemic and cultural biases? (Dodge and Kitchin 2013)
  • 12. 12 Core CGI research 1. Who are the contributors and why do they engage in spatial information production, and what incentives work or do not work? How do they collaborate and organise? How do we include marginalised communities? (Budhathoki and Haythornthwaite 2013) 2. How can we calculate the quality and fitness for purpose of crowdsourced data in a reliable, preferably intrinsic way? (Goodchild and Li 2012) 3. What are the limitations of such models and what are their spatial, epistemic and cultural biases? (Dodge and Kitchin 2013)
  • 13. 13 CGI for other domains Natural sciences: biology, climate science, Earth sensing Social sciences: urban planning, transportation, public health, economics, human geography Humanities: digital humanities, history, cultural analytics
  • 15. 15 • "Ignorance of crowds" (Carr, 2007) • Conditions for wisdom • Menial work, no real innovation/creativity • Undermining paid work • Variable quality Limitations of the paradigm 2004
  • 16. 16 • "Ignorance of crowds" (Carr, 2007) • Conditions for wisdom • Menial work, no real innovation/creativity • Undermining paid work • Variable quality Limitations of the paradigm 2007
  • 17. 17 • Large volumes of information do not imply usefulness or fitness for purpose • We need representative samples, not large samples (e.g., random sample of 1,000 > 1M non-random) Limitations of the paradigm
  • 18. 18 Each CGI source is a particular viewpoint and will return a different image of the social and natural world. suifaijohnmak.wordpress.com
  • 19. 19 Diversity/biases of CGI • Thematic: e.g., tourism, outdoors, typical/atypical behaviour, sharing bias • Demographic: Western Educated Industrialised Rich Democratic (WEIRD) (not always!) • Social: 90%-9%-1%, hyperactive minorities of contributors • Geographic: urban/rural, developed/developing, central/peripheral, human/natural
  • 20. 20 Diversity/biases of CGI • Thematic: e.g., tourism, outdoors, typical/atypical behaviour, sharing bias • Demographic: Western Educated Industrialised Rich Democratic (WEIRD) (not always!) • Social: 90%-9%-1%, hyperactive minorities of contributors • Geographic: urban/rural, developed/developing, central/peripheral, human/natural
  • 21. 21 • Without centralised planning and protocols, data quality remains uneven (coverage!) • Wikipedia replaced Britannica, but OpenStreetMap is not replacing Google Maps • CGI cannot replace established data collection protocols and sources, but can provide useful ancillary data CGI strictures
  • 22. 22 • Without centralised planning and protocols, data quality remains uneven (coverage!) • Wikipedia replaced Britannica, but OpenStreetMap is not replacing Google Maps • CGI cannot replace established data collection protocols and sources, but can provide useful ancillary data CGI strictures
  • 23. 23 Reinventing wheels • Some CGI replicates work that has been done better by professionals • More useful to focus on "missing" data: bookscrounger.com
  • 24. 24 Open data vs CGI Authoritative datasets are becoming cheaper/free
  • 26. 26 Most studies on OSM, Wikipedia, Twitter, Flickr. There's more out there! https://fineartamerica.com The usual suspects
  • 27. 27
  • 28. 28 • Google Maps • From 5M to 50M contributors in 2017 • 700K new places monthly • Gamification
  • 29. 29 • Hundreds of millions of users, billions of reviews • Measurable effects on spatial and economic behaviour • Sentiment about points of interest, cities, and neighbourhoods
  • 30. 30 • Micro-economic data (e.g. price of onions in India, new shops in Ghana) • For profit, contributors are paid • Applications: International Development, Government, Global Security, and Business
  • 32. 32
  • 33. 33 Online visibility of CGI projects • Search engines are the key entry point to discover new information • Feedback loop between Wikipedia and Google Search to attract new contributors • Making CGI findable and consumable for search engines and social media • Study on CGI on Google Search (2018)
  • 34. 34 Online visibility of CGI projects • Search engines are the key entry point to discover new information • Feedback loop between Wikipedia and Google Search to attract new contributors • Making CGI findable and consumable for search engines and social media • Study on CGI on Google Search (2018)
  • 35. 35 Interest in CGI projects (Ballatore & Jokar Arsanjani, 2018)
  • 36. (Ballatore & Jokar Arsanjani, 2018) Search Interest in Wikimapia/OSM combined
  • 37. 37 Search Interest in Wikimapia vs OSM (Ballatore & Jokar Arsanjani, 2018)
  • 38. Andrea Ballatore Department of Geography, Birkbeck, University of London Stefano De Sabbata* School of Geography, Geology and the Env., University of Leicester S c h o o l o f G e o g r a p h y , G e o l o g y a n d t h e E n v i r o n m e n t Geographies of crowdsourced information and their implications AGILE 2018 – Lund, Sweden
  • 39. Case studies • Operationalising quality in health studies – with J Bright, S Lee, B Ganesh, DK Humphreys – OpenStreetMap – impact of availability on alcohol-related harm • Analysis of global gazetteers – with E Acheson and RS Purves – GeoNames and Getty TGN – patterns of coverage • Urban Information Geographies – with A Ballatore – with N Tate – quality and representativeness of VGI
  • 40. Operationalising VGI in health studies • with J Bright, S Lee, B Ganesh, DK Humphreys • OpenStreetMap • impact of availability on alcohol-related harm • https://www.sciencedirect.com/scie nce/article/pii/S1353829217305804
  • 41. Quality indicator Values Transformation Num. of map features per square kilometre Log transformed, scaled Average num. of edits per map feature Log transformed, scaled Num. of users who made at least one edit Log transformed, scaled Num. of features edited by more than one user Log transformed, scaled Num. of days with at least one edit Log transformed, scaled Days since the last edit in the area Inversed, scaled Proportion of buildings with full address -1 if 0%, 0 if no buildings • Intrinsic quality indicator based on Barron et al. (2014) • Calculated as a linear combination of:
  • 42. Replicating CRESH study The overall conclusions are essentially identical • significant positive correlation between density of alcohol outlets and incidence of alcohol related harms • an effect size which is comparable (though smaller) The correlation between OSM-based estimates and CRESH is stronger where the quality index is higher
  • 43. Analysis of global gazetteers • with E Acheson and RS Purves • GeoNames and Getty TGN • patterns of coverage • https://www.sciencedirect.com/scie nce/article/pii/S0198971516302496
  • 44. Comparing GeoNames and TGN Two distinct levels of coverages • in “high coverage” countries, TGN same order of magnitude as counts in GeoNames • in “low coverage” countries, GeoNames has over 60 times as much content as TGN
  • 45. GeoNames and TGN in Great Britain Gazetteer Features Pop place GeoNames 54,701 16,475 Getty TGN 24,003 16,816 Ordnance Survey 50k 248,626 Ordnance Survey OpenNames 41,490
  • 47. Urban Information Geographies • with A Ballatore • with N Tate • quality and representativeness of VGI
  • 48.
  • 49. Findings • Overall confirmed (some) hypothesis – representation similar biases as participation – Higher qualifications strongest factor – Wealth (house prices) strong factor in both, more so for Wikipedia – Twitter strongly influenced by perc. of ppl. aged 30-44 (positively) and households with de-pendent children (negatively) • However – models account only for about 44–55% of variability – need for more explanatory factors, e.g., tourism-related activities • …or can these difference be used as indicators? – ethnic composition is not a factor in the UK • Twitter and Wikipedia similar but distinct geographies – only representative of themselves
  • 50. Comparing Twitter and Wikipedia Number of tweets (951 obs.) ~ Number of articles Adj. R2 = 0.490
  • 51. De Sabbata and Tate (forthcoming) A geodemographic approach
  • 52. Conclusions • More interaction between research on CGI core and domain applications – beyond technology, towards social domains – beyond the usual suspects (OSM, etc) • Access to data: sampling and scraping • Systematic work on limitations is needed – Urban Information Geographies • Impact of GDPR on access and use of data
  • 53. Thank you for your attention. Any questions? Stefano De Sabbata School of Geography, Geology and the Env., University of Leicester Andrea Ballatore Department of Geography, Birkbeck, University of London