Portal Kombat : extension du réseau de propagande russe
Bogdan cirlugea master_thesis_poster
1. SSIE
/
ENAC / PROJET DE MASTER ANNEE 2015
SECTION SCIENCES ET INGENIERIE DE L’ENVIRONNEMENT
Validation of OpenStreetMap road data for integration into the Global Roads
Open Access Data Set (gROADS)
1 Laboratory of Geographic Information Systems (LASIG), EPFL / 2 Center for International Earth Science Infromation Network (CIESIN), The
Earth Institute, Columbia University in the City of New York
Author : Bogdan-Mihai Cîrlugea
Advisers : Prof. François Golay 1 / Prof. Alex de Sherbinin 2 / Paola Kim-Blanco 2
Introduction
Methods & Results
Conclusion
Background: gROADS v1 represents a global dataset that provides the best available open access road data by country, under a consistent and
internationally accepted data model. A second version of gROADS sets the goal of improving the original dataset by including other freely available,
more accurate and complete datasets. In this context OpenStreetMap (OSM) represents the best open dataset available for improving gROADS v1.
Country
Criteria
Ingestion
decision
Completeness
Positional
Accuracy
Attribute
structure
Improvement RMSE < 50m Improvement
Liberia NO YES NO NO
Guinea NO YES NO NO
Ghana YES YES YES YES
Senegal NO YES NO NO
Problem: Being a Volunteered Geographic Information product, OSM has no systematic quality control. Moreover previous studies have shown that
OSM’s quality is highly variable. A fast comparative assessment of the quality of OSM data is impossible due to the lack of reference datasets.
Objectives: Development of a set of diagnostics that can give a sense about the overall quality of OSM and a decision framework that could
determine if OSM country road networks should be ingested into gROADS and replace the existent data.
Country
Erroneous predictions of
incomplete regions
Discrete
classification (%)
Regression
models (%)
Liberia 21% 31%
Guinea 0% 11%
Ghana 22% 23%
Senegal 0% 0%
Country
Total
RMSE (m)
Urban
RMSE (m)
Rural
RMSE (m)
Liberia 31.57 7.97 43.93
Guinea 11.50 8.06 13.30
Senegal 7.46 4.10 8.99
Ghana 9.47 9.90 9.03
Process: Country by country assessment of different quality aspects of gROADS & OSM road networks.
Analysis platform: R programming, ArcGIS, PostGIS
Case study: Liberia, Guinea, Ghana & Senegal
Attribute structure
Completeness assessment
Positional accuracy
Countries
Unspecified / Unclassified
roads
gROADS (%) OSM (%)
Liberia 20.98 32.89
Guinea 9.58 23.57
Ghana 98.37 14.94
Senegal 9.18 14.80
The ingestion decision can be made based on a comparison between OSM
and gROADS on the three described quality criteria.
OSM is not always superior to gROADS v1 for low income countries.
Only one country out of four is suitable for ingestion: Ghana.
The need for a systematic validation process was reinforced.
If improved, the exploratory completeness assessment could serve as a tool
for the OSM community, to guide further mapping.
Method: Determine the proportion of features that lack
classification. Compute the length of ‘unclassified’ (OSM) /
‘unspecified’ (gROADS) features out of the total road network
length (%).
Method: Compare position of OSM road intersections with the position of road
intersections digitized on imagery (ground truth). Provide one RMSE value for
each country.
Imagery: Google Earth
Sample size: 100 intersections
Sampling scheme: 10 random
admin units (urban + rural) & 10
random intersections in each.
Results: All cases present an RMSE suitable for gROADS (<50m).
Assumption: Presence/absence of roads is influenced by 3 quantifiable
variables: Population, Wealth, Terrain Variability. If true, the 3 variables can be
used to predict regions with missing roads in OSM
Identify suitable datasets: GPW (population density), DSH survey (wealth
index), STRM-1 Arc Second Global (terrain variability)
Aggregate datasets: Subnational admin units 2 (provided by GPW dataset)
Asses correlation: Pop. density, Wealth & Road density -> highly correlated
Develop prediction methods: Discrete classification & Regression model
Validate predictions: Visually inspect the predicted regions against Google
Earth Imagery. If no roads are missing, prediction is considered erroneous.
Workflow
1. Discrete classification: Classify admin units as Low - High road density,
population density & wealth using the median of each variable. Tag as
incomplete regions with Low road density but High pop. density & High wealth.
2. Regression model: Predict road density with pop. density and wealth using
a Spatial Durbin regression model. Tag as incomplete regions with high
negative residuals (<25%).
Exploratory method
Results: Some patterns can be spotted but the two methods have generally
different results. The number of erroneous predictions is high.
Comparative method
Results: Only in the
case of Ghana, OSM
represents an
improvement over
gROADS v1.
Results: OSM mapping is concentrated in urban regions.
Only for Ghana OSM represents a significant improvement.
Method 1: Compare the total length of equivalent road networks
in OSM and gROADS.
Country
OSM road network
length (km)
gROADS v1 road
network length (km)
Liberia 32’457 25’205
Guinea 101’733 100’401
Ghana 57’613 22’752
Senegal 41’622 71’375
Method 2: Compare at admin unit level the road density (road
length / area) of equivalent road networks in OSM and gROADS.
Discrete classification Regression model
2016