How Many Volunteers Does It Take To Map An Area Well? <br />Dr Muki Haklay <br />Department of Civil, Environmental and Ge...
Outline<br />A bit about quality of geographical information<br />Evaluation of OSM with Meridian data set<br />Evaluation...
The quality issue<br />How good it the data? <br />First question: good for what? Subjective quality – fitness for purpose...
The quality issue<br />How good it the data? <br />Positional accuracy – the position of features or geographic objects in...
The ‘problem’<br />“We know little about the people that collect it, their skills, knowledge or patterns of data collectio...
Coverage and completeness <br />
Coverage and completeness <br />
Completeness – difference by user?<br />
Patterns of collaboration<br />
Users<br />Limited ‘on the ground’ collaboration. Important as this can be the main source of quality assurance - ‘Given e...
Accuracy and Completeness- Study I  <br />Comparing OSM to OS Meridian 2 roads layer<br />Maridian 2 -Motorways, major and...
Positional Accuracy<br />Meridian 2 and OSM – Motorway comparison<br />
Goodchild and Hunter (1997), Hunter (1999) method<br />Assuming that one dataset is of higher quality<br />Create buffer a...
Motorway comparison<br />Buffer of 20m<br />Average of 80% - ranging from 59.81% to 88.80%<br />
Comparison II – Ordnance Survey Master Map<br />Data used for comparison: OS MasterMap Integrated Transport Network (ITN) ...
Four test locations chosen:<br />TQ28se<br />TQ38se<br />TQ17ne<br />TQ37sw<br />
Buffer analysis – again based on Goodchild and Hunter  (1997) buffer comparison technique:<br />Buffer width (X):<br />X<b...
Buffer overlap results:<br />109 roads examined covering over 328 km<br />Results of Master Map comparison <br />
TQ38se (East London)<br />TQ28se (North/Central London)<br />
TQ37sw (South London)<br />TQ17ne (West London)<br />
Quality not linked to length <br />
Completeness – bulk method<br />Assumption: as Meridian 2 is generalised, so for each sq km:<br />If Total length(OSM road...
Methodology<br />1<br />3<br />2<br />4<br />5<br />
Change in completeness Mar 2008 – Mar 2010<br />
England – March 2008<br />
England – March 2009<br />
England – October 2009<br />
England – March 2010<br />
Completeness with attributes<br />The test for completeness with attributes checks that roads and streets names have been ...
England – March 2008<br />
England – March 2009<br />
England – October 2009<br />
England – March 2010<br />
Linus’ law and OSM <br />
Conclusions<br />OSM quality is high – and it is assumed that the quality is coming from aerial imagery <br />Linus’ Law d...
Further reading <br />Haklay, M., 2008, How good is OpenStreetMap information? A comparative study of OpenStreetMap and Or...
Upcoming SlideShare
Loading in …5
×

4B_1_How many volunteers does it take to map an area well

1,538 views
1,463 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,538
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • The higher level dataset to be used was OS MasterMap. The Ordnance Survey’s MasterMap is a database that records every fixed feature of Great Britain larger than a few metres in one continuous digital map – It is a framework on which future OS products will be based The ITN layer is one of four layers, and the layer used for this project. ITN obtained from Edina, OSM for CloudeMadeITN data obtained in GML format, which was then converted to shp file format, and the required OSM downloaded as shp file.
  • - 25km square grid , based on OS 1:10000 grid tilesThe tiles were chosen to cover a range of London, North, South, East &amp; West, two close and two further out from the centre, so that a good sample was chosen
  • Requires the use of a reference source (ITN) and a test source (OSM)Here we have an example of an ITN and corresponding OSM road feature.ITN is the higher level dataset, and therefore the ITN road feature is considered to be the actual centreline of the road. A buffer of width x is created around the ITN featureThe proportion of the OSM road that lies that lies within the buffer is then calculatedThis analysis carried out for every A-road, B-road and motorway feature in the datasets
  • The results are very high, indicating that the positional accuracy of OSM is very good.In particular, the South London tile had an average of 92.62% overlap for A-roads, and North/Central London tile had 81.46% overlap for B-roads.Only one motorway segment (very high)
  • - Results maps for each of the four test regions- The distribution of results in the histogram is clearly illustrated by the results maps. Anything above 90% overlap in red, between 80-90% in orange, and between 70-80% in yellow.
  • 4B_1_How many volunteers does it take to map an area well

    1. 1. How Many Volunteers Does It Take To Map An Area Well? <br />Dr Muki Haklay <br />Department of Civil, Environmental and Geomatic Engineering, UCL <br />m.haklay@ucl.ac.uk<br />AamerAther (M.Eng 2009), Sofia Basiouka (MSc GIS 2009) and NaureenZulfiqar (M.Eng 2008)<br />Ordnance Survey data was kindly provided by the Ordnance Survey research unit. <br />OSM data was provided by GeoFabrik & CloudMade<br />
    2. 2. Outline<br />A bit about quality of geographical information<br />Evaluation of OSM with Meridian data set<br />Evaluation of OSM with MasterMap<br />Linus’ low –more users: higher quality? <br />
    3. 3. The quality issue<br />How good it the data? <br />First question: good for what? Subjective quality – fitness for purpose/use<br />Second question: how to measure? Objective quality – but need to evaluate it in light of the first question<br />
    4. 4. The quality issue<br />How good it the data? <br />Positional accuracy – the position of features or geographic objects in either two or three dimensions<br />Temporal accuracy – how up to date is the data? Does it presents the existing situation and when will it be updated? <br />Thematic/attribute accuracy – for quantitative attributes (width) and qualitative attributes (geographic names)<br />Completeness – The presence and absence of objects in a dataset at a particular point in time<br />Logical consistency –adherence to the logical rules of the data structure, attribution and relationships <br />
    5. 5. The ‘problem’<br />“We know little about the people that collect it, their skills, knowledge or patterns of data collection”<br />“Loose coordination and no top-down quality assurance processes – can’t produce good data”<br />“It is not complete and comprehensive – there are white areas”<br />
    6. 6. Coverage and completeness <br />
    7. 7. Coverage and completeness <br />
    8. 8. Completeness – difference by user?<br />
    9. 9. Patterns of collaboration<br />
    10. 10.
    11. 11. Users<br />Limited ‘on the ground’ collaboration. Important as this can be the main source of quality assurance - ‘Given enough eyeballs, all bugs are shallow’ (Raymond, 2001) <br />Translate to VGI it might mean:“The more users there are per area, the better is the positional and attribute quality”<br />But does Linus’ law apply to OSM (and to VGI)?!?<br />
    12. 12. Accuracy and Completeness- Study I <br />Comparing OSM to OS Meridian 2 roads layer<br />Maridian 2 -Motorways, major and minor roads are... Complex junctions are collapsed to single nodes and multi-carriageways to single links... some minor roads and cul-de-sacs less than 200m are not represented... Private roads and tracks are not included...<br />Nodes are derived from 1:1,250-1:2,500 mapping, with 20m filter around centre line generalisation<br />
    13. 13. Positional Accuracy<br />Meridian 2 and OSM – Motorway comparison<br />
    14. 14. Goodchild and Hunter (1997), Hunter (1999) method<br />Assuming that one dataset is of higher quality<br />Create buffer around the dataset with known width <br />Calculate the percentage of the evaluated dataset that falls within the buffer<br />
    15. 15. Motorway comparison<br />Buffer of 20m<br />Average of 80% - ranging from 59.81% to 88.80%<br />
    16. 16. Comparison II – Ordnance Survey Master Map<br />Data used for comparison: OS MasterMap Integrated Transport Network (ITN) layer<br />ITN consists of road network information<br />The most accurate and up-to-date geographic reference for Great Britain’s road structure <br />Any major real world changes are updated within 6 months <br />Used for numerous applications<br />e.g. Transport management systems, road routing, emergency planning...<br />
    17. 17. Four test locations chosen:<br />TQ28se<br />TQ38se<br />TQ17ne<br />TQ37sw<br />
    18. 18. Buffer analysis – again based on Goodchild and Hunter (1997) buffer comparison technique:<br />Buffer width (X):<br />X<br />ITN<br />OSM<br />Comparison methodology <br />
    19. 19. Buffer overlap results:<br />109 roads examined covering over 328 km<br />Results of Master Map comparison <br />
    20. 20. TQ38se (East London)<br />TQ28se (North/Central London)<br />
    21. 21. TQ37sw (South London)<br />TQ17ne (West London)<br />
    22. 22.
    23. 23. Quality not linked to length <br />
    24. 24. Completeness – bulk method<br />Assumption: as Meridian 2 is generalised, so for each sq km:<br />If Total length(OSM roads)>Total length(Meridian 2 roads)<br />Than OSM is more complete than Meridian 2<br />The comparison can also includes attributes, by testing for the number of objects with complete set of values<br />
    25. 25. Methodology<br />1<br />3<br />2<br />4<br />5<br />
    26. 26. Change in completeness Mar 2008 – Mar 2010<br />
    27. 27. England – March 2008<br />
    28. 28. England – March 2009<br />
    29. 29. England – October 2009<br />
    30. 30. England – March 2010<br />
    31. 31. Completeness with attributes<br />The test for completeness with attributes checks that roads and streets names have been completed<br />Until the release of Ordnance Survey data in 1st April 2010, this was a good indication for ground survey of an area<br />
    32. 32. England – March 2008<br />
    33. 33. England – March 2009<br />
    34. 34. England – October 2009<br />
    35. 35. England – March 2010<br />
    36. 36.
    37. 37. Linus’ law and OSM <br />
    38. 38. Conclusions<br />OSM quality is high – and it is assumed that the quality is coming from aerial imagery <br />Linus’ Law does not seem to apply in a straight forward manner – at least not from 5 and above<br />More research is required for lower numbers or participants and different quality of imagery <br />
    39. 39. Further reading <br />Haklay, M., 2008, How good is OpenStreetMap information? A comparative study of OpenStreetMap and Ordnance Survey datasets for London and the rest of England, submitted to Environment and Planning B.<br />Haklay, M. And Weber, P., 2008, OpenStreetMap – User Generated Street Map, IEEE Pervasive Computing.<br />Haklay, M., Singleton, A., and Parker, C., 2008, Web mapping 2.0: the Neogeography of the Geoweb, Geography Compass<br />Haklay, M., 2008, Open Knowledge – learning from environmental information, presented at the Open Knowledge Conference (OKCon) 2008, London, 15 March. <br />Haklay, M., 2007, OSM and the public - what barriers need to be crossed?presented at State of the Map conference, Manchester, UK, 14-15 July.<br />To get a copy, write to m.haklay@ucl.ac.uk , or get them on povesham.wordpress.com <br />

    ×