An authoritative crowd? #geomob
Upcoming SlideShare
Loading in...5

An authoritative crowd? #geomob



Short presentation on quality assurance and improvement of OpenStreetMap for @Geomob London on 24-11-2012...

Short presentation on quality assurance and improvement of OpenStreetMap for @Geomob London on 24-11-2012

Can a quality assured product increase user confidence within the "professional" community and encourage increased contribution to fill in the gaps.



Total Views
Slideshare-icon Views on SlideShare
Embed Views



1 Embed 10 10



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Note the emphasis on “how”Can a crowdsourced geospatial database be considered authoritative? Indeed can any dataset that describes the real world be considered authoritative, whether crowd sourced or “professionally compiled”? Who determines authority? What constitutes authority in geodata? Does authority matter and if it does, why? What actions or processes might contribute to promoting crowdsourced geodata to a position of authority?I want to consider the nature of authority in geospatial data and whether it might be possible for a crowdsourced dataset such as OpenStreetMap (although these observations could apply to any crowdsourced geodata) to become authoritative or a primary reference source. These are some early musings on the topic, more to followI also want to introduce you to a new project called OSM-GB which might make a contribution to increasing the coverage and authority of OpenStreetMap for GB users.
  • If you are impatient let me give you the executive summary:In a literal sense a crowdsourced dataset is unlikely to ever be granted legal status as authoritative (e.g. for conveyancing) but that does not mean that it cannot attain a level of acceptance that is close to authoritative and may in practice be more accurate/complete/up to date than data that has a formal stamp of authority
  • Defining AuthorityLet me start by considering what authority means in terms of a geodata.The Oxford English Dictionary, which in itself would be considered an authority on the English language, defines “authoritative” asCLICK1 able to be trusted as being accurate or true; reliable: CLICK2 commanding and self-confident; likely to be respected and obeyed:Several different concepts are merged in these definitions: accurate, true and reliable all seem to have an absolute quality while best of its kind and unlikely or likely are relative terms. There are also differing ways that authority can be manifested: reliable, commanding and self-confident - does a dataset become authoritative if I assert its authority with self confidence? Perhaps the different aspects of the definition highlight the challenge of determining what constitutes authority in a geodata, is it absolute or relative, is authority granted, assumed or objectively defined?
  • Let’s look at some geodata and consider what we think about that data in terms of authorityCLICK – this OS OpenData, authoritative but not very detailed (of course there is much more detail available if you pay)CLICK – This is Google, more detail and some nice buildingCLICK – OSM – much more detail and attribution, but is it accurate can we trust it?Let’s look at the 3 main criteria that we would want to assess data against in terms of authority
  • In the context of authoritative geodata I suggest that we would expect it to be Geometrically and positionally accurate (within the scale/specification of capture)
  • Complete, no features or objects within the scope of the dataset are omittedOS has an SLA to capture 99.6% of real world change within 6 months
  • Correctly attributed (features are correctly named and classified according to a pre-determined but inevitably evolving scheme or taxonomy)
  • Authority is more than accuracyAccuracy alone does not guarantee authorityAccuracy and completeness are not the sole determinants of authority, change detection, capture standards and processes and quality assurance processes will all impact our willingness to “trust” or “respect” a dataset.Authority implies having some visible quality specifications and processes for testing the data against those specsIt is important to distinguish between data that has authority and data that is “accurate” or deemed to be fit for purpose the latter may be good enough or even very good but still may not have the implied safety/reliability seal that comes with being classed as authoritative. The opposite could also be true, it is also possible that data that has some official seal of authority may not be accurate, complete and current.
  • We have seen that authority is about trust and respect in addition to accuracy and we know that even OS is not perfect so what other data might gain our trust and respect?Let’s turn our attention to OSMIs it possible for a crowdsourced dataset such as OSM to be “trusted as being accurate or true” or “considered to be the best of its kind and unlikely to be improved upon”? Let’s consider the 3 criteria for authoritative geodata outlined above.The challenges1. Geometrically and positionally accurateOSM data is captured by a combination of handheld GPS surveys and “armchair surveys” tracing over aerial imagery donated by Yahoo or Bing (more up to date), in principle it should be possible to capture data to about 5m accuracy or slightly better using these tools. Whether this is sufficient to be relied upon will depend upon the proposed use of the data.2. Complete, no features or objects within the scope of the dataset are omittedThe community based approach to data capture does not allow for volunteers to be directed to cover specific areas in a planned manner although over time it does appear that the completeness is improving. A lack of completeness will limit the use of the data in applications which require broad cover, however that might not be a concern to an organisation wishing to build an application for say Greater London only. 3. Correctly attributed and classifiedAttribution and classification are more dependent on “on the ground” observations than the other criteria above. Consequently the level of attribution and classification has lagged behind the simple capture of geometry. Furthermore the classification model within OSM known as tags can be confusing for new contributors resulting in some potential errors or omissions in classification.Click But there are no formal QA processes, does that mean we cannot trust OSM?
  • MukiHaklay has undertaken several quantitative studies of the accuracy and completeness of OSM data which suggest that the data that has been captured is accurate but not yet complete or fully attributed. “By the end of March 2010, OpenStreetMap coverage of England grown to 69.8% from 51.2% a year ago. CLICK When attribute information is taken into account, the coverage grown to 24.3% from 14.7% a year ago.”CLICKAlthough there is a continually improving trend in completeness and attribution it would appear that the demographics and geographic distribution of volunteers may prevent the map ever having full or even close to full attribution and GB cover.See Haklay, M., 2010, “How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets” at Haklay at
  • This question needs to be considered within the context of the constraints of an informal organisation of volunteer contributors. To become a reliable and trusted source of information within GB, OSM would need to broaden the range of contributors and identify the means to motivate contributors to focus on completing the map to a consistent level for the whole of the GB. It is unclear whether this is something that the current mapping community is able to achieve let alone wishes to doAccuracy and attributionThere are a wide range of quality evaluation tools and services developed by the OSM community for bug reporting, error detection, monitoring, and analysing tags. Specific tools range from checking network continuity, analysing relationships, visualising turn restrictions and identifying duplicate nodes, there are also tools to mark potential errors, analyse data by contributor and many that are country specific. However there is no mandatory set of processes that data pass through prior to release and it is difficult to determine the extent to which these tools are used by volunteers.The OSM philosophy on quality can perhaps be summarised as “the wisdom of the crowd will ultimately correct any errors or omissions” whether that is through observation or through the use of the tools available.If a combination of automated QA tools were applied in a consistent process to OSM edits then potential errors could be flagged and in some way prioritised for further examination and either corrected or verified.CompletenessMukiHaklay has identified that the level of completeness of OSM is greater in urban areas and that it also inversely correlates with the level of deprivation within an area.“… the analysis of OSM shows is that deprived communities and rural areas are not well covered, especially when attributes are considered”To rectify these biases OSM would need to find ways to either encourage existing volunteer contributors to step outside of their current areas of activity or attract new contributors in these under-mapped areas.See and Haklay & Ellul “Completeness in volunteered geographical information”
  • Users as producers – explain the shift from producer centric communityThere is no formal mission statement or outline of quality and coverage objectives for OSM, however this description on the OpenStreetMap Foundation’s web site probably is as close as we will get OpenStreetMap is an open initiative to create and provide free geographic data such as street maps to anyone who wants them. It is a massive online collaboration, with hundreds of thousands of registered users worldwide.It is focussed on producing maps that are available without charge or constraint and interestingly refers to its contributors as “users” rather than producers. I would say that it is producer centric not user centricThe direction of OSM is largely driven by an active community of volunteers who have taken on the mission to map the world for a variety of reasons which range from producer centric “because we can” or “because it is fun” to more commercial or humanitarian motivations. The organisation has been highly producer centric and has, up till now, resisted the influence of large potential users of its data (corporates or governments). A recent blog post by Martijn van Exel makes the case for OSM to focus on “warm” geography rather than seeking to emulate what he describes as the “cold” geography of national mapping agencies and navigation data suppliers.“… the extremely high churn rate that OpenStreetMap is coping with — less than one tenth of everyone who ever created an OpenStreetMap account continue to become active contributors. ..OpenStreetMap needs those flesh and blood contributors, because it is ‘Warm Geography’ at its core: real people mapping what is important to them — as opposed to the ‘Cold Geography’ of the thematic geodata churned out by the national mapping agencies and commercial street data providers; data that is governed by volumes of specifications and elaborate QA rules.”This is one contributor’s view but in my opinion it will resonate with many current contributors. If the current contributors do not want to create data that conforms to a specification then OSM is unlikely to become a trusted and reliable source of geodata.Perhaps by attracting potential users of OSM who are concerned with that “cold” geography to become contributors, the challenges of a consistent approach to QA and a more structured approach to completeness can be resolved. OSM-GB is one possible way of attracting such users.
  • OSM-GB is a project being initiated at the Centre for Geospatial Sciences at Nottingham University. It is a collaboration between CGS and 1Spatial that will apply 1Spatial’s rules based geodata quality tools to a GB extract of OSM. The resulting “improved” and structured data will be projected into BNG and served as an OGC Web Map Service and Web Feature Service, for the duration of this project (approximately 15 months) these services will be available at no charge.
  • The project has 2 main strands of research:Applying rules based quality improvement processes to OSM to identify possible errors and after some experiment and refining of the rules potentially to automatically correct some geometric and attribute errors. The “improved” dataset will be available for download from the OSM-GB web site and could be offered back to the main OSM database (probably as a basis for further inspection prior to incorporation).By making the “improved” data available via standards based web services, it is hoped that public sector users in both central and local government will be encouraged to experiment with OSM and identify potential use cases for OSM that are not met by the geodata currently available through the PSMA. A number of organisations have already confirmed interest in accessing OSM-GB.The objective of making data available to so called professional users whose expectations have been set by using authoritative geodata is to encourage them to become contributors to OSM, motivated by the potential use cases identified, the flexibility of the range of data that can be captured and the data model. These users will often have a great deal of local knowledge (particularly those working within local government) that could help to address the challenges of completeness detailed above. In the longer term it may even be possible to encourage these users to incorporate contributing to OSM as part of their routine workflows.
  • Blame – 1 of the most frequently leveled criticisms at OSMResponsibility for the quality of OSM is often raised as a concern by potential users (much less so by people actually using the data) “who would I blame if something goes wrong?” The answer inevitably is no one, however it should be noted that most data providers including OS do not warrant that their data is accurate or even fit for purpose and exclude any liability for errors. CLICK For example the PSMA says:9.4 Ordnance Survey excludes to the fullest extent permissible by law all warranties, conditions, representations or terms, whether implied by, or expressed in, common law or statute including, but not limited to, any regarding the accuracy, compatibility, fitness for purpose, performance, satisfactory quality or use of the Licensed Data.
  • Wrapping upOSM is unlikely to ever be considered authoritative within a legal context. CLICKBut I hope that I have shown how in the more conversational sense of the term authoritative, OSM data could become an alternative trusted and reliable source of geodata for “professional users” offering a wide range of content which differs from and complements other sources. For this level of trust to be achieved a more formal approach to quality assurance and a more structured and consistent approach to data capture (content, geography and attribution) will be needed. The current OSM contributors may not choose to move in this direction but projects like OSM-GB may attract a new group of user/contributors who recognise the opportunities that OSM offers them and their organisations and who are able to help improve quality and extend coverage and attribution.
  • Informal mapping can communicate a lot of local knowledge without being accurateClickDoes it really matterSo rememberCLICK
  • Rules based quality improvement combined with OpenStreetMap could produce a trusted dataset that encourages users to become contributors – a winning team!Don’t forget to look at

An authoritative crowd? #geomob An authoritative crowd? #geomob Presentation Transcript

  • Can the crowd be authoritative? Steven Feldman
  • A quick judgement
  • Whose Authority? able to be trusted as being accurate or true; reliable: “clear, authoritative information and advice” “an authoritative source” (of a text) considered to be the best of its kind and unlikely to be improved upon: “this is likely to become the authoritative study of the subject” commanding and self-confident; likely to be respected and obeyed: “his voice was calm and authoritative” proceeding from an official source and requiring compliance or obedience: “authoritative directives”
  • What makes geodata authoritative?© OpenStreetMap contributors, CC-BY-SA
  • Accurate
  • Complete
  • Correctly attributed
  • Authority > Accuracy
  • How accurate is OSM in GB? No formal and consistent QA processes applied!
  • the analysis of Muki says … OSM shows In 2010 coverage that deprived atat 69.8% 69.8% but communities attribution 24.3%and rural areas are not well covered,especially when attributes are considered
  • Could OSM be an authoritative source for GB?
  • Users vs Producers
  • Tile OSM-GB WMS ServiceProjected WFSto OSGB Rules based Data available for quality download/reuse improved on OSM terms
  • A step through the OSM-GB Download Data workflow WMS WFS Tile SvcOSM Master OSM-GB Raw OSM-GB Database Database QA/QI DB ?OSM-GB QA 1Spatial Report for Quality Rules Rules based potential Engine improvement actions action
  • Someautomated rules based processes Will Research Questions improve quality? Will formal QA increase confidence/authority? How can improved QA & confidence increase contribution? Can “professional” contributors be “motivated” to fill in gaps? Are there use cases that will support a sustainable commercial model for OSM-GB?
  • There is no one to blame
  • Can the crowd be authoritative? Yes and No
  • A final thought Maps don’tmatter? be Does it have to authoritative, they can just be fun and usefulMartin Usborne
  • Thank You