How authoritative can the crowd be?

  • 1,893 views
Uploaded on

Can a crowdsourced geospatial database be considered authoritative? Indeed can any dataset that describes the real world be considered authoritative, whether crowd sourced or “professionally …

Can a crowdsourced geospatial database be considered authoritative? Indeed can any dataset that describes the real world be considered authoritative, whether crowd sourced or “professionally compiled”? Who determines authority? What constitutes authority in geodata? Does authority matter and if it does, why? What actions or processes might contribute to promoting crowdsourced geodata to a position of authority?
I want to consider the nature of authority in geospatial data and whether it might be possible for a crowdsourced dataset such as OpenStreetMap (although these observations could apply to any crowdsourced geodata) to become authoritative or a primary reference source.

More in: Technology , Sports
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,893
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
6
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Note the emphasis on “how”Can a crowdsourced geospatial database be considered authoritative? Indeed can any dataset that describes the real world be considered authoritative, whether crowd sourced or “professionally compiled”? Who determines authority? What constitutes authority in geodata? Does authority matter and if it does, why? What actions or processes might contribute to promoting crowdsourced geodata to a position of authority?I want to consider the nature of authority in geospatial data and whether it might be possible for a crowdsourced dataset such as OpenStreetMap (although these observations could apply to any crowdsourced geodata) to become authoritative or a primary reference source. These are some early musings on the topic, more to followI also want to introduce you to a new project called OSM-GB which might make a contribution to increasing the coverage and authority of OpenStreetMap for GB users.
  • If you are impatient let me give you the executive summary:In a literal sense a crowdsourced dataset is unlikely to ever be granted legal status as authoritative (e.g. for conveyancing) but that does not mean that it cannot attain a level of acceptance that is close to authoritative and may in practice be more accurate/complete/up to date than data that has a formal stamp of authority
  • Defining AuthorityLet me start by considering what authority means in terms of a geodata.The Oxford English Dictionary, which in itself would be considered an authority on the English language, defines “authoritative” asCLICK1 able to be trusted as being accurate or true; reliable: CLICK2 commanding and self-confident; likely to be respected and obeyed:Several different concepts are merged in these definitions: accurate, true and reliable all seem to have an absolute quality while best of its kind and unlikely or likely are relative terms. There are also differing ways that authority can be manifested: reliable, commanding and self-confident - does a dataset become authoritative if I assert its authority with self confidence? Perhaps the different aspects of the definition highlight the challenge of determining what constitutes authority in a geodata, is it absolute or relative, is authority granted, assumed or objectively defined?See http://oxforddictionaries.com/definition/authoritative
  • Wikipedia is often cited as the foremost example of a crowdsourced dataset and probably has the highest level of recognition and acceptance of any crowdsourced project. Most people are aware that not all of the content within Wikipedia is absolutely accurate, some of it may be opinion masquerading as fact and some is certainly not written by “recognised authorities”. Not withstanding these known limitations Wikipedia is widely used, quoted and trusted. Encyclopaedia Britannica, which was until recently considered the authoritative encyclopaedic reference source, can also be subject to error and author bias.In 2005 Nature undertook a comparison of the accuracy of scientific articles in Wikipedia and Britannica using independent reviewers and found that both contained errors with Wikipedia having marginally more errors (3.86 errors per article compared with 2.93). Commenting on some of the errors identified within Britannica the Wikipedia authors say:“These examples can serve as useful reminders of the fact that no encyclopedia can ever expect to be perfectly error-free (which is sometimes forgotten, especially when Wikipedia is compared to traditional encyclopedias), and as an illustration of the advantages of an editorial process where anybody can correct an error at any time.”See http://en.wikipedia.org/wiki/Wikipedia:External_peer_review#Nature for a summary of analyses of comparative accuracy of Wikipedia and BritannicaSee http://en.wikipedia.org/wiki/Wikipedia:Errors_in_the_Encyclop%C3%A6dia_Britannica_that_have_been_corrected_in_Wikipedia
  • in a 2008 judgement the 8th US Circuit Court of Appeals ruled that the Department of Homeland Security could not rely upon Wikipedia as a source in deciding whether to admit asylum seekers. The court went on to quote Wikipedia “… The site acknowledges [that articles], “may become caught up in a heavily unbalanced viewpoint and can take some time – months perhaps – to regain a better-balanced consensus.” As a consequence, Wikipedia observes, the website’s “radical openness means that any given article may be, at any given moment, in a bad state: for example, it could be in the middle of a large edit or it could have been recently vandalized”United States Court of Appeals for the 8th Circuit No. 07-2276 http://www.ca8.uscourts.gov/opndir/08/08/072276P.pdf
  • So in the context of an encyclopaedias it would appear that even the gold standard is far from perfect but the nature of crowdsourcing and the continuous process of improvement and correction render Wikipedia unsuitable to be relied upon as a information source within a US court (it is perhaps worth noting that the US Court did not suggest an alternative more authoritative reference source as an acceptable alternative to Wikipedia). That said many commentators have countered that the “wisdom of the crowd”CLICK ensures that errors are identified and rectified much more rapidly within Wikipedia than within a traditional printed encyclopaedia.
  • In the context of authoritative geodata I suggest that we would expect it to be Geometrically and positionally accurate (within the scale/specification of capture)
  • Complete, no features or objects within the scope of the dataset are omittedOS has an SLA to capture 99.6% of real world change within 6 months
  • Correctly attributed (features are correctly named and classified according to a pre-determined but inevitably evolving scheme or taxonomy)
  • Authority is more than accuracyAccuracy alone does not guarantee authorityAccuracy and completeness are not the sole determinants of authority, change detection, capture standards and processes and quality assurance processes will all impact our willingness to “trust” or “respect” a dataset.Authority implies having some visible quality specifications and processes for testing the data against those specsIt is important to distinguish between data that has authority and data that is “accurate” or deemed to be fit for purpose the latter may be good enough or even very good but still may not have the implied safety/reliability seal that comes with being classed as authoritative. The opposite could also be true, it is also possible that data that has some official seal of authority may not be accurate, complete and current.
  • There is no statutory instrument conferring authority on the OS but I understand that the Land Registry does have guidance to use OS mapping as the basis of it’s records.In Great Britain, the Ordnance Survey has been designated by government as the National Mapping Agency (not the national mapping authority)“Ordnance Survey is the national mapping agency of Great Britain, collecting, maintaining, managing and distributing the definitive record of the features of the natural, built and planned environment, the definitive record of official boundaries and the record of such other national geographic datasets as required by government and the private sector.”“Ordnance Survey will work with and consult with others in the geographic information community to help determine and advise upon the standards and quality of its data in relation to present and future national needs. This data will provide the framework to which other geographical data in Britain is referenced.”
  • Clearly the OS is the authoritative source of geographic information providing a “definitive record” of features and boundaries. OS data is the only basis for determining legal disputes about GB geography (e.g. land ownership, political and administrative boundaries) and it is most unlikely that our courts will accept any alternative reference source whilst OS has this status. Does this mean that OS is the sole authority in all other contexts and that no other data can be considered authoritative? I would suggest not for several reasons:Other organisations could collect similar data to OS at the same or a higher standard of accuracy etc. Navteq and TeleAtlas would probably claim with justification that their navigation datasets contain more attribution (e.g. turn and height restrictions) which is maintained to a higher level of currency than OSOS only captures a subset of geographic information, usually described as reference data. Other organisations may capture different information (e.g. Environment Agency, British Geological Survey)But who could be the arbiter of authority outside of the context of core reference data? In an academic context authority is granted following some process of peer review perhaps the crowd could determine accuracy and completeness of alternative geodata sources through mass observations and determine the extent to which data could be relied upon?
  • An authority can be wrongWhat happens when OS omits data or makes a mistake? Even the current data capture SLA for the OS only seeks to record 99.6% of real world change within 6 months, a target that is met or bettered, this implies that 0.4% omissions are acceptable, what other tolerances in absolute quality might be acceptable in an authoritative dataset? Without doubt the authority of OS data is closely linked with the accuracy and detail of their maps and their data capture and QA processes which are based on over 200 years of experience, state of the art technology and 300 specialist surveyors. Whilst OS data is considered authoritative and a “definitive record”, it is still not absolutely correct or accurate at any point in time. QA processes tend to focus on what is included within a dataset rather than omissions, inevitably the ultimate quality check on any dataset’s completeness will be its users’ local knowledge.So let me turn to a crowd sourced geodataset and how we might assess that under similar criteriaSee Ordnance Survey Annual Report (Business Performance) http://www.ordnancesurvey.co.uk/oswebsite/docs/annual-reports/ordnance-survey-annual-report-and-accounts-2009-10.pdf
  • We have seen that authority is about trust and respect in addition to accuracy and we know that even OS is not perfect so what other data might gain our trust and respect?Let’s turn our attention to OSMIs it possible for a crowdsourced dataset such as OSM to be “trusted as being accurate or true” or “considered to be the best of its kind and unlikely to be improved upon”? Let’s consider the 3 criteria for authoritative geodata outlined above.The challenges1. Geometrically and positionally accurateOSM data is captured by a combination of handheld GPS surveys and “armchair surveys” tracing over aerial imagery donated by Yahoo or Bing (more up to date), in principle it should be possible to capture data to about 5m accuracy or slightly better using these tools. Whether this is sufficient to be relied upon will depend upon the proposed use of the data.2. Complete, no features or objects within the scope of the dataset are omittedThe community based approach to data capture does not allow for volunteers to be directed to cover specific areas in a planned manner although over time it does appear that the completeness is improving. A lack of completeness will limit the use of the data in applications which require broad cover, however that might not be a concern to an organisation wishing to build an application for say Greater London only. 3. Correctly attributed and classifiedAttribution and classification are more dependent on “on the ground” observations than the other criteria above. Consequently the level of attribution and classification has lagged behind the simple capture of geometry. Furthermore the classification model within OSM known as tags can be confusing for new contributors resulting in some potential errors or omissions in classification.Click But there are no formal QA processes, does that mean we cannot trust OSM?
  • MukiHaklay has undertaken several quantitative studies of the accuracy and completeness of OSM data which suggest that the data that has been captured is accurate but not yet complete or fully attributed. “By the end of March 2010, OpenStreetMap coverage of England grown to 69.8% from 51.2% a year ago. CLICK When attribute information is taken into account, the coverage grown to 24.3% from 14.7% a year ago.”CLICKAlthough there is a continually improving trend in completeness and attribution it would appear that the demographics and geographic distribution of volunteers may prevent the map ever having full or even close to full attribution and GB cover.See Haklay, M., 2010, “How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets” at http://povesham.wordpress.com/2010/08/03/how-good-is-vgi-a-comparative-study-of-openstreetmap-and-ordnance-survey-datasets-published/See Haklay at http://povesham.wordpress.com/2010/04/04/openstreetmap-completeness-evaluation-march-2010/
  • This question needs to be considered within the context of the constraints of an informal organisation of volunteer contributors. To become a reliable and trusted source of information within GB, OSM would need to broaden the range of contributors and identify the means to motivate contributors to focus on completing the map to a consistent level for the whole of the GB. It is unclear whether this is something that the current mapping community is able to achieve let alone wishes to doAccuracy and attributionThere are a wide range of quality evaluation tools and services developed by the OSM community for bug reporting, error detection, monitoring, and analysing tags. Specific tools range from checking network continuity, analysing relationships, visualising turn restrictions and identifying duplicate nodes, there are also tools to mark potential errors, analyse data by contributor and many that are country specific. However there is no mandatory set of processes that data pass through prior to release and it is difficult to determine the extent to which these tools are used by volunteers.The OSM philosophy on quality can perhaps be summarised as “the wisdom of the crowd will ultimately correct any errors or omissions” whether that is through observation or through the use of the tools available.If a combination of automated QA tools were applied in a consistent process to OSM edits then potential errors could be flagged and in some way prioritised for further examination and either corrected or verified.CompletenessMukiHaklay has identified that the level of completeness of OSM is greater in urban areas and that it also inversely correlates with the level of deprivation within an area.“… the analysis of OSM shows is that deprived communities and rural areas are not well covered, especially when attributes are considered”To rectify these biases OSM would need to find ways to either encourage existing volunteer contributors to step outside of their current areas of activity or attract new contributors in these under-mapped areas.See http://wiki.openstreetmap.org/wiki/List_of_OSM_based_Services#Quality_Assurance and http://wiki.openstreetmap.org/wiki/Quality_AssuranceSee Haklay & Ellul “Completeness in volunteered geographical information”
  • Users as producers – explain the shift from producer centric communityThere is no formal mission statement or outline of quality and coverage objectives for OSM, however this description on the OpenStreetMap Foundation’s web site probably is as close as we will get OpenStreetMap is an open initiative to create and provide free geographic data such as street maps to anyone who wants them. It is a massive online collaboration, with hundreds of thousands of registered users worldwide.It is focussed on producing maps that are available without charge or constraint and interestingly refers to its contributors as “users” rather than producers. I would say that it is producer centric not user centricThe direction of OSM is largely driven by an active community of volunteers who have taken on the mission to map the world for a variety of reasons which range from producer centric “because we can” or “because it is fun” to more commercial or humanitarian motivations. The organisation has been highly producer centric and has, up till now, resisted the influence of large potential users of its data (corporates or governments). A recent blog post by Martijn van Exel makes the case for OSM to focus on “warm” geography rather than seeking to emulate what he describes as the “cold” geography of national mapping agencies and navigation data suppliers.“… the extremely high churn rate that OpenStreetMap is coping with — less than one tenth of everyone who ever created an OpenStreetMap account continue to become active contributors. ..OpenStreetMap needs those flesh and blood contributors, because it is ‘Warm Geography’ at its core: real people mapping what is important to them — as opposed to the ‘Cold Geography’ of the thematic geodata churned out by the national mapping agencies and commercial street data providers; data that is governed by volumes of specifications and elaborate QA rules.”This is one contributor’s view but in my opinion it will resonate with many current contributors. If the current contributors do not want to create data that conforms to a specification then OSM is unlikely to become a trusted and reliable source of geodata.Perhaps by attracting potential users of OSM who are concerned with that “cold” geography to become contributors, the challenges of a consistent approach to QA and a more structured approach to completeness can be resolved. OSM-GB is one possible way of attracting such users.http://blog.osmfoundation.org/faq/http://oegeo.wordpress.com/2011/08/30/openstreetmap-and-warm-vs-cold-geography-2/
  • OSM-GB is a project being initiated at the Centre for Geospatial Sciences at Nottingham University. It is a collaboration between CGS and 1Spatial that will apply 1Spatial’s rules based geodata quality tools to a GB extract of OSM. The resulting “improved” and structured data will be projected into BNG and served as an OGC Web Map Service and Web Feature Service, for the duration of this project (approximately 15 months) these services will be available at no charge.
  • The project has 2 main strands of research:Applying rules based quality improvement processes to OSM to identify possible errors and after some experiment and refining of the rules potentially to automatically correct some geometric and attribute errors. The “improved” dataset will be available for download from the OSM-GB web site and could be offered back to the main OSM database (probably as a basis for further inspection prior to incorporation).By making the “improved” data available via standards based web services, it is hoped that public sector users in both central and local government will be encouraged to experiment with OSM and identify potential use cases for OSM that are not met by the geodata currently available through the PSMA. A number of organisations have already confirmed interest in accessing OSM-GB.The objective of making data available to so called professional users whose expectations have been set by using authoritative geodata is to encourage them to become contributors to OSM, motivated by the potential use cases identified, the flexibility of the range of data that can be captured and the data model. These users will often have a great deal of local knowledge (particularly those working within local government) that could help to address the challenges of completeness detailed above. In the longer term it may even be possible to encourage these users to incorporate contributing to OSM as part of their routine workflows.
  • Blame – 1 of the most frequently levelled criticismsResponsibility for the quality of OSM is often raised as a concern by potential users (much less so by people actually using the data) “who would I blame if something goes wrong?” The answer inevitably is no one, however it should be noted that most data providers including OS do not warrant that their data is accurate or even fit for purpose and exclude any liability for errors. CLICK For example the PSMA says:9.4 Ordnance Survey excludes to the fullest extent permissible by law all warranties, conditions, representations or terms, whether implied by, or expressed in, common law or statute including, but not limited to, any regarding the accuracy, compatibility, fitness for purpose, performance, satisfactory quality or use of the Licensed Data.
  • Wrapping upOSM is unlikely to ever be considered authoritative within a legal context.CLICKI hope that I have shown how in the more conversational sense of the term authoritative, OSM data could become an alternative trusted and reliable source of geodata offering a wide range of content which differs from and complements other sources. For this level of trust to be achieved a more formal approach to quality assurance and a more structured and consistent approach to data capture (content, geography and attribution) will be needed. The current OSM community may not choose to move in this direction but projects like OSM-GB may attract a new group of user/contributors who recognise the opportunities that OSM offers them and their organisations and who are able to help improve quality and extend coverage and attribution.
  • Informal mapping can communicate a lot of local knowledge without being accurateClickDoes it really matterSo rememberCLICK
  • Don’t forget to look at osmgb.org.uk

Transcript

  • 1. How authoritative can the crowd be?
    Steven Feldman
  • 2. A quick judgement
    http://www.flickr.com/photos/safari_vacation/5929769873/
  • 3. Whose Authority?
    able to be trusted as being accurate or true; reliable:
    “clear, authoritative information and advice”
    “an authoritative source”
    (of a text) considered to be the best of its kind and unlikely to be improved upon:
    “this is likely to become the authoritative study of the subject”
    commanding and self-confident; likely to be respected and obeyed:
    “his voice was calm and authoritative”
    proceeding from an official source and requiring compliance or obedience:
    “authoritative directives”
    http://www.flickr.com/photos/spunter/2907888414/
  • 4. http://www.flickr.com/photos/dorkomatic/5317150831/
    http://www.flickr.com/photos/topaz33/4735755952/
    A brief diversion into encyclopaedias
  • 5. The legal view
  • 6. http://www.flickr.com/photos/earthworm/189013050/
    Hmmmm
  • 7. What makes geodata authoritative?
    http://www.flickr.com/photos/rdmillar/1325739265/
  • 8. Complete
    http://www.flickr.com/photos/lwr/2113499196/
  • 9. Correctly attributed
    http://www.flickr.com/photos/scottmarkwell/2207949953/
  • 10. Authority > Accuracy
  • 11. The National Mapping Agency
    “Ordnance Survey is the national mapping agency of Great Britain, collecting, maintaining, managing and distributing the definitive record of the features of the natural, built and planned environment, the definitive record of official boundaries and the record of such other national geographic datasets as required by government and the private sector.
    … This data will provide the framework to which other geographical data in Britain is referenced.”
    http://www.flickr.com/photos/osmapping/5201291064/
  • 12. Could another map be authoritative?
    http://www.flickr.com/photos/harpreetsingh/48978787/
  • 13. Even the OS isn’t perfect
    http://www.flickr.com/photos/wwworks/2943810776/
  • 14. How accurate is OSM in GB?
    No formal and consistent QA processes applied!
  • 15. Muki says …
    the analysis of OSM shows that deprived communities and rural areas are not well covered, especially when attributes are considered
    In 2010 coverage at 69.8%
    In 2010 coverage at 69.8% but attribution 24.3%
    http://www.flickr.com/photos/chrisfleming/5942012099/
  • 16. Could OSM be an authoritative source for GB?
    http://www.flickr.com/photos/thorinside/194806347/
  • 17. Usersvs Producers
  • 18. OSM-GB
    Tile Service
    WMS
    Projected to OSGB
    WFS
    Rules based quality improved
    Data available for download/reuse on OSM terms
    FREE!
  • 19. Some Research Questions
    Will automated rules based processes improve quality?
    Will formal QA increase confidence/authority?
    Will improved QA & confidence increase contribution?
    Can “professional” contributors be “motivated” to fill in gaps?
    Are there use cases that will support a sustainable commercial model for OSM-GB?
  • 20. There is no one to blame
    http://www.flickr.com/photos/a2gemma/1448178195/
  • 21. Can the crowd be authoritative?
    Yes and No
    http://www.flickr.com/photos/wallyg/160534588/
  • 22. Does it matter?
    A final thought
    Maps don’t have to be authoritative, they can just be fun and useful
    Martin Usborne http://londonist.com
  • 23. Thank You
    www.osmgb.org.uk
    www.knowhwerconsulting.co.uk