Before We Start <ul><li>I am not here to persuade you about the usefulness or limitations of Neogeography or User Generate...
My Background <ul><li>PhD in Geography, specializing in Cartography </li></ul><ul><li>Attended AutoCarto 1 in 1974 (and ga...
Data Quality and Neogeography Dr. Mike Dobson President TeleMapics LLC [email_address]
Spatial Data Quality? <ul><li>Overall concern regarding the “fitness” of data for a particular use </li></ul><ul><ul><li>A...
Spatial Data’s Emerging  Popularity <ul><li>World of spatial data is exploding </li></ul><ul><ul><ul><li>Accessibility to ...
Why Is This of Concern? <ul><li>The quality of spatial data mitigates the success of communicating spatial concepts </li><...
Why Data Quality Is Key
No Integrity!
Neogeography <ul><li>Neogeography  </li></ul><ul><ul><li>“new” geography using non-traditional tools </li></ul></ul><ul><u...
NeoGeos <ul><li>What Roles do Neogeographers play in the process of communicating spatial data? </li></ul><ul><ul><li>Data...
Spatial Data Quality and Neogeography <ul><li>In order to help you understand my persuasion on data quality and neogeograp...
U ser  G enerated  C ontent? <ul><li>Content that is produced by users of web sites and digital media </li></ul><ul><ul><l...
So What’s Important About UGC? <ul><li>Equality of opportunity to publish </li></ul><ul><li>Coupled with one of the most s...
Social Networking
How Did This Happen? <ul><li>Technology that allows you to be “connected”, as well as to communicate and collaborate on yo...
How Did This Happen? <ul><li>Networks provide for </li></ul><ul><ul><li>Collective intelligence – the hive mentality or pe...
UGC Potential Benefits <ul><li>Linus’s law  </li></ul><ul><ul><li>With enough eyes all bugs ( spatial errors ) become triv...
Criticisms Of UGC <ul><li>Some error situations are too complex to be understood real-time </li></ul><ul><li>Usability may...
Lake What Road?
Not enough Contributors -Data Points?
User Priorities - Oooops
Prejudice in Response?
Prejudice in Response
UGC And Spatial Databases
Spatial Database Creation
What’s Being Optimized In The Previous Process? <ul><li>spatial data quality </li></ul><ul><ul><li>Accuracy of position </...
How Optimized? <ul><li>Data Quality is an integral part of the process </li></ul><ul><ul><li>Initially </li></ul></ul><ul>...
Harmonization <ul><li>It is this attempt to  actively harmonize all data  that distinguishes database building efforts. </...
Three Categories of Spatial Data <ul><li>Controlled data </li></ul><ul><ul><ul><li>OS, Navteq, TeleAtlas, INFOusa </li></u...
Issue <ul><li>It is possible to manage  controlled data quality  to meet specific requirements </li></ul><ul><li>It is pos...
Compilation <ul><li>Commercial </li></ul><ul><ul><li>Training in compilation </li></ul></ul><ul><ul><li>Specialization </l...
Compare and Contrast <ul><li>Commercial </li></ul><ul><ul><li>What are my coverage goals? </li></ul></ul><ul><ul><li>What ...
What Are The Potential Weaknesses of WIKI? <ul><li>Common issues </li></ul><ul><ul><li>Not enough data gatherers to valida...
Getting Around Some UGC Issues
Are Other Types of Spatial Databases Superior? <ul><li>Even with the benefits of Moolah ($) -Major navigation databases ar...
www.refnum.com/osm/gmaps/ Haywards Heath
And That’s Why UGC and Neogeographers <ul><li>Will become an integral part of building spatial databases </li></ul><ul><li...
Old Information Sharing
New Information Sharing
What’s The New Process
Social Networking Tools Of Interest in Compilation
Spatial Data Collection <ul><ul><li>Some UGC will be active </li></ul></ul><ul><ul><ul><li>User connects to an app and ent...
Relative Cost
Relative Accuracy
Summing UP <ul><li>Data Collection Systems </li></ul><ul><ul><li>Closed – commercial compilation efforts, no UGC </li></ul...
Raises These Questions <ul><li>Will the winners be </li></ul><ul><ul><li>Established commercial companies that capitalize ...
PND Data Flow – A Winner
UGC Open Street Data Flow – No Medal
Commercializing UGC
Relative Benefits Of Types Of UGC By Device
Why We Need UGC and Neogeographers
Thanks
Upcoming SlideShare
Loading in …5
×

Data Quality and Neogeography

2,543 views
2,454 views

Published on

A review of the role played by User Generated Content in creating or augmenting spatial databases.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,543
On SlideShare
0
From Embeds
0
Number of Embeds
150
Actions
Shares
0
Downloads
58
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • Data Quality and Neogeography

    1. 2. Before We Start <ul><li>I am not here to persuade you about the usefulness or limitations of Neogeography or User Generated Content </li></ul><ul><li>I am here to share my views on issues relating to the topic of spatial data quality and neogeography </li></ul><ul><li>Disclaimer - In general, my observations derive from my familiarity with mapping, navigation and local search </li></ul>
    2. 3. My Background <ul><li>PhD in Geography, specializing in Cartography </li></ul><ul><li>Attended AutoCarto 1 in 1974 (and gave the keynote in 2008) </li></ul><ul><li>Associate Professor of mapping and geography at SUNY Albany (1972–1985) </li></ul><ul><li>Associate at Spad Systems </li></ul><ul><li>Chief Cartographer, Chief Technologist and VP of BizDev for Rand McNally (1986-1999) </li></ul><ul><li>CTO and EVP of Engineering for go2 Systems (YP over cell phones) </li></ul><ul><li>Now run a consulting business focused on geospatial, especially local search, mapping and navigation applications </li></ul>
    3. 4. Data Quality and Neogeography Dr. Mike Dobson President TeleMapics LLC [email_address]
    4. 5. Spatial Data Quality? <ul><li>Overall concern regarding the “fitness” of data for a particular use </li></ul><ul><ul><li>Accuracy of position </li></ul></ul><ul><ul><ul><li>resolution </li></ul></ul></ul><ul><ul><li>Accuracy of Attribution </li></ul></ul><ul><ul><ul><li>Logical Consistency </li></ul></ul></ul><ul><ul><li>Completeness </li></ul></ul><ul><ul><ul><li>Including spatial coverage </li></ul></ul></ul><ul><ul><li>Temporal relevance </li></ul></ul><ul><ul><li>Metadata </li></ul></ul>
    5. 6. Spatial Data’s Emerging Popularity <ul><li>World of spatial data is exploding </li></ul><ul><ul><ul><li>Accessibility to spatial data increasing </li></ul></ul></ul><ul><ul><ul><li>Availability of spatial data increasing </li></ul></ul></ul><ul><ul><ul><li>Today’s online environment provides </li></ul></ul></ul><ul><ul><ul><ul><li>Easy-to-use tools for collecting spatial data </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Easy-to-use tools for analyzing spatial data </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Easy-to-use tools for presenting spatial data </li></ul></ul></ul></ul>
    6. 7. Why Is This of Concern? <ul><li>The quality of spatial data mitigates the success of communicating spatial concepts </li></ul><ul><ul><li>Could this explosive growth have an influence on the quality of spatial data? </li></ul></ul>
    7. 8. Why Data Quality Is Key
    8. 9. No Integrity!
    9. 10. Neogeography <ul><li>Neogeography </li></ul><ul><ul><li>“new” geography using non-traditional tools </li></ul></ul><ul><ul><li>Neogeographers </li></ul></ul><ul><ul><ul><li>Want to communicate/share their interests in geography and are willing to do something about it </li></ul></ul></ul>
    10. 11. NeoGeos <ul><li>What Roles do Neogeographers play in the process of communicating spatial data? </li></ul><ul><ul><li>Data collectors – database creators </li></ul></ul><ul><ul><li>Data analyzers </li></ul></ul><ul><ul><li>Data Presenters </li></ul></ul><ul><li>While all three roles impact or are influenced by “data quality”, today I will focus on neogeographers and data collection /database creation </li></ul>
    11. 12. Spatial Data Quality and Neogeography <ul><li>In order to help you understand my persuasion on data quality and neogeography, I would like to explore User Generated Content </li></ul><ul><ul><li>UGC is one of the primary means that neogeographers use to express their interest in Geography </li></ul></ul><ul><ul><ul><li>On this journey we will loop outside of geography and then fall back in through mapping and other uses of spatial data. </li></ul></ul></ul>
    12. 13. U ser G enerated C ontent? <ul><li>Content that is produced by users of web sites and digital media </li></ul><ul><ul><li>Contrasted with traditional media producers such as broadcasters, production companies publishing companies and map database companies </li></ul></ul>
    13. 14. So What’s Important About UGC? <ul><li>Equality of opportunity to publish </li></ul><ul><li>Coupled with one of the most significant demographic trends in the last century: </li></ul><ul><ul><li>“ It’s about me” (e.g. use of YouTube, MySpace, Facebook ) </li></ul></ul><ul><ul><ul><li>“ Especially in respect to the streets, roads and trails I travel, as well as the POIs I frequent and the spatial topics of interest to me ” </li></ul></ul></ul>
    14. 15. Social Networking
    15. 16. How Did This Happen? <ul><li>Technology that allows you to be “connected”, as well as to communicate and collaborate on your own terms </li></ul><ul><ul><li>Internet </li></ul></ul><ul><ul><li>Cellular telephony </li></ul></ul><ul><li>Development of comprehensive spatial databases </li></ul><ul><ul><li>Pushing geospatial into the mainstream -Neogeography </li></ul></ul>
    16. 17. How Did This Happen? <ul><li>Networks provide for </li></ul><ul><ul><li>Collective intelligence – the hive mentality or perhaps the Borg </li></ul></ul><ul><ul><li>Aggregated knowledge from decentralized sources (Wikipedia – Wikinomics) </li></ul></ul><ul><ul><li>Low cost collaboration </li></ul></ul>
    17. 18. UGC Potential Benefits <ul><li>Linus’s law </li></ul><ul><ul><li>With enough eyes all bugs ( spatial errors ) become trivial </li></ul></ul><ul><li>Contributors exhibit </li></ul><ul><ul><li>Self selection </li></ul></ul><ul><ul><li>Focus </li></ul></ul><ul><ul><li>Self benefit </li></ul></ul><ul><li>Numerousness </li></ul><ul><ul><li>There should be more interested spatial data contributors than professional map editors </li></ul></ul><ul><li>Spatial distribution </li></ul><ul><ul><li>The distribution of UGCers is more ubiquitous than that of professional map editors. </li></ul></ul>
    18. 19. Criticisms Of UGC <ul><li>Some error situations are too complex to be understood real-time </li></ul><ul><li>Usability may be low </li></ul><ul><li>May require extensive error checking </li></ul><ul><li>User priorities may lead to unreliability </li></ul><ul><li>Prejudice in responses </li></ul>
    19. 20. Lake What Road?
    20. 21. Not enough Contributors -Data Points?
    21. 22. User Priorities - Oooops
    22. 23. Prejudice in Response?
    23. 24. Prejudice in Response
    24. 25. UGC And Spatial Databases
    25. 26. Spatial Database Creation
    26. 27. What’s Being Optimized In The Previous Process? <ul><li>spatial data quality </li></ul><ul><ul><li>Accuracy of position </li></ul></ul><ul><ul><ul><li>resolution </li></ul></ul></ul><ul><ul><li>Accuracy of Attribution </li></ul></ul><ul><ul><ul><li>Logical Consistency </li></ul></ul></ul><ul><ul><li>Completeness </li></ul></ul><ul><ul><ul><li>Including spatial coverage </li></ul></ul></ul><ul><ul><li>Temporal relevance </li></ul></ul><ul><ul><li>Metadata </li></ul></ul>
    27. 28. How Optimized? <ul><li>Data Quality is an integral part of the process </li></ul><ul><ul><li>Initially </li></ul></ul><ul><ul><ul><li>Data collected according to specifications </li></ul></ul></ul><ul><ul><ul><ul><li>Bad data re-collected or placed in the update queue </li></ul></ul></ul></ul><ul><ul><li>Ongoing </li></ul></ul><ul><ul><ul><li>Every year significant spatial changes are accommodated. </li></ul></ul></ul><ul><ul><ul><li>Areas of high change are identified and updated. </li></ul></ul></ul><ul><ul><ul><li>Other changes are found by systematically working research teams through the entire coverage over time </li></ul></ul></ul><ul><ul><li>The overall assignment is designed to maximize the time value of money, while increasing the integrity of the database. </li></ul></ul>
    28. 29. Harmonization <ul><li>It is this attempt to actively harmonize all data that distinguishes database building efforts. </li></ul><ul><li>Important Issues </li></ul><ul><ul><ul><li>Who directs crowdsourced data from an editorial perspective? </li></ul></ul></ul><ul><ul><ul><li>Who sets standards for crowdsourced data? </li></ul></ul></ul><ul><ul><ul><li>Who Quality Controls crowdsourced data? </li></ul></ul></ul><ul><ul><ul><li>What external guidance exists in crowdsourced systems ? </li></ul></ul></ul>
    29. 30. Three Categories of Spatial Data <ul><li>Controlled data </li></ul><ul><ul><ul><li>OS, Navteq, TeleAtlas, INFOusa </li></ul></ul></ul><ul><li>Hybrid (a mix of controlled and uncontrolled data) </li></ul><ul><ul><ul><li>Google, Yahoo, MSN, TomTom </li></ul></ul></ul><ul><li>Crowdsourced (uncontrolled) </li></ul><ul><ul><ul><li>OSM, Flickr, etc </li></ul></ul></ul>
    30. 31. Issue <ul><li>It is possible to manage controlled data quality to meet specific requirements </li></ul><ul><li>It is possible to manage hybrid data quality to meet specific requirements </li></ul><ul><li>But can you manage crowdsourced data quality to meet specific requirements on a reliable basis? </li></ul><ul><li>Let’s look at database compilation for some insights </li></ul>
    31. 32. Compilation <ul><li>Commercial </li></ul><ul><ul><li>Training in compilation </li></ul></ul><ul><ul><li>Specialization </li></ul></ul><ul><ul><li>Staff size limited </li></ul></ul><ul><ul><li>Research limited </li></ul></ul><ul><ul><li>Sweat of the brow </li></ul></ul><ul><ul><ul><li>But salaried sweat of the brow </li></ul></ul></ul><ul><li>Wiki </li></ul><ul><ul><li>Self Selection </li></ul></ul><ul><ul><li>Local experience </li></ul></ul><ul><ul><li>Staff size potentially unlimited </li></ul></ul><ul><ul><li>Research hours potentially unlimited </li></ul></ul><ul><ul><li>Avocation </li></ul></ul>
    32. 33. Compare and Contrast <ul><li>Commercial </li></ul><ul><ul><li>What are my coverage goals? </li></ul></ul><ul><ul><li>What are my accuracy goals? </li></ul></ul><ul><ul><li>How Much can I spend on updating? </li></ul></ul><ul><ul><li>What size of capable staff can I afford? </li></ul></ul><ul><ul><ul><li>How well can I pay them? </li></ul></ul></ul><ul><ul><ul><li>How can I otherwise incent them to create the best database possible? </li></ul></ul></ul><ul><li>WIKI </li></ul><ul><ul><li>How many people will contribute? </li></ul></ul><ul><ul><ul><li>How many are capable? </li></ul></ul></ul><ul><ul><li>Where are they located? </li></ul></ul><ul><ul><ul><li>Does this match areas of weak coverage? </li></ul></ul></ul><ul><ul><li>How long will it take to get good results over large coverages? </li></ul></ul><ul><ul><li>How to motivate these collaborators over long periods? </li></ul></ul>
    33. 34. What Are The Potential Weaknesses of WIKI? <ul><li>Common issues </li></ul><ul><ul><li>Not enough data gatherers to validate the data </li></ul></ul><ul><ul><ul><li>or a method to redeploy them </li></ul></ul></ul><ul><ul><li>Not enough coverage to meet the need (the distribution of the UGCers) </li></ul></ul><ul><ul><ul><li>Or a method to redeploy them </li></ul></ul></ul><ul><ul><li>Lack of Standards </li></ul></ul><ul><ul><li>Lack of Quality Control </li></ul></ul><ul><li>But all of these limitation can be accommodated </li></ul>
    34. 35. Getting Around Some UGC Issues
    35. 36. Are Other Types of Spatial Databases Superior? <ul><li>Even with the benefits of Moolah ($) -Major navigation databases are </li></ul><ul><ul><li>Out of date </li></ul></ul><ul><ul><li>Inaccurate </li></ul></ul><ul><ul><li>Non-comprehensive </li></ul></ul><ul><ul><li>Variable quality </li></ul></ul><ul><ul><li>Too expensive to maintain </li></ul></ul><ul><ul><ul><li>Navteq database extension and update costs in 2007 were over $300,000,000 </li></ul></ul></ul>
    36. 37. www.refnum.com/osm/gmaps/ Haywards Heath
    37. 38. And That’s Why UGC and Neogeographers <ul><li>Will become an integral part of building spatial databases </li></ul><ul><li>Hybrid data collection systems using UCG and controlled data are where geospatial is going </li></ul><ul><ul><li>Let’s look </li></ul></ul>
    38. 39. Old Information Sharing
    39. 40. New Information Sharing
    40. 41. What’s The New Process
    41. 42. Social Networking Tools Of Interest in Compilation
    42. 43. Spatial Data Collection <ul><ul><li>Some UGC will be active </li></ul></ul><ul><ul><ul><li>User connects to an app and enters relevant spatial data for updating or extending a spatial database </li></ul></ul></ul><ul><ul><li>Some UGC will be passive </li></ul></ul><ul><ul><ul><li>Device tracks and reports (anonymously) user paths, builds database by merging path information over time </li></ul></ul></ul><ul><ul><ul><ul><li>Passive is particularly useful in building navigation databases </li></ul></ul></ul></ul>
    43. 44. Relative Cost
    44. 45. Relative Accuracy
    45. 46. Summing UP <ul><li>Data Collection Systems </li></ul><ul><ul><li>Closed – commercial compilation efforts, no UGC </li></ul></ul><ul><ul><li>Open – WIKI approaches, no proprietary data </li></ul></ul><ul><ul><li>Hybrid – where geospatial is going </li></ul></ul><ul><ul><ul><li>Advantages spatial data accuracy by contributing the best of both approaches. </li></ul></ul></ul>
    46. 47. Raises These Questions <ul><li>Will the winners be </li></ul><ul><ul><li>Established commercial companies that capitalize on UGC to augment their data? </li></ul></ul><ul><ul><li>New competitors that commercialize UGC and augment these data to compete with established commercial systems? </li></ul></ul>
    47. 48. PND Data Flow – A Winner
    48. 49. UGC Open Street Data Flow – No Medal
    49. 50. Commercializing UGC
    50. 51. Relative Benefits Of Types Of UGC By Device
    51. 52. Why We Need UGC and Neogeographers
    52. 53. Thanks

    ×