Data Quality and Neogeography

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite

    Data Quality and Neogeography - Presentation Transcript

    1.  
    2. Before We Start
      • I am not here to persuade you about the usefulness or limitations of Neogeography or User Generated Content
      • I am here to share my views on issues relating to the topic of spatial data quality and neogeography
      • Disclaimer - In general, my observations derive from my familiarity with mapping, navigation and local search
    3. My Background
      • PhD in Geography, specializing in Cartography
      • Attended AutoCarto 1 in 1974 (and gave the keynote in 2008)
      • Associate Professor of mapping and geography at SUNY Albany (1972–1985)
      • Associate at Spad Systems
      • Chief Cartographer, Chief Technologist and VP of BizDev for Rand McNally (1986-1999)
      • CTO and EVP of Engineering for go2 Systems (YP over cell phones)
      • Now run a consulting business focused on geospatial, especially local search, mapping and navigation applications
    4. Data Quality and Neogeography Dr. Mike Dobson President TeleMapics LLC [email_address]
    5. Spatial Data Quality?
      • Overall concern regarding the “fitness” of data for a particular use
        • Accuracy of position
          • resolution
        • Accuracy of Attribution
          • Logical Consistency
        • Completeness
          • Including spatial coverage
        • Temporal relevance
        • Metadata
    6. Spatial Data’s Emerging Popularity
      • World of spatial data is exploding
          • Accessibility to spatial data increasing
          • Availability of spatial data increasing
          • Today’s online environment provides
            • Easy-to-use tools for collecting spatial data
            • Easy-to-use tools for analyzing spatial data
            • Easy-to-use tools for presenting spatial data
    7. Why Is This of Concern?
      • The quality of spatial data mitigates the success of communicating spatial concepts
        • Could this explosive growth have an influence on the quality of spatial data?
    8. Why Data Quality Is Key
    9. No Integrity!
    10. Neogeography
      • Neogeography
        • “new” geography using non-traditional tools
        • Neogeographers
          • Want to communicate/share their interests in geography and are willing to do something about it
    11. NeoGeos
      • What Roles do Neogeographers play in the process of communicating spatial data?
        • Data collectors – database creators
        • Data analyzers
        • Data Presenters
      • While all three roles impact or are influenced by “data quality”, today I will focus on neogeographers and data collection /database creation
    12. Spatial Data Quality and Neogeography
      • In order to help you understand my persuasion on data quality and neogeography, I would like to explore User Generated Content
        • UGC is one of the primary means that neogeographers use to express their interest in Geography
          • On this journey we will loop outside of geography and then fall back in through mapping and other uses of spatial data.
    13. U ser G enerated C ontent?
      • Content that is produced by users of web sites and digital media
        • Contrasted with traditional media producers such as broadcasters, production companies publishing companies and map database companies
    14. So What’s Important About UGC?
      • Equality of opportunity to publish
      • Coupled with one of the most significant demographic trends in the last century:
        • “ It’s about me” (e.g. use of YouTube, MySpace, Facebook )
          • “ Especially in respect to the streets, roads and trails I travel, as well as the POIs I frequent and the spatial topics of interest to me ”
    15. Social Networking
    16. How Did This Happen?
      • Technology that allows you to be “connected”, as well as to communicate and collaborate on your own terms
        • Internet
        • Cellular telephony
      • Development of comprehensive spatial databases
        • Pushing geospatial into the mainstream -Neogeography
    17. How Did This Happen?
      • Networks provide for
        • Collective intelligence – the hive mentality or perhaps the Borg
        • Aggregated knowledge from decentralized sources (Wikipedia – Wikinomics)
        • Low cost collaboration
    18. UGC Potential Benefits
      • Linus’s law
        • With enough eyes all bugs ( spatial errors ) become trivial
      • Contributors exhibit
        • Self selection
        • Focus
        • Self benefit
      • Numerousness
        • There should be more interested spatial data contributors than professional map editors
      • Spatial distribution
        • The distribution of UGCers is more ubiquitous than that of professional map editors.
    19. Criticisms Of UGC
      • Some error situations are too complex to be understood real-time
      • Usability may be low
      • May require extensive error checking
      • User priorities may lead to unreliability
      • Prejudice in responses
    20. Lake What Road?
    21. Not enough Contributors -Data Points?
    22. User Priorities - Oooops
    23. Prejudice in Response?
    24. Prejudice in Response
    25. UGC And Spatial Databases
    26. Spatial Database Creation
    27. What’s Being Optimized In The Previous Process?
      • spatial data quality
        • Accuracy of position
          • resolution
        • Accuracy of Attribution
          • Logical Consistency
        • Completeness
          • Including spatial coverage
        • Temporal relevance
        • Metadata
    28. How Optimized?
      • Data Quality is an integral part of the process
        • Initially
          • Data collected according to specifications
            • Bad data re-collected or placed in the update queue
        • Ongoing
          • Every year significant spatial changes are accommodated.
          • Areas of high change are identified and updated.
          • Other changes are found by systematically working research teams through the entire coverage over time
        • The overall assignment is designed to maximize the time value of money, while increasing the integrity of the database.
    29. Harmonization
      • It is this attempt to actively harmonize all data that distinguishes database building efforts.
      • Important Issues
          • Who directs crowdsourced data from an editorial perspective?
          • Who sets standards for crowdsourced data?
          • Who Quality Controls crowdsourced data?
          • What external guidance exists in crowdsourced systems ?
    30. Three Categories of Spatial Data
      • Controlled data
          • OS, Navteq, TeleAtlas, INFOusa
      • Hybrid (a mix of controlled and uncontrolled data)
          • Google, Yahoo, MSN, TomTom
      • Crowdsourced (uncontrolled)
          • OSM, Flickr, etc
    31. Issue
      • It is possible to manage controlled data quality to meet specific requirements
      • It is possible to manage hybrid data quality to meet specific requirements
      • But can you manage crowdsourced data quality to meet specific requirements on a reliable basis?
      • Let’s look at database compilation for some insights
    32. Compilation
      • Commercial
        • Training in compilation
        • Specialization
        • Staff size limited
        • Research limited
        • Sweat of the brow
          • But salaried sweat of the brow
      • Wiki
        • Self Selection
        • Local experience
        • Staff size potentially unlimited
        • Research hours potentially unlimited
        • Avocation
    33. Compare and Contrast
      • Commercial
        • What are my coverage goals?
        • What are my accuracy goals?
        • How Much can I spend on updating?
        • What size of capable staff can I afford?
          • How well can I pay them?
          • How can I otherwise incent them to create the best database possible?
      • WIKI
        • How many people will contribute?
          • How many are capable?
        • Where are they located?
          • Does this match areas of weak coverage?
        • How long will it take to get good results over large coverages?
        • How to motivate these collaborators over long periods?
    34. What Are The Potential Weaknesses of WIKI?
      • Common issues
        • Not enough data gatherers to validate the data
          • or a method to redeploy them
        • Not enough coverage to meet the need (the distribution of the UGCers)
          • Or a method to redeploy them
        • Lack of Standards
        • Lack of Quality Control
      • But all of these limitation can be accommodated
    35. Getting Around Some UGC Issues
    36. Are Other Types of Spatial Databases Superior?
      • Even with the benefits of Moolah ($) -Major navigation databases are
        • Out of date
        • Inaccurate
        • Non-comprehensive
        • Variable quality
        • Too expensive to maintain
          • Navteq database extension and update costs in 2007 were over $300,000,000
    37. www.refnum.com/osm/gmaps/ Haywards Heath
    38. And That’s Why UGC and Neogeographers
      • Will become an integral part of building spatial databases
      • Hybrid data collection systems using UCG and controlled data are where geospatial is going
        • Let’s look
    39. Old Information Sharing
    40. New Information Sharing
    41. What’s The New Process
    42. Social Networking Tools Of Interest in Compilation
    43. Spatial Data Collection
        • Some UGC will be active
          • User connects to an app and enters relevant spatial data for updating or extending a spatial database
        • Some UGC will be passive
          • Device tracks and reports (anonymously) user paths, builds database by merging path information over time
            • Passive is particularly useful in building navigation databases
    44. Relative Cost
    45. Relative Accuracy
    46. Summing UP
      • Data Collection Systems
        • Closed – commercial compilation efforts, no UGC
        • Open – WIKI approaches, no proprietary data
        • Hybrid – where geospatial is going
          • Advantages spatial data accuracy by contributing the best of both approaches.
    47. Raises These Questions
      • Will the winners be
        • Established commercial companies that capitalize on UGC to augment their data?
        • New competitors that commercialize UGC and augment these data to compete with established commercial systems?
    48. PND Data Flow – A Winner
    49. UGC Open Street Data Flow – No Medal
    50. Commercializing UGC
    51. Relative Benefits Of Types Of UGC By Device
    52. Why We Need UGC and Neogeographers
    53. Thanks

    + mdobmdob, 2 years ago

    custom

    1105 views, 1 favs, 1 embeds more stats

    A review of the role played by User Generated Cont more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 1105
      • 1021 on SlideShare
      • 84 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 20
    Most viewed embeds
    • 84 views on http://blog.telemapics.com

    more

    All embeds
    • 84 views on http://blog.telemapics.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories