Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling GIS Data in Non-relational Data Stores


Published on

As the amount of GIS data we need to keep track of increases, the amount of devices accessing it increases, and the amount of GIS writes increase, we’re finding that, much like real-time web applications, normal RDBMS’s are not well suited to scaling. This talk covers why GIS data is hard to scale in a normal RDBMS, what nonrelational stores exist out there, and some basic examples of how to do spatial queries within a nonrelational store.

Published in: Technology

Scaling GIS Data in Non-relational Data Stores

  1. 1. Scaling GIS Data in Nonrelational Data Stores featuring Mike Malone Tuesday, March 30, 2010
  2. 2. Mike Malone @mjmalone Tuesday, March 30, 2010
  3. 3. Tuesday, March 30, 2010
  4. 4. SimpleGeo Scalable turnkey location infrastructure Allows you to easily add geo-aware features to an existing application That result: we need to store and query lots of data (data set is already approaching 1TB, and we haven’t launched) Tuesday, March 30, 2010
  5. 5. Scaling HTTP is easy No shared state - shared-nothing architecture • HTTP requests contain all of the information necessary to generate a response • HTTP responses contain all of the information necessary for clients to interpret them • In other words, requests are self-contained and different requests can be routed to different servers Uniform interface - allows middleware applications to proxy requests, creating a tiered architecture and making load balancing trivial Tuesday, March 30, 2010
  6. 6. So what’s the problem? Individual HTTP requests have no shared state, but the applications that communicate via HTTP can and do Application state has to live somewhere • Path of least resistance is usually a relational database • But RDBMSs aren’t always the best tool for the job Tuesday, March 30, 2010
  7. 7. Desirable Data Store Characteristics Massively distributed Horizontally scalable Fault tolerant Fast Always available Tuesday, March 30, 2010
  8. 8. Relational Databases Based on the “relational model” first proposed by E.F. Codd in 1969 Tons of implementation experience and lots of robust open source and proprietary implementations Tuesday, March 30, 2010
  9. 9. RDBMS Strenghts Theoretically pure Clean abstraction Declarative syntax Mostly standardized Easy to reason about data Tuesday, March 30, 2010
  10. 10. ACID Atomicity - if one part of a transaction fails, the entire transaction fails Consistency - all data constraints must be met for a transaction to be successful Isolation - other operations can’t see a transaction that has not yet completed Durability - once the client has been notified that a transaction succeeded, the transaction will not be lost Tuesday, March 30, 2010
  11. 11. RDBMS Weaknesses SQL is opaque, and query parsers don’t always do the right thing • Geospatial SQL is particularly bad The best ones are crazy expensive Really bad at scaling writes Strong consistency requirements make horizontal scaling difficult Tuesday, March 30, 2010
  12. 12. RDBMS Writes Relational databases almost always use B- Tree (or some other tree-based) indexes Writes are typically implemented by doing an in-place update on disk • Requires random seek to a specific location on disk • May require additional seeks to read indexes if they outgrow the disk cache Disk seeks are bad. Tuesday, March 30, 2010
  13. 13. CAP Theorem There are three desirable characteristics of a shared data system that is deployed in a distributed environment like the web. Tuesday, March 30, 2010
  14. 14. CAP Theorem 1. Consistency - every node in the system contains the same data (e.g., replicas are never out of date) 2. Availability - every request to a non-failing node in the system returns a response 3. Partition Tolerance - system properties (consistency and/or availability) hold even when the system is partitioned and data is lost Tuesday, March 30, 2010
  15. 15. CAP Theorem Choose two. Tuesday, March 30, 2010
  16. 16. Client reads & writes reads & writes replicates Node A Node B Tuesday, March 30, 2010
  17. 17. Client writes replicates Node A Node B Tuesday, March 30, 2010
  18. 18. Client responds acknowledges Node A Node B Tuesday, March 30, 2010
  19. 19. Client responds o noes! Node A Node B Tuesday, March 30, 2010
  20. 20. What now? 1. Write fails: data store is unavailable 2. Write succeeds on Node A: data is inconsistent Tuesday, March 30, 2010
  21. 21. RDBMS Consistency Relational databases prioritize consistency Large scale distributed systems need to be highly available • As we add servers, the possibility of a network partition or node failure becomes an inevitability We could write an abstraction layer on top of a relational data store that trades consistency for availability Or we could switch to a data store that prioritizes the characteristics we really want Tuesday, March 30, 2010
  22. 22. Nonrelational DBs Over the past couple years, a number of specialized data stores have emerged • CouchDB • Redis • Cassandra • MongoDB • Dynamo • SimpleDB • BigTable • Memcached • Riak • MemcacheDB Tuesday, March 30, 2010
  23. 23. Also Known As NoSQL Not entirely appropriate, since SQL can be implemented on non-relational DBs But SQL is an opaque abstraction with lots of features that are difficult or impossible to efficiently distribute Tuesday, March 30, 2010
  24. 24. So what’s different? Most “non-relational” stores specifically emphasize partition tolerance and availability Typically provide a more relaxed guarantee of eventually consistent Tuesday, March 30, 2010
  25. 25. NoACID Tuesday, March 30, 2010
  26. 26. BASE Basically Available Soft State Eventually Consistent Tuesday, March 30, 2010
  27. 27. Eventual Consistency Write operations are attempted on n nodes that are “authoritative” for the provided key In the event of a network partition, data is written to another node in the cluster When the network heals and nodes become available again, inconsistent data is updated Tuesday, March 30, 2010
  28. 28. SimpleGeo Cassandra No single point of failure Efficient online cluster rebalancing allows for incremental scalability Emphasizes availability and partition tolerance • Eventually consistent • Tradeoff between consistency and latency exposed to the client Battle tested - large clusters at Facebook, Digg, and Twitter Tuesday, March 30, 2010
  29. 29. Cassandra Data Model Column - a tuple containing a name, value, and timestamp Column Family - a group of columns that are stored together on disk Row - identifier for a specific group of columns in a column family Super Column - a column that has columns Tuesday, March 30, 2010
  30. 30. Cassandra Data Model { '9xj5ss824mzyv.12345': { 'Record': { 'lat': 40.0149856, 'lon': -105.2705456, 'city': 'Boulder', 'state': 'CO' }, }, 'dr5regy3zcfgr.67890': { 'Record': { 'lat': 40.7142691, 'lon': -74.0059729, 'city': 'New York', 'state': 'NY' } } } Tuesday, March 30, 2010
  31. 31. Cassandra Data Model { '9xj5ss824mzyv.12345': { 'Record': { 'lat': 40.0149856, 'lon': -105.2705456, 'city': 'Boulder', 'state': 'CO' }, }, 'dr5regy3zcfgr.67890': { 'Record': { 'lat': 40.7142691, 'lon': -74.0059729, 'city': 'New York', 'state': 'NY' } } } Tuesday, March 30, 2010
  32. 32. Cassandra Data Model { '9xj5ss824mzyv.12345': { 'Record': { 'lat': 40.0149856, 'lon': -105.2705456, 'city': 'Boulder', 'state': 'CO' }, }, 'dr5regy3zcfgr.67890': { 'Record': { 'lat': 40.7142691, 'lon': -74.0059729, 'city': 'New York', 'state': 'NY' } } } Tuesday, March 30, 2010
  33. 33. Cassandra Data Model { '9xj5ss824mzyv.12345': { 'Record': { 'lat': 40.0149856, 'lon': -105.2705456, 'city': 'Boulder', 'state': 'CO' }, }, 'dr5regy3zcfgr.67890': { 'Record': { 'lat': 40.7142691, 'lon': -74.0059729, 'city': 'New York', 'state': 'NY' } } } Tuesday, March 30, 2010
  34. 34. Writes are crazy fast Writes are written to a commit log in the order they’re received - serial I/O New data is stored in an in-memory table Memory table is periodically synced to a file Files are occasionally merged Reads may end up checking multiple files (bloom filter helps) and merging results • Thats okay because reads are pretty easy to scale Tuesday, March 30, 2010
  35. 35. How can I query? Depends on the partitioner you use • Random partitioner: makes it really easy to keep a cluster balanced, but can only do lookups by row key • Order-preserving partitioner: stores data ordered by row key, so it can query for ranges of keys, but it’s a lot harder to keep balanced Tuesday, March 30, 2010
  36. 36. BYOI • If you need an index on something other than the row key, you need to build an inverted index yourself • Row key: attribute you're interested in plus row key being indexed • “dr5regy3zcfgr:com.simplegeo/1” • But what about indexing multiple attributes..? Tuesday, March 30, 2010
  37. 37. The Curse of Dimensionality Location data is multidimensional Traditional GIS software typically uses some variation of a Quadtree or R-Tree for indexes Like B-Trees, R-Trees need to be updated in-place and are expensive to manipulate when they outgrow memory Tuesday, March 30, 2010
  38. 38. Dimensionality Reduction If we think of the world as two-dimensional cartesian plane, we can think of latitude and longitude as coordinates for that plane Instead of using (x, y) coordinates, we can break the plane into a grid and number each box • Space-filling curve: a continuous line that intersects every point in a two-dimensional plane Tuesday, March 30, 2010
  39. 39. Tuesday, March 30, 2010
  40. 40. Geohash A convenient dimensionality reduction mechanism for (latitude, longitude) coordinates that uses a Z-Curve Simply interleave the bits of a (latitude, longitude) pair and base32 encode the result Interesting characteristics • Easy to calculate and to reverse • Represent bounding boxes • Truncating bits from the end of a geohash results in a larger geohash bounding the original Tuesday, March 30, 2010
  41. 41. Geohash Drawbacks Z-Curves are not necessarily the most efficient space-filling curve for range queries • Points on either end of the Z’s diagonal seem close together when they’re not • Points next to each other on the spherical earth may end up on opposite sides of our plane These inefficiencies mean we sometimes have to run multiple queries, or expand bounding box queries to cover very large expanses Tuesday, March 30, 2010
  42. 42. Geohash Alternatives Hilbert curves: improve on Z-Curves but have different drawbacks Non-algorithmic unique identifiers • Provide unique identifiers for geopolitical and colloquial bounding polygons • Yahoo! GeoPlanet’s WOEIDs are a good example Tuesday, March 30, 2010
  43. 43. Other stuff we use Tuesday, March 30, 2010
  44. 44. Memcache Useful for storing ephemeral or short-lived data and for caching Super crazy extra fast Robust support from pretty much every language in the world Tuesday, March 30, 2010
  45. 45. MemcacheDB BDB backed memcache We use it for statistics • Can’t use Cassandra because it doesn’t support eventually consistent increment and decrement operations (yet) Giant con: it’s pretty much impossible to rebalance if you add a node Tuesday, March 30, 2010
  46. 46. Pushpin Service Custom storage solution R-Tree index for fast lookups Mostly fixed data sets so it’s ok that we can’t update data efficiently Tuesday, March 30, 2010
  47. 47. MySQL! Our website still uses MySQL for some stuff... though we’re moving away from it Tuesday, March 30, 2010
  48. 48. Thanks! Tuesday, March 30, 2010
  49. 49. Ask me questions! @mjmalone Tuesday, March 30, 2010