Non-Relational Databases at ACCU2011

3,225 views

Published on

Slides from my talk at ACCU2011 in Oxford on 16th April 2011. A whirlwind tour of the non-relational database families, with a little more detail on Redis, MongoDB, Neo4j and HBase.

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,225
On SlideShare
0
From Embeds
0
Number of Embeds
492
Actions
Shares
0
Downloads
107
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • 13 rules, numbered 0 to 12\nNo popular DBMS is actually ‘relational’ by 12 rules - they all break some of them\nLeading commercial - Oracle, MS, IBM (DB2)\nLeading open-source - MySQL, PostgreSQL, SQLite\n
  • \n
  • If one part of transaction fails, it all fails, DB left unchanged.\nFailures: HW, system, DB (disk etc), application (violate constraints on data)\n
  • The DB will enforce consistency and relationships/constraints that have been specified in the schema - everything else is the responsibility of the application\n
  • Dirty reads - allow other transactions to read, but not modify uncommitted data - improve performance\n
  • \n
  • DB creates new version of data for a TX\nOther TXes read the old version until TX completed.\nMVCC used by some non-relational databases\n
  • Usually use a transaction log that can be replayed to rebuild data in event of failure.\n
  • \n
  • \n
  • What most of these companies have in common is scale\nHow would an RDBMS handle the size of data they deal with?\nMost of the big companies have built their own solutions.\nMost of them also use RDBMSes - Facebook is huge MySQL user.\n
  • \n
  • Scaling - RDBMs don’t scale linearly - big box == $$$$\ne.g. Graph relationships don’t map to tables & rows easily\nSemi/Unstructured data, lots of columns, lots of nulls\n
  • Caching - e.g. memcacheDB, store common queries in memory\ndenormalise - add redundant data, grouped data to reduce table joins - reduce load on physical hardware - improve locality of reference\nSo... you choose a distributed NOSQL fancy modern DB\n
  • \n
  • Not really...\n
  • C - all nodes see same data at the same time\nA - survivors continue to operate when nodes fail\nP - system continues to operate despite message loss between nodes\nMany systems relax consistency\n
  • Also by Eric Brewer \nBASE system relaxes the C in CAP\nBA - might lose access to some data if nodes fail\nSS - System state might change over time without input (eventual consistency, propagation)\n
  • Different ways to consider whether a write has succeeded, whether new value is returned.\n
  • \n
  • Consistent Smashing - video from Basho/Riak\n
  • Lots of overlap between families - esp. column & key-value/DHT\n
  • \n
  • Schema-less way of looking at data as documents rather than fields - all related data in document. \nMaps very well to a lot of applications\n
  • huMONGOus\n10gen\n
  • Can be ACID if using replication for durability\n
  • \n
  • \n
  • \n
  • Object mapper - not ORM\n
  • \n
  • \n
  • FlockDB - Twitter, social graph - simpler than neo4j\nNeo4j - dual open-source/commercial license\nHama - apache project\n
  • ACID transactions\npersistence\nconcurrency\nscalable\n
  • \n
  • \n
  • \n
  • Tokyo Tyrant - network access protocol for Tokyo Cabinet DB\nVoldemort - LinkedIn\n
  • \n
  • Can be ACID if aof fsyncs all the time\n
  • \n
  • \n
  • \n
  • replication non-blocking on master. Writes will work even if slave blocked.\nReplication for scaling (read-only slaves) or for redundancy.\nAOF log - everything that changes the dataset.\nIf server crashes redis replays the AOF\nBGREWRITEAOF to optimize AOF - minimum steps to rebuild dataset in memory\nconfigurable fsync options - every command, every second, never\n\n
  • \n
  • Oracle Berkeley DB, Berkeley DB Java, Berkeley DB XML\nMemcache + Berkeley DB = MemcacheDB, a bit like Redis, for KV\n\n
  • OSDI 2006 (MapReduce was 2004)\n
  • Bigtable - column families, distributed, scale\n
  • \n
  • Consider a whiteboard overview of Hadoop here. \nReal-time (low-latency) as opposed to Hadoop & mapreduce batch jobs. \nNot ACID - effect of distributed writes on consistency and isolation of views\nRelaxes A of cap - consistent & partition tolerant\n
  • partitioned on row count/size\nRegion is basic unit of availability\n\n
  • \n
  • \n
  • \n
  • Queries - no support for complex queries\nCompute query in application (mapreduce, etc)\nall necessary data is denormalised in the row - wide table with lots of columns.\n“versioned get” returns older version of row\n
  • Couchbase - combination of CouchDB, Membase, Memcached\nKyoto Cabinet - C++ implementation by Tokyo Cabinet author.\n
  • Impedance Mismatch\nCAP Theorem, Eventual Consistency\nRedis, MongoDB, Neo4j, HBase\n
  • \n
  • Non-Relational Databases at ACCU2011

    1. 1. * databases query_language 
<>
‘SQL’;Gavin Heavyside - ACCU Conference - 16 April 2011
    2. 2. *databasesquery_language
<>
‘SQL’LIMIT
4;
    3. 3. Me• Director of Engineering at MyDrive• Hands-on coding in Ruby, C++ & others• Big data, SW architecture, robustness, tdd, devops, data analysis• Background of SW for telecoms, mobile, embedded• @gavinheavyside
    4. 4. MyDrive Solutions• Driver behaviour analysis and scoring for telematics-based insurance• Large-scale geospatial processing of GPS and map data• Relational DBs - PostgreSQL, MySQL• Non-relational DBs - Redis, HBase• Big Data tools - Hadoop• Built on Linux and open-source stack
    5. 5. RDBMS
    6. 6. What is an RDBMS• “Codd’s 12 Rules”, 1970• Relations • e.g. tables, rows, columns• Relational Operators • Manipulate data in tabular form
    7. 7. ACID• Atomicity• Consistency• Isolation• Durability
    8. 8. Atomicity• All or nothing• Maintain atomicity across failures
    9. 9. Consistency• DB moves from one consistent state to another• Only valid data is written to DB• It can only enforce rules it knows about
    10. 10. Isolation• Transactions can’t see data from other incomplete transactions• Blocking & Deadlocks • Dirty reads • MVCC
    11. 11. Locking• Row locking• Whole table locking• TX might require lots of locks• Blocking
    12. 12. MVCC• Multi-Version Concurrency Control• Maintain several versions of objects• Read & write timestamps on transactions• Reads never blocked
    13. 13. Durability• Data from successful tx is never lost
    14. 14. What’s wrong with relational DBs?
    15. 15. http://www.flickr.com/photos/exfordy/4734358134/
    16. 16. All the cool kids use non-relational DBs...Facebook LinkedInTwitter Google
    17. 17. ...and relational DBs
    18. 18. What’s wrong with relational DBs?• Nothing• ‘Impedance Mismatch’• Scaling
    19. 19. Scaling an RDBMS• Launch successful service• Read saturation - add caching• Write saturation - add hardware (£££)• Queries slow - denormalise• Reads still too slow - prematerialise common queries, stop joining• Writes too slow - drop secondary indexes and triggers
    20. 20. Denormalising• Normalise logical data design • Joins • Materialised views can optimise queries• Denormalise logical data design • Eliminate joins • Application must ensure data consistency
    21. 21. Scaling a distributed DB• Just add more commodity servers...• ...we wish
    22. 22. CAP Theorem• Eric Brewer, 2000• Distributed System can’t simultaneously be • Consistent • Available • Partition-tolerant
    23. 23. BASE• Basically Available• Soft state• Eventually consistent• Relaxation of the C in CAP
    24. 24. Eventual Consistency• All nodes eventually see the same data• Different strategies • One • Quorum • All
    25. 25. Horizontal Scaling• Partitioning• Sharding• Dynamo-style
    26. 26. http://vimeo.com/13667174
    27. 27. Non-relational Database Families• Document-oriented• Graph• Column-oriented• Key-value & DHT• Others
    28. 28. DocumentDatabases
    29. 29. Document Databases• IBM Lotus• CouchDB• MongoDB• Riak
    30. 30. http://mongodb.org
    31. 31. MongoDB• JSON-style documents• Indexes on any field• Replication, auto-sharding• Map/Reduce
    32. 32. MongoDB Demo
    33. 33. Other Features• Document linking & embedding• GridFS - store large files• Geospatial indexes and searches
    34. 34. OM
    35. 35. Graph DBs http://www.flickr.com/photos/thefangmonster/2301364418/
    36. 36. Graph Databases• Nodes, relationships & properties• Query by traversing graph• Natural fit for recommendations, shortest paths, social graph
    37. 37. Graph DBs• FlockDB• Neo4j• Apache Hama• Google Pregel
    38. 38. Neo4j• Embedded• Server• REST• Components - indexing, management, rdf, geospatial
    39. 39. Key-Value & DHT
    40. 40. Key-Value & DHT• Amazon Dynamo• Project Voldemort• Redis• Tokyo Cabinet• Amazon SimpleDB
    41. 41. http://redis.io
    42. 42. redis• By Salvatore Sanfillipo (@antirez)• Sponsored by VMware• data-structure server• strings, hashes, lists• sets, sorted sets• All operations in memory, backed by disk
    43. 43. Text Interactive Documentation
    44. 44. Redis Demo
    45. 45. Other features• Replication (master/slaves)• Persistence • Snapshotting • Append-only log file
    46. 46. Object Hash Mappers• cf ORM• OHM
    47. 47. Other KV Stores• Berkeley DB• Memcache• Microsoft Dynomite
    48. 48. Column-Oriented DBs http://www.flickr.com/photos/nationalmediamuseum/3588099765/
    49. 49. Column-Oriented Databases• Google Bigtable• Cassandra• Hypertable• HBase
    50. 50. HBase http://www.flickr.com/photos/negativz/14470756/
    51. 51. • Apache top-level project• Implementation of Google Bigtable• Distributed• High write throughput• ‘real-time’ read/write
    52. 52. HBase• Automatic partitioning• Scale linearly and automatically• Commodity HW• Fault tolerant• MapReduce
    53. 53. Data Model• Schema-less• Versioned cells• key/column family/cell qualifier/timestamp• Column Families
    54. 54. http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
    55. 55. Text http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html
    56. 56. Other DBs• Couchbase• Kyoto Cabinet• Many more I’ve omitted
    57. 57. Wrap Up• RDBMS vs non-relational• Distribute DBs• Non-relational families
    58. 58. The End@gavinheavysidegavin.heavyside@mydrivesolutions.com

    ×