Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What database

Slides from my talk at the Hasgeek FifthElephant conference 2017 - a practical guide to database selection

  • Be the first to comment

What database

  1. 1. What database? - a practical guide to selection from NoSQL, SQL and Polyglot data stores Regunath B twitter.com/RegunathB github.com/regunathb Engineering @ CureFit-HealthFace, ex-Flipkart Infra services, Built Aadhaar
  2. 2. State of the Database Landscape - the options
  3. 3. What to look for? Over-the-hood considerations
  4. 4. Data Manipulation SQL[1] KV operations put(k,v) get(k) remove(k) Graph query g.V().has(‘name','hercules').out('father').out('father').values('name') Query Language Bulk Processing • Loading • Data Export & Transfer
  5. 5. ACID properties 1.0 Atomicity Consistency Isolation Durability 2.0 (For Scaling) Associative Commutative Idempotent Distributed Transaction Support Data Staleness Ordering Surviving Crashes Transaction Support (Limited) Relaxed Ordering (High- Throughput) - CRDTs [2] Atleast-Once delivery (Eventually-Consistent) High-Availability Property Impact
  6. 6. Wire-protocol, Standard interfaces • Wire Protocol • Custom protocols over TCP/IP, Http, gRPC • Support popular database protocols • Postgres - e.g. CockroachDB • Memcached - e.g. Couch base • Standard Interfaces • JDBC - e.g. Apache Hive, Phoenix for HBase, Vitess
  7. 7. Schema(less) support Schema is not Evil • Why Schema-less • Sparse-Metric and Entity-Attribute-Value storage needs • Frequent changes • Why Schema • Understanding structure of data • Referential Data integrity, Quality of data controlled by Data dictionary • In-between • Schema-less but require Column Indexes (e.g. ColumnFamily model of KV stores)
  8. 8. CAP theorem critique • “one example of a fundamental trade-off between safety and liveness in fault-prone systems” [3] • Too simplistic [4] • Choice of CA impractical mostly (Single node database), critique therefore applies to CP or AP. • CAP-Availability and CAP-Consistency is a spectrum and not binary • e.g. AP-Reads, AP-Writes, Strong Consistency vs. Eventual Consistency • Define application tradeoffs, validate impact on NFRs - Latency, Throughput • Good starting point for considering Polyglot persistence X
  9. 9. Polyglot Persistence - Tiered Data stores Source: Aadhaar technology white-paper
  10. 10. Polyglot Persistence - CQRS Source: Microsoft MSDN Source: Flipkart catalog Write & Read[5]
  11. 11. Polyglot Persistence - pluggable storage, secondary indices • Healthcare Graph data (Conditions, Symptoms) on Apache Titan • Mostly Read-only queries - Point lookups, one-hop traversals • AP-Read data (Storage engine : Cassandra) • Also query by properties of Vertex/Edge(Secondary indices in ES) Source: CureFit Symptoms & Conditions datastore
  12. 12. Others • Performance benchmarks - Latency, Throughput, Concurrency - e.g. Graph DBs benchmark [6] • Operations & Maintenance - e.g. MySQL as backend data store for Facebook TAO [7], LinkedIn Espresso [8] • Support - Paid (single vendor vs. multiple), Community (size, composition) • Hosted service - on public clouds as a managed service
  13. 13. What to look for? Under-the-hood considerations
  14. 14. Database Type • Relational • All field values of a row stored together • Common storage formats: BTree • Better suited for OLTP • Columnar • All values of a column stored together • More efficient data compression • OLAP queries perform better Source: https://gerardnico.com/wiki/relation/structure/column_store
  15. 15. Database Type • Document • Sub-class of a KV store • Often hierarchical (DB -> Collection -> Document) • Often have challenges in optimising storage - due to lack of Data Dictionary (schema-free) • KV • Often RAM based • Durability through replication(sync) and persistence to disk • Preference for LSM over in-place updates when designed for SSD Source: https://blog.mlab.com/2014/01/how-big-is-your-mongodb/ Source: http://www.aerospike.com/technologies/
  16. 16. Data Organisation • B-Tree • Better suited for in-place updates • Log Structured Merged (LSM) • Better suited for high insert volume • Better suited for SSD (for reducing write amplification) • Achieve high data locality of reference through good row-key design [9] Source: http://www.programering.com/a/MTMwAzMwATM.html Source: http://www.cyanny.com/2014/03/13/hbase-architecture- analysis-part1-logical-architecture/
  17. 17. Replication, Consensus • Replication • Sync vs. Async • No. of Replicas, Min. Replicas, Journalling, Guaranteed writes with hinted handoff • Single master read-write(CP) vs. Replica reads(AP) • Consensus • Used in • Leader election • Committing transactions/Log replication • Strength of protocol - Paxos, Raft, Zab etc. • Jepsen Tests (https://jepsen.io/) - Tests ‘Safety’ of distributed databases • e.g. CockroachDB, MongoDB, VoltDB, Solr, Elastic Search etc Source:https://martin.kleppmann.com Source: https://raft.github.io/raft.pdf
  18. 18. Operations • Data export & restore (RPO, RTO) - Disaster Recovery(DR) • Tools for full export vs incremental snapshots • Tools for restoring from exports, logs • Piggy-back on XDC replication support to create continuous/ongoing backup&restore • Large scale data migration [10] • Mean Time to Recovery (MTTR) - Node failure/Minor outages • e.g. promoting hot-standby to master • Tools to detect failure, validate data, promote new master/leader
  19. 19. Cost • Disk-Memory ratio • Database architecture to support disk storage, size of on-disk data w.r.t RAM • Compute required • No. of compute nodes required to keep data on-line • Power Consumption • SSD based databases generally more energy efficient than HDD • Density of storage • Relevant when storing large data over extended periods of time • e.g. Aadhaar enrolment raw data, Facebook photos [11]
  20. 20. DB-specific Optimisations to leverage RAM, reduce Disk I/O • Data block-cache/buffer-pool • Reduces disk I/O • Provides lower latency on repeat reads • Provides potentially lower latency for reads on high data locality of reference • Bloom Filters • Reduces disk I/O and row scanning in random key lookups Source: https://sematext.com Source: Cloudera
  21. 21. References • [1] - Google Spanner becoming a SQL System • [2] - CRDTs in Riak • [3] - Perspectives on the CAP Theorem • [4] - Martin Kleppmann CP or AP • [5] - Flipkart Catalog System, Datastore • [6] - Do We Need Specialised Graph Databases? • [7] - Facebook TAO social graph data store • [8] - LinkedIn Espresso • [9] - Facebook style notifications using HBase • [10] - Flipkart DC migration • [11] - Facebook cold storage system

×