Your SlideShare is downloading. ×
  • Like
  • Save
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Cloudant - StampedeCon 2013
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Cloudant - StampedeCon 2013

  • 892 views
Published

At the StampedeCon 2013 Big Data conference in St. Louis, Adam Kocoloski, Co­Founder & CTO of Cloudant, CouchDB Expert, discussed CouchDB at its Core: Global Data Storage and Rich Incremental Indexing …

At the StampedeCon 2013 Big Data conference in St. Louis, Adam Kocoloski, Co­Founder & CTO of Cloudant, CouchDB Expert, discussed CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Cloudant - StampedeCon 2013. Cloudant operates database clusters comprising 100+ nodes based on BigCouch, the company’s fork of CouchDB. Key elements of CouchDB’s design have proven instrumental to success at this scale, including version histories, append-­only storage, and multi-­master replication. In this talk, Cloudant Co­Founder and Apache CouchDB Committer Adam Kocoloski will discuss lessons learned from running production CouchDB clusters bigger than many well­publicized Hadoop deployments, and how Cloudant’s experience at scale is informing development work on the next release of Apache CouchDB.

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
892
On SlideShare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. CouchDB at Its Core Global Data Storage and Rich Incremental Indexing at Cloudant Adam Kocoloski StampedeCon 2013
  • 2. What is Cloudant? • Founded by “big data” scientists • Particle physicists @ MIT analyzing petabytes of collider data • Frustrated by inadequate tools, founders became experts in scaling CouchDB (“BigCouch”) 2 • Started Cloudant in 2008 as a managed data layer • Premise: Apps should grow into their data layer, not out of it • Built: Scalable, global, fault-tolerant data layer managed service • Funded by Avalon, Devonshire (Fidelity), IQT, Rackspace, Samsung Ventures, Toba Capital, Y Combinator
  • 3. Cloudant Overview • Operational JSON document store • Web service • Advanced APIs • Replication & Sync • Full-text Search • Geospatial • Incremental MapReduce • Scalable, Highly Available Performance • Cross-data center data distribution & fail over • Geo load balancing • Multi-tenant and single-tenant clusters • Monitoring, admin & dev dashboards • Managed 24x7 by experts 4
  • 4. 5 Cloudant: 34 locations on 5 hosting providers
  • 5. Anatomy of the Cloudant Data Network US-EAST “Node” Single- tenant cluster Multi-tenant cluster HTTP POST, GET,…{JSON doc} Edge Database Cluster Mobile Devices AP-JP Filtered Replication & Sync Secondary Data Centers (for DR & distributed access) EU-NL 6
  • 6. Horizontal Clustering Framework How CouchDB Fits In Visualization Lucene Search Chainable MapReduce Management Monitoring IOQ Fabric Mem3 Rexi Apache CouchDB Docs: JSON, Attachments Developer APIs Prioritizing IO types; prevents “noisy neighbors” in multi-tenancy Clustering API, Sharding, Intra-cluster messaging GET/PUT docs, Views, Replication… Horizontal Clustering Framework Geospatial Indexing Geo-Load Balancing Connects users to closest copy of data Dashboards-Monitoring, Admin, Development 7
  • 7. Why CouchDB? 8 • Durable append-only storage engine • Sequence tree enabling incremental processing of updates • Data structures supporting eventual consistency • Sophisticated replication & synchronization The right primitives for a global data network
  • 8. 9 Append-only Storage
  • 9. Append-only Storage 10 • Rewrite path to root in each index on document update • Large sequential writes, smaller random reads • Wasted space must be periodically vacuumed • Disk is cheap • SSD-friendly access pattern • We build what we run ➜ we make things that are easy to run • (We automated the heck out of the compactor) This used to be controversial, now everyone does it
  • 10. 11 Sequence Index
  • 11. Sequence Index 12 1 foo 2 bar 3 baz 4 bif GET /db/_changes {“seq”:1, “id”: “foo”, “rev”:”1-...”} {“seq”:2, “id”: “bar”, “rev”:”1-...”} {“seq”:3, “id”: “baz”, “rev”:”1-...”} {“seq”:4, “id”: “bif”, “rev”:”1-...”}
  • 12. Sequence Index 13 1 foo 2 bar 3 baz 4 bif GET /db/_changes {“seq”:1, “id”: “foo”, “rev”:”1-...”} {“seq”:3, “id”: “baz”, “rev”:”1-...”} {“seq”:4, “id”: “bif”, “rev”:”1-...”} {“seq”:5, “id”: “bar”, “rev”:”2-...”} 5 bar OR GET /db/_changes?since=4 {“seq”:5, “id”: “bar”, “rev”:”2-...”}
  • 13. Sequence Index 14 • Index each document in order of most recent update • Allows incremental, resumable processing in the background • Originally, MapReduce views • First class API endpoint ➜ DIY integrations (c.f. ElasticSearch) • Lucene-based text search • Geospatial indexes and querying • First class internal service ➜ add additional consumers as need arises
  • 14. 15 Eventual Consistency
  • 15. Eventual Consistency 16 • CAP theorem (Brewer) • O"en over-simplified • I’ll offer my own oversimplification: “You must choose P” • When faced with a network partition, you optimize for consistency or availability • Cloudant is an ODS • Availability is paramount • Strong consistency across geographies introduces unacceptable latency* ✱ Unless you’re Google and you install atomic clocks in your data centers
  • 16. Eventual Consistency: Hash Histories 17 • Multiple concurrent versions of data will happen • Default strategy cannot be to discard user data • Hash histories track versions of a document • Baked into every document • Think git • Document versions derived from contents + edit history • Same series of edits, applied in same order, yield same version ID • History comparison detects divergences and how the versions fit into the “family tree” 1-5a4... 2-ab6... 3-085...3-f57... 4-7ba... 4-8bf... 5-d4e...
  • 17. 18 Replication & Synchronization
  • 18. Replication & Sync 19 1-5a4... 2-ab6... 3-085... 4-7ba... 4-8bf... 5-d4e... 1-5a4... 2-ab6... 3-085...3-f57... /db1/foo /db2/foo
  • 19. Replication & Sync 20 1-5a4... 2-ab6... 3-085...3-f57... 4-7ba... 4-8bf... 5-d4e... /db1/foo /db2/foo 1-5a4... 2-ab6... 3-085...3-f57...
  • 20. Replication & Sync 21 1-5a4... 2-ab6... 3-085...3-f57... 4-7ba... 4-8bf... 5-d4e... 1-5a4... 2-ab6... 3-085...3-f57... 4-7ba... 4-8bf... 5-d4e... /db1/foo /db2/foo
  • 21. Replication & Sync 22 • Not your RDBMS’ notion of replication • Transfers updates from any source DB to any target DB • Builds on earlier primitives • Leverages sequence index to determine what’s changed • Leverages hash histories to determine what’s missing on the target • Critical “anti-entropy” element in clusters • DBs are divided into partitions, copies of each partition are stored on multiple distinct nodes • Partition copies replicate with each other to ensure that documents are durably stored and that consistency is achieved ... eventually
  • 22. Why CouchDB Recap 23 • Durable append-only storage engine • Sequence tree enabling incremental processing of updates • Data structures supporting eventual consistency • Sophisticated replication & synchronization
  • 23. What’s Next? 24 • BigCouch ➜ CouchDB • Cloudant will continue development under ASF umbrella • Fewer code forks ➜ better velocity • New CouchDB web UI “Fauxton” • Better developer tooling for server-side code • Plugins for Cloudant-specific functionality • Cloudant is betting on data “at the edge”
  • 24. Thank You adam@cloudant.com @kocolosk