CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Cloudant - StampedeCon 2013

CouchDB at Its Core
Global Data Storage and Rich Incremental Indexing at Cloudant
Adam Kocoloski
StampedeCon 2013

What is Cloudant?
• Founded by “big data” scientists
• Particle physicists @ MIT analyzing
petabytes of collider data
• Frustrated by inadequate tools,
founders became experts in
scaling CouchDB (“BigCouch”)
2
• Started Cloudant in 2008 as a managed data layer
• Premise: Apps should grow into their data layer, not out of it
• Built: Scalable, global, fault-tolerant data layer managed service
• Funded by Avalon, Devonshire (Fidelity), IQT, Rackspace, Samsung Ventures, Toba
Capital, Y Combinator

Cloudant Overview
• Operational JSON document store
• Web service
• Advanced APIs
• Replication & Sync
• Full-text Search
• Geospatial
• Incremental MapReduce
• Scalable, Highly Available Performance
• Cross-data center data distribution & fail over
• Geo load balancing
• Multi-tenant and single-tenant clusters
• Monitoring, admin & dev dashboards
• Managed 24x7 by experts
4

5
Cloudant: 34 locations on 5 hosting providers

Anatomy of the Cloudant Data Network
US-EAST “Node”
Single-
tenant
cluster
Multi-tenant
cluster
HTTP POST, GET,…{JSON doc}
Edge Database Cluster
Mobile Devices
AP-JP
Filtered
Replication
& Sync
Secondary Data Centers
(for DR & distributed access)
EU-NL
6

Horizontal Clustering Framework
How CouchDB Fits In
Visualization
Lucene
Search
Chainable
MapReduce
Management
Monitoring
IOQ
Fabric Mem3 Rexi
Apache CouchDB
Docs: JSON,
Attachments
Developer APIs
Prioritizing IO types; prevents
“noisy neighbors” in multi-tenancy
Clustering API, Sharding,
Intra-cluster messaging
GET/PUT docs, Views,
Replication…
Horizontal Clustering Framework
Geospatial
Indexing
Geo-Load Balancing Connects users to closest copy of
data
Dashboards-Monitoring, Admin,
Development
7

Why CouchDB?
8
• Durable append-only storage engine
• Sequence tree enabling incremental processing of updates
• Data structures supporting eventual consistency
• Sophisticated replication & synchronization
The right primitives for a global data network

Append-only Storage
10
• Rewrite path to root in each index on
document update
• Large sequential writes, smaller random reads
• Wasted space must be periodically vacuumed
• Disk is cheap
• SSD-friendly access pattern
• We build what we run ➜ we make things
that are easy to run
• (We automated the heck out of the compactor)
This used to be controversial, now everyone does it

Sequence Index
12
1
foo
2
bar
3
baz
4
bif
GET /db/_changes
{“seq”:1, “id”: “foo”, “rev”:”1-...”}
{“seq”:2, “id”: “bar”, “rev”:”1-...”}
{“seq”:3, “id”: “baz”, “rev”:”1-...”}
{“seq”:4, “id”: “bif”, “rev”:”1-...”}

Sequence Index
13
1
foo
2
bar
3
baz
4
bif
GET /db/_changes
{“seq”:1, “id”: “foo”, “rev”:”1-...”}
{“seq”:3, “id”: “baz”, “rev”:”1-...”}
{“seq”:4, “id”: “bif”, “rev”:”1-...”}
{“seq”:5, “id”: “bar”, “rev”:”2-...”}
5
bar
OR
GET /db/_changes?since=4
{“seq”:5, “id”: “bar”, “rev”:”2-...”}

Sequence Index
14
• Index each document in order of most recent update
• Allows incremental, resumable processing in the background
• Originally, MapReduce views
• First class API endpoint ➜ DIY integrations (c.f. ElasticSearch)
• Lucene-based text search
• Geospatial indexes and querying
• First class internal service ➜ add additional consumers as need arises

Eventual Consistency
16
• CAP theorem (Brewer)
• O"en over-simplified
• I’ll oﬀer my own oversimplification: “You must choose P”
• When faced with a network partition, you optimize for consistency
or availability
• Cloudant is an ODS
• Availability is paramount
• Strong consistency across geographies introduces unacceptable latency*
✱ Unless you’re Google and you install atomic clocks in your data centers

Eventual Consistency: Hash Histories
17
• Multiple concurrent versions of data will happen
• Default strategy cannot be to discard user data
• Hash histories track versions of a document
• Baked into every document
• Think git
• Document versions derived from contents + edit history
• Same series of edits, applied in same order, yield same
version ID
• History comparison detects divergences and how the
versions fit into the “family tree”
1-5a4...
2-ab6...
3-085...3-f57...
4-7ba... 4-8bf...
5-d4e...

18
Replication & Synchronization

Replication & Sync
19
1-5a4...
2-ab6...
3-085...
4-7ba... 4-8bf...
5-d4e...
1-5a4...
2-ab6...
3-085...3-f57...
/db1/foo /db2/foo

Replication & Sync
20
1-5a4...
2-ab6...
3-085...3-f57...
4-7ba... 4-8bf...
5-d4e...
/db1/foo /db2/foo
1-5a4...
2-ab6...
3-085...3-f57...

Replication & Sync
21
1-5a4...
2-ab6...
3-085...3-f57...
4-7ba... 4-8bf...
5-d4e...
1-5a4...
2-ab6...
3-085...3-f57...
4-7ba... 4-8bf...
5-d4e...
/db1/foo /db2/foo

Replication & Sync
22
• Not your RDBMS’ notion of replication
• Transfers updates from any source DB to any target DB
• Builds on earlier primitives
• Leverages sequence index to determine what’s changed
• Leverages hash histories to determine what’s missing on the target
• Critical “anti-entropy” element in clusters
• DBs are divided into partitions, copies of each partition are stored on
multiple distinct nodes
• Partition copies replicate with each other to ensure that documents are
durably stored and that consistency is achieved ... eventually

Why CouchDB Recap
23
• Durable append-only storage engine
• Sequence tree enabling incremental processing of updates
• Data structures supporting eventual consistency
• Sophisticated replication & synchronization

What’s Next?
24
• BigCouch ➜ CouchDB
• Cloudant will continue development under ASF umbrella
• Fewer code forks ➜ better velocity
• New CouchDB web UI “Fauxton”
• Better developer tooling for server-side code
• Plugins for Cloudant-specific functionality
• Cloudant is betting on data “at the edge”

Thank You
adam@cloudant.com
@kocolosk

CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Cloudant - StampedeCon 2013

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Cloudant - StampedeCon 2013

Similar to CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Cloudant - StampedeCon 2013 (20)

More from StampedeCon

More from StampedeCon (20)

Recently uploaded

Recently uploaded (20)

CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Cloudant - StampedeCon 2013