CouchDB
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

CouchDB

  • 2,380 views
Uploaded on

An overview of CouchDB. Originally presented internally at University of Calgary IT.

An overview of CouchDB. Originally presented internally at University of Calgary IT.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,380
On Slideshare
2,373
From Embeds
7
Number of Embeds
1

Actions

Shares
Downloads
77
Comments
0
Likes
3

Embeds 7

http://www.slideshare.net 7

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. CouchDB King Chung Huang Information Technologies University of Calgary
  • 2. Relax
  • 3. Document-oriented Databases Today’s Talk CouchDB Overview Demonstrations
  • 4. Document-oriented Databases
  • 5. Databases
  • 6. Flat Hierarchical Network Relational Databases
  • 7. Post-Relational Databases Dimensional Object Document-oriented
  • 8. Document-oriented Databases Comparable to documents in the real world • Records are stored as schema-less documents • Each document is uniquely named ■ Documents are the primary unit of storage ■ Structures are not explicitly defined • No tables with uniform, pre-defined fields ■ Every document can have varying fields of different types ■ Documents are self contained • Data is not decomposed into tables with relations ■ Documents contain the context needed to understand them ■
  • 9. Document-oriented Databases Examples • Lotus Notes ■ Amazon SimpleDB ■ CouchDB ■ Key-Value Stores • Amazon S3 ■ Dynamo: Amazon’s Highly Available Key-value Store, DeCandia, et al., 2007 ■ Facebook Cassandra ■ Recently accepted as an Apache incubation project ■ Google BigTable ■ Bigtable: A Distributed Storage System for Structured Data, Chang, et al., ■ 2006
  • 10. CouchDB Overview
  • 11. Document database server REST API What is CouchDB? JSON documents Views with MapReduce Highly Scalable
  • 12. Document Database Server Implemented in Erlang • Ericsson Language ■ Highly concurrent, functional programming language ■ Designed with modern web applications in mind • Atomic Consistent Isolated Durable (ACID) • “Crash-only” design • Supports external handlers • Change notification ■ Custom processing ■ •
  • 13. REST HTTP API Representational State Transfer • A set of principles about how resources are defined and addressed ■ World Wide Web (HTTP) is RESTful • Uniform interface for accessing resources ■ Resources identified by URI ■ Actions transmitted in HTTP methods ■ Status communicated in status codes ■
  • 14. REST HTTP API CRUD Create, Read, Update, and Delete • • In HTTP POST /some/resource/id ■ GET /some/resource/id ■ PUT /some/resource/id ■ DELETE /some/resource/id ■
  • 15. JSON Documents JavaScript Object Notation • Considered language-independent ■ CouchDB stored XML documents before version 0.8 • Suitable if content is already in XML ■ Human readable, but can be onerous to type ■ Markup language, requires transformation from/to data structures ■ Represents primitive data types and structures • Strings, numbers, booleans ■ Arrays, dictionaries ■ Null ■ Documents can have attachments •
  • 16. JSON Documents Example { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 17. JSON Documents Example { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 18. JSON Documents Example { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 19. JSON Documents Example { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 20. JSON Documents Example { _id: “post1”, _rev: “123456”, … _attachments: { “picture.png”: { stub: true, content_type: “image/png”, length: 384 } } }
  • 21. Views Used to sort and filter through data • Lazily evaluated, highly efficient • Similar to indexing in relational databases ■ Defined in design documents • Documents named _design/… ■ Consist of map and reduce functions • Language independent ■ JavaScript supported by default ■ Mozilla Spidermonkey included ■
  • 22. Data Processing with MapReduce Programming model for processing and generating large data sets • Related, but not equivalent to map and reduce operations in • functional languages Take and produce key/value pairs with map and reduce functions • Map functions • Take input key/value pairs and produce an intermediate set of key/value pairs ■ Reduce functions • Take intermediate key and set of values for the key, and merges them into a ■ possibly smaller set of values MapReduce: Simplified Data Processing on Large Clusters • Jeff Dean, Sanjay Ghemawat, Google Inc.
  • 23. Data Processing with MapReduce Example { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 24. Data Processing with MapReduce Example “post1” = { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 25. Data Processing with MapReduce Example “post1” = { title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…” }
  • 26. Data Processing with MapReduce Emit Posts by post_date “post1” = { title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…” } 1239910768 = { title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…” }
  • 27. Data Processing with MapReduce Emit Posts by post_date 1208456184 {title: “A bloody long time ago”, …} 1215421546 {title: “A blue moon ago”, …} 1222654641 {title: “Just Yesterday”, …} 1239910768 {title: “A Blog Post”, …} 1246816518 {title: “That was Then”, …} 1251687980 {title: “This is Now”, …} 1264836981 {title: “When Will Then Be Now?”, …}
  • 28. Data Processing with MapReduce Emit Posts by tag “post1” = { title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…” } “blue” = { title: “A Blog Post”, … } “glue” = { title: “A Blog Post”, … }
  • 29. Data Processing with MapReduce Emit Posts by tag blue {title: “Just Yesterday”, …} blue {title: “A Blog Post”, …} clue {title: “Just Yesterday”, …} flue {title: “When Will Then Be Now?”, …} flue {title: “This is Now”, …} glue {title: “A Blog Post”, …} wazoo {title: “That was Then”, …}
  • 30. Data Processing with MapReduce Emit Posts by tag, Reduced {title: “Just Yesterday”, …}, blue {title: “A Blog Post”, …} clue {title: “Just Yesterday”, …} {title: “When Will Then Be Now?”, …}, flue {title: “This is Now”, …} glue {title: “A Blog Post”, …} wazoo {title: “That was Then”, …}
  • 31. Scalability Incremental MapReduce • Multiversion Concurrency Control (MVCC) • Achieves serializability through multiversioning instead of locking ■ Eliminates waits to access objects ■ Updates create new documents ■ Tradeoff point: no waits, increased data storage ■ Incremental Distributed Replication • Eventual Consistency • Changes eventually propagate through distributed systems ■ Tradeoff point: increase availability and tolerancy, decreased freshness ■
  • 32. Demonstrations