• Like
CouchDB
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

CouchDB

  • 1,184 views
Published

An overview of CouchDB. Originally presented internally at University of Calgary IT.

An overview of CouchDB. Originally presented internally at University of Calgary IT.

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,184
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
77
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. CouchDB King Chung Huang Information Technologies University of Calgary
  • 2. Relax
  • 3. Document-oriented Databases Today’s Talk CouchDB Overview Demonstrations
  • 4. Document-oriented Databases
  • 5. Databases
  • 6. Flat Hierarchical Network Relational Databases
  • 7. Post-Relational Databases Dimensional Object Document-oriented
  • 8. Document-oriented Databases Comparable to documents in the real world • Records are stored as schema-less documents • Each document is uniquely named ■ Documents are the primary unit of storage ■ Structures are not explicitly defined • No tables with uniform, pre-defined fields ■ Every document can have varying fields of different types ■ Documents are self contained • Data is not decomposed into tables with relations ■ Documents contain the context needed to understand them ■
  • 9. Document-oriented Databases Examples • Lotus Notes ■ Amazon SimpleDB ■ CouchDB ■ Key-Value Stores • Amazon S3 ■ Dynamo: Amazon’s Highly Available Key-value Store, DeCandia, et al., 2007 ■ Facebook Cassandra ■ Recently accepted as an Apache incubation project ■ Google BigTable ■ Bigtable: A Distributed Storage System for Structured Data, Chang, et al., ■ 2006
  • 10. CouchDB Overview
  • 11. Document database server REST API What is CouchDB? JSON documents Views with MapReduce Highly Scalable
  • 12. Document Database Server Implemented in Erlang • Ericsson Language ■ Highly concurrent, functional programming language ■ Designed with modern web applications in mind • Atomic Consistent Isolated Durable (ACID) • “Crash-only” design • Supports external handlers • Change notification ■ Custom processing ■ •
  • 13. REST HTTP API Representational State Transfer • A set of principles about how resources are defined and addressed ■ World Wide Web (HTTP) is RESTful • Uniform interface for accessing resources ■ Resources identified by URI ■ Actions transmitted in HTTP methods ■ Status communicated in status codes ■
  • 14. REST HTTP API CRUD Create, Read, Update, and Delete • • In HTTP POST /some/resource/id ■ GET /some/resource/id ■ PUT /some/resource/id ■ DELETE /some/resource/id ■
  • 15. JSON Documents JavaScript Object Notation • Considered language-independent ■ CouchDB stored XML documents before version 0.8 • Suitable if content is already in XML ■ Human readable, but can be onerous to type ■ Markup language, requires transformation from/to data structures ■ Represents primitive data types and structures • Strings, numbers, booleans ■ Arrays, dictionaries ■ Null ■ Documents can have attachments •
  • 16. JSON Documents Example { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 17. JSON Documents Example { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 18. JSON Documents Example { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 19. JSON Documents Example { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 20. JSON Documents Example { _id: “post1”, _rev: “123456”, … _attachments: { “picture.png”: { stub: true, content_type: “image/png”, length: 384 } } }
  • 21. Views Used to sort and filter through data • Lazily evaluated, highly efficient • Similar to indexing in relational databases ■ Defined in design documents • Documents named _design/… ■ Consist of map and reduce functions • Language independent ■ JavaScript supported by default ■ Mozilla Spidermonkey included ■
  • 22. Data Processing with MapReduce Programming model for processing and generating large data sets • Related, but not equivalent to map and reduce operations in • functional languages Take and produce key/value pairs with map and reduce functions • Map functions • Take input key/value pairs and produce an intermediate set of key/value pairs ■ Reduce functions • Take intermediate key and set of values for the key, and merges them into a ■ possibly smaller set of values MapReduce: Simplified Data Processing on Large Clusters • Jeff Dean, Sanjay Ghemawat, Google Inc.
  • 23. Data Processing with MapReduce Example { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 24. Data Processing with MapReduce Example “post1” = { _id: “post1”, _rev: “123456”, title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…”, is_published: true }
  • 25. Data Processing with MapReduce Example “post1” = { title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…” }
  • 26. Data Processing with MapReduce Emit Posts by post_date “post1” = { title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…” } 1239910768 = { title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…” }
  • 27. Data Processing with MapReduce Emit Posts by post_date 1208456184 {title: “A bloody long time ago”, …} 1215421546 {title: “A blue moon ago”, …} 1222654641 {title: “Just Yesterday”, …} 1239910768 {title: “A Blog Post”, …} 1246816518 {title: “That was Then”, …} 1251687980 {title: “This is Now”, …} 1264836981 {title: “When Will Then Be Now?”, …}
  • 28. Data Processing with MapReduce Emit Posts by tag “post1” = { title: “A Blog Post”, tags: [“blue”, “glue”], post_date: 1239910768, body: “Once upon a time…” } “blue” = { title: “A Blog Post”, … } “glue” = { title: “A Blog Post”, … }
  • 29. Data Processing with MapReduce Emit Posts by tag blue {title: “Just Yesterday”, …} blue {title: “A Blog Post”, …} clue {title: “Just Yesterday”, …} flue {title: “When Will Then Be Now?”, …} flue {title: “This is Now”, …} glue {title: “A Blog Post”, …} wazoo {title: “That was Then”, …}
  • 30. Data Processing with MapReduce Emit Posts by tag, Reduced {title: “Just Yesterday”, …}, blue {title: “A Blog Post”, …} clue {title: “Just Yesterday”, …} {title: “When Will Then Be Now?”, …}, flue {title: “This is Now”, …} glue {title: “A Blog Post”, …} wazoo {title: “That was Then”, …}
  • 31. Scalability Incremental MapReduce • Multiversion Concurrency Control (MVCC) • Achieves serializability through multiversioning instead of locking ■ Eliminates waits to access objects ■ Updates create new documents ■ Tradeoff point: no waits, increased data storage ■ Incremental Distributed Replication • Eventual Consistency • Changes eventually propagate through distributed systems ■ Tradeoff point: increase availability and tolerancy, decreased freshness ■
  • 32. Demonstrations