Going Schema-free
  Document-Oriented Databases
Schema-free?
Structure :)
Structures :(
Replication
Conflicts
Documents!
(not the bad kind)
•name
 •A Document
•abstract
 •A Document for the purpose of
   demonstration

•attachments
 •realdoc.doc
•name
 •Another Document
•author
 •Biz Stone
•attachments
 •fulldoc.txt
json documents
map-reduce
views
1   quot;mapquot;: function (doc)
2   {
3     emit(quot;idquot;, quot;valuequot;);
4   }
1   quot;reducequot;: function (keys, values, rereduce)
2   {
3     return {quot;resultquot;: true};
4   }
View options
• key/keys          • descending
• startkey/endkey   • skip
• startkey_docid/   • group
  endkey_docid
      ...
(demo)
More Information


•http://couchdb.apache.org/
•http://books.couchdb.org/
Going Schema-Free
Going Schema-Free
Going Schema-Free
Going Schema-Free
Going Schema-Free
Going Schema-Free
Going Schema-Free
Going Schema-Free
Going Schema-Free
Going Schema-Free
Upcoming SlideShare
Loading in …5
×

Going Schema-Free

2,333 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,333
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
27
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide


  • what’s our course? where are we starting, where are we going?
    (image: somewhere on flickr)
  • Where do we start?
    I’m approaching this as someone whose primary/only database use has been with relational databases, using SQL.
  • What are we looking for?
    An understanding of what document-oriented databases are, how they differ from relational databases, why you’d use them over relational databases, and what some of the options are.
  • What pitfalls might we encounter?
    Matt is not an expert, so I probably will miss stuff, might not be able to argue for document-oriented databases very eloquently, hopefully I won’t totally mislead anybody.
  • back to the title. are we really going to talk about and seriously consider schema-free databases? what’s the point of that?
    the short answer is yes. hopefully this presentation will show why schema-free databases are sometimes very useful.
  • quick review: relational databases are made up of relations.
    roughly, attributes are columns, tuples are rows. relations are collections of tuples with the same set of attributes, so tables.
    nice, structured, data.
    (image: http://en.wikipedia.org/wiki/File:Relational_database_terms.svg)
  • You could say that a relational database is defined by its structure.
    Structured Query Language
    For this presentation, it’s analagous to static programming languages (like C, C++, C#, Java)



    so, what are some of the challenges?
  • ironically, some structured data can be difficult or tedious to implement.
    For example, parent-child relationships can be difficult to represent and/or query on (select all work items where area path is in “Top-Level Component”)
  • Relational databases typically aren’t designed for replication and scale-out from the beginning. As we all know, neglecting to consider something like this will make it harder to do later. Even something like merging in a source control tool (git vs. svn)... if you start out trying to support it, you’ll do better than if you add it as a feature later.
  • one of the reasons that replication or distribution is difficult is that conflicts are sure to arise. two edits could conflict. two identical ids could be autogenerated... the application can solve these things, but the database isn’t going to provide too much out of the box.
  • Relational databases do solve problems for us, and they’re a powerful tool. I don’t want to discount that.
  • document-oriented databases.
    can anyone tell me what document-oriented databases are made up of?



    http://www.flickr.com/photos/janodecesare/2978128591/sizes/o/


  • we’re not doing waterfall, here
  • attributes.
  • notice the differences. there’s no schema to follow!
    flexibility
  • couchdb is a very popular open-source document-oriented database.
  • JavaScript Object Notation
  • CAP theorem.
    consistency: all reads return the same, “right” result; reads from two servers return the same result. This ends up being a challenge for lots of big web 2.0 properties -- I’ve read about how flickr, facebook deal with this.
    availability: data is returned when requested. i.e. writes don’t block reads.
    partition tolerance: the database can be split
    choose two.



    “eventual consistency”



    as you can see, one of the differences between couchdb and a relational database is the consistency/availability tradeoff. couchdb is written in erlang, so some of its features have an erlangish feel to them: data is always there (old revisions always exist, are immutable), and new versions get layered on top.
  • can someone describe map-reduce?
  • enables parallelization.
  • views in couch are map/reduce




  • stale=ok means that views won’t be recomputed (if map’s output is in memory, don’t check to see if it needs to be regenerated).
    reduce=false skips the reduce function, if it was supplied.




  • Going Schema-Free

    1. 1. Going Schema-free Document-Oriented Databases
    2. 2. Schema-free?
    3. 3. Structure :)
    4. 4. Structures :(
    5. 5. Replication
    6. 6. Conflicts
    7. 7. Documents!
    8. 8. (not the bad kind)
    9. 9. •name •A Document •abstract •A Document for the purpose of demonstration •attachments •realdoc.doc
    10. 10. •name •Another Document •author •Biz Stone •attachments •fulldoc.txt
    11. 11. json documents
    12. 12. map-reduce
    13. 13. views
    14. 14. 1 quot;mapquot;: function (doc) 2 { 3   emit(quot;idquot;, quot;valuequot;); 4 }
    15. 15. 1 quot;reducequot;: function (keys, values, rereduce) 2 { 3   return {quot;resultquot;: true}; 4 }
    16. 16. View options • key/keys • descending • startkey/endkey • skip • startkey_docid/ • group endkey_docid • group_level • limit • reduce • stale • include_docs
    17. 17. (demo)
    18. 18. More Information •http://couchdb.apache.org/ •http://books.couchdb.org/

    ×