NoSQLTomas Bosak@yojimbo87            July 2011
WHAT IS NoSQL?Not only SQL movement around distributed DBs.Flexible schema changes, join-less querying,horizontally scalab...
CAP THEOREMThree core system properties:  1. Consistency     ACID - what you write is what you will read.     CAP - all no...
SYSTEM REQUIREMENTSOnly 2 out of 3 properties of CAP theorem can beachieved at any given time:- All NoSQL systems should b...
VARIOUS DATA MODELSDocument Oriented- collection of documents- flexible schema, programmer/web friendly, REST- MongoDB, Co...
VARIOUS DATA MODELSGraph Oriented- graph theory based- complex relationships, fast - Neo4j, OrientDB*Object Oriented- obje...
MongoDBMain Properties- written in C++ (AGPL license)- binary protocol, JSON style documents- Master/slave replication, au...
CouchDBMain Properties- written in Erlang (Apache license)- REST interface, JSON style documents- MVCC, N master replicati...
RiakMain Properties- written in Erlang (Apache license)- REST interface, binary protocol- shard-partitioned storage, tunab...
RedisMain Properties- written in C (BSD license)- binary protocol- disk-backed in-memory database (VM support)- advanced d...
Neo4jMain Properties- written in Java (GPL license)- REST interface, embedding- optimized for reads, Gremlin traversal lan...
OrientDBMain Properties- written in Java (Apache license)- REST interface, binary protocol- object/document/graph database...
db4oMain Properties- written in Java/.NET (GPL license)- REST interface, binary protocol- embeddable, low footprint (~1MB)...
BENCHMARKS IN GENERALReal benchmarks require real-world data/load.Speed versus data durability.More operations per second ...
RESOURCESNoSQL EcosystemBrewers CAP TheoremSystemic RequirementsChoosing consistencyEventually Consistent - RevisitedList ...
Upcoming SlideShare
Loading in...5
×

NoSQL

1,634
-1

Published on

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,634
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
64
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

NoSQL

  1. 1. NoSQLTomas Bosak@yojimbo87 July 2011
  2. 2. WHAT IS NoSQL?Not only SQL movement around distributed DBs.Flexible schema changes, join-less querying,horizontally scalable.Various types of DBs with different data models.More options when building solutions and solvingproblems than classic RDBMS. 2/15
  3. 3. CAP THEOREMThree core system properties: 1. Consistency ACID - what you write is what you will read. CAP - all nodes see the same piece of data. 2. Availability Service is available and responsive. 3. Partition Tolerance System continues to operate even if some of the nodes are unavailable. 3/15
  4. 4. SYSTEM REQUIREMENTSOnly 2 out of 3 properties of CAP theorem can beachieved at any given time:- All NoSQL systems should be partition tolerant- C or A depends on the level of consistencyStrong Consistency Eventual Consistency- no stale reads - stale reads possible- higher read latency - lowest read latency- lower read throughput - highest read throughput 4/15
  5. 5. VARIOUS DATA MODELSDocument Oriented- collection of documents- flexible schema, programmer/web friendly, REST- MongoDB, CouchDB, RavenDBKey-Value Stores- collection of key-value pairs- handles size well, very fast- Riak, Redis, Membase 5/15
  6. 6. VARIOUS DATA MODELSGraph Oriented- graph theory based- complex relationships, fast - Neo4j, OrientDB*Object Oriented- objects- complex objects, direct serialization, fast- db4o, Objectivity, Versant 6/15
  7. 7. MongoDBMain Properties- written in C++ (AGPL license)- binary protocol, JSON style documents- Master/slave replication, auto sharding- JavaScript based ad hoc and map/reduce queryingUse Cases- general purpose NoSQL system- caching, high volume data store- real-time statistics, archiving, logging, commerce- users: foursquare, bit.ly, SourceForge, GitHub, ... 7/15
  8. 8. CouchDBMain Properties- written in Erlang (Apache license)- REST interface, JSON style documents- MVCC, N master replication- map/reduce querying, reliable designUse Cases- N master replication on various systems- scenarios where versioning is important- scenarios where ad hoc queries are not required- users: BBC, Ubuntu One, Engine Yard, CERN, ... 8/15
  9. 9. RiakMain Properties- written in Erlang (Apache license)- REST interface, binary protocol- shard-partitioned storage, tunable consistency- map/reduce querying, full-text searchUse Cases- scenarios with tunable level of consistency- built-in search requirement- performance oriented systems- users: Mozilla, Yammer, Aol, Voxer, ... 9/15
  10. 10. RedisMain Properties- written in C (BSD license)- binary protocol- disk-backed in-memory database (VM support)- advanced data structures, pub/sub, very fastUse Cases- caching- real-time data, analytics, statistics- scenarios where performance matters- users: GitHub, StackOverflow, Blizzard, Disqus, ... 10/15
  11. 11. Neo4jMain Properties- written in Java (GPL license)- REST interface, embedding- optimized for reads, Gremlin traversal language- deployable as a full server or a very slim databaseUse Cases- graph style data- social relations, network topologies- tagging, metadata annotations, hierarchic data- users: Schor.ly, Namesake, Face2Face, ... 11/15
  12. 12. OrientDBMain Properties- written in Java (Apache license)- REST interface, binary protocol- object/document/graph database hybrid- embeddable, SQL-like queryingUse Cases- general purpose NoSQL solution- scenarios where fast insertion matters- cross-platform requirements- users: NuvolaBase 12/15
  13. 13. db4oMain Properties- written in Java/.NET (GPL license)- REST interface, binary protocol- embeddable, low footprint (~1MB)- data access through Native Queries, LINQUse Cases- ORM-free data manipulation- database installation-free scenarios- cross-platform requirements- users: Boeing, BOSCH, Seagate, IBM, Intel, ... 13/15
  14. 14. BENCHMARKS IN GENERALReal benchmarks require real-world data/load.Speed versus data durability.More operations per second != better system.Cost of I/O. L1 cache 3 cycles L2 cache 14 cycles RAM 250 cycles Disk 41 000 000 cycles Network 240 000 000 cycles 14/15
  15. 15. RESOURCESNoSQL EcosystemBrewers CAP TheoremSystemic RequirementsChoosing consistencyEventually Consistent - RevisitedList of NoSQL DatabasesNoSQL Databases ComparisonChoosing SQL, NoSQL or BothUse Cases For Choosing NoSQL DatabaseNoSQL, NewSQL and Beyond 15/15
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×