WHAT IS NoSQL?Not only SQL movement around distributed DBs.Flexible schema changes, join-less querying,horizontally scalable.Various types of DBs with different data models.More options when building solutions and solvingproblems than classic RDBMS. 2/15
CAP THEOREMThree core system properties: 1. Consistency ACID - what you write is what you will read. CAP - all nodes see the same piece of data. 2. Availability Service is available and responsive. 3. Partition Tolerance System continues to operate even if some of the nodes are unavailable. 3/15
SYSTEM REQUIREMENTSOnly 2 out of 3 properties of CAP theorem can beachieved at any given time:- All NoSQL systems should be partition tolerant- C or A depends on the level of consistencyStrong Consistency Eventual Consistency- no stale reads - stale reads possible- higher read latency - lowest read latency- lower read throughput - highest read throughput 4/15
VARIOUS DATA MODELSDocument Oriented- collection of documents- flexible schema, programmer/web friendly, REST- MongoDB, CouchDB, RavenDBKey-Value Stores- collection of key-value pairs- handles size well, very fast- Riak, Redis, Membase 5/15
VARIOUS DATA MODELSGraph Oriented- graph theory based- complex relationships, fast - Neo4j, OrientDB*Object Oriented- objects- complex objects, direct serialization, fast- db4o, Objectivity, Versant 6/15
CouchDBMain Properties- written in Erlang (Apache license)- REST interface, JSON style documents- MVCC, N master replication- map/reduce querying, reliable designUse Cases- N master replication on various systems- scenarios where versioning is important- scenarios where ad hoc queries are not required- users: BBC, Ubuntu One, Engine Yard, CERN, ... 8/15
RiakMain Properties- written in Erlang (Apache license)- REST interface, binary protocol- shard-partitioned storage, tunable consistency- map/reduce querying, full-text searchUse Cases- scenarios with tunable level of consistency- built-in search requirement- performance oriented systems- users: Mozilla, Yammer, Aol, Voxer, ... 9/15
RedisMain Properties- written in C (BSD license)- binary protocol- disk-backed in-memory database (VM support)- advanced data structures, pub/sub, very fastUse Cases- caching- real-time data, analytics, statistics- scenarios where performance matters- users: GitHub, StackOverflow, Blizzard, Disqus, ... 10/15
Neo4jMain Properties- written in Java (GPL license)- REST interface, embedding- optimized for reads, Gremlin traversal language- deployable as a full server or a very slim databaseUse Cases- graph style data- social relations, network topologies- tagging, metadata annotations, hierarchic data- users: Schor.ly, Namesake, Face2Face, ... 11/15
OrientDBMain Properties- written in Java (Apache license)- REST interface, binary protocol- object/document/graph database hybrid- embeddable, SQL-like queryingUse Cases- general purpose NoSQL solution- scenarios where fast insertion matters- cross-platform requirements- users: NuvolaBase 12/15
db4oMain Properties- written in Java/.NET (GPL license)- REST interface, binary protocol- embeddable, low footprint (~1MB)- data access through Native Queries, LINQUse Cases- ORM-free data manipulation- database installation-free scenarios- cross-platform requirements- users: Boeing, BOSCH, Seagate, IBM, Intel, ... 13/15
BENCHMARKS IN GENERALReal benchmarks require real-world data/load.Speed versus data durability.More operations per second != better system.Cost of I/O. L1 cache 3 cycles L2 cache 14 cycles RAM 250 cycles Disk 41 000 000 cycles Network 240 000 000 cycles 14/15
RESOURCESNoSQL EcosystemBrewers CAP TheoremSystemic RequirementsChoosing consistencyEventually Consistent - RevisitedList of NoSQL DatabasesNoSQL Databases ComparisonChoosing SQL, NoSQL or BothUse Cases For Choosing NoSQL DatabaseNoSQL, NewSQL and Beyond 15/15
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.