Your SlideShare is downloading. ×
0
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
No sql landscape_nosqltips
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

No sql landscape_nosqltips

2,020

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,020
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
58
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • NoSQL does not mean no SQL, or that it is against SQL or RDBMS data bases. NoSQL is better characterized as non-RDBMS data stores, but even that is not completely true.
  • NoSQL are very compatible and often used together. SQL usually takes the OLTP role while NoSQL slots in for special purposes.
  • Brewer's Theorem - Inktomi C onsistency A vailability P artition Tolerance You can have any 2 but not all 3 C & A in single node system Add P and you must choose between C and A
  • Membase is distributed (elastic) map CouchDb is document store Companies combined to form CouchBase
  • RDF = Resource Description Framework
  • RDF – Resource Description Framework Triplestore – Subject – Predicate – Object Predicate is relationship OWL – Web Ontology Language – semantic web
  • Transcript

    • 1. The NoSQL Landscape <ul><li>Objective – Reasonable understanding of the non-relational or NoSQL data stores and how they relate to RDBMS databases we are all used to working with. </li></ul>
    • 2. About Me <ul><li>Chief Architect – youwho.com </li></ul><ul><li>Former dot com CTO </li></ul><ul><li>NoSql advocate </li></ul><ul><li>nosqltips.blogspot.com </li></ul><ul><li>@nosqltips on twitter </li></ul>
    • 3. Agenda <ul><li>What is NoSQL? </li></ul><ul><li>Landscape </li></ul><ul><li>Vocabulary and concepts </li></ul><ul><li>CAP Theorem </li></ul><ul><li>SQL vs NoSQL comparison </li></ul><ul><li>Overview of each type w/ examples </li></ul><ul><li>Question and Answer </li></ul>
    • 4.  
    • 5.  
    • 6.  
    • 7.  
    • 8.  
    • 9. Vocabulary <ul><li>CAP Theorem – consistency, availability, partitioning </li></ul><ul><li>ACID – Atomic, Consistent, Isolated, Durable </li></ul><ul><li>BASE – Basically Available, Soft state, Eventually consistent </li></ul><ul><li>RDF – Resource Description Framework </li></ul><ul><li>Sharding – Partitioning, distributed </li></ul><ul><li>Web Scale – Google, Twitter, Facebook, etc </li></ul>
    • 10.  
    • 11. CAP Tuning <ul><li>NRW </li></ul><ul><ul><li>N: Number of Data Copies </li></ul></ul><ul><ul><li>R: Read Quorum </li></ul></ul><ul><ul><li>W: Write Quorum </li></ul></ul><ul><li>Hard Consistency – RDBMS </li></ul><ul><li>Soft Consistency – No Guarantees </li></ul><ul><li>Eventual Consistency – Most NoSQL </li></ul>
    • 12. Cap Tuning Chart NRW Outcome N=3 Magic Number of Data Replicas W=N R=1 Read Optimized – Strong Consistency. W=1 R=N Write Optimized – Strong Consistency. W+R > N Strong Consistency on Read and Write. W+R <= N Weak Eventual Consistency. Read may not see the latest Data. N > W > 1 Eventual Consistency - Most NoSQL data stores live here.
    • 13. Eventual Consistency <ul><li>All replicas have same data – eventually </li></ul><ul><li>Milliseconds to seconds </li></ul><ul><li>Not all applications are compatible </li></ul><ul><li>Various ways to ensure latest data </li></ul><ul><ul><li>Vector Clocks, Read Repair, Gossiping </li></ul></ul><ul><ul><li>Application determines correct data </li></ul></ul>
    • 14.  
    • 15. Comparison <ul><li>SQL </li></ul><ul><li>Prefers big-box, self redundant </li></ul><ul><li>Keep things from breaking </li></ul><ul><li>Solidly in CA land </li></ul><ul><li>P is difficult and expensive </li></ul><ul><li>Query by SQL </li></ul><ul><li>Stored procedures </li></ul><ul><li>NoSQL </li></ul><ul><li>Prefers commodity hardware, distributed </li></ul><ul><li>Assume things break or are broken </li></ul><ul><li>Mostly AP, some tunable </li></ul><ul><li>P generally easy </li></ul><ul><li>Custom API, SQLish </li></ul><ul><li>Map/Reduce </li></ul>
    • 16. Comparison <ul><li>SQL </li></ul><ul><li>ACID transactions </li></ul><ul><li>Advanced indexing </li></ul><ul><li>Foreign key support </li></ul><ul><li>Strong lock support </li></ul><ul><li>Schema centric </li></ul><ul><li>API – usually JPA or JDBC </li></ul><ul><li>Strong access control </li></ul><ul><li>NoSQL </li></ul><ul><li>BASE transactions </li></ul><ul><li>Key only to Advanced </li></ul><ul><li>Usually none </li></ul><ul><li>Usually none </li></ul><ul><li>Usually schema-less </li></ul><ul><li>Depends on implementation </li></ul><ul><li>Usually none </li></ul>
    • 17. Comparison <ul><li>SQL </li></ul><ul><li>Complex disk store, random access </li></ul><ul><li>Easy for dev with JPA/Hibernate/SQL </li></ul><ul><li>Multi-platform </li></ul><ul><li>General purpose </li></ul><ul><li>Strong commercial support </li></ul><ul><li>Great tool support </li></ul><ul><li>NoSQL </li></ul><ul><li>Usually append only, 1 seek, 1 read </li></ul><ul><li>Puts more work on application dev </li></ul><ul><li>Favors Linux/Unix </li></ul><ul><li>More special purpose </li></ul><ul><li>Strong to no commercial support </li></ul><ul><li>Not so much </li></ul>
    • 18.  
    • 19. Column Stores <ul><li>Data stored by column instead of row </li></ul><ul><li>Schema-less </li></ul><ul><li>Non-relational, data is de-normalized </li></ul><ul><li>Column format stores sparse data efficiently </li></ul><ul><li>Column families cannot change </li></ul><ul><li>10,000+ columns by 100 million+ rows </li></ul><ul><li>Easy sharding (partitioning) </li></ul><ul><li>Usually not ACID compliant </li></ul>
    • 20. Column stores <ul><li>BigTable – Google, 2006 paper </li></ul><ul><li>Hadoop/HBase – Part of Apache Hadoop </li></ul><ul><li>Cassandra – Facebook, LAN/WAN replication </li></ul><ul><li>Hypertable – Pluggable DFS, HQL </li></ul><ul><li>Vertica – Full SQL implementation </li></ul><ul><li>Amazon SimpleDB – Cloud store </li></ul>
    • 21. Document Stores <ul><li>CAP tunable </li></ul><ul><li>Either key/value or bucket/key/value </li></ul><ul><li>Easy/Auto sharding - Consistent hashing </li></ul><ul><li>Usually ACID compliant </li></ul><ul><li>Not SQL compliant, maybe custom query </li></ul><ul><li>Easy implementation via map or custom api </li></ul>
    • 22. Document stores <ul><li>Amazon – Dynamo and S3 (cloud based) </li></ul><ul><li>Riak – CAP tunable, built in map/reduce </li></ul><ul><li>CouchDB – ACID, REST api </li></ul><ul><li>MongoDB – Indexing, query support </li></ul><ul><li>Voldemort – Java, pluggable serialization </li></ul><ul><li>MySQL – Key access, denormalize schema, kill indexes </li></ul>
    • 23. Memory Stores <ul><li>Mostly in the CA realm </li></ul><ul><li>P can be tough depending on implementation </li></ul><ul><li>Some are distributed, some local only </li></ul><ul><li>Usually key-value stores </li></ul><ul><li>Many are disk backed, append only files </li></ul><ul><li>Designed for very high-speed access </li></ul>
    • 24. Memory stores <ul><li>CouchBase – Membase + CouchDb </li></ul><ul><li>Memcached – Local map </li></ul><ul><li>Coherence – Commercial Oracle, distributed </li></ul><ul><li>Redis – Supports hash, list, set, and sorted set, data structure server </li></ul><ul><li>Tokyo/Kyoto Cabinet – disk backed map </li></ul><ul><li>Infinispan – JSR-107 jcache impl </li></ul><ul><li>Scalaris – Erlang, strong consistency </li></ul>
    • 25. Graph/Triple Store <ul><li>Model relationships well, bi-directional </li></ul><ul><li>Node/edges – edges can be weighted or not </li></ul><ul><li>RDF Triple – subject -> predicate -> object, w3c standard for semantic web </li></ul><ul><li>Many implement SPARQL, object api </li></ul><ul><li>Sharding can difficult because of graph nature </li></ul><ul><li>Schema-less – nodes, edges, properties </li></ul><ul><li>Fast set operations </li></ul>
    • 26. Graph/Triple Stores <ul><li>Neo4j – ACID transactions, object API </li></ul><ul><li>Alegrograph – Reference impl of SPARQL </li></ul><ul><li>Bigdata – dynamic sharding </li></ul><ul><li>Trinity – Microsoft research </li></ul><ul><li>Infinite Graph – Distributed, cross-platform </li></ul><ul><li>FlockDb – Twitter, fast set operations </li></ul><ul><li>Infogrid – Object based, REST api </li></ul>
    • 27. Interesting Integrations <ul><li>Lucene - Document Store with Search as Query Language </li></ul><ul><li>SOLR and Elastic Search – Scalable Lucene </li></ul><ul><li>Riak Search – Elang impl of Lucene APIs </li></ul><ul><li>Solandra – Lucene on Cassandra backend </li></ul><ul><li>Couchdb-lucene – Integration </li></ul><ul><li>DistributedLucene – Lucene on Hadoop </li></ul><ul><li>Neo4j – Full Text Search on Graph Store </li></ul>
    • 28. Worth Mentioning <ul><li>Configuration Dbs – ZooKeeper, Doozer </li></ul><ul><ul><li>Distributed configuration, locks, synchronization </li></ul></ul><ul><ul><li>Used to make other apps scalable </li></ul></ul><ul><li>XML Dbs – eXist, BaseX, Xindice </li></ul><ul><ul><li>XML only, Xquery, Xpath, ACID, GUI support </li></ul></ul><ul><ul><li>non-distributed </li></ul></ul>
    • 29.  
    • 30.  
    • 31. Case Study - HBase <ul><li>Apache – part of Hadoop/HDFS </li></ul><ul><li>Requires ZooKeeper </li></ul><ul><li>Java based </li></ul><ul><li>Runs well on Amazon EC2 </li></ul><ul><li>Excellent language support </li></ul><ul><li>Supports REST interface </li></ul>
    • 32. HBase continued <ul><li>Map/Reduce via Hadoop </li></ul><ul><li>Schema-less, column families fixed </li></ul><ul><li>Nearly unlimited columns and rows </li></ul><ul><li>HBQL – partial sql + JDBC support </li></ul><ul><li>Some ACID support, atomicity, durability </li></ul><ul><li>Integration with Hive for data warehousing, ad-hoc query support - HiveQL </li></ul>
    • 33. Case Study - Riak <ul><li>Data Model – Bucket/Key/Value </li></ul><ul><li>Value has MIME type, byte[] </li></ul><ul><li>Value supports one-way Links, basic graph </li></ul><ul><li>Erlang, Protocol Buffers, REST interfaces </li></ul><ul><li>Pre/Post Commit Hooks </li></ul><ul><li>CAP Tunable per bucket </li></ul><ul><li>Map/Reduce – Erlang and Javascript </li></ul>
    • 34. Riak Continued <ul><li>Vector Clocks </li></ul><ul><li>Read repair for R < N </li></ul><ul><li>Peer-to-Peer, Nothing Shared Architecture </li></ul><ul><li>Replication across data centers </li></ul><ul><li>Pluggable storage </li></ul><ul><li>API for Most Languages + REST </li></ul><ul><li>Commercial Support </li></ul>
    • 35. Case Study - Redis <ul><li>Supports hash, list, set, and sorted set </li></ul><ul><li>Fast set operations </li></ul><ul><li>Atomic updates </li></ul><ul><li>Everything stored in memory </li></ul><ul><li>Persistence to disk – periodic save, append only file, can be compacted </li></ul><ul><li>Good API support, JDBC subset driver </li></ul>
    • 36. Redis Continued <ul><li>Master – slave replication, read scalability, redundancy, slave can sync to disk </li></ul><ul><li>Can swap out values, keys must be in memory </li></ul><ul><li>Can be used as pub/sub messaging system </li></ul><ul><li>Can send multiple commands in single request </li></ul><ul><li>Built to be extremely fast </li></ul><ul><li>Supports very high speed atomic counters </li></ul>
    • 37. Case Study - Neo4j <ul><li>Java based – cross platform </li></ul><ul><li>ACID transactions </li></ul><ul><li>Durable persistence </li></ul><ul><li>Handle billions of nodes/edges single machine </li></ul><ul><li>Supports bulk data loading </li></ul><ul><li>Good language support </li></ul>
    • 38. Neo4j Continued <ul><li>Spatial index support </li></ul><ul><li>RDF triples/OWL/SPARQL support </li></ul><ul><li>Replication and HA – commercial version </li></ul><ul><li>Object oriented API </li></ul><ul><li>Sharding at client level </li></ul><ul><li>Dual open source and commercial license </li></ul>
    • 39. Resources <ul><li>fallabs.com/tokyocabinet </li></ul><ul><li>fallabs.com/kyotocabinet </li></ul><ul><li>redis.io </li></ul><ul><li>www.membase.org </li></ul><ul><li>neo4j.org </li></ul><ul><li>en.wikipedia.org/wiki/Triplestore </li></ul><ul><li>en.wikipedia.org/wiki/Graph_theory </li></ul><ul><li>research.microsoft.com/en-us/projects/trinity </li></ul>
    • 40. Resources <ul><li>www.jboss.org/infinispan </li></ul><ul><li>basho.com </li></ul><ul><li>nosqlpedia.com/wiki/Consistency_models_in_nonrelational_dbs </li></ul><ul><li>www.hypertable.org </li></ul><ul><li>project-voldemort.com </li></ul><ul><li>www.allthingsdistributed.com/2007/10/amazons_dynamo.html </li></ul>
    • 41. Resources <ul><li>nosql-database.org </li></ul><ul><li>couchdb.apache.org </li></ul><ul><li>engineering.twitter.com/2010/05/introducing-flockdb.html </li></ul><ul><li>infinitegraph.com </li></ul><ul><li>nosql-database.org </li></ul><ul><li>http://www.w3.org/TR/rdf-concepts/ </li></ul>

    ×