NoSQL   By Zenyk Matchyshyn   Staff Engineer, Lohika                        1
Agenda •   History •   Architecture vs Technology •   Classification •   Pros and Cons of usage •   Trends •   Q/A        ...
HISTORY          3
4
History •   NoSQL Technologies are not new •   Many ideas originate from distributed     computing, grid computing and par...
Google •   In the beginning… there was Google! •   Google shared scientific papers:     •   “The Google File System”, Octo...
Amazon •   … and Amazon! •   “Dynamo: Amazon Highly Available key/value     Store”, October 2007                          ...
New technologies! •   Creators of Lucene wanted to create a full     search solution •   Ended up with Hadoop and Hadoop  ...
ARCHITECTURE VS TECHNOLOGY                             9
Architecture vs Technology •   SQL is not bad, it’s just different •   You can use SQL DB in NoSQL way, e.g.     MySQL as ...
Architecture •   The way you store data •   The way you query data •   Technology environment                             ...
CLASSIFICATION                 12
Terms •   ACID – Atomicity, Consistency, Isolation,     Durability •   CAP Theorem – Consistency, Availability,     Partit...
Classification •   Column oriented stores •   Key/Value stores •   Key/Value stores with configurable     consistency •   ...
Chart                            memcachedScalability & Performance                                   Key/value           ...
Column oriented •   Based on Google Bigtable •   Column oriented is a revers of Row oriented •   Assumption is that datace...
HBase •   Spin off from Hadoop project -     http://hbase.apache.org/ •   Written in Java •   A lot of interfaces – Thrift...
Hypertable •   Developed by Zvents, open sourced •   Written in C++ •   Running on top of distributed file system •   Used...
Key/Value •   Key/Value Store – Oracle Berkley DB (Oracle     NoSQL), Redis, Kyoto Cabinet •   Can store strings, arrays, ...
Oracle NoSQL •   Sign of things to come! •   http://www.oracle.com/technetwork/database/     nosqldb/overview/index.html •...
Redis •   http://redis.io/ •   Lots of bindings •   Written in C •   In-memory, with optional durability •   Also a docume...
Key/Value – eventual consistency •   K/V Availability over Consistency •   Inspired by Amazon Dynamo •   Dynamo based on a...
Cassandra •   http://cassandra.apache.org/ •   Multidimensional map indexed by key •   No single point of failure •   Dece...
Voldemort •   http://project-voldemort.com/ •   Developed by LinkedIn •   Written in Java •   Developers oriented – a lot ...
Document stores •   Document Databases •   Document oriented stores are semi structured •   Mostly JSON oriented •   Also ...
MongoDB •   http://www.mongodb.org/ •   Schema-free, document-oriented •   Written in C++ •   Lots of interfaces •   JSON ...
CouchDB •   http://couchdb.apache.org/ •   RESTful API •   JSON documents •   Written in Erlang •   Supports ACID •   Map/...
Graph •   Provide ways to store graphs •   Provide traversing •   Graph oriented functionality                            ...
Neo4j •   http://neo4j.org/ •   Written in Java •   Stores and navigates graphs •   Stable and proven •   Commercial and f...
PROS AND CONS OF USAGE                         30
Pros and Cons •   Scalability •   Transactional Integrity and Consistency •   Data Modeling •   Query Support •   Access a...
Typical Usage •   Large amount of data •   Read/Write balanced? •   Read Heavy •   Write Heavy •   Scan •   Geospatial •  ...
Is it for you?  •   Technology is still developing  •   Be ready to patch  •   SQL is easier  •   Not all startups will en...
TRENDS         34
Trends •   Oracle released Oracle NoSQL! •   Adoption of Hadoop soars •   SQL like access to NoSQL stores taking form     ...
Opportunities •   Spring Data -     http://www.springsource.org/spring-data •   Cloud Foundry PaaS -     http://www.cloudf...
Q/A      37
Upcoming SlideShare
Loading in …5
×

Lviv EDGE 2 - NoSQL

3,899 views

Published on

Presentation from Lviv EDGE #2 User Group meeting by Lohika Staff Engineer / Scalable Java Lab Lead- Zenyk Matchyshyn

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,899
On SlideShare
0
From Embeds
0
Number of Embeds
2,503
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Lviv EDGE 2 - NoSQL

  1. 1. NoSQL By Zenyk Matchyshyn Staff Engineer, Lohika 1
  2. 2. Agenda • History • Architecture vs Technology • Classification • Pros and Cons of usage • Trends • Q/A 2
  3. 3. HISTORY 3
  4. 4. 4
  5. 5. History • NoSQL Technologies are not new • Many ideas originate from distributed computing, grid computing and parallel computing • Main drivers: • Scalability • Parallelization • Costs 5
  6. 6. Google • In the beginning… there was Google! • Google shared scientific papers: • “The Google File System”, October 2003 • “MapReduce: Simplified Data Processing on Large Clusters”, December 2004 • “Bigtable: A Distributed Storage System for Structured Data”, November 2006 • “The Chubby Lock Service for Loosely- Coupled Distributed Systems”, November 2006 6
  7. 7. Amazon • … and Amazon! • “Dynamo: Amazon Highly Available key/value Store”, October 2007 7
  8. 8. New technologies! • Creators of Lucene wanted to create a full search solution • Ended up with Hadoop and Hadoop Distributed File System (HDFS) • Success helped adoption and new solutions emerged 8
  9. 9. ARCHITECTURE VS TECHNOLOGY 9
  10. 10. Architecture vs Technology • SQL is not bad, it’s just different • You can use SQL DB in NoSQL way, e.g. MySQL as a key-value database • You can do SQL queries on Hadoop data 10
  11. 11. Architecture • The way you store data • The way you query data • Technology environment 11
  12. 12. CLASSIFICATION 12
  13. 13. Terms • ACID – Atomicity, Consistency, Isolation, Durability • CAP Theorem – Consistency, Availability, Partition tolerance • Eventual consistency • Hashing • Schema 13
  14. 14. Classification • Column oriented stores • Key/Value stores • Key/Value stores with configurable consistency • Document stores • Graph stores 14
  15. 15. Chart memcachedScalability & Performance Key/value Column oriented Document store RDBMS Depth of Functionality 15
  16. 16. Column oriented • Based on Google Bigtable • Column oriented is a revers of Row oriented • Assumption is that datacenters are transcontinental and connected using standard Internet • C and P from CAP Theorem • Data consistent and partitioned but trouble with availability 16
  17. 17. HBase • Spin off from Hadoop project - http://hbase.apache.org/ • Written in Java • A lot of interfaces – Thrift, REST, JRuby, etc. • SQL-like access through Hive - http://hive.apache.org/ • HBase ORM – Surus - https://github.com/mushkevych/surus • Used by Facebook, Hulu, Yahoo!, Ning, etc. 17
  18. 18. Hypertable • Developed by Zvents, open sourced • Written in C++ • Running on top of distributed file system • Used by Baidu 18
  19. 19. Key/Value • Key/Value Store – Oracle Berkley DB (Oracle NoSQL), Redis, Kyoto Cabinet • Can store strings, arrays, hashes 19
  20. 20. Oracle NoSQL • Sign of things to come! • http://www.oracle.com/technetwork/database/ nosqldb/overview/index.html • Written in Java • Configurable consistency • BerkleyDB as a backend • No single node of failure • Transactions 20
  21. 21. Redis • http://redis.io/ • Lots of bindings • Written in C • In-memory, with optional durability • Also a document store 21
  22. 22. Key/Value – eventual consistency • K/V Availability over Consistency • Inspired by Amazon Dynamo • Dynamo based on assumption of high speed network links between data centers and datacenters are close to each other • A and P from CAP Theorem • Achieve eventual consistency through replication and verification • Consistency is eventual 22
  23. 23. Cassandra • http://cassandra.apache.org/ • Multidimensional map indexed by key • No single point of failure • Decentralized • Tunable consistency • Used by Facebook, Cisco, IBM, Rackspace 23
  24. 24. Voldemort • http://project-voldemort.com/ • Developed by LinkedIn • Written in Java • Developers oriented – a lot of modules are pluggable • Strictly key/value 24
  25. 25. Document stores • Document Databases • Document oriented stores are semi structured • Mostly JSON oriented • Also called schema free rows • Can query by field 25
  26. 26. MongoDB • http://www.mongodb.org/ • Schema-free, document-oriented • Written in C++ • Lots of interfaces • JSON documents • Query language, supports indexing • Map/Reduce 26
  27. 27. CouchDB • http://couchdb.apache.org/ • RESTful API • JSON documents • Written in Erlang • Supports ACID • Map/Reduce • Eventual consistency 27
  28. 28. Graph • Provide ways to store graphs • Provide traversing • Graph oriented functionality 28
  29. 29. Neo4j • http://neo4j.org/ • Written in Java • Stores and navigates graphs • Stable and proven • Commercial and free licenses 29
  30. 30. PROS AND CONS OF USAGE 30
  31. 31. Pros and Cons • Scalability • Transactional Integrity and Consistency • Data Modeling • Query Support • Access and Interface Availability 31
  32. 32. Typical Usage • Large amount of data • Read/Write balanced? • Read Heavy • Write Heavy • Scan • Geospatial • Map/Reduce • Social data 32
  33. 33. Is it for you? • Technology is still developing • Be ready to patch • SQL is easier • Not all startups will end up being Facebooks • Some things can be solvable only with NoSQL 33
  34. 34. TRENDS 34
  35. 35. Trends • Oracle released Oracle NoSQL! • Adoption of Hadoop soars • SQL like access to NoSQL stores taking form – UnSQL - http://www.unqlspec.org/display/UnQL/Home • You can participate! 35
  36. 36. Opportunities • Spring Data - http://www.springsource.org/spring-data • Cloud Foundry PaaS - http://www.cloudfoundry.com/ • ORM/Simplification 36
  37. 37. Q/A 37

×