Lviv EDGE 2 - NoSQL
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Lviv EDGE 2 - NoSQL

  • 3,643 views
Uploaded on

Presentation from Lviv EDGE #2 User Group meeting by Lohika Staff Engineer / Scalable Java Lab Lead- Zenyk Matchyshyn

Presentation from Lviv EDGE #2 User Group meeting by Lohika Staff Engineer / Scalable Java Lab Lead- Zenyk Matchyshyn

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
3,643
On Slideshare
1,384
From Embeds
2,259
Number of Embeds
15

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 2,259

http://www.rozrobka.com 1,195
http://edge-lviv.blogspot.com 977
http://asyncionews.com 49
http://edge-lviv.blogspot.ru 20
http://edge-lviv.blogspot.de 4
http://gag.org.ua 3
http://www.newsblur.com 2
http://translate.googleusercontent.com 2
http://edge-lviv.blogspot.ie 1
http://edge-lviv.blogspot.nl 1
http://edge-lviv.blogspot.dk 1
http://a0.twimg.com 1
http://us-w1.rockmelt.com 1
http://paper.li 1
http://edge-lviv.blogspot.in 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. NoSQL By Zenyk Matchyshyn Staff Engineer, Lohika 1
  • 2. Agenda • History • Architecture vs Technology • Classification • Pros and Cons of usage • Trends • Q/A 2
  • 3. HISTORY 3
  • 4. 4
  • 5. History • NoSQL Technologies are not new • Many ideas originate from distributed computing, grid computing and parallel computing • Main drivers: • Scalability • Parallelization • Costs 5
  • 6. Google • In the beginning… there was Google! • Google shared scientific papers: • “The Google File System”, October 2003 • “MapReduce: Simplified Data Processing on Large Clusters”, December 2004 • “Bigtable: A Distributed Storage System for Structured Data”, November 2006 • “The Chubby Lock Service for Loosely- Coupled Distributed Systems”, November 2006 6
  • 7. Amazon • … and Amazon! • “Dynamo: Amazon Highly Available key/value Store”, October 2007 7
  • 8. New technologies! • Creators of Lucene wanted to create a full search solution • Ended up with Hadoop and Hadoop Distributed File System (HDFS) • Success helped adoption and new solutions emerged 8
  • 9. ARCHITECTURE VS TECHNOLOGY 9
  • 10. Architecture vs Technology • SQL is not bad, it’s just different • You can use SQL DB in NoSQL way, e.g. MySQL as a key-value database • You can do SQL queries on Hadoop data 10
  • 11. Architecture • The way you store data • The way you query data • Technology environment 11
  • 12. CLASSIFICATION 12
  • 13. Terms • ACID – Atomicity, Consistency, Isolation, Durability • CAP Theorem – Consistency, Availability, Partition tolerance • Eventual consistency • Hashing • Schema 13
  • 14. Classification • Column oriented stores • Key/Value stores • Key/Value stores with configurable consistency • Document stores • Graph stores 14
  • 15. Chart memcachedScalability & Performance Key/value Column oriented Document store RDBMS Depth of Functionality 15
  • 16. Column oriented • Based on Google Bigtable • Column oriented is a revers of Row oriented • Assumption is that datacenters are transcontinental and connected using standard Internet • C and P from CAP Theorem • Data consistent and partitioned but trouble with availability 16
  • 17. HBase • Spin off from Hadoop project - http://hbase.apache.org/ • Written in Java • A lot of interfaces – Thrift, REST, JRuby, etc. • SQL-like access through Hive - http://hive.apache.org/ • HBase ORM – Surus - https://github.com/mushkevych/surus • Used by Facebook, Hulu, Yahoo!, Ning, etc. 17
  • 18. Hypertable • Developed by Zvents, open sourced • Written in C++ • Running on top of distributed file system • Used by Baidu 18
  • 19. Key/Value • Key/Value Store – Oracle Berkley DB (Oracle NoSQL), Redis, Kyoto Cabinet • Can store strings, arrays, hashes 19
  • 20. Oracle NoSQL • Sign of things to come! • http://www.oracle.com/technetwork/database/ nosqldb/overview/index.html • Written in Java • Configurable consistency • BerkleyDB as a backend • No single node of failure • Transactions 20
  • 21. Redis • http://redis.io/ • Lots of bindings • Written in C • In-memory, with optional durability • Also a document store 21
  • 22. Key/Value – eventual consistency • K/V Availability over Consistency • Inspired by Amazon Dynamo • Dynamo based on assumption of high speed network links between data centers and datacenters are close to each other • A and P from CAP Theorem • Achieve eventual consistency through replication and verification • Consistency is eventual 22
  • 23. Cassandra • http://cassandra.apache.org/ • Multidimensional map indexed by key • No single point of failure • Decentralized • Tunable consistency • Used by Facebook, Cisco, IBM, Rackspace 23
  • 24. Voldemort • http://project-voldemort.com/ • Developed by LinkedIn • Written in Java • Developers oriented – a lot of modules are pluggable • Strictly key/value 24
  • 25. Document stores • Document Databases • Document oriented stores are semi structured • Mostly JSON oriented • Also called schema free rows • Can query by field 25
  • 26. MongoDB • http://www.mongodb.org/ • Schema-free, document-oriented • Written in C++ • Lots of interfaces • JSON documents • Query language, supports indexing • Map/Reduce 26
  • 27. CouchDB • http://couchdb.apache.org/ • RESTful API • JSON documents • Written in Erlang • Supports ACID • Map/Reduce • Eventual consistency 27
  • 28. Graph • Provide ways to store graphs • Provide traversing • Graph oriented functionality 28
  • 29. Neo4j • http://neo4j.org/ • Written in Java • Stores and navigates graphs • Stable and proven • Commercial and free licenses 29
  • 30. PROS AND CONS OF USAGE 30
  • 31. Pros and Cons • Scalability • Transactional Integrity and Consistency • Data Modeling • Query Support • Access and Interface Availability 31
  • 32. Typical Usage • Large amount of data • Read/Write balanced? • Read Heavy • Write Heavy • Scan • Geospatial • Map/Reduce • Social data 32
  • 33. Is it for you? • Technology is still developing • Be ready to patch • SQL is easier • Not all startups will end up being Facebooks • Some things can be solvable only with NoSQL 33
  • 34. TRENDS 34
  • 35. Trends • Oracle released Oracle NoSQL! • Adoption of Hadoop soars • SQL like access to NoSQL stores taking form – UnSQL - http://www.unqlspec.org/display/UnQL/Home • You can participate! 35
  • 36. Opportunities • Spring Data - http://www.springsource.org/spring-data • Cloud Foundry PaaS - http://www.cloudfoundry.com/ • ORM/Simplification 36
  • 37. Q/A 37