• Save
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database

  • 8,382 views
Uploaded on

NoSQL includes a wide range of different database technologies and were developed as a result of surging volume of data stored. Relational databases are not capable of coping with this huge volume......

NoSQL includes a wide range of different database technologies and were developed as a result of surging volume of data stored. Relational databases are not capable of coping with this huge volume and faces agility challenges. This is where NoSQL databases have come in to play and are popular because of their features. The session covers the following topics to help you choose the right NoSQL databases:

Traditional databases
Challenges with traditional databases
CAP Theorem
NoSQL to the rescue
A BASE system
Choose the right NoSQL database

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
8,382
On Slideshare
6,635
From Embeds
1,747
Number of Embeds
11

Actions

Shares
Downloads
1
Comments
0
Likes
17

Embeds 1,747

http://www.edureka.in 1,109
http://www.edureka.co 543
https://twitter.com 30
http://feeds.feedburner.com 24
http://www.slideee.com 19
https://www.linkedin.com 9
http://news.google.com 4
http://searchutil01 3
http://tweetedtimes.com 2
http://localhost 2
http://www.slidesearchengine.com 2

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Slide 1 HBase Vs Cassandra Vs MongoDB - choose the right NoSQL database View NoSQL database Courses at : www.edureka.in *
  • 2. Slide 2 Objectives of this Session • Un For Queries during the session and class recording: Post on Twitter @edurekaIN: #askEdureka Post on Facebook /edurekaIN  Traditional databases  Challenges with traditional databases  CAP Theorem  NoSQL to the rescue  A BASE system  Choose the right NoSQL database www.edureka.in
  • 3. Slide 3 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions RDBMS/OLTP/Real Time NoSQL/New SQL/BigData DSS/OLAP/DW Oracle MySQL MS SQL DB2 Netezza SAP Hana Oracle Express MongoDB HBase Cassandra CouchDB Database Categories www.edureka.in
  • 4. Slide 4 www.edureka.in 5000 TPS Caching Layer 300 ~ 500 SQL Transaction 100 ~ 200 SQL Transaction 1000 TPS WEB APPLICATION RDBMS1 Applications Changing Data RDBMS1 Elastic Scale A Traditional database solution
  • 5. Slide 5 www.edureka.in 1000 TPS Elastic Scale WEB APPLICATION Applications Changing Data Elastic Scale CASSANDRA 300 ~ 500 SQL Transaction 100 ~ 200 SQL Transaction 5000 TPS A NoSQL database solution
  • 6. Slide 6 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in Challenges with traditional databases  Not a good fit for large Data Volume (petabytes of data) with Varying data types e.g. images, videos, text etc.  Can’t scale for large data volume e.g. 15 - 20 petabyte data in Govt. of India “AADHAR” project  Scale-up - Limited by Memory and Processing (CPU) capabilities  Scale-out - Cache dependent ‘Read’ and ‘Write’ Operations  Complex RDBMS model – Parsing, Locking, Logging, Buffer pool, Threads etc.  Sharding causes operational problems e.g. managing a shard failure  Consistency – A bottleneck for Scalability in RDBMS  Satisfying ACID is an hindrance for Scaling  Relaxed consistency to scale out with NoSQL databases
  • 7. Slide 7 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in CAP We must understand the CAP theorem when we talk about NoSQL databases or in fact when designing any distributed system. CAP theorem states that there are 3 basic requirements which exist in a special relation when designing applications for a distributed architecture. Consistency Availability Partition Tolerance CAP Theorem This means that the system is always on (service guarantee availability), no downtime. This means that the system continues to function even the communication among the servers is unreliable, i.e. the servers may be partitioned into multiple groups that cannot communicate with one another. This means that the data in the database remains consistent after the execution of an operation. For example after an update operation all clients see the same data.
  • 8. Slide 8 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions  CAP provides the basic requirements for a distributed system to follow 2 of the 3 requirements.  In theoretically it is impossible to fulfill all 3 requirements.  Therefore all the current NoSQL database follow the different combinations of the C, A, P from the CAP theorem. CAP Theorem and NoSQL databases  CA - Single site cluster, therefore all nodes are always in contact. When a partition occurs, the system blocks.  CP - Some data may not be accessible, but the rest is still consistent/accurate.  AP - System is still available under partitioning, but some of the data returned may be inaccurate. www.edureka.in
  • 9. Slide 9 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in NoSQL to the rescue  A scale-out, shared-nothing architecture, capable of running on a large number of nodes  A non-locking concurrency control mechanism so real-time reads will not conflict with writes  Scalable replication and distribution  Thousands of machines with distributed data  An architecture providing much higher per-node performance than available from the traditional SQL-based databases  Schema-less Data Model  Mostly Query and Few Updates
  • 10. Slide 10 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in  Basically Available indicates that the system does guarantee availability, in terms of the CAP theorem. Basically Available  Soft State indicates that the state of the system may change over time, even without input. This is because of the eventual consistency model. Soft State  Eventual Consistency indicates that the system will become consistent over time, given that the system doesn't receive input during that time. Eventual Consistency A BASE system gives up on consistency. NoSQL database - A BASE not ACID system
  • 11. Slide 11 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in ~ 150 No SQL Database are there in Market ~150 NoSQL database – Not a Panacea
  • 12. Slide 12 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions NoSQL Database – Storage Architecture CouchDB, MongoDB Collection of key value Connections Incomplete Data Tolerant Query Performance, No Standard Query Syntax Hbase, Cassandra Column Families Fast Look-ups Very Low Level API Amazon Simple DB, Redis Collection of Key Value pairs Fast Look-ups Stored Data has no Schema InfoGrid, Infinite Graph “Property Graph” - Nodes Graph Algorithms – Shortest Path, Connected ness, Etc Not easy to Cluster, traverse whole graph to get answer Data Model Example Weakness Strength Data Model Example Weakness Strength Data Model Example Weakness Strength Data Model Example Weakness Strength Document Data Store Databases Key Value Databases Columnar NoSQL Databases Graph NoSQL Databases No SQL Database Types www.edureka.in
  • 13. Slide 13 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions Right Data Model Pros and Cons of Consistency Compromising Features of RDBMS Step 2 Step 3 Selecting a NoSQL database Step 1 www.edureka.in
  • 14. Slide 14 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions Where to Use Cassandra?  If looking for simple setup, maintenance and code  Very High Velocity Random Reads & Writes  Flexible Sparse / Wide Column Requirements  No Multiple Secondary Index Needs www.edureka.in
  • 15. Slide 15 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.inwww.edureka.in Massive Scale, High Availability Cassandra Use Case - Twitter
  • 16. Slide 16 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions Where NOT to Use Cassandra? Do not use Cassandra if your application has:  Secondary Indexes.  Relational Data.  Transactional (Rollback, Commit)  Primary & Financial Records.  Stringent Security & Authorization Needs On Data  Dynamic Queries on Columns.  Searching Column Data  Low Latency www.edureka.in
  • 17. Slide 17 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions Where to Use HBase  Optimized for reads  Well suited for doing Range based scans  Applications with strict consistency requirements  Applications with fast read and writes with scalability  Facebook uses it to manage its user statuses, photos, chat messages etc. www.edureka.in
  • 18. Slide 18 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.inwww.edureka.in Consistency and Scale HBase Use Case - Facebook Messenger
  • 19. Slide 19 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in  It is not optimized for classic transactional applications or even relational analytics  Application that need:  full table scans  data to be aggregated, rolled up, analysed across rows Where Not to use HBase
  • 20. Slide 20 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in Where to Use MongoDB www.edureka.in  RDBMS replacement for Web Applications  Semi-structured Content Management  Real-time Analytics & High-Speed Logging  Caching and High Scalability  Web 2.0, Media, SAAS, Gaming http://www.mongodb.org/about/production-deployments/
  • 21. Slide 21 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in  MySQL for Active posts  MongoDB for Archived posts  Migrated Two billion plus posts to MongoDB  Migrated from RDBMS to MongoDB  Storage of venues and check-ins High-performance and Schema-free MongoDB Use Cases
  • 22. Slide 22 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in  Highly Transactional Applications  Applications with traditional database systems requirements such as foreign-key constraints etc. Where Not to use MongoDB
  • 23. Slide 23 www.edureka.in  Distributed and scalable big data store  Strong consistency  Built on top of Hadoop Distributed File system (HDFS)  CP on CAP Cassandra MongoDBHBase  High availability  Incremental scalability  Eventually consistent  Trade-offs between consistency and latency  Minimal administration  No SPF (Single Point of Failure)  AP on CAP  Schemas to change as applications evolve (Schema-free)  Full Index Support for High Performance.  Replication and Failover for High Availability.  Auto Sharding for Easy Scalability.  Rich Document based queries for Easy readability  CP on CAP HBase Vs Cassandra Vs MongoDB
  • 24. Slide 24 Questions? Buy NoSQL database Courses at : www.edureka.in Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in