SlideShare a Scribd company logo
1 of 84
Data, data, data. I cannot make bricks without clay.  Sherlock Holmes, Sherlock Holmes [2009]
Data Qualitative or Quantitative attributes of a variable or set of variables Lowest level of abstraction from which information and then knowledge are derived. Representation of a fact, figure and idea.
A well organized newspaper or a clumsy, cluttered one?
Data explosion From Gigabytes to Terabytes to Petabytes to perhaps (I’m out of nomenclature)-bytes
NoSQL = Not Only SQL!= No to SQL != Never SQL
Open Source Abridged version of this presentation and notes will be available for everyone. Distributed under no License FREE AS IN SPEECH AND BEER
WEB 2.0 DDBMS RDBMS performance OODB RnD Cloud Computing Multiple Solutions Necessity is the mother of Invention
SQL Databases, the ‘Hammer’ It’s a wonderful tool
Commercial SQL Databases Even Gods use it Design Power Ergonomics Ease of use Features Warranty Upgrades Apart from Hole in the Pocket
Nail is a nail, Screw is a screw Hammering a screw or Screw driving a nail is FOOLISHNESS!
Non-relational next generation operational data stores and databases What? NoSQL is a new look at data to deliver: ,[object Object]
Unlimited horizontal scalability
Economic, common, unreliable hardware
Auto Sharding
Support for wide range of data
Recursive, Hierarchical
Non-Rigid
High Availability,[object Object]
NoSQL, the ‘screwdriver’ Yet another tool in our repository to go along with the hammer
NoSQL is about choice Not all problems are nails. Not all screws are same. GOOD PROGRAMMING PRACTICE:  Know your tools and use them appropriately
SQL Databases Data Relational Tabular – Rows/Columns Interface Sql Basic Design Inspiration Set Theory ACID Design Scale Up Design ,[object Object]
MySQL
Teradata
SQLite
SQL ServerAnd many more
Why? ,[object Object]
 If Consistency is ensured, do we have to enforce/check it again at the database level.
 Are RDBMS ready for challenges of the future like:
Dynamic schema/metadata
Huge amounts of data
Through horizontal auto scaling
Ability to handle complex data types
Images, Videos, Audios and much moreNot Really!
Why? (Continued…) RDBMS drawbacks: Scalability CRUD Performance Write Overhead Limited by single disk architecture Lack of In Memory design Rigid schema design And more …..
HAMMERS Are under some Hammering
DRAWBACKS E E P D I V E
Scalability True Scalability Horizontal Scaling Transparency to the application No single point of failure Problems with SQL databases Vertical Scaling Partitioning aka Sharding Read Slaves Anti Patterns Normalized Data Joins ACID Transactions
No Breadcrumbs CRUD is crude Delete/Update strategy is improper CRA! Create, Read, Archive – way to go ahead Audit information is lost in CRUD but not in the case of CRA
Naive Data Support Not designed for  Complex Data Structures Recursive Hierarchical Ordered List Circular Dynamic Metadata
Logical/Physical separation concerns Relational model -> Logical Model RDBMS implement it at physical level Using Multiple indices Artificial overhead in managing the database Frequent drop and create index to make DB perform
Spinning Disk Storage Design flaw for most RDBMS systems With cheaper memory, Memory based approach should also be included in the design Defiance of Moore’s law Disk reads grew only 12.5 times in about 50 years Disk writes much lesser. Disk write is expensive. RDBMS make things worse by writing more. ACID rains are UNHEALTHY
Think ‘Out of the ROM’
At Snail’s pace RDBMS engine growth – SLOW Optimizations have been minor since initial days Majority of growth due to Moore’s law Faster hardware Slightly faster storage Faster memory What when Moore’s law diminishes thanks to external factors like heat generated.
Database size limits RDBMS are too slow Over multiterabyte and petabyte databases Purpose designed parallel processing would be needed to handle such capacities of data in a RDBMS.
RDBMS  has been there since years  and is proven technology What aboutNoSQL
RDBMS grew fast but  growth slowed down over time and  might eventually reach a stale point NoSQL unarguably a new immature tool,  has been growing faster than RDBMS ever did and is being supported by the Big Players
Did you say BIG PLAYERS! WHO?
NoSQL Real World Implementations ,[object Object]
Facebook – Hbase
Digg – Cassandra
Amazon – Dynamo
Trend Micro – Hbase
Netflix – Amazon SimpleDB
Shutterfly – MongoDB
LinkedIn – Voldemortand more Microsoft is considering NoSQL as well for Azure services so is Twitter Are we next? Major IT Companies have implemented or even better created their own NoSQL to manage huge Data stores which couldn’t be managed by SQL Databases.
We are used to  SQL and relatedness,  why can’t they just fix RDBMS to handle Big Data STORAGE SEEK RATES Large writes and ACID being a huge limitation Big Data can be handled via  Scale Out/Partitionability across Multiple Nodes
CAP Theorem Applies to distributed shared data system
CAP THEOREM
A Deeper look Consistency: The system is in a consistent state after an operation All clients see the same data Strong Consistency(ACID) vs. Eventual (BASE) Availability: ‘Always On’ mode, no downtime All clients can find some available replica Software/hardware upgrade tolerance Partition Tolerance: The system continues to function even when split into disconnected subsets (by a network disruption) Reads and Writes combined
CP ,[object Object]
Sharded database
TERADATA comes hereCA ,[object Object],RDBMS Paxos NoSQL AP ,[object Object],[object Object]
Basically Available Soft State Eventually  Consistent When Availability and Partitionability are prioritized over Consistency, think in terms of BASE
Eventual Consistency If no new updates are made to the object, eventually all accesses will return the last updated value. Ex: Domain Name System (DNS)
Types of Eventual Consistency Read-your-write consistency Session consistency Monotonic read consistency Monotonic write consistency Causal consistency Practically, Read-your-write consistency and monotonic read consistency are desirable in an eventually consistent system
Hash() Different Apps – Different CAP requirement Prioritize among Consistency – Availability Availability – Partitionability Consistency - Partitionability
WHERE? So will NoSQL eventually replace RDBMSs everywhere?No, RDBMS are there to stay. NoSQL is here to help.
Wherever you want to take Advantage of  NoSQL
Big Data Denormalize Shard Scale Out And look no further than NoSQL
Write Intensive Applications I/OpS of the Best storage device <<< n * I/OpS of relatively cheaper storage devices in simple terms:          ‘HARNESS THE POWER OF YOUR CLOUD’
Fast Key-Value Access NoSQL – ‘User, you are looking for $value’ RDBMS – ‘Query executing ….’ A O(1) Hash operation or O(log n) B+/B tree traversals
Flexible Schema and Data types ‘I once was a integer, then a string then a date; What am I’  - FieldRDBMS – ‘WTH! Whatever you are, You are beyond my scope’
Transient Data Data – ‘I’m here only for a while and want to get my work done fast’ RDBMS – ‘You are data and you shall be treated like the rest’ NoSQL – ‘Okay, I’ll allot you space in the RAM using Memcached If available otherwise you still have my cloud’
High Write Availability Warning - Incoming data ….NoSQL – ‘Anytime you like, user’ RDBMS – ‘This is insane, I’m already busy  with other things’
ECONOMICS RDBMS – ‘I’m powered by a wonderful, beautiful rabbit’ NoSQL – ‘I’m powered by many cute little hamsters’
No Single Point of Failure Designed to run over Economic Commonly Available Unreliable hardware
Full table scan operations MapReduce: Map:  To define your problems into optimal sub problems which can be computed in parallel and reduced later Reduce: To merge the sub optimal solutions into the result Divide and Conquer your way to Victory Powered by MapReduce! Or something similar
Ability to restore, maintain, repair itself No DBA required Design
HOW? Let us welcome  Keys, Values, Collections, Data Structures, Objects, Documents  Graphs
NoSQL View The basic approach at data: Key/Value store Run on multiple machines Partitions and Replication across these machines Relax consistency Aim at Eventual Consistency Asynchronous replication But not all NoSQL take the same path.
Document Store Key-Value Store Object NoSQL Multivalue Graph Stores BigTable Clones Tuble Store
Key-Value Stores One key, one value, no duplicates and crazy fast Distributed hash tables The value is stored as binary object – BLOB The DB doesn’t understand it and doesn’t want to Ex: Amazon Dynamo, MemcacheDB
Key4 Key3 Key2 Key1 Key/Value store doesn’t know what is in here
Document Store Key-value store, but the value is structured and understood by the DB Querying data is possible On not just the key Ex: MongoDB, CouchDB, Riaketc

More Related Content

What's hot

Presentation on Databases in the Cloud
Presentation on Databases in the CloudPresentation on Databases in the Cloud
Presentation on Databases in the Cloudmoshfiq
 
Cloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayCloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayZendCon
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSatya Pal
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational DatabasesUdi Bauman
 
Oracle vs NoSQL – The good, the bad and the ugly
Oracle vs NoSQL – The good, the bad and the uglyOracle vs NoSQL – The good, the bad and the ugly
Oracle vs NoSQL – The good, the bad and the uglyJohn Kanagaraj
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutesKaren Lopez
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQLCrate.io
 
To SQL or NoSQL, that is the question
To SQL or NoSQL, that is the questionTo SQL or NoSQL, that is the question
To SQL or NoSQL, that is the questionKrishnakumar S
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web ApplicationsWhat Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web ApplicationsTodd Hoff
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...James Serra
 
Architecting applications in the AWS cloud
Architecting applications in the AWS cloudArchitecting applications in the AWS cloud
Architecting applications in the AWS cloudCloud Genius
 
The return of big iron?
The return of big iron?The return of big iron?
The return of big iron?Ben Stopford
 
Designing for the Cloud Tutorial - QCon SF 2009
Designing for the Cloud Tutorial - QCon SF 2009Designing for the Cloud Tutorial - QCon SF 2009
Designing for the Cloud Tutorial - QCon SF 2009Stuart Charlton
 

What's hot (20)

On nosql
On nosqlOn nosql
On nosql
 
Presentation on Databases in the Cloud
Presentation on Databases in the CloudPresentation on Databases in the Cloud
Presentation on Databases in the Cloud
 
NoSQL Basics - A Quick Tour
NoSQL Basics - A Quick TourNoSQL Basics - A Quick Tour
NoSQL Basics - A Quick Tour
 
Cloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayCloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go Away
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explained
 
Know what is NOSQL
Know what is NOSQL Know what is NOSQL
Know what is NOSQL
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational Databases
 
Oracle vs NoSQL – The good, the bad and the ugly
Oracle vs NoSQL – The good, the bad and the uglyOracle vs NoSQL – The good, the bad and the ugly
Oracle vs NoSQL – The good, the bad and the ugly
 
Rdbms vs. no sql
Rdbms vs. no sqlRdbms vs. no sql
Rdbms vs. no sql
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutes
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQL
 
To SQL or NoSQL, that is the question
To SQL or NoSQL, that is the questionTo SQL or NoSQL, that is the question
To SQL or NoSQL, that is the question
 
Nosql intro
Nosql introNosql intro
Nosql intro
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web ApplicationsWhat Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
 
Architecting applications in the AWS cloud
Architecting applications in the AWS cloudArchitecting applications in the AWS cloud
Architecting applications in the AWS cloud
 
The return of big iron?
The return of big iron?The return of big iron?
The return of big iron?
 
Designing for the Cloud Tutorial - QCon SF 2009
Designing for the Cloud Tutorial - QCon SF 2009Designing for the Cloud Tutorial - QCon SF 2009
Designing for the Cloud Tutorial - QCon SF 2009
 

Viewers also liked

Pro LinkedIn Tips for Pro Bloggers
Pro LinkedIn Tips for Pro BloggersPro LinkedIn Tips for Pro Bloggers
Pro LinkedIn Tips for Pro BloggersConnie Chan Wang
 
Design & Usability Basics
Design & Usability BasicsDesign & Usability Basics
Design & Usability Basicselmorandall
 
OpenCalais in Linked Data context
OpenCalais in Linked Data contextOpenCalais in Linked Data context
OpenCalais in Linked Data contexteldorina
 
Docker introduction
Docker introductionDocker introduction
Docker introductioncawamata
 
Demystifying PostgreSQL
Demystifying PostgreSQLDemystifying PostgreSQL
Demystifying PostgreSQLNOLOH LLC.
 
MarkLogic and The Universal Index
MarkLogic and The Universal IndexMarkLogic and The Universal Index
MarkLogic and The Universal IndexNuno Job
 
Digital Marketing Statistics of 2014 & 2015- Credible Projections with Rules ...
Digital Marketing Statistics of 2014 & 2015- Credible Projections with Rules ...Digital Marketing Statistics of 2014 & 2015- Credible Projections with Rules ...
Digital Marketing Statistics of 2014 & 2015- Credible Projections with Rules ...Makesbridge
 

Viewers also liked (7)

Pro LinkedIn Tips for Pro Bloggers
Pro LinkedIn Tips for Pro BloggersPro LinkedIn Tips for Pro Bloggers
Pro LinkedIn Tips for Pro Bloggers
 
Design & Usability Basics
Design & Usability BasicsDesign & Usability Basics
Design & Usability Basics
 
OpenCalais in Linked Data context
OpenCalais in Linked Data contextOpenCalais in Linked Data context
OpenCalais in Linked Data context
 
Docker introduction
Docker introductionDocker introduction
Docker introduction
 
Demystifying PostgreSQL
Demystifying PostgreSQLDemystifying PostgreSQL
Demystifying PostgreSQL
 
MarkLogic and The Universal Index
MarkLogic and The Universal IndexMarkLogic and The Universal Index
MarkLogic and The Universal Index
 
Digital Marketing Statistics of 2014 & 2015- Credible Projections with Rules ...
Digital Marketing Statistics of 2014 & 2015- Credible Projections with Rules ...Digital Marketing Statistics of 2014 & 2015- Credible Projections with Rules ...
Digital Marketing Statistics of 2014 & 2015- Credible Projections with Rules ...
 

Similar to NoSQL

Databases benoitg 2009-03-10
Databases benoitg 2009-03-10Databases benoitg 2009-03-10
Databases benoitg 2009-03-10benoitg
 
Enterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillEnterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillBilly Newport
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, HowIgor Moochnick
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.Lukas Smith
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraJeff Bollinger
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesJon Meredith
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistTony Rogerson
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]Huy Do
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless DatabasesDan Gunter
 
Database Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastDatabase Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastInside Analysis
 
05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.pptAnandKonj1
 
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'sankarapu posibabu
 
No SQL Databases.ppt
No SQL Databases.pptNo SQL Databases.ppt
No SQL Databases.pptssuser8c8fc1
 
Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud RightScale
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!Andraz Tori
 

Similar to NoSQL (20)

Databases benoitg 2009-03-10
Databases benoitg 2009-03-10Databases benoitg 2009-03-10
Databases benoitg 2009-03-10
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
Enterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillEnterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison Pill
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with Cassandra
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL Databases
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/Specialist
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
 
No sql
No sqlNo sql
No sql
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Database Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastDatabase Revolution - Exploratory Webcast
Database Revolution - Exploratory Webcast
 
05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt
 
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
 
No SQL Databases.ppt
No SQL Databases.pptNo SQL Databases.ppt
No SQL Databases.ppt
 
Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
 

NoSQL

  • 1. Data, data, data. I cannot make bricks without clay. Sherlock Holmes, Sherlock Holmes [2009]
  • 2. Data Qualitative or Quantitative attributes of a variable or set of variables Lowest level of abstraction from which information and then knowledge are derived. Representation of a fact, figure and idea.
  • 3. A well organized newspaper or a clumsy, cluttered one?
  • 4. Data explosion From Gigabytes to Terabytes to Petabytes to perhaps (I’m out of nomenclature)-bytes
  • 5. NoSQL = Not Only SQL!= No to SQL != Never SQL
  • 6. Open Source Abridged version of this presentation and notes will be available for everyone. Distributed under no License FREE AS IN SPEECH AND BEER
  • 7. WEB 2.0 DDBMS RDBMS performance OODB RnD Cloud Computing Multiple Solutions Necessity is the mother of Invention
  • 8. SQL Databases, the ‘Hammer’ It’s a wonderful tool
  • 9. Commercial SQL Databases Even Gods use it Design Power Ergonomics Ease of use Features Warranty Upgrades Apart from Hole in the Pocket
  • 10. Nail is a nail, Screw is a screw Hammering a screw or Screw driving a nail is FOOLISHNESS!
  • 11.
  • 15. Support for wide range of data
  • 18.
  • 19. NoSQL, the ‘screwdriver’ Yet another tool in our repository to go along with the hammer
  • 20. NoSQL is about choice Not all problems are nails. Not all screws are same. GOOD PROGRAMMING PRACTICE: Know your tools and use them appropriately
  • 21.
  • 22. MySQL
  • 26.
  • 27. If Consistency is ensured, do we have to enforce/check it again at the database level.
  • 28. Are RDBMS ready for challenges of the future like:
  • 32. Ability to handle complex data types
  • 33. Images, Videos, Audios and much moreNot Really!
  • 34. Why? (Continued…) RDBMS drawbacks: Scalability CRUD Performance Write Overhead Limited by single disk architecture Lack of In Memory design Rigid schema design And more …..
  • 35. HAMMERS Are under some Hammering
  • 36. DRAWBACKS E E P D I V E
  • 37. Scalability True Scalability Horizontal Scaling Transparency to the application No single point of failure Problems with SQL databases Vertical Scaling Partitioning aka Sharding Read Slaves Anti Patterns Normalized Data Joins ACID Transactions
  • 38. No Breadcrumbs CRUD is crude Delete/Update strategy is improper CRA! Create, Read, Archive – way to go ahead Audit information is lost in CRUD but not in the case of CRA
  • 39. Naive Data Support Not designed for Complex Data Structures Recursive Hierarchical Ordered List Circular Dynamic Metadata
  • 40. Logical/Physical separation concerns Relational model -> Logical Model RDBMS implement it at physical level Using Multiple indices Artificial overhead in managing the database Frequent drop and create index to make DB perform
  • 41. Spinning Disk Storage Design flaw for most RDBMS systems With cheaper memory, Memory based approach should also be included in the design Defiance of Moore’s law Disk reads grew only 12.5 times in about 50 years Disk writes much lesser. Disk write is expensive. RDBMS make things worse by writing more. ACID rains are UNHEALTHY
  • 42. Think ‘Out of the ROM’
  • 43. At Snail’s pace RDBMS engine growth – SLOW Optimizations have been minor since initial days Majority of growth due to Moore’s law Faster hardware Slightly faster storage Faster memory What when Moore’s law diminishes thanks to external factors like heat generated.
  • 44. Database size limits RDBMS are too slow Over multiterabyte and petabyte databases Purpose designed parallel processing would be needed to handle such capacities of data in a RDBMS.
  • 45. RDBMS has been there since years and is proven technology What aboutNoSQL
  • 46. RDBMS grew fast but growth slowed down over time and might eventually reach a stale point NoSQL unarguably a new immature tool, has been growing faster than RDBMS ever did and is being supported by the Big Players
  • 47. Did you say BIG PLAYERS! WHO?
  • 48.
  • 55. LinkedIn – Voldemortand more Microsoft is considering NoSQL as well for Azure services so is Twitter Are we next? Major IT Companies have implemented or even better created their own NoSQL to manage huge Data stores which couldn’t be managed by SQL Databases.
  • 56. We are used to SQL and relatedness, why can’t they just fix RDBMS to handle Big Data STORAGE SEEK RATES Large writes and ACID being a huge limitation Big Data can be handled via Scale Out/Partitionability across Multiple Nodes
  • 57. CAP Theorem Applies to distributed shared data system
  • 59. A Deeper look Consistency: The system is in a consistent state after an operation All clients see the same data Strong Consistency(ACID) vs. Eventual (BASE) Availability: ‘Always On’ mode, no downtime All clients can find some available replica Software/hardware upgrade tolerance Partition Tolerance: The system continues to function even when split into disconnected subsets (by a network disruption) Reads and Writes combined
  • 60.
  • 62.
  • 63. Basically Available Soft State Eventually Consistent When Availability and Partitionability are prioritized over Consistency, think in terms of BASE
  • 64. Eventual Consistency If no new updates are made to the object, eventually all accesses will return the last updated value. Ex: Domain Name System (DNS)
  • 65. Types of Eventual Consistency Read-your-write consistency Session consistency Monotonic read consistency Monotonic write consistency Causal consistency Practically, Read-your-write consistency and monotonic read consistency are desirable in an eventually consistent system
  • 66. Hash() Different Apps – Different CAP requirement Prioritize among Consistency – Availability Availability – Partitionability Consistency - Partitionability
  • 67. WHERE? So will NoSQL eventually replace RDBMSs everywhere?No, RDBMS are there to stay. NoSQL is here to help.
  • 68. Wherever you want to take Advantage of NoSQL
  • 69. Big Data Denormalize Shard Scale Out And look no further than NoSQL
  • 70. Write Intensive Applications I/OpS of the Best storage device <<< n * I/OpS of relatively cheaper storage devices in simple terms: ‘HARNESS THE POWER OF YOUR CLOUD’
  • 71. Fast Key-Value Access NoSQL – ‘User, you are looking for $value’ RDBMS – ‘Query executing ….’ A O(1) Hash operation or O(log n) B+/B tree traversals
  • 72. Flexible Schema and Data types ‘I once was a integer, then a string then a date; What am I’ - FieldRDBMS – ‘WTH! Whatever you are, You are beyond my scope’
  • 73. Transient Data Data – ‘I’m here only for a while and want to get my work done fast’ RDBMS – ‘You are data and you shall be treated like the rest’ NoSQL – ‘Okay, I’ll allot you space in the RAM using Memcached If available otherwise you still have my cloud’
  • 74. High Write Availability Warning - Incoming data ….NoSQL – ‘Anytime you like, user’ RDBMS – ‘This is insane, I’m already busy with other things’
  • 75. ECONOMICS RDBMS – ‘I’m powered by a wonderful, beautiful rabbit’ NoSQL – ‘I’m powered by many cute little hamsters’
  • 76. No Single Point of Failure Designed to run over Economic Commonly Available Unreliable hardware
  • 77. Full table scan operations MapReduce: Map: To define your problems into optimal sub problems which can be computed in parallel and reduced later Reduce: To merge the sub optimal solutions into the result Divide and Conquer your way to Victory Powered by MapReduce! Or something similar
  • 78. Ability to restore, maintain, repair itself No DBA required Design
  • 79. HOW? Let us welcome Keys, Values, Collections, Data Structures, Objects, Documents Graphs
  • 80. NoSQL View The basic approach at data: Key/Value store Run on multiple machines Partitions and Replication across these machines Relax consistency Aim at Eventual Consistency Asynchronous replication But not all NoSQL take the same path.
  • 81. Document Store Key-Value Store Object NoSQL Multivalue Graph Stores BigTable Clones Tuble Store
  • 82. Key-Value Stores One key, one value, no duplicates and crazy fast Distributed hash tables The value is stored as binary object – BLOB The DB doesn’t understand it and doesn’t want to Ex: Amazon Dynamo, MemcacheDB
  • 83. Key4 Key3 Key2 Key1 Key/Value store doesn’t know what is in here
  • 84. Document Store Key-value store, but the value is structured and understood by the DB Querying data is possible On not just the key Ex: MongoDB, CouchDB, Riaketc
  • 85. Each database has collections Each collection has a set of documents They are well-designed for access through applications Suitable for web applications Few Document databases provide SQL Like query interface now
  • 86. Key4 Key3 Key2 Key1 Name: $NameValue: $Value Version: $Version Type: $Type Emb Object1 Objects inside Objects CRAZY! Emb Object2
  • 87. BigTable & its Clones Database, tables, rows, columns and ’ SuperColumn’ Row consists of columns and SuperColumns Few supercolumns can be made a must Each supercolumn – arbitrary set of columns Rows are typically versioned by a system assigned timestamp.
  • 88. Intended for tables with huge number of columns Millions can also be supported very easily ‘a sparse, distributed multi-dimensional sorted map’ Also referred to as Wide Column stores Ex: Google BigTable, Cassandra, Hbase, Voldemort, Azure Tables
  • 90. Graph Databases Nodes, Edges, Properties Replace traditional tables, columns, rows Graph database can be implement in different ways Key/value store, columnar, bigtable clone or even combination of these Fields are used to directly store the id of another entity forming the edge
  • 91. Graph database is a multi-relational graph No need for secondary indexes Relationships in RDBMS are ‘weak’ Relationships in Graphs are ‘strong’ The rest don’t really care about relations at db level
  • 92. Address Age: 32 Matt Mobile April Is related to SSN Spouse owns Drives Honda Model City registration
  • 93. Key-Value Store Size Document Store BigTable Clone Graph Databases Complexity
  • 94. Too Many Cooks and Recipes No specific recipe! Major implementations: Graph Document store Tabular Key value store Eventually consistent Hierarchical Ordered Other Known Recipes: Multivalue Object Tuble Store
  • 95. The Menu On Disk BigTable Membase Tokyo Cabinet In RAM Memcached Velocity Eventually Consistent Cassandra Dynamo Riak Hierarchical GT.M Ordered Berkeley DB NMDB C-ISAM Multivalue eXe OpenQM Document Store CouchDB Lotus Notes MongoDB Graph AllegroGraph Neo4j DEX Tabular BigTable Hbase HyperTable The list isn’t even a quarter of the whole
  • 96. _theOpenSourceIssue Most of them are open source Thus fork-ablelike Linux The first of the lot Google’s BigTable Amazon’s Dynamo All in all, there are about 10 roots with 4 major ones.
  • 97. No single database to rule them all
  • 98. Real World Implementations Digg’s 3TB for Green Badges [CASSANDRA] Facebook’s 50TB for Inbox Search [HBASE] eBay’s 2PB overall data Google’s
  • 100. MongoDB Document Store JSON Storage REST ….. Not out of the box Map/Reduce Master slave replication Strong suite of query APIs Good support for SQL Work in Progress: Autosharding based scalability Failover support Open Source Non Relational Scalable Schemaless Queryable
  • 101. Document Oriented Mongo stores documents in collections Documents are slightly enhanced JSON Objects Complex data structures is very much possible Data Modelling is a more natural process
  • 102. Embeddable Objects Complexity.begin() Embed objects within a single document Document is an enhanced form of object like mentioned earlier The same thing in RDBMS can be achieved using multiple tables and joining them together Consider our requirement is to store a blogging post with this information Post Content Post Title Post Author Comments Comment order Comment content Comment author
  • 104. MongoDB Solution Documents …. Each one of them is a post { Name: $name, Author: $author, Comment: [ { Author: $author1, Comment: $comment1} , { Author: $author2, Comment: $comment2, Replies: [ { Author: $author3, Comment: $comment3} ] } ] }
  • 107.
  • 108. Schema-less No database enforced Schema Addition, Deletion of columns are simple Its about how the application uses APIs Data definition need not be defined up front.
  • 109. Other Features Data Tagging Caching Real Time Analytics Image Storage Dynamic Queries Binary Storage
  • 110. MongoDB - Why Not? Lacks transactions Doesn’t completely support SQL Lacks built-in revisioning system like CouchDB Lacks full text searching features
  • 111. Try MongoDB @ http://try.mongodb.org/
  • 112. EOL
  • 113. Calm down! Eventually Answered System All your questions will be answered eventually

Editor's Notes

  1. SQL Databases approach data in the form of sets and tables. Incidentally its strength soon become its weakness.Assumptions made:Data is represented in the form of tables. Row and ColumnsData in each table can be related to data in another.Data can/has to be searchable through all columns.Strengths:Data manipulation through Set theory.Enforce relational constraints with its management system.Weakness:Relational ness becomes an overhead once data becomes real huge.Large amounts of writes in a SQL database is a lot of burden on the DBMS apart from the storage disk.
  2. NoSQL is a collection of databases which elude from the drawbacks of RDBMS without completely giving up on Relational Models. They are not stringent when it comes to certain core RDBMS concepts like ACID complianceand other integrity constraints.The priority is to support high levels of scalability through easy partitioning abilities across multiple cheap naïve hardware by giving up on Consistency which SQL databases look at delivering apart from some amount of relatedness from the data.
  3. The CAP theorem states that any shared-data system can only achieve two of these three.Consistency (All database clients see the same data, even with concurrent updates.)Availability (All database clients are able to access some version of the data.)Partition tolerance (The database can be split over multiple servers.)http://www.julianbrowne.com/article/viewer/brewers-cap-theoremhttp://devblog.streamy.com/2009/08/24/cap-theorem/http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/