Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How big data moved the needle from monolithic SQL RDBMS to distributed NoSQL


Published on

we will see what factors contributed to the evolution of the next thing and what kind of design choices were made by the engineers along the evolution. We will also see what we got rid of (or the tradeoffs) during the evolution process. We will talk about what kind of applications will be best suited to a particular type of database.

Published in: Technology
  • Be the first to comment

How big data moved the needle from monolithic SQL RDBMS to distributed NoSQL

  1. 1. Sunil Sayyaparaju, Citrusleaf Inc
  2. 2. Agenda Evolution of SQL RDBMS Need to break out Fresh Thinking Spectrum of databases Future
  3. 3. Evolution of SQL RDBMS Data management started with flat files 1960: Navigational DBMS  Iterate over entire file on tape. No search 1970: Relational DBMS  God sent Codd  Then came tables, keys, normalization  Adopted tuple calculus to form basis for SQL  System R and Ingres were born ○ gave birth to DB2, Sybase, Informix, Oracle 1980: Object-oriented databases 2000: In-memory, XML databases 2000: Distributed Shared-disk databases
  4. 4. Need to break out More and more data continued to pour in Storage costs went up  Were offset by cheaper and larger disks Speed went down  Were offset by powerful machines  Were offset by several optimizations Cost went up  Large businesses could bear it  But small businesses ??? 24X7 uptime became necessary  Uhh ohhh Flexibility of DB schema  Uhh ohhh
  5. 5. Distributed Shared-disk Model Multiple machines sharing a disk Data copies in cache, single copy on disk Advantages  Could scale well in reads  Add/Remove individual nodes Hauntings  Write scalability, Locking  Maintaining transaction semantics  Communication between nodes  Invalidating old replicated data on write Workaround: Redesign applications  To exploit this model  Called well-partitioned applications $M Question: If I redesign my application, why not a totally new model ?
  6. 6. Evils of 24x7 uptime Evils :  s/w or h/w upgrades  Failures  Routine maintenance  DB Schema changes Workaround:  Replicate data and switch  Problem: Needs manual intervention
  7. 7. Fresh Thinking I want  24x7 uptime without manual intervention  Flexibility in my database schema  Speed and Predictability  Vertical and horizontal scalability I don’t want  Splurging money on software and hardware  Overheads unrelated to my use-case I can loose (Most important)  Attitude: I know to manage my data ○ Several applications already do that. For e.g SAP R/3  Joins, Multi-record transactions  Complex query functionality  SQL altogether
  8. 8. Let us do some housecleaning Full blown RDBMS  Cutdown RDBMSQuery Compilation Query CompilationQuery OptimizationQuery Execution Query ExecutionTransaction Engine Transaction EngineStorage & Access Storage & Access
  9. 9. Who does not want features ? Formula1 Car Sedan Car Fuel Efficient ? No Yes Can it carry my family ? No Yes Does it have a 6 disk audio player ? No Yes Does it have airbags ? No Yes Then why will someone buy F1 Car ?  Because it goes amazingly fast  Its does best what it is designed forTrivia: Why F1 cars don’t have airbags ?
  10. 10. Let there be NoSQL Started as No-SQL Some evolved into Not-Only-SQL Horizontal scalability is assumed Supports latest hardware like SSDs etc Different flavors of NoSQL  Targeted for different use-cases  Key-value stores  Ordered Key-value stores  Document stores with text search  Graph databases
  11. 11. Spectrum of Databases NoSQL Lotus Notes Citrusleaf ObjectDB Mongo Versant Cassandra Zope Redis MySQL NDBSQL/NoSQL SQL Oracle Oracle RAC HP Nonstop DB2 Sybase SDC VoltDB MS-SQL IBM PureScale Sybase ASE ScaleDB MySQL Monolithic Distributed Distributed Shared-disk Shared-nothing Distributedness
  12. 12. NoSQL Datamodels
  13. 13. Future: Fortunate/Unfortunate ? NoSQL Citrusleaf Mongo Cassandra Redis MySQL NDBSQL/NoSQL SQL Oracle DB2 MS-SQL Sybase ASE MySQL Monolithic Distributed Distributed Shared-disk Shared-nothing Distributedness
  14. 14. Future: More Storage roles Application Hadoop Hadoop Hadoop Hadoop Job Job Job Job HDFS HDFS Mongo Citrusleaf
  15. 15. Conclusion You cannot just replace SQL with NoSQL You loose some features when you go to NoSQL You have to put extra effort to use NoSQL Make sure that NoSQL is not as fat as SQL NoSQL solves subset of/specific problems but well NoSQL is lean and mean NoSQL is designed to be highly available NoSQL does not demand powerful hardware