Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scalability: Rdbms Vs Other Data Stores


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

Scalability: Rdbms Vs Other Data Stores

  1. 1. RDBMS vs. Other Data Stores forScalability<br /><br />TechTalk 2009, IIIT Hyderabad<br />
  2. 2. Scalability<br />Increase Resources  Increase Performance (Linearly)<br />Performance?<br />Latency, Capacity, Throughput<br />Vertical Scalability (Scaling Up)<br />Divide the functionality<br />Horizontal Scalability (Scaling Out)<br />Divide the data<br />
  3. 3. Relational Database<br />Table, Row, Column<br />Set, Item, Property<br />
  4. 4. Relational Theory<br />Selection: SELECT<br />Filter: WHERE<br />Join: JOIN, LEFT JOIN,RIGHT JOIN<br />Correlation: SELECT a FROM A WHERE A.b IN (SELECT b FROM B WHERE b.a &gt; a)<br />
  5. 5. Relational Theory<br />Aggregation<br />Set Operators<br />Union, Intersection, Minus<br />Group By<br />MAX, MIN, SUM, AVG<br />
  6. 6. Transactions: Atomicity<br />Transaction Level<br />Entire Logical operations is a transaction<br />Multiple statements<br />Statement level<br />Each statement is either successful or not, no partial success<br />Multiple records<br />Record Level<br />All modifications to a record are successful or not<br />
  7. 7. Transactions: Consistency<br />Integrity Constraints<br />Referential Integrity<br />
  8. 8. Transactions: Isolation Levels<br />Serializable<br />A definite order of mutations/transactions is possible to arrive to state B from state A<br />Repeatable Read<br />Any data read by a transaction will remain so till transaction is complete<br />Non Repeatable Read aka Read Committed<br />Two reads within a transaction may give different results<br />Dirty Read<br />A transaction might read data which might then be rolledback<br />
  9. 9. RDBMS Luxuries<br />Multiple Indexes<br />Auto Increments/Sequences<br />Triggers<br />
  10. 10. Scalability in RDBMS<br />Replication<br />Read Replication (Master-Slave)<br />Read Write Replication (Master-Master)<br />Cluster<br />Distributed Transaction<br />Two-phase commits<br />
  11. 11. Scalability Impediments<br />Performance<br />Sub-Queries/Correlation, Joins, Aggregates, <br />Referential Integrity constraints<br />Basic Guarantee<br />Consistency<br />Availability<br />
  12. 12. CAP?<br />Conjecture: Distributed systems cannot ensure all three of the following properties at once<br />Consistency The client perceives that a set of operations has occurred all at once.<br />Availability Every operation must terminate in an intended response.<br />Partition tolerance Operations will complete, even if individual components are unavailable.<br />
  13. 13. ACID to BASE<br />Basically Available - system seems to work all the time<br />Soft State - it doesn&apos;t have to be consistent all the time<br />Eventually Consistent - becomes consistent at some later time<br />
  14. 14. BASE: An Example<br />BEGIN Transaction<br />INSERT INTO ORDER( oid, timestamp, customer)<br />FOREACH item IN itemList<br /> INSERT INTO ORDER_ITEM ( oid,, item.quantity, item.unitprice)<br /> //UPDATE INVENTORY SET quantity=quantity- item.quantityWHERE item =<br />COMMIT<br />END Transaction<br />Assume Each statement is queued for execution <br />You will get COMMIT success<br />
  15. 15. Alternate Implementations<br />BigTable – Google – CP<br />Hbase – Apache – CP <br />HyperTable – Community - CP<br />Dynamo – Amazon – AP<br />SimpleDB– Amazon - AP<br />Voldemort – LinkedIn – AP<br />Cassandra – Facebook– AP<br />MemcacheDB - community – CP/AP<br />
  16. 16. Data Models<br />Key/Value Pairs <br />Dynamo, MemcacheDB, Voldemort<br />Row-Column<br />BigTable, Casandra, SimpleDB, Hypertable, Hbase<br />
  17. 17. Programming Models<br />// Open the table<br />Table *T = OpenOrDie(&quot;/bigtable/web/webtable&quot;);<br />// Write a new anchor and delete an old anchor<br />RowMutation r1(T, &quot;com.cnn.www&quot;);<br />r1.Set(&quot;;, &quot;CNN&quot;);<br />r1.Delete(&quot;;);<br />Operation op;<br />Apply(&op, &r1);<br />
  18. 18. BigTable: Consistent yet Infinitely Scalable<br />Single Master<br />B+ tree based data distribution<br />
  19. 19. BigTable: Transactions<br />Enities and Entity Groups<br />Invoice<br />Invoice Item<br />Delivery Note<br />
  20. 20. Dynamo: Highly available and Infinitely Scalable<br />Consistent Hashing<br />Peer to Peer Distributed<br />Gossip based member discovery<br />
  21. 21. RDBMS or Other?<br />Nature of Business<br />Maturity of the Product<br />Cost of Adoption<br />Maturity of the alternative Datastores<br />
  22. 22. Q&A<br />