Scalability: Rdbms Vs Other Data Stores
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Scalability: Rdbms Vs Other Data Stores

on

  • 7,157 views

 

Statistics

Views

Total Views
7,157
Views on SlideShare
5,976
Embed Views
1,181

Actions

Likes
0
Downloads
86
Comments
0

9 Embeds 1,181

http://blog.codechef.com 655
http://www.codechef.com 420
http://blog.folks.in 32
http://www.linkedin.com 31
http://www.slideshare.net 24
https://www.linkedin.com 9
http://blog.ww2.codechef.com 8
http://staging.codechef.com 1
http://ww2.codechef.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • May have to discuss Queuing Systems, Idempotency and so on

Scalability: Rdbms Vs Other Data Stores Presentation Transcript

  • 1. RDBMS vs. Other Data Stores forScalability
    ramki.g@directi.com
    TechTalk 2009, IIIT Hyderabad
  • 2. Scalability
    Increase Resources  Increase Performance (Linearly)
    Performance?
    Latency, Capacity, Throughput
    Vertical Scalability (Scaling Up)
    Divide the functionality
    Horizontal Scalability (Scaling Out)
    Divide the data
  • 3. Relational Database
    Table, Row, Column
    Set, Item, Property
  • 4. Relational Theory
    Selection: SELECT
    Filter: WHERE
    Join: JOIN, LEFT JOIN,RIGHT JOIN
    Correlation: SELECT a FROM A WHERE A.b IN (SELECT b FROM B WHERE b.a > a)
  • 5. Relational Theory
    Aggregation
    Set Operators
    Union, Intersection, Minus
    Group By
    MAX, MIN, SUM, AVG
  • 6. Transactions: Atomicity
    Transaction Level
    Entire Logical operations is a transaction
    Multiple statements
    Statement level
    Each statement is either successful or not, no partial success
    Multiple records
    Record Level
    All modifications to a record are successful or not
  • 7. Transactions: Consistency
    Integrity Constraints
    Referential Integrity
  • 8. Transactions: Isolation Levels
    Serializable
    A definite order of mutations/transactions is possible to arrive to state B from state A
    Repeatable Read
    Any data read by a transaction will remain so till transaction is complete
    Non Repeatable Read aka Read Committed
    Two reads within a transaction may give different results
    Dirty Read
    A transaction might read data which might then be rolledback
  • 9. RDBMS Luxuries
    Multiple Indexes
    Auto Increments/Sequences
    Triggers
  • 10. Scalability in RDBMS
    Replication
    Read Replication (Master-Slave)
    Read Write Replication (Master-Master)
    Cluster
    Distributed Transaction
    Two-phase commits
  • 11. Scalability Impediments
    Performance
    Sub-Queries/Correlation, Joins, Aggregates,
    Referential Integrity constraints
    Basic Guarantee
    Consistency
    Availability
  • 12. CAP?
    Conjecture: Distributed systems cannot ensure all three of the following properties at once
    Consistency The client perceives that a set of operations has occurred all at once.
    Availability Every operation must terminate in an intended response.
    Partition tolerance Operations will complete, even if individual components are unavailable.
  • 13. ACID to BASE
    Basically Available - system seems to work all the time
    Soft State - it doesn't have to be consistent all the time
    Eventually Consistent - becomes consistent at some later time
  • 14. BASE: An Example
    BEGIN Transaction
    INSERT INTO ORDER( oid, timestamp, customer)
    FOREACH item IN itemList
    INSERT INTO ORDER_ITEM ( oid, item.id, item.quantity, item.unitprice)
    //UPDATE INVENTORY SET quantity=quantity- item.quantityWHERE item = item.id
    COMMIT
    END Transaction
    Assume Each statement is queued for execution
    You will get COMMIT success
  • 15. Alternate Implementations
    BigTable – Google – CP
    Hbase – Apache – CP
    HyperTable – Community - CP
    Dynamo – Amazon – AP
    SimpleDB– Amazon - AP
    Voldemort – LinkedIn – AP
    Cassandra – Facebook– AP
    MemcacheDB - community – CP/AP
  • 16. Data Models
    Key/Value Pairs
    Dynamo, MemcacheDB, Voldemort
    Row-Column
    BigTable, Casandra, SimpleDB, Hypertable, Hbase
  • 17. Programming Models
    // Open the table
    Table *T = OpenOrDie("/bigtable/web/webtable");
    // Write a new anchor and delete an old anchor
    RowMutation r1(T, "com.cnn.www");
    r1.Set("anchor:www.c-span.org", "CNN");
    r1.Delete("anchor:www.abc.com");
    Operation op;
    Apply(&op, &r1);
  • 18. BigTable: Consistent yet Infinitely Scalable
    Single Master
    B+ tree based data distribution
  • 19. BigTable: Transactions
    Enities and Entity Groups
    Invoice
    Invoice Item
    Delivery Note
  • 20. Dynamo: Highly available and Infinitely Scalable
    Consistent Hashing
    Peer to Peer Distributed
    Gossip based member discovery
  • 21. RDBMS or Other?
    Nature of Business
    Maturity of the Product
    Cost of Adoption
    Maturity of the alternative Datastores
  • 22. Q&A