Scalability: Rdbms Vs Other Data Stores

  • 5,426 views
Uploaded on

 

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
5,426
On Slideshare
0
From Embeds
0
Number of Embeds
5

Actions

Shares
Downloads
86
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • May have to discuss Queuing Systems, Idempotency and so on

Transcript

  • 1. RDBMS vs. Other Data Stores forScalability
    ramki.g@directi.com
    TechTalk 2009, IIIT Hyderabad
  • 2. Scalability
    Increase Resources  Increase Performance (Linearly)
    Performance?
    Latency, Capacity, Throughput
    Vertical Scalability (Scaling Up)
    Divide the functionality
    Horizontal Scalability (Scaling Out)
    Divide the data
  • 3. Relational Database
    Table, Row, Column
    Set, Item, Property
  • 4. Relational Theory
    Selection: SELECT
    Filter: WHERE
    Join: JOIN, LEFT JOIN,RIGHT JOIN
    Correlation: SELECT a FROM A WHERE A.b IN (SELECT b FROM B WHERE b.a > a)
  • 5. Relational Theory
    Aggregation
    Set Operators
    Union, Intersection, Minus
    Group By
    MAX, MIN, SUM, AVG
  • 6. Transactions: Atomicity
    Transaction Level
    Entire Logical operations is a transaction
    Multiple statements
    Statement level
    Each statement is either successful or not, no partial success
    Multiple records
    Record Level
    All modifications to a record are successful or not
  • 7. Transactions: Consistency
    Integrity Constraints
    Referential Integrity
  • 8. Transactions: Isolation Levels
    Serializable
    A definite order of mutations/transactions is possible to arrive to state B from state A
    Repeatable Read
    Any data read by a transaction will remain so till transaction is complete
    Non Repeatable Read aka Read Committed
    Two reads within a transaction may give different results
    Dirty Read
    A transaction might read data which might then be rolledback
  • 9. RDBMS Luxuries
    Multiple Indexes
    Auto Increments/Sequences
    Triggers
  • 10. Scalability in RDBMS
    Replication
    Read Replication (Master-Slave)
    Read Write Replication (Master-Master)
    Cluster
    Distributed Transaction
    Two-phase commits
  • 11. Scalability Impediments
    Performance
    Sub-Queries/Correlation, Joins, Aggregates,
    Referential Integrity constraints
    Basic Guarantee
    Consistency
    Availability
  • 12. CAP?
    Conjecture: Distributed systems cannot ensure all three of the following properties at once
    Consistency The client perceives that a set of operations has occurred all at once.
    Availability Every operation must terminate in an intended response.
    Partition tolerance Operations will complete, even if individual components are unavailable.
  • 13. ACID to BASE
    Basically Available - system seems to work all the time
    Soft State - it doesn't have to be consistent all the time
    Eventually Consistent - becomes consistent at some later time
  • 14. BASE: An Example
    BEGIN Transaction
    INSERT INTO ORDER( oid, timestamp, customer)
    FOREACH item IN itemList
    INSERT INTO ORDER_ITEM ( oid, item.id, item.quantity, item.unitprice)
    //UPDATE INVENTORY SET quantity=quantity- item.quantityWHERE item = item.id
    COMMIT
    END Transaction
    Assume Each statement is queued for execution
    You will get COMMIT success
  • 15. Alternate Implementations
    BigTable – Google – CP
    Hbase – Apache – CP
    HyperTable – Community - CP
    Dynamo – Amazon – AP
    SimpleDB– Amazon - AP
    Voldemort – LinkedIn – AP
    Cassandra – Facebook– AP
    MemcacheDB - community – CP/AP
  • 16. Data Models
    Key/Value Pairs
    Dynamo, MemcacheDB, Voldemort
    Row-Column
    BigTable, Casandra, SimpleDB, Hypertable, Hbase
  • 17. Programming Models
    // Open the table
    Table *T = OpenOrDie("/bigtable/web/webtable");
    // Write a new anchor and delete an old anchor
    RowMutation r1(T, "com.cnn.www");
    r1.Set("anchor:www.c-span.org", "CNN");
    r1.Delete("anchor:www.abc.com");
    Operation op;
    Apply(&op, &r1);
  • 18. BigTable: Consistent yet Infinitely Scalable
    Single Master
    B+ tree based data distribution
  • 19. BigTable: Transactions
    Enities and Entity Groups
    Invoice
    Invoice Item
    Delivery Note
  • 20. Dynamo: Highly available and Infinitely Scalable
    Consistent Hashing
    Peer to Peer Distributed
    Gossip based member discovery
  • 21. RDBMS or Other?
    Nature of Business
    Maturity of the Product
    Cost of Adoption
    Maturity of the alternative Datastores
  • 22. Q&A