• Save
Sergejus Barinovas
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Sergejus Barinovas

on

  • 1,131 views

 

Statistics

Views

Total Views
1,131
Views on SlideShare
1,131
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Atomicity. All of the operations in the transaction will complete, or none will.Consistency. The database will be in a consistent state when the transaction begins and ends.Isolation. The transaction will behave as if it is the only operation being performed upon the database.Durability. Upon completion of the transaction, the operation will not be reversed.
  • Consistency. The client perceives that a set of operations has occurred all at once.Availability. Every operation must terminate in an intended response.Partition tolerance. Operations will complete, even if individual components are unavailable.http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
  • Basically Available. Supportingpartial failures without total system failure.Soft state. The state can be inconsistent for a given period of time.Eventual consistency. After some time all replicas will have consistent data.For a given accepted update and a given replica eventually either the update reaches the replica or the replica retires from service

Sergejus Barinovas Presentation Transcript

  • 1.
  • 2. Web Scale with NoSQL
    Sergejus Barinovas(@sergejusb)
    http://sergejus.blogas.lt
  • 3. Who Am I?
    Architect at
    Running NoSQL servers in production
    Blogger (http://sergejus.blogas.lt, @sergejusb)
    Community member (http://dotnetgroup.lt)
    Contact me via sergejus.barinovas@gmail.com
  • 4. Powered by RDBMS
    Used everywhere…
    …even where it shouldn’t
    Used for 30+ years!
  • 5. Back to 1980’s…
  • 6. Data boom
  • 7. in numbers
    600 000 000 users
    30 000 servers
    20+ TB raw data per day
    >20 PB stored data
  • 8. You really think they use RDBMS?
  • 9. RDBMS Scaling Example
  • 10. Simple usage
    Customers
    master
    Reads / Writes
  • 11. Scale reads
    Customers
    master
    Writes
    Reads
    slave
    slave
  • 12. Scale writes
    master
    Reads / Writes [N-Z]
    Customers [N-Z]
    master
    Customers [A-M]
    Reads / Writes [A-M]
  • 13. Scale reads / writes
    slave
    slave
    Reads [A-M]
    master
    Customers [N-Z]
    Writes [N-Z]
    master
    Writes [A-M]
    Customers [A-M]
    Reads [A-M]
    slave
    slave
  • 14. Pray your system won’t fail
  • 15. Enter the NoSQL
  • 16. Why NoSQL
    Limited SQL scalability
    Sharding and vertical partitioning
    Limited SQL availability
    Master / slave configuration
    Limited SQL speed of read operations
    Multiple read replicas
    SQL limitations for huge amount of data
    Key / value / type columns
  • 17. NoSQLhistory
    2009, Eric Evans, no:sql(est)
    NoSQL– open source distributed databases, not relational SQL databases
    NoSQL– not only SQL
    NoSQL-> Big Data
  • 18. NoSQL characteristics (1/2)
    Scalability
    The ability to horizontally scale simple-operation throughput over many servers
    BASE
    A “weaker” concurrency model than the ACID transactions in most SQL systems
  • 19. NoSQLcharacteristics (2/2)
    Distributed
    Efficient use of distributed indexes and RAM for data storage
    Schema-less
    The ability to dynamically define new attributes or data schema
  • 20. ACID (transactions)
    Atomicity – all or nothing
    Consistency – state integrity
    Isolation – no reads of uncommitted data
    Durability – recover committed trans
  • 21. CAP theorem
    2000, Eric Brewer
    It is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
    Consistency
    Availability
    Partition tolerance
  • 22. BASE (eventualconsistency)
    Basically – partial system failures are OKAvailable
    Soft state – inconsistency is OK
    Eventual consistency – stale data is OK
  • 23.
  • 24. NoSQL Databases
  • 25. NoSQLcategories
    Key / value store
    Document database
    Graph database
    Columnar database
  • 26. Key / valuestore
    <key, value> or Tuple<key, v1,. ., vn>
    Simple operations
    Get
    Put
    Delete
    Key
    Value
    Byte[]
    Byte[]
  • 27. Key / valuestore
    Key
    Value
    “current_date”
    2011.04.04
    “sergejusb”
    Binary Object
    “sergejusb”
    JSON Object
  • 28. Key / value stores
    Redis
    (+)messaging
    (-)no shards
    Voldermort
    Membase
    (+)memcache interface
    Riak
  • 29. Document database
    Document == complex object
    XML
    YAML
    JSON / BSON
    Support for secondary indexes
    Schema can be defined at runtime
    Optional support for simple querying using Map / Reduce
  • 30. Document databases
    MongoDB
    (+)shards
    CouchDB
    (+)master / master replication
  • 31. Graph database
    Graph == network
    Basic constructs
    Node
    Edge
    Properties
    sergejus.blogas.lt
    reads
    authors
    knows
    sergejus
    tdagys
    knows
  • 32. Graph databases
    Neo4j
    (-)paid version required for scaling
    FlockDB
    (+)fast
    (-)limited functionality
  • 33. Columnar database
    For HUGE amount of data
    Columns are added at a runtime
    Great scalability
    Horizontal
    Vertical
  • 34. Columnar database
    Unusual data model
    Key Space ->Database
    Column Family -> Table
    Columns and Super Columns
    Super Column -> array of Columns
    Column -> Tuple<Key, Value, Timestamp, TTL>
  • 35. Columnar database
    Simple column
  • 36. Columnar database
    Simple column
  • 37. Columnar database
    Cassandra
    (+)easy scalable
    HBase
    (+)consistent
    (+)part of Hadoop
    Hypertable
  • 38. NoSQL is Cool! But…
  • 39.
  • 40. NoSQL limitations
    ORDER BY ?
    Natural key order
    GROUP BY ?
    Map / Reduce*
    JOIN ?
    Multiple Map / Reduce*
    SELECT * ?
    Multi-machine Map / Reduce*
    *if possible
  • 41. NoSQLLimitations
    Maturity
    Tooling
    Specificity
  • 42. SQL vs. NoSQL
    Choose the right tool for the task
    You can use BOTH
  • 43. Thank you!
    Sergejus Barinovas (@sergejusb)
    sergejus.barinovas@gmail.com
    http://sergejus.blogas.lt