NoSQL - what's that
Upcoming SlideShare
Loading in...5
×
 

NoSQL - what's that

on

  • 3,155 views

Overview of NoSQL in general, its types and available most pop

Overview of NoSQL in general, its types and available most pop

Statistics

Views

Total Views
3,155
Views on SlideShare
3,016
Embed Views
139

Actions

Likes
3
Downloads
97
Comments
0

6 Embeds 139

http://neotechnology.priqia.com 121
http://neotech-local:8888 8
http://coderwall.com 4
http://neotechnology:8890 4
http://static.slidesharecdn.com 1
http://neotechnology 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Atomicity. All of the operations in the transaction will complete, or none will.Consistency. The database will be in a consistent state when the transaction begins and ends.Isolation. The transaction will behave as if it is the only operation being performed upon the database.Durability. Upon completion of the transaction, the operation will not be reversed.
  • Consistency. The client perceives that a set of operations has occurred all at once.Availability. Every operation must terminate in an intended response.Partition tolerance. Operations will complete, even if individual components are unavailable.http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
  • Basically Available. Supportingpartial failures without total system failure.Soft state. The state can be inconsistent for a given period of time.Eventual consistency. After some time all replicas will have consistent data.For a given accepted update and a given replica eventually either the update reaches the replica or the replica retires from service
  • http://labs.google.com/papers/bigtable.htmlhttp://labs.google.com/papers/gfs.htmlhttp://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf

NoSQL - what's that NoSQL - what's that Presentation Transcript

  • NoSQL – What’s that?
    SergejusBarinovas | Microsoft MVP
    @sergejusb, sergejus.blogas.lt
  • NoSQL
  • WHY?
    • Limited SQL scalability
    Horizontal partitioning (sharding)
    Vertical partitioning
    NoSQL – Why?
    • Limited SQL availability
    Master / slave configuration
    NoSQL – Why?
    • SQL limitations for storing huge amount of data
    Key / value / type columns
    NoSQL – Why?
    • Limited SQL speed of read/write operations
    Multiple read replicas
    NoSQL – Why?
    • 2009, Eric Evans
    • NoSQL – open source distributed databases, not relational SQL databases
    • NoSQL – not only SQL
    • NoSQL->Big Data
    NoSQL History
    • The ability to horizontally scale simple-operation throughput over many servers
    NoSQL Characteristics (scalability)
    • A “weaker” concurrency model than the ACID transactions in most SQL systems
    NoSQL Characteristics (BASE)
    • Efficient use of distributed indexes and RAM for data storage
    NoSQL Characteristics (distributed)
    • The ability to dynamically define new attributes or data schema
    NoSQL Characteristics (schema-less)
    • Atomicity – all or nothing
    • Consistency – state integrity
    • Isolation – no reads of uncommitted data
    • Durability – recover committed trans
    ACID (transactions)
    • 2000, Eric Brewer
    It is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
    • Consistency
    • Availability
    • Partition tolerance
    CAP Theorem
    • Basically – partial system failures are OKAvailable
    • Soft state – inconsistency is OK
    • Eventual consistency – stale data is OK
    BASE (eventual consistency)
  • NoSQL Databases
    • Key / value store
    • Document database
    • Graph database
    • Columnar database
    NoSQL Categories
    • <key, value> or Tuple<key, v1,. ., vn>
    • Simple operations
    Get
    Put
    Delete
    Key / value store
    Key
    Value
    Byte[]
    Byte[]
  • Key / value store
    Key
    Value
    “current_date”
    2011.01.16
    “sergejusb”
    Binary Object
    “sergejusb”
    JSON Object
    • Dynamo*
    • Membase
    • Voldermort
    • Redis
    • Azure Table Storage
    • Riak
    Key / value store
  • Name: Dynamo
    Created: 2007, Amazon (proprietary)
    Implementation: ?
    Distributed: Yes
    Replication: Multiple Servers
    CAP: AP
    API: ?
    Key / value store
  • Name: Membase
    Created: 2010, sponsored by Zinga
    Implementation: C / C++ / Erlang
    Distributed: Yes
    Replication: Multiple Servers
    CAP: CP
    API: Memcached API, JSON
    Key / value store
  • Name: Voldemort
    Created: 2008, LinkedIn
    Implementation: Java
    Distributed: Yes
    Replication: Multiple Servers
    CAP: AP
    API: Java
    Key / value store
  • Name: Redis
    Created: 2009, sponsored by VMWare
    Implementation: C
    Distributed: No
    Replication: Master / Slave
    CAP: CP
    API: Various Languages
    Key / value store
  • Name: Azure Table Storage
    Created: 2008, Microsoft
    Implementation: ?
    Distributed: Yes
    Replication: Multiple Servers (DFS)
    CAP: CP
    API: .NET API, JSON
    Key / value store
  • Name: Riak
    Created: 2008, Basho (from Akamai)
    Implementation: Erlang
    Distributed: Yes
    Replication: Multiple Servers
    CAP: AP
    API: JSON
    Key / value store
    • Document == complex object
    XML
    YAML
    JSON / BSON
    • Support for secondary indexes
    • Schema can be defined at runtime
    • Optional support for simple querying using Map / Reduce
    Document database
    • MongoDB
    • CouchDB
    • RavenDB
    Document database
  • Name: MongoDB
    Created: 2008, 10gen
    Implementation: C++
    Distributed: Yes via Shards
    Replication: Master / Slave
    CAP: CP
    API: BSON
    Document database
  • Name: CouchDB
    Created: 2005
    Implementation: Erlang
    Distributed: Sort of
    Replication: Master / Master
    CAP: AP
    API: JSON
    Document database
  • Name: RavenDB
    Created: 2010, AyendeRahien
    Implementation: C#
    Distributed: Yes via Shards
    Replication: Master / Master
    CAP: AP
    API: .NET API, JSON
    Document database
    • Graph == network
    • Basic constructs
    Node
    Edge
    Properties
    Graph database
    sergejus.blogas.lt
    reads
    authors
    knows
    sergejus
    tdagys
    knows
    • FlockDB
    • Neo4J
    Graph database
  • Name: FlockDB
    Created: 2010, Twitter
    Implementation: Scala
    Distributed: Yes
    Replication: Multiple Servers
    CAP: AP
    API: Thrift, Ruby
    Graph database
  • Name: Neo4J
    Created: 2003, NeoTechnologies
    Implementation: Java
    Distributed: No
    Replication: Master / Slave
    CAP: CP
    API: JSON, Various Languages
    Graph database
    • For HUGE amount of data
    • Columns are added at a runtime
    • Great scalability
    Horizontal
    Vertical
    Columnar database
    • Unusual data model
    Key Space == Database
    Column Family == Table
    Columns and Super Columns
    Super Column == array of Columns
    Column == Tuple<Key, Value, Timestamp, TTL>
    Columnar database
  • Columnar database
    • Simple Column
  • Columnar database
    • Super Column
    • BigTable*
    • Cassandra
    • HBase
    • Hypertable
    Columnar database
  • Name: BigTable
    Created: 2006, Google
    Implementation: C++
    Distributed: Yes
    Replication: Multiple Servers (GFS)
    CAP: CP
    API: C++
    Columnar database
  • Name: Cassandra
    Created: 2008, Facebook
    Implementation: Java
    Distributed: Yes
    Replication: Multiple Servers
    CAP: AP
    API: Thrift, Avro
    Columnar database
  • Name: HBase
    Created: 2007, Powerset
    Implementation: Java
    Distributed: Yes
    Replication: Multiple Servers (HDFS)
    CAP: CP
    API: Thrift, Java, JSON
    Columnar database
  • Name: Hypertable
    Created: 2007, Zvents
    Implementation: C
    Distributed: Yes
    Replication: Multiple Servers
    CAP: CP
    API: Thrift
    Columnar database
    • ORDER BY ?
    “Natural Key Order”
    NoSQL Limitations
    • GROUP BY ?
    Map / Reduce
    NoSQL Limitations
    • JOIN ?
    Multiple Map / Reduce
    NoSQL Limitations
    • SELECT * ?
    Multi-Machine Map / Reduce
    NoSQL Limitations
    • Maturity
    • Tooling
    • Specificity
    NoSQL Limitations
    • Choose the right tool for the task
    • You can use BOTH
    SQL vs. NoSQL
  • Q & A