4. Traditional Database
• Relational Database – MySQL, Oracle.
• Issues with Relational database
– Weak clustering technology
– Does not scale horizontally
• Adding 1 more node to a single instance MySQL database server doesn’t
make the performance two times.
– Strict data format – suitable for structured data only
• Why?
– Strict ACID rules in Relational Database
• Atomicity, Concurrency, Isolation and Durability
• Due to ACID rules every data needs to be synchronized across all clusters
before a transaction is completed
– Adds overhead to the database system making linear scaling impossible to achieve
7. No-SQL Database
• Relaxed ACID property
• Distributed across multiple nodes
• Scaling is more important than perfect synchronization
• Semi-strict data format – suitable for unstructured data
• ACID vs. BASE
– Atomicity, Concurrency, Isolation and Durability
– Basically available, soft-state, eventually consistent
10. Key-Value Stores
• Distributed hash-table
– Key – search based on key, alpha-numeric
– Value – text, lists, set or complex objects
– Example
• Redis (http://redis.io/)
• Voldemort (LinkedIn)
• Berkeley DB
• Riak
• DynamoDB from Amazon
– Usage
• User profiles
• Session data
• Product information
11. Document Database
• Both key and Values are searchable
• Value – semi-structured data – (name, value) pair
• Value column may vary from row to row
– Different row may number and type of attributes
• Typical value – JSON, XML, BSON (Binary JSON)
• Example
– CouchDB (JSON)
• http://couchdb.apache.org/
– MongoDB (BSON)
• https://www.mongodb.org/
• Storing and managing text documents,
email messages, XML documents
12. Column-Family Stores
• Key-Value pair
– Value – wide column
• Multiple column and value pair
• Super column – collection of a set of column
• Schema-less nature so that each of their "row"s
can contain a different number of columns
• Column Family - Table
• Super Column Family / Super Column – Column
Family within a column family
• Example –
– Google BigTable
• https://cloud.google.com/bigtable/docs/
– Cassandra
• http://www.datastax.com/
• http://cassandra.apache.org/
– Dynamo DB (Amazon)
• http://aws.amazon.com/dynamodb/getting-
started/
– Hbase
• http://hbase.apache.org/
16. Node
Node
Single node computing with
Single large disk Single node computing with
multiple disks in RAID
Node
Node
Node
Node
Node
Multiple node computing with
multiple disks in distributed file system
Distributed file system