Cassandra
Upcoming SlideShare
Loading in...5
×
 

Cassandra

on

  • 4,427 views

 

Statistics

Views

Total Views
4,427
Views on SlideShare
4,383
Embed Views
44

Actions

Likes
2
Downloads
63
Comments
0

2 Embeds 44

http://www.linkedin.com 40
https://www.linkedin.com 4

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cassandra Cassandra Presentation Transcript

  • Cassandra
    Jahangir Mohammed
    md.jahangir27@gmail.com
  • What is Cassandra?
    Distributed data store
    O(1) DHT
    Column-oriented
    Dynamo + Big Table
  • Why not RDBMS?
    Many-to-many relationships -> Joins -> Denormalization -> Multiple copies of data or redundancy
    Rigid schema
    Vertical scaling is easier than horizontal
    ACID, Distributed transaction, Two-phase commit
    Slower writes
  • CAP Theorem
    Consistency – All clients will read the same data at same time.
    Availability – Service always up and running.
    Partition tolerance – System on whole operates despite network issues.
  • Features
    Proven
    Rich data model
    Scalable
    Distributed & Decentralized
    Cross datacenter support
    High performance writes/reads
    No SPOF
    Schema free
    Tunable consistency
  • Limitations
    No ACID transactions(if needed)
    Eventually consistent(Tunable consistency, trade-off with performance)
  • ARCHITECTURE
    Ring
    Each node – unique token
    Tokens range from 0 to 2**127
    Keys MD5 hash to determine node
  • RING
    h(key1)
    0
    1
    N=3
    B
    h(key2)
    A
    C
    F
    E
    D
    1/2
    9
  • ARCHITECTURE
    P2P:
    All nodes are identical
    No “master” node
    Gossip:
    Protocol for intra-ring communication
    Each node have state information about other nodes
    Anti-entropy & Read Repair:
    Replica synchronization mechanism
    Occurs during major compaction
    Uses Merkle trees
  • READ REPAIR
    Client
    Result
    Query
    Cassandra Cluster
    Read repair if digests differ
    Closest replica
    Result
    Replica A
    Digest Query
    Digest Response
    Digest Response
    Replica B
    Replica C
  • WRITE PATH
    Commit log: Responsible for all writes
    Memtable: In-memory data structure, written after commit log.
    SSTable:
    Immutable table
    Memtable flushed to disk
  • WRITE PATH
    Key (CF1 , CF2 , CF3)
    • Data size
    • Number of Objects
    • Lifetime
    Memtable ( CF1)
    Commit Log
    Binary serialized
    Key ( CF1 , CF2 , CF3 )
    Memtable ( CF2)
    FLUSH
    Memtable ( CF2)
    Data file on disk
    <Key name><Size of key Data><Index of columns/supercolumns>< Serialized column family>
    ---
    ---
    ---
    ---
    <Key name><Size of key Data><Index of columns/supercolumns>< Serialized column family>
    Dedicated Disk
  • READ PATH
  • ARCHITECTURE
    Bloom filter:
    Performance booster
    Fast, nondeterministic algorithms
    In memory
    Used during read operation
    Tombstones:
    Deletion marker
    Soft delete
    Marker older than a set time, GC’ed
  • HINTED HANDOFF & COMPACTION
    Hinted Handoff:
    Node responsible down
    Coordinator creates hint
    Compaction:
    Merge SSTables.
    Keys merged
    Columns combined
    Tombstones discarded
    New index created
  • PARTITIONER
    Decides where row key(data) finds place in ring.
    Random Partitioner:
    MD5 hash
    Spreads keys evenly
    Inefficient range queries
    Order-Preserving Partitioner:
    Rows sorted
  • DATA MODEL
    Keyspace:
    Like Database.
    Container for CFs.
    Column Family:
    Like Table(But, not exactly a relational database table).
    Container of rows.
    Row:
    Sorted collection of columns.
    Column:
    Basic unit of data structure.
    Triplet of name, value and timestamp
  • DATA MODEL
    Super Column:
    Special column.
    Sorted associative array of columns.
    Map of maps.
    Only one level deep.
    Super Column Family:
    Container of rows having super columns.
    4-D DHT = Standard CF:
    [Keyspace][ColumnFamily][Key][Column].
    5-D DHT = Super CF:
    [Keyspace][ColumnFamily][Key][SuperColumn][SubColumn].
  • REPLICATION & CONSISTENCY
    Replication: No. of copies of data in the system.
    Consistency level: No. of replicas to respond.
  • WRITE
  • READ
  • REPLICA PLACEMENT STRATEGY
    Simple Strategy:
    Rack-Unaware
    Fast
    Single D.C.
  • SIMPLE STRATEGY
  • OLD NTS
    Rack-aware
    Same D.C.
  • NTS
    Rack-aware
    D.C. aware
  • IMAGE REFERENCES
    Nathan Hurst’s Blog
    http://2.bp.blogspot.com/_YGilJHLjrrI/TJy3K0wshLI/AAAAAAAAAOI/ogAvf8Ckq3k/s1600/cassandra-ring2.png
    Sigmod presentation: Avinash et. al, Facebook
    Datastax
    http://answers.oreilly.com/topic/2408-replica-placement-strategies-when-using-cassandra/
  • QUESTIONS?