• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Cassandra Presentation for San Antonio JUG
 

Cassandra Presentation for San Antonio JUG

on

  • 15,948 views

1 April 2010 given to the San Antonio Java User Group by Gary Dusbabek

1 April 2010 given to the San Antonio Java User Group by Gary Dusbabek

Statistics

Views

Total Views
15,948
Views on SlideShare
8,537
Embed Views
7,411

Actions

Likes
40
Downloads
0
Comments
2

8 Embeds 7,411

http://nosql.mypopescu.com 7305
http://www.slideshare.net 92
http://static.slidesharecdn.com 6
http://www.hanrss.com 2
http://www.mefeedia.com 2
http://www.linkedin.com 2
http://xianguo.com 1
http://translate.googleusercontent.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

12 of 2 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Do you have any sample files?
    Thanks!
    Are you sure you want to
    Your message goes here
    Processing…
  • What problems does it solve?
    Reliability at scale
    No single point of failure (all nodes are identical)
    Simple scaling
    linear
    High write throughput
    Large data sets
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Hello World
  • RandomPartitioner – takes key, uses MD5 as the real key, then stores on the appropriate node.OrderPreservingPartitioner– get cheap range scans. Takes more work.
  • Eric Brewer
  • Need to describe hinted handoff better.
  • Keyspace == like namespaceCF == like a tableKeyspace + Table used interchangeably in the code.
  • Key cache : keys whose location are kept in memory to avoid index scan.Row cache: entire rows kept in memory.
  • Avro: Doug Cutting
  • Mmap – index and data files (read only)
  • java.sun.com/performance/reference/whitepapers/tuning.htmlhttp://java.sun.com/javase/technologies/hotspot/gc/index.jspGoal is low pause times and high throughput:-XX:TargetSurvivorRatio=90Allows 90% of the survivor spaces to be occupied instead of the default 50%, allowing better utilization of the survivor space memory. -XX:SurvivorRatio=128Sets survivor space ratio to 1:128, resulting in small survivor. Smaller survivor spaces allow short lived less time in the young generation (they die faster). -XX:+AggressiveOptsturns on point optimizations that are expected to be on in later releases. Experimental and sometimes reveals JDK bugs.-XX:+UseParNewGC -UseConcMarkSweepGCparallel young generation collector. Similar to +UsePareallelGC except can be used with the concurrent collector. See benefits here on multiway systems. Two pauses instead of one long pause (mark, then sweep). Mark: directly reachable (young). 2nd: objects missed due to concurrent execution of threads (the remark).-XX:+CMSParallelRemarkEnabledworks with UseParNewGC to decrease the remark pauses.

Cassandra Presentation for San Antonio JUG Cassandra Presentation for San Antonio JUG Presentation Transcript

  • MySQL for Beginners
    Gary Dusbabek
    Rackspace
    April Fools!!!11
  • Apache
    Gary Dusbabek
    Rackspace
  • What is Cassandra?
    Key-value store (with some structure)
    Highly scalable
    Eventually consistent
    Distributed
    Tunable
    Partitioning
    Replication
  • Where did it come from?
    Created at Facebook
    Dynamo: distribution architecture
    BigTable: data model
    Open-sourced in 2008
    Apache incubator in early 2009
    Graduation in March 2010
  • Who uses it?
    Rackspace
    Facebook (of course)
    Twitter
    Digg
    Reddit
    IBM
    Others…
  • What problems does it solve?
    Reliability at scale
    No single point of failure (all nodes are identical)
    Simple scaling
    linear
    High write throughput
    Large data sets
  • What problems can’t it solve?
    No flexible indices
    No querying on non PK values
    Not good for big binary data (>64mb) unless you chunk
    Row contents must fit in available memory
  • Concepts: CAP
    CAP Theorem
    Consistency
    Availability
    Partition tolerance
    • Choose two
    • Cassandra chooses A and P but allows them to be tunable to have more C.
  • Concepts: Denormalization
    Ditch joins
    Duplicate data
    Structure data around queries
    Normalized
    Denormalized
  • Concepts: Replication & Consistency
    You specify replication factor
    You specify consistency level for read/write operations
    ZERO, ONE, QUORUM, ALL, ANY
  • Ring Topology
    Storage ring
    Every node gets a token
    Defines its place in the storage ring
    And which keys it is responsible for (its ranges)
    RF=3
    a
    j
    d
    g
  • Ring Topology
    Storage ring
    Every node gets a token
    Defines its place in the storage ring
    And which keys it is responsible for (its ranges)
    RF=2
    a
    j
    d
    g
  • Ring: New Node
    New node
    Ranges are adjusted
    RF=3
    a
    m
    j
    d
    g
  • Ring: New Node
    New node
    Ranges are adjusted
    RF=2
    a
    m
    j
    d
    g
  • Ring Partition
    Node dies or becomes isolated from the ring
    Hints
    Handoff
    RF=3
    a
    m
    j
    d
    g
  • Data Model
    Keyspace-contains column families
    ColumnFamily
    Standard or Super
    Two levels of indexes (key and column name)
  • Data Model
    Column and subcolumn sorting
    Specify your own comparator:
    TimeUUID
    LexicalUUID
    UTF8
    Long
    Bytes
    CreateYourOwn
  • Data Model
    Standard Column Family
  • Data Model
    Super Column Family
  • Inserting: Overview
    Simple: put(key, col, value)
    Complex: put(key, [col:value, …, col:value])
    Batch: multi key.
  • Inserting: Writes
    Commit log for durability
    Memtable – no disk access (no reads or seeks)
    Sstables are final (become read only)
    Index
    Bloom filter
    Raw data
    Atomic within a ColumnFamily
    Bottom line: FAST!!!
  • Querying: Overview
    You need a key or keys:
    Single: key=‘a’
    Range: key=‘a’ through ’f’
    And columns to retrieve:
    Slice: cols={bar through kite}
    By name: key=‘b’ cols={bar, cat, llama}
    Nothing like SQL “WHERE col=‘faz’”
    But secondary indices are being worked on (see CASSANDRA-749)
  • Querying: Reads
    Not as fast as writes
    Read repair when out of sync
    New in 0.6:
    Row cache (avoid sstable lookup)
    Key cache (avoid index scan)
  • Client API (Low level)
    Fat Client
    Maybe too low level, not well-tested
    Thrift (currently best-supported)
    Many language bindings
    Not much of a community
    No streaming
    Fast transport
    Avro
    Just getting started
    Shows promise
  • Client API (High Level)
    Rapidly changing, getting feature-rich
    Connection pools
    Load balancing/Failover
    Reduces the verbosity of working with thrift
    For Java, see Hector
    http://github.com/rantav/hector
    Also Ruby, Python, C++, C#, Perl, PHP
    http://wiki.apache.org/cassandra/ClientExamples
  • Java Bits: JMX
    Relatively easy to expose objects and services as MBeans
    Simplifies aspects of cluster and node management
    Easy monitoring
    You choose the JMX-enabled system management tool (jconsole is alright)
  • Java Bits: available libraries
    Excellent:
    Google collections
    Multimap, BiMap, Iterators
    java.util.concurrency
    nio files (including mmap)
    Meh:
    nio sockets
  • Java Bits: Heap & GC
    Cassandra tweaks the default GC settings quite a bit:
    XX:+UseParNewGC
    XX:+UseConcMarkSweepGC
    XX:+CMSParallelRemarkEnabled
    XX:TargetSurvivorRatio=90
    XX:SurvivorRatio=128
    XX:MaxTenuringThreshold=0
    XX:+HeapDumpOnOutOfMemoryError
    XX:+AggressiveOpts
  • Java Bits: code management
    Library versioning
    No standard way
    Mostly declarative
    Not readily queryable
    Must ship every dependency
    Or use ant/mvn.
    Now you have two (or more!) problems.
  • Java Bits: daemonization
    Java doesn’t make it easy re: stdout, stderr
    After setting up, System.out and System.err are close()d
    Windows: don’t ask
  • Future Direction
    Range delete (delete these cols from those keys)
    Vector clocks (including server-side conflict resolution)
    Altering keyspace/column family definitions on a live cluster
    Byte[] keys
    Compression
    Multi-tenant support
    Less memory restrictions
  • Linky
    wiki.apache.org/cassandra
    cassandra.apache.org
    Google BigTable
    labs.google.com/papers/bigtable.html
    Amazon Dynamo
    s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf
    Facebook Cassandra
    www.facebook.com/note.php?note_id=24413138919
    Java tuning:
    java.sun.com/performance/reference/whitepapers/tuning.html
    java.sun.com/javase/technologies/hotspot/gc/index.jsp
    Me
    gdusbabek@gmail.com
    gdusbabek on twitter and just about everything else.