SlideShare a Scribd company logo
1 of 34
Download to read offline
Cassandra




                           Jonathan Ellis
                                 @spyced
                          jbellis@riptano.com
Tuesday, April 20, 2010
Tuesday, April 20, 2010
Tuesday, April 20, 2010
Tuesday, April 20, 2010
“NoSQL”

                          Performance   Reliability




                            Scaling        B&D



Tuesday, April 20, 2010
Performance




Tuesday, April 20, 2010

Each Cassandra node manages its storage locally. Not limited by obsolete systems, and not
slowed by layering on top of a DFS.
b-trees




Tuesday, April 20, 2010

read-before-write
index in ram
random i/o
Memtable / SSTable




Tuesday, April 20, 2010
Durable

                     • Write to commitlog
                      • fsync is cheap since it’s append-only
                     • Write to memtable
                     • [amortized] flush memtable to sstable

Tuesday, April 20, 2010

Cassandra is one of the few NoSQL systems that is suitable for use when data loss is
unacceptable.
SSTable format, briefly
                                        <row data 0>
                           <key 127>
                                        <row data 1>
                           <key 255>
                                             ...
                               ...
                                       <row data 127>
                                             ...
                                       <row data 255>
                                             ...



Tuesday, April 20, 2010
Scaling




Tuesday, April 20, 2010

How managing our own data helps scaling
Scaling

                     • Facebook: grew from less than 80 machines
                          to 150+
                     • SimpleGEO: from 20 EC2 Large instances
                          to 50+




Tuesday, April 20, 2010
How it works




Tuesday, April 20, 2010
W   A




                          T
                                  L


Tuesday, April 20, 2010
W   A




                                      F


                          T
                                  L


Tuesday, April 20, 2010
W           A




                                  (A-F]       F


                          T
                                  (F-L]
                                          L


Tuesday, April 20, 2010
Key “C”
                              W   A




                                      F


                          T
                                  L


Tuesday, April 20, 2010
Reliability

                     • No single points of failure
                     • Multiple datacenters
                     • Monitorable


Tuesday, April 20, 2010
Design




Tuesday, April 20, 2010
The opposite of heroes


                     • “If your software wakes people up at 4 AM
                          to fix it, you’re doing it wrong.”




Tuesday, April 20, 2010
W   A




                          T
                                  L


Tuesday, April 20, 2010

Every node is equal
Y
                                                      Key “C”
                                              A
                              W



                          U
                                                  F


                              T
                                              L
                                       P


Tuesday, April 20, 2010

Always at least one copy in each datacenter
Alternate datacenters on the ring
Monitorable




Tuesday, April 20, 2010
Events




Tuesday, April 20, 2010
JMX




Tuesday, April 20, 2010
Bondage & Discipline


                     • Twitter: “Fifteen months ago, it took two
                          weeks to perform ALTER TABLE on the
                          statuses [tweets] table.”




Tuesday, April 20, 2010
ColumnFamilies
                            Columns




Tuesday, April 20, 2010
SuperColumns
                          SuperColumns




Tuesday, April 20, 2010
Twissandra
          User = {
              'a4a70900-24e1-11df-8924-001ff3591711': {
                   'id': 'a4a70900-24e1-11df-8924-001ff3591711',
                   'username': 'ericflo',
                   'password': '****',
              },
          }



        Followers = {
            'a4a70900-24e1-11df-8924-001ff3591711': {
                # friend id: timestamp of when the followership was added
                '10cf667c-24e2-11df-8924-001ff3591711': '1267413962580791',
                '343d5db2-24e2-11df-8924-001ff3591711': '1267413990076949',
                '3f22b5f6-24e2-11df-8924-001ff3591711': '1267414008133277',
            },
        }


Tuesday, April 20, 2010
Tweet = {
              '7561a442-24e2-11df-8924-001ff3591711': {
                  'id': '89da3178-24e2-11df-8924-001ff3591711',
                  'user_id': 'a4a70900-24e1-11df-8924-001ff3591711',
                  'body': 'Trying out Twissandra. This is awesome!',
                  '_ts': '1267414173047880',
              },
          }
        Timeline = {
            'a4a70900-24e1-11df-8924-001ff3591711': {
                # timestamp of tweet: tweet id
                1267414247561777: '7561a442-24e2-11df-8924-001ff3591711',
                1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711',
                1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711',
                1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711',
            },
        }
Tuesday, April 20, 2010
Tuesday, April 20, 2010
Denormalize
    Userline = {
        'a4a70900-24e1-11df-8924-001ff3591711': {
            # timestamp of tweet: tweet id
            1267414247561777: '7561a442-24e2-11df-8924-001ff3591711',
            1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711',
            1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711',
            1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711',
        },
    }




Tuesday, April 20, 2010
A note on UUIDs

                     • TimeUUID = Version 1 UUID
                     • LexicalUUID = any UUID
                      • usually version 4


Tuesday, April 20, 2010

UUIDs are better than timestamps
Questions




Tuesday, April 20, 2010

More Related Content

More from jbellis

Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databasesjbellis
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloudjbellis
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015jbellis
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014jbellis
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1jbellis
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014jbellis
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013jbellis
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0jbellis
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynotejbellis
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionjbellis
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012jbellis
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandrajbellis
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1jbellis
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Javajbellis
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprisejbellis
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)jbellis
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011jbellis
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)jbellis
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from javajbellis
 

More from jbellis (20)

Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databases
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloud
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solution
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandra
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Java
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprise
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from java
 

Cassandra: NoSQL Database Performance and Scaling

  • 1. Cassandra Jonathan Ellis @spyced jbellis@riptano.com Tuesday, April 20, 2010
  • 5. “NoSQL” Performance Reliability Scaling B&D Tuesday, April 20, 2010
  • 6. Performance Tuesday, April 20, 2010 Each Cassandra node manages its storage locally. Not limited by obsolete systems, and not slowed by layering on top of a DFS.
  • 7. b-trees Tuesday, April 20, 2010 read-before-write index in ram random i/o
  • 9. Durable • Write to commitlog • fsync is cheap since it’s append-only • Write to memtable • [amortized] flush memtable to sstable Tuesday, April 20, 2010 Cassandra is one of the few NoSQL systems that is suitable for use when data loss is unacceptable.
  • 10. SSTable format, briefly <row data 0> <key 127> <row data 1> <key 255> ... ... <row data 127> ... <row data 255> ... Tuesday, April 20, 2010
  • 11. Scaling Tuesday, April 20, 2010 How managing our own data helps scaling
  • 12. Scaling • Facebook: grew from less than 80 machines to 150+ • SimpleGEO: from 20 EC2 Large instances to 50+ Tuesday, April 20, 2010
  • 13. How it works Tuesday, April 20, 2010
  • 14. W A T L Tuesday, April 20, 2010
  • 15. W A F T L Tuesday, April 20, 2010
  • 16. W A (A-F] F T (F-L] L Tuesday, April 20, 2010
  • 17. Key “C” W A F T L Tuesday, April 20, 2010
  • 18. Reliability • No single points of failure • Multiple datacenters • Monitorable Tuesday, April 20, 2010
  • 20. The opposite of heroes • “If your software wakes people up at 4 AM to fix it, you’re doing it wrong.” Tuesday, April 20, 2010
  • 21. W A T L Tuesday, April 20, 2010 Every node is equal
  • 22. Y Key “C” A W U F T L P Tuesday, April 20, 2010 Always at least one copy in each datacenter Alternate datacenters on the ring
  • 26. Bondage & Discipline • Twitter: “Fifteen months ago, it took two weeks to perform ALTER TABLE on the statuses [tweets] table.” Tuesday, April 20, 2010
  • 27. ColumnFamilies Columns Tuesday, April 20, 2010
  • 28. SuperColumns SuperColumns Tuesday, April 20, 2010
  • 29. Twissandra User = { 'a4a70900-24e1-11df-8924-001ff3591711': { 'id': 'a4a70900-24e1-11df-8924-001ff3591711', 'username': 'ericflo', 'password': '****', }, } Followers = { 'a4a70900-24e1-11df-8924-001ff3591711': { # friend id: timestamp of when the followership was added '10cf667c-24e2-11df-8924-001ff3591711': '1267413962580791', '343d5db2-24e2-11df-8924-001ff3591711': '1267413990076949', '3f22b5f6-24e2-11df-8924-001ff3591711': '1267414008133277', }, } Tuesday, April 20, 2010
  • 30. Tweet = { '7561a442-24e2-11df-8924-001ff3591711': { 'id': '89da3178-24e2-11df-8924-001ff3591711', 'user_id': 'a4a70900-24e1-11df-8924-001ff3591711', 'body': 'Trying out Twissandra. This is awesome!', '_ts': '1267414173047880', }, } Timeline = { 'a4a70900-24e1-11df-8924-001ff3591711': { # timestamp of tweet: tweet id 1267414247561777: '7561a442-24e2-11df-8924-001ff3591711', 1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711', 1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711', 1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711', }, } Tuesday, April 20, 2010
  • 32. Denormalize Userline = { 'a4a70900-24e1-11df-8924-001ff3591711': { # timestamp of tweet: tweet id 1267414247561777: '7561a442-24e2-11df-8924-001ff3591711', 1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711', 1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711', 1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711', }, } Tuesday, April 20, 2010
  • 33. A note on UUIDs • TimeUUID = Version 1 UUID • LexicalUUID = any UUID • usually version 4 Tuesday, April 20, 2010 UUIDs are better than timestamps