Radical NoSQL Scalability with Cassandra - Tim Berglund

JAX London
JAX LondonJAX London
Radical NoSQL
                          Scalability
                                with




                                      @tlberglund
Tuesday, October 16, 12
Data Model


                                       for developers
Tuesday, October 16, 12
Column




Tuesday, October 16, 12
Column

                          full_name: “Tim Berglund”




Tuesday, October 16, 12
Column

                          full_name: “Tim Berglund”

                                          20120425T1832




Tuesday, October 16, 12
Column
                          Key/Value pair


                              full_name: “Tim Berglund”

                                              20120425T1832




Tuesday, October 16, 12
Column
                          Key/Value pair


                              full_name: “Tim Berglund”

                                              20120425T1832




                                                Timestamp


Tuesday, October 16, 12
Column




Tuesday, October 16, 12
Column

          ‣ Key-value pair




Tuesday, October 16, 12
Column

          ‣ Key-value pair
          ‣ Optionally typed




Tuesday, October 16, 12
Column

          ‣ Key-value pair
          ‣ Optionally typed
          ‣ Timestamped




Tuesday, October 16, 12
Column

          ‣ Key-value pair
          ‣ Optionally typed
          ‣ Timestamped
          ‣ Fundamental unit



Tuesday, October 16, 12
Row




Tuesday, October 16, 12
Row


                          column




Tuesday, October 16, 12
Row


                          column   column




Tuesday, October 16, 12
Row


                          column   column   column




Tuesday, October 16, 12
Row


                 row key   column   column   column




Tuesday, October 16, 12
Row


                             name:   bday:    role:
                tlberglund
                               Tim    06-15    teacher




Tuesday, October 16, 12
Row
           Row Key



                             name:   bday:    role:
                tlberglund
                               Tim    06-15    teacher




Tuesday, October 16, 12
Row
           Row Key



                             name:   bday:     role:
                tlberglund
                               Tim    06-15     teacher




                                     Columns


Tuesday, October 16, 12
Row


                             name:   bday:    role:
                tlberglund
                               Tim    06-15    teacher




Tuesday, October 16, 12
Row


                             bday:    name:   role:
                tlberglund
                              06-15     Tim    teacher




Tuesday, October 16, 12
Row


                                bday:    name:   role:
                tlberglund
                                 06-15     Tim    teacher


                          Sorted by UTF8Type comparator




Tuesday, October 16, 12
Table




Tuesday, October 16, 12
Table




Tuesday, October 16, 12
Table
                                name:   role:     status:
                          tim
                                  Tim   teacher     Cool




Tuesday, October 16, 12
Table
                                     name:     role:        status:
                            tim
                                       Tim     teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen




Tuesday, October 16, 12
Table
                                     name:     role:        status:
                            tim
                                       Tim     teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO




Tuesday, October 16, 12
Table
                                     name:      role:        status:
                            tim
                                       Tim      teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      role:        status:
                          matt
                                     Matt       founder      ubercool




Tuesday, October 16, 12
Table
                                     name:      role:        status:
                            tim
                                       Tim      teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      role:        status:
                          matt
                                     Matt       founder      ubercool




Tuesday, October 16, 12
Outer hash key
                                       Table
                                     name:      role:        status:
                            tim
                                       Tim      teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      role:        status:
                          matt
                                     Matt       founder      ubercool




Tuesday, October 16, 12
Outer hash key
                                       Table
                                                   Inner hash key

                                     name:      role:        status:
                            tim
                                       Tim      teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      role:        status:
                          matt
                                     Matt       founder      ubercool




Tuesday, October 16, 12
Outer hash key
                                       Table
                                                   Inner hash key

                                     name:      role:        status:
                            tim
                                       Tim      teacher        Cool
                                                                        Sparse
                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      role:        status:
                          matt
                                     Matt       founder      ubercool




Tuesday, October 16, 12
Database

                            Accounts

                           ClickStream

                              Orders

                          InventoryEvents




Tuesday, October 16, 12
Database

                                  Accounts

                                 ClickStream
                   Tables
                                    Orders

                                InventoryEvents




Tuesday, October 16, 12
Cluster

                            System
                           Database



                          Application
                           Database




Tuesday, October 16, 12
Secondary Indexes




Tuesday, October 16, 12
Secondary Indexes

          ‣ Ubiquitous in relational databases




Tuesday, October 16, 12
Secondary Indexes

          ‣ Ubiquitous in relational databases
          ‣ Supported in Cassandra, with
            qualifications




Tuesday, October 16, 12
Secondary Indexes
                                     name:      email:      role:
                            tim
                                       Tim       tb@a.com    teacher


                                    name:      email:
                          kristen               k@ds.com
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      email:    status:
                          matt
                                     Matt       m@ds.com  ubercool




Tuesday, October 16, 12
Secondary Indexes
                                     name:      email:      role:
                            tim
                                       Tim       tb@a.com    teacher


                                    name:      email:
                          kristen               k@ds.com
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      email:    status:
                          matt
                                     Matt       m@ds.com  ubercool




Tuesday, October 16, 12
Secondary Indexes
                                     name:      email:      role:
                            tim
                                       Tim       tb@a.com    teacher


                                    name:      email:
                          kristen               k@ds.com
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      email:    status:
                          matt
                                     Matt       m@ds.com  ubercool




Tuesday, October 16, 12
Secondary Indexes




Tuesday, October 16, 12
Secondary Indexes
          ‣ In relational databases: performant for
            high cardinality




Tuesday, October 16, 12
Secondary Indexes
          ‣ In relational databases: performant for
            high cardinality
          ‣ In Cassandra: the reverse




Tuesday, October 16, 12
Secondary Indexes
          ‣ In relational databases: performant for
            high cardinality
          ‣ In Cassandra: the reverse
          ‣ Not suitable for lookup-by-email




Tuesday, October 16, 12
Secondary Indexes
          ‣ In relational databases: performant for
            high cardinality
          ‣ In Cassandra: the reverse
          ‣ Not suitable for lookup-by-email
          ‣ Suitable for lookup by: region code,
            gender, state, etc.

Tuesday, October 16, 12
Tuesday, October 16, 12
Why ?




Tuesday, October 16, 12
Why BigTable?




Tuesday, October 16, 12
Why BigTable?

                          Flexibility



Tuesday, October 16, 12
Why BigTable?

                            Flexibility
                          Performance


Tuesday, October 16, 12
Query Language


                                      for developers
Tuesday, October 16, 12
Tuesday, October 16, 12
CQL


Tuesday, October 16, 12
CQL
               (Cassandra Query Language)



Tuesday, October 16, 12
CREATE


Tuesday, October 16, 12
CREATE KEYSPACE




Tuesday, October 16, 12
CREATE KEYSPACE


     CREATE KEYSPACE DemoKeyspace
      WITH strategy_class='SimpleStrategy'
      AND strategy_options:replication_factor=1;




Tuesday, October 16, 12
CREATE TABLE




Tuesday, October 16, 12
CREATE TABLE

                          CREATE TABLE accounts
                           (KEY text PRIMARY KEY)
                             WITH comparator=text
                             AND default_validation=text;




Tuesday, October 16, 12
CREATE TABLE




Tuesday, October 16, 12
CREATE TABLE

                          CREATE TABLE accounts
                           (KEY text PRIMARY KEY,
                            name text,
                            email text,
                            signed_up_at timestamp)
                           WITH comparator=text;




Tuesday, October 16, 12
INSERT


Tuesday, October 16, 12
INSERT




Tuesday, October 16, 12
INSERT

                          INSERT INTO accounts
                            (KEY, name, email, signed_up_at)
                            VALUES
                            ('tlberglund',
                             'Tim Berglund',
                             'tlberglund@gmail.com',
                             '2012-04-25');




Tuesday, October 16, 12
INSERT




Tuesday, October 16, 12
INSERT

                          INSERT INTO events
                            (KEY, 0, 1, 2, 3, 4)
                            VALUES
                            ('2012-04-25T11:04:34-0700',
                             55.4, 56.2, 59.6, 65.3, 79)
                            USING CONSISTENCY QUORUM
                            AND TTL 86400;




Tuesday, October 16, 12
SELECT


Tuesday, October 16, 12
SELECT




Tuesday, October 16, 12
SELECT


                          SELECT *
                            FROM accounts
                            WHERE KEY='tlberglund';




Tuesday, October 16, 12
SELECT




Tuesday, October 16, 12
SELECT

                  SELECT 1..3
                    FROM events
                    WHERE KEY='2012-04-25T11:04:34-0700';




Tuesday, October 16, 12
SELECT




Tuesday, October 16, 12
SELECT

                          SELECT *
                            FROM accounts
                            WHERE KEY='tlberglund'
                            USING CONSISTENCY ONE;




Tuesday, October 16, 12
UPDATE


Tuesday, October 16, 12
UPDATE




Tuesday, October 16, 12
UPDATE

     UPDATE accounts
       SET last_login='2012-04-25T09:37:35-0700'
       WHERE KEY='tlberglund';




Tuesday, October 16, 12
Distribution Model


Tuesday, October 16, 12
Hash Ring


Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
Writing a Key


Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000
                name: Tim
                               E000          2000




                            C000                4000




                               A000          6000


                                      8000


Tuesday, October 16, 12
0000
                3D97: Tim
                               E000          2000




                            C000                4000




                               A000          6000


                                      8000


Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000
                                             3D97: Tim




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000
            role: Teacher
                               E000          2000




                            C000                4000




                               A000          6000


                                      8000


Tuesday, October 16, 12
0000
          9C4F: Teacher
                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000


                             E000                 2000




                          C000                       4000




                              A000
                           9C4F: Teacher          6000


                                           8000


Tuesday, October 16, 12
Reading a Key


Tuesday, October 16, 12
0000

             3D97?
                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000

             3D97?
                             E000          2000
           name: Tim


                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000

             9C4F?
                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000

             9C4F?
                             E000          2000
      role: Teacher


                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
Replication


Tuesday, October 16, 12
Replication




Tuesday, October 16, 12
Replication


                          Replication factor (N)




Tuesday, October 16, 12
Replication


                          Replication factor (N)
         Pluggable placement strategies



Tuesday, October 16, 12
Replication Strategies




Tuesday, October 16, 12
Replication Strategies


                          Simple




Tuesday, October 16, 12
Replication Strategies


                                  Simple
                          Network Topology Aware



Tuesday, October 16, 12
Simple Strategy
                                        0000


                                 E000          2000




                              C000                4000


                    N=3
                                 A000          6000


                                        8000

Tuesday, October 16, 12
Simple Strategy
                                        0000
             3D97: Tim
                                 E000          2000




                              C000                4000


                    N=3
                                 A000          6000


                                        8000

Tuesday, October 16, 12
Simple Strategy
                                        0000


                                 E000          2000




                              C000                4000
                                                 3D97: Tim




                    N=3
                                 A000          6000


                                        8000

Tuesday, October 16, 12
Simple Strategy
                                        0000


                                 E000            2000




                              C000                   4000
                                                    3D97: Tim




                    N=3
                                 A000           6000
                                               3D97: Tim




                                        8000

Tuesday, October 16, 12
Simple Strategy
                                          0000


                                 E000                 2000




                              C000                        4000
                                                         3D97: Tim




                    N=3
                                 A000                6000
                                                    3D97: Tim




                                          8000
                                        3D97: Tim


Tuesday, October 16, 12
Topology Aware
                             DC1    DC2
                             2000   0000



                             6000   4000



                             A000   8000



                             E000   C000




Tuesday, October 16, 12
Topology Aware
                             DC1    DC2
                             2000   0000



                             6000   4000



                             A000   8000



                             E000   C000




Tuesday, October 16, 12
Topology Aware
               3D97: Tim        DC1     DC2
                                 2000   0000



                                 6000   4000



                                 A000   8000



                                 E000   C000




Tuesday, October 16, 12
Topology Aware
                                DC1     DC2
                                 2000     0000



                                 6000     4000
                                        3D97: Tim




                                 A000     8000



                                 E000     C000




Tuesday, October 16, 12
Topology Aware
                                 DC1        DC2
                                 2000         0000



                                 6000
                                3D97: Tim     4000
                                            3D97: Tim




                                 A000         8000



                                 E000         C000




Tuesday, October 16, 12
Topology Aware
                                 DC1        DC2
                                 2000         0000



                                 6000
                                3D97: Tim     4000
                                            3D97: Tim




                                 A000         8000
                                            3D97: Tim




                                 E000         C000




Tuesday, October 16, 12
Topology Aware
                                 DC1        DC2
                                 2000         0000



                                 6000
                                3D97: Tim     4000
                                            3D97: Tim




                                 A000
                                3D97: Tim     8000
                                            3D97: Tim




                                 E000         C000




Tuesday, October 16, 12
Writing


Tuesday, October 16, 12
Client Connections


Tuesday, October 16, 12
Client Connections
                                           0000


                                    E000          2000




                                 C000                4000


                Client Machine

                                    A000          6000


                                           8000




Tuesday, October 16, 12
Client Connections
                                               0000


                                        E000          2000




                Client Machine
                                 ?   C000                4000




                                        A000          6000


                                               8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000




                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          3D97?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          3D97?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          3D97?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          3D97?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000




                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          9C4F?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          9C4F?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          9C4F?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          9C4F?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Load Balancer




Tuesday, October 16, 12
Load Balancer
                            Hardware VIP




Tuesday, October 16, 12
Load Balancer
                            Hardware VIP
                              HAProxy




Tuesday, October 16, 12
Load Balancer
                            Hardware VIP
                              HAProxy
                          Round-robin DNS



Tuesday, October 16, 12
Load Balancer
                            Hardware VIP
                              HAProxy
                          Round-robin DNS
                             Client-side


Tuesday, October 16, 12
Load Balancer
                            Hardware VIP
                              HAProxy
                          Round-robin DNS
                             Client-side
                            (Hector does this)


Tuesday, October 16, 12
Write Consistency


Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          2000




                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                      0000


                                               E000          2000




                                            C000                4000
     Client Machine


                                               A000          6000


                          Load Balancer               8000

                                          14C7

Tuesday, October 16, 12
Write Consistency
                                                      0000


                                               E000          2000




                                            C000                4000
     Client Machine


                                               A000          6000


                          Load Balancer               8000

                                          14C7

Tuesday, October 16, 12
Write Consistency
                                                      0000


                                               E000          2000




                                            C000                4000
     Client Machine


                                               A000          6000


                          Load Balancer               8000

                                          14C7

Tuesday, October 16, 12
Write Consistency
                                                      0000


                                               E000          2000




                                            C000                4000
     Client Machine


                                               A000          6000


                          Load Balancer               8000

                                          14C7

Tuesday, October 16, 12
Write Consistency
                                                      0000


                                               E000          2000




                                            C000                4000
     Client Machine


                                               A000          6000


                          Load Balancer               8000

                                          14C7

Tuesday, October 16, 12
Write Consistency
                                                       0000


                          Coordinator           E000          2000




                                             C000                4000
     Client Machine


                                                A000          6000


                           Load Balancer               8000

                                           14C7

Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                          C000                4000
     Client Machine
                                 14C7
                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                                           X
                                          C000                 4000
     Client Machine
                                 14C7
                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                                           X
                                          C000                 4000
     Client Machine
                                 14C7
                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                                           X
                                          C000                 4000
     Client Machine
                                 14C7
                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                                           X
                                          C000                 4000
     Client Machine
                                 14C7
                                             A000          6000


                          Load Balancer             8000




                            Coordinator
                          stores a “hint”
Tuesday, October 16, 12
Write Consistency




Tuesday, October 16, 12
Write Consistency

          ‣ ANY
                 At least one node (hinted handoffs allowed)




Tuesday, October 16, 12
Write Consistency

          ‣ ANY
                 At least one node (hinted handoffs allowed)
          ‣ ONE
                 At least one node (no hinted handoffs)




Tuesday, October 16, 12
Write Consistency

          ‣ ANY
                 At least one node (hinted handoffs allowed)
          ‣ ONE
                 At least one node (no hinted handoffs)
          ‣ QUORUM
                 (N/2)+1 nodes



Tuesday, October 16, 12
Write Consistency




Tuesday, October 16, 12
Write Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone




Tuesday, October 16, 12
Write Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone
          ‣ EACH_QUORUM
                 (N/2)+1 nodes in all availability zones




Tuesday, October 16, 12
Write Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone
          ‣ EACH_QUORUM
                 (N/2)+1 nodes in all availability zones
          ‣ ALL
                 Write successfully to all replicas



Tuesday, October 16, 12
Reading


Tuesday, October 16, 12
Read Consistency
                           0000


                    E000          2000




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                            9C4F?
                           0000


                    E000          2000




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                            9C4F?
                           0000


                    E000          2000




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim   TODAY




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim   TODAY


                                            Tim   TODAY
            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim   TODAY


                                            Tim   TODAY
            C000                     4000

                                            Tim   TODAY


                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                  Coordinator
                                                 9C4F?
                           0000


                    E000             2000       Tim   TODAY


                                                Tim   TODAY
            C000                        4000

                                                Tim   TODAY


                    A000             6000


                           8000




Tuesday, October 16, 12
Read Consistency
                           0000


                    E000          2000




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                            9C4F?
                           0000


                    E000          2000




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim   TODAY




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim    TODAY


                                            Jim   YESTERDAY

            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim    TODAY


                                            Jim   YESTERDAY

            C000                     4000

                                            Tim    TODAY


                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                                 9C4F?
                           0000


                    E000          2000       Tim    TODAY


                                             Jim   YESTERDAY

            C000                     4000

                                             Tim    TODAY


                    A000          6000


                           8000
                                  Inconsistent

Tuesday, October 16, 12
Passive Read Repair




Tuesday, October 16, 12
Passive Read Repair
          ‣ Initiated by coordinator




Tuesday, October 16, 12
Passive Read Repair
          ‣ Initiated by coordinator
          ‣ Cleans up entropy in a single row




Tuesday, October 16, 12
Passive Read Repair
          ‣ Initiated by coordinator
          ‣ Cleans up entropy in a single row
          ‣ Happens regardless of consistency
            level




Tuesday, October 16, 12
Passive Read Repair
          ‣ Initiated by coordinator
          ‣ Cleans up entropy in a single row
          ‣ Happens regardless of consistency
            level
          ‣ Just reading the database reduces its
            entropy

Tuesday, October 16, 12
Read Consistency




Tuesday, October 16, 12
Read Consistency

          ‣ ONE
                 Get response from the closest replica




Tuesday, October 16, 12
Read Consistency

          ‣ ONE
                 Get response from the closest replica
          ‣ QUORUM
                 Get (N/2)+1 nodes, return most recent
                 timestamp




Tuesday, October 16, 12
Read Consistency




Tuesday, October 16, 12
Read Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone




Tuesday, October 16, 12
Read Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone
          ‣ EACH_QUORUM
                 (N/2)+1 nodes in all availability zones




Tuesday, October 16, 12
Read Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone
          ‣ EACH_QUORUM
                 (N/2)+1 nodes in all availability zones
          ‣ ALL
                 Wait for all replicas to respond



Tuesday, October 16, 12
Tuesday, October 16, 12
But what
                          about Column
                            Families?




Tuesday, October 16, 12
Replication For Real




Tuesday, October 16, 12
Replication For Real

                          Rows are replicated




Tuesday, October 16, 12
Replication For Real

                          Rows are replicated
                           “Key” is row key




Tuesday, October 16, 12
Replication For Real

                           Rows are replicated
                             “Key” is row key
                          “Value” is the row data



Tuesday, October 16, 12
Replication For Real

                            Rows are replicated
                              “Key” is row key
                          “Value” is the row data
                          Implications for row size

Tuesday, October 16, 12
Gossip


Tuesday, October 16, 12
Gossip




Tuesday, October 16, 12
Gossip
          ‣ Naive heartbeats don’t scale




Tuesday, October 16, 12
Gossip
          ‣ Naive heartbeats don’t scale
          ‣ Cluster still needs to know its state




Tuesday, October 16, 12
Gossip
          ‣ Naive heartbeats don’t scale
          ‣ Cluster still needs to know its state
          ‣ Computes a real-valued “suspicion”
            for each node




Tuesday, October 16, 12
Gossip
          ‣ Naive heartbeats don’t scale
          ‣ Cluster still needs to know its state
          ‣ Computes a real-valued “suspicion”
            for each node
          ‣ Probabilistic



Tuesday, October 16, 12
Gossip
          ‣ Naive heartbeats don’t scale
          ‣ Cluster still needs to know its state
          ‣ Computes a real-valued “suspicion”
            for each node
          ‣ Probabilistic
          ‣ Just like the real thing

Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000


                             E000                    2000


                                  How are
                                 you, E000?


                          C000                          4000




                             A000                    6000


                                              8000


Tuesday, October 16, 12
0000


                             E000                2000

                                    I'm cool.




                          C000                      4000




                             A000                6000


                                          8000


Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000


                             E000                    2000




                          C000                          4000
                                 How about
                                 you, A000?




                             A000                    6000


                                              8000


Tuesday, October 16, 12
0000


                             E000                   2000




                          C000                         4000


                                    Oh, I'm fine.



                             A000                   6000


                                             8000


Tuesday, October 16, 12
0000


                             E000                      2000

                                    I trust
                                 e000. Not so
                                  sure about
                                   A000...

                          C000                            4000




                             A000                      6000


                                                8000


Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000


                             E000   C000, what
                                                 2000
                                      do you
                                      know?




                          C000                      4000




                             A000                6000


                                        8000


Tuesday, October 16, 12
0000


                             E000                       2000



                                    I'm Great!


                          C000                             4000




                             A000                       6000


                                                 8000


Tuesday, October 16, 12
0000


                             E000                     2000


                                   E000 is
                                 doing well.

                          C000                           4000




                             A000                     6000


                                               8000


Tuesday, October 16, 12
0000


                             E000                     2000


                                     Poor A000 is
                                    having trouble
                                        lately.

                          C000                           4000




                             A000                     6000


                                               8000


Tuesday, October 16, 12
0000


                             E000                 2000
                                    Hmmm, so...




                          C000                       4000




                             A000                 6000


                                       8000


Tuesday, October 16, 12
0000


                             E000            2000
                                    ✔ C000
                                    ✔ E000
                                    ✘ A000




                          C000                  4000




                             A000            6000


                                    8000


Tuesday, October 16, 12
Gossip Config




Tuesday, October 16, 12
Gossip Config

          ‣ A new node needs “seed nodes”




Tuesday, October 16, 12
Gossip Config

          ‣ A new node needs “seed nodes”
          ‣ Seed nodes configured in
            $CASSANDRA_HOME/conf/
            cassandra.yaml



Tuesday, October 16, 12
Storage


Tuesday, October 16, 12
Storage Engine




Tuesday, October 16, 12
Storage Engine
          ‣ “Log-structured storage”




Tuesday, October 16, 12
Storage Engine
          ‣ “Log-structured storage”
          ‣ All writes are sequential




Tuesday, October 16, 12
Storage Engine
          ‣ “Log-structured storage”
          ‣ All writes are sequential
          ‣ All writes are immutable




Tuesday, October 16, 12
Storage Engine
          ‣ “Log-structured storage”
          ‣ All writes are sequential
          ‣ All writes are immutable
          ‣ Designed to avoid seeks



Tuesday, October 16, 12
Storage Engine
          ‣ “Log-structured storage”
          ‣ All writes are sequential
          ‣ All writes are immutable
          ‣ Designed to avoid seeks
          ‣ Writes are faster than reads


Tuesday, October 16, 12
Write Sequence




Tuesday, October 16, 12
Write Sequence
              Write from            Commit
              Coordinator             Log




Tuesday, October 16, 12
Write Sequence
              Write from            Commit
              Coordinator             Log




                                    Memtable



Tuesday, October 16, 12
Write Sequence
              Write from                             Commit
              Coordinator                              Log




                SSTable(s)                           Memtable

                             (lots of tuning here)

Tuesday, October 16, 12
Write Sequence
              Write from                              Commit
              Coordinator                               Log


                            Compaction
                                 (more tuning here)




                SSTable(s)                            Memtable

                              (lots of tuning here)

Tuesday, October 16, 12
Commit Log




Tuesday, October 16, 12
Commit Log
          ‣ Writes go here first




Tuesday, October 16, 12
Commit Log
          ‣ Writes go here first
          ‣ Commit succeeds before node
            reports successful write




Tuesday, October 16, 12
Commit Log
          ‣ Writes go here first
          ‣ Commit succeeds before node
            reports successful write
          ‣ Append-only, sequential writes




Tuesday, October 16, 12
Commit Log
          ‣ Writes go here first
          ‣ Commit succeeds before node
            reports successful write
          ‣ Append-only, sequential writes
          ‣ One per server



Tuesday, October 16, 12
Commit Log
          ‣ Writes go here first
          ‣ Commit succeeds before node
            reports successful write
          ‣ Append-only, sequential writes
          ‣ One per server
          ‣ Good to have a dedicated spindle

Tuesday, October 16, 12
MemTable




Tuesday, October 16, 12
MemTable
          ‣ An in-memory structure




Tuesday, October 16, 12
MemTable
          ‣ An in-memory structure
          ‣ One per column family




Tuesday, October 16, 12
MemTable
          ‣ An in-memory structure
          ‣ One per column family
          ‣ Holds most recent row changes




Tuesday, October 16, 12
MemTable
          ‣ An in-memory structure
          ‣ One per column family
          ‣ Holds most recent row changes
          ‣ Tunable memory use



Tuesday, October 16, 12
MemTable
          ‣ An in-memory structure
          ‣ One per column family
          ‣ Holds most recent row changes
          ‣ Tunable memory use
          ‣ Flushed to disk when “full”


Tuesday, October 16, 12
SSTable




Tuesday, October 16, 12
SSTable
          ‣ Memtables flushed to disk here




Tuesday, October 16, 12
SSTable
          ‣ Memtables flushed to disk here
          ‣ Many SSTable files per column family




Tuesday, October 16, 12
SSTable
          ‣ Memtables flushed to disk here
          ‣ Many SSTable files per column family
          ‣ Every SSTable is immutable




Tuesday, October 16, 12
SSTable
          ‣ Memtables flushed to disk here
          ‣ Many SSTable files per column family
          ‣ Every SSTable is immutable
          ‣ SSTables are accessed during reads



Tuesday, October 16, 12
SSTable
          ‣ Memtables flushed to disk here
          ‣ Many SSTable files per column family
          ‣ Every SSTable is immutable
          ‣ SSTables are accessed during reads
          ‣ Must be compacted


Tuesday, October 16, 12
Read Sequence

                              Read from
                             Coordinator




Tuesday, October 16, 12
Read Sequence

                              Memtable




Tuesday, October 16, 12
Read Sequence

                              Memtable


           Are all columns here?



Tuesday, October 16, 12
Read Sequence

                                  Memtable


           Are all columns here?
                          YES: stop and return result.



Tuesday, October 16, 12
Read Sequence

                                  Memtable


           Are all columns here?
                          YES: stop and return result.
                          NO: continue.

Tuesday, October 16, 12
Read Sequence
                               Newest
                               SSTable




Tuesday, October 16, 12
Read Sequence
                               Newest
                               SSTable


           Are all columns here?



Tuesday, October 16, 12
Read Sequence
                                   Newest
                                   SSTable


           Are all columns here?
                          YES: stop and return result.



Tuesday, October 16, 12
Read Sequence
                                   Newest
                                   SSTable


           Are all columns here?
                          YES: stop and return result.
                          NO: continue.

Tuesday, October 16, 12
Read Sequence
                              Next Oldest
                                SSTable




Tuesday, October 16, 12
Read Sequence
                              Next Oldest
                                SSTable


        How about now?



Tuesday, October 16, 12
Read Sequence
                                 Next Oldest
                                   SSTable


        How about now?
                          YES: great!



Tuesday, October 16, 12
Read Sequence
                                 Next Oldest
                                   SSTable


        How about now?
                          YES: great!
                          NO: keep looking...

Tuesday, October 16, 12
Read Sequence

                              And so on.




Tuesday, October 16, 12
Read Sequence

                                And so on.



                          Doesn’t this get old?


Tuesday, October 16, 12
Read Sequence




Tuesday, October 16, 12
Read Sequence

          ‣ Check Memtable first




Tuesday, October 16, 12
Read Sequence

          ‣ Check Memtable first
          ‣ Read SSTables from newest to oldest




Tuesday, October 16, 12
Read Sequence

          ‣ Check Memtable first
          ‣ Read SSTables from newest to oldest
          ‣ Bloom filters prevent most reads




Tuesday, October 16, 12
Read Sequence

          ‣ Check Memtable first
          ‣ Read SSTables from newest to oldest
          ‣ Bloom filters prevent most reads
          ‣ Compaction shrinks number of files



Tuesday, October 16, 12
Compaction


Tuesday, October 16, 12
Compaction




Tuesday, October 16, 12
Compaction
          ‣ Combine many SSTables into one




Tuesday, October 16, 12
Compaction
          ‣ Combine many SSTables into one
          ‣ Performed in the background




Tuesday, October 16, 12
Compaction
          ‣ Combine many SSTables into one
          ‣ Performed in the background
          ‣ Node still operates




Tuesday, October 16, 12
Compaction
          ‣ Combine many SSTables into one
          ‣ Performed in the background
          ‣ Node still operates
          ‣ Requires extra disk space



Tuesday, October 16, 12
Compaction
          ‣ Combine many SSTables into one
          ‣ Performed in the background
          ‣ Node still operates
          ‣ Requires extra disk space
          ‣ Three tunable varieties


Tuesday, October 16, 12
Compaction




Tuesday, October 16, 12
Compaction
          ‣ Major
                 All SSTables are merged into one clean one




Tuesday, October 16, 12
Compaction
          ‣ Major
                 All SSTables are merged into one clean one
          ‣ Minor
                 Similarly-sized SSTables are merged together
                 after reaching a threshold




Tuesday, October 16, 12
Compaction
          ‣ Major
                 All SSTables are merged into one clean one
          ‣ Minor
                 Similarly-sized SSTables are merged together
                 after reaching a threshold
          ‣ Leveled
                 http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra




Tuesday, October 16, 12
http://www.datastax.com/
                   products/community




Tuesday, October 16, 12
http://www.datastax.com/
                   products/community

                                         t hi s!
                               n lo ad
                           D ow


Tuesday, October 16, 12
Tim Berglund
    tlberglund@github.com
    @tlberglund


   Thank You
Tuesday, October 16, 12
1 of 276

Recommended

Cassandra ppt 2 by
Cassandra ppt 2Cassandra ppt 2
Cassandra ppt 2Skillwise Group
674 views8 slides
Nuove professioni dell'editoria by
Nuove professioni dell'editoriaNuove professioni dell'editoria
Nuove professioni dell'editoriaFlavio Pintarelli
838 views18 slides
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa... by
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...DataStax Academy
3.5K views54 slides
Cassandra datamodel by
Cassandra datamodelCassandra datamodel
Cassandra datamodellurga
582 views7 slides
Apache Cassandra by
Apache CassandraApache Cassandra
Apache CassandraSperasoft
1.4K views25 slides
DataStax: Backup and Restore in Cassandra and OpsCenter by
DataStax: Backup and Restore in Cassandra and OpsCenterDataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenterDataStax Academy
4.3K views19 slides

More Related Content

Viewers also liked

Apache Cassandra, part 2 – data model example, machinery by
Apache Cassandra, part 2 – data model example, machineryApache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machineryAndrey Lomakin
5.2K views52 slides
Cassandra Data Modeling by
Cassandra Data ModelingCassandra Data Modeling
Cassandra Data ModelingBen Knear
767 views50 slides
Cassandra Data Modeling - Practical Considerations @ Netflix by
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflixnkorla1share
15.5K views27 slides
Cassandra into by
Cassandra intoCassandra into
Cassandra intoDataArt
240 views16 slides
Cassandra Data Model by
Cassandra Data ModelCassandra Data Model
Cassandra Data Modelebenhewitt
14.1K views47 slides
Webinar | Introducing DataStax Enterprise 4.6 by
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6DataStax
2.8K views28 slides

Viewers also liked(14)

Apache Cassandra, part 2 – data model example, machinery by Andrey Lomakin
Apache Cassandra, part 2 – data model example, machineryApache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machinery
Andrey Lomakin5.2K views
Cassandra Data Modeling by Ben Knear
Cassandra Data ModelingCassandra Data Modeling
Cassandra Data Modeling
Ben Knear767 views
Cassandra Data Modeling - Practical Considerations @ Netflix by nkorla1share
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
nkorla1share15.5K views
Cassandra into by DataArt
Cassandra intoCassandra into
Cassandra into
DataArt240 views
Cassandra Data Model by ebenhewitt
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
ebenhewitt14.1K views
Webinar | Introducing DataStax Enterprise 4.6 by DataStax
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6
DataStax2.8K views
Data Modeling Basics for the Cloud with DataStax by DataStax
Data Modeling Basics for the Cloud with DataStaxData Modeling Basics for the Cloud with DataStax
Data Modeling Basics for the Cloud with DataStax
DataStax557 views
Cassandra Community Webinar | The World's Next Top Data Model by DataStax
Cassandra Community Webinar | The World's Next Top Data ModelCassandra Community Webinar | The World's Next Top Data Model
Cassandra Community Webinar | The World's Next Top Data Model
DataStax4.7K views
Overview of DataStax OpsCenter by DataStax
Overview of DataStax OpsCenterOverview of DataStax OpsCenter
Overview of DataStax OpsCenter
DataStax2.3K views
Webinar | Target Modernizes Retail with Engaging Digital Experiences by DataStax
Webinar | Target Modernizes Retail with Engaging Digital ExperiencesWebinar | Target Modernizes Retail with Engaging Digital Experiences
Webinar | Target Modernizes Retail with Engaging Digital Experiences
DataStax2K views
Cassandra by Edureka!
CassandraCassandra
Cassandra
Edureka!3K views
Advanced data modeling with apache cassandra by Patrick McFadin
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
Patrick McFadin16.3K views
Cassandra at NoSql Matters 2012 by jbellis
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
jbellis5.2M views
Cassandra Performance and Scalability on AWS by Adrian Cockcroft
Cassandra Performance and Scalability on AWSCassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWS
Adrian Cockcroft41.8K views

More from JAX London

Everything I know about software in spaghetti bolognese: managing complexity by
Everything I know about software in spaghetti bolognese: managing complexityEverything I know about software in spaghetti bolognese: managing complexity
Everything I know about software in spaghetti bolognese: managing complexityJAX London
2.2K views50 slides
Devops with the S for Sharing - Patrick Debois by
Devops with the S for Sharing - Patrick DeboisDevops with the S for Sharing - Patrick Debois
Devops with the S for Sharing - Patrick DeboisJAX London
807 views72 slides
Busy Developer's Guide to Windows 8 HTML/JavaScript Apps by
Busy Developer's Guide to Windows 8 HTML/JavaScript AppsBusy Developer's Guide to Windows 8 HTML/JavaScript Apps
Busy Developer's Guide to Windows 8 HTML/JavaScript AppsJAX London
2.5K views26 slides
It's code but not as we know: Infrastructure as Code - Patrick Debois by
It's code but not as we know: Infrastructure as Code - Patrick DeboisIt's code but not as we know: Infrastructure as Code - Patrick Debois
It's code but not as we know: Infrastructure as Code - Patrick DeboisJAX London
1.5K views102 slides
Locks? We Don't Need No Stinkin' Locks - Michael Barker by
Locks? We Don't Need No Stinkin' Locks - Michael BarkerLocks? We Don't Need No Stinkin' Locks - Michael Barker
Locks? We Don't Need No Stinkin' Locks - Michael BarkerJAX London
962 views26 slides
Worse is better, for better or for worse - Kevlin Henney by
Worse is better, for better or for worse - Kevlin HenneyWorse is better, for better or for worse - Kevlin Henney
Worse is better, for better or for worse - Kevlin HenneyJAX London
3.8K views33 slides

More from JAX London(20)

Everything I know about software in spaghetti bolognese: managing complexity by JAX London
Everything I know about software in spaghetti bolognese: managing complexityEverything I know about software in spaghetti bolognese: managing complexity
Everything I know about software in spaghetti bolognese: managing complexity
JAX London2.2K views
Devops with the S for Sharing - Patrick Debois by JAX London
Devops with the S for Sharing - Patrick DeboisDevops with the S for Sharing - Patrick Debois
Devops with the S for Sharing - Patrick Debois
JAX London807 views
Busy Developer's Guide to Windows 8 HTML/JavaScript Apps by JAX London
Busy Developer's Guide to Windows 8 HTML/JavaScript AppsBusy Developer's Guide to Windows 8 HTML/JavaScript Apps
Busy Developer's Guide to Windows 8 HTML/JavaScript Apps
JAX London2.5K views
It's code but not as we know: Infrastructure as Code - Patrick Debois by JAX London
It's code but not as we know: Infrastructure as Code - Patrick DeboisIt's code but not as we know: Infrastructure as Code - Patrick Debois
It's code but not as we know: Infrastructure as Code - Patrick Debois
JAX London1.5K views
Locks? We Don't Need No Stinkin' Locks - Michael Barker by JAX London
Locks? We Don't Need No Stinkin' Locks - Michael BarkerLocks? We Don't Need No Stinkin' Locks - Michael Barker
Locks? We Don't Need No Stinkin' Locks - Michael Barker
JAX London962 views
Worse is better, for better or for worse - Kevlin Henney by JAX London
Worse is better, for better or for worse - Kevlin HenneyWorse is better, for better or for worse - Kevlin Henney
Worse is better, for better or for worse - Kevlin Henney
JAX London3.8K views
Java performance: What's the big deal? - Trisha Gee by JAX London
Java performance: What's the big deal? - Trisha GeeJava performance: What's the big deal? - Trisha Gee
Java performance: What's the big deal? - Trisha Gee
JAX London858 views
Clojure made-simple - John Stevenson by JAX London
Clojure made-simple - John StevensonClojure made-simple - John Stevenson
Clojure made-simple - John Stevenson
JAX London2.7K views
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias Wessendorf by JAX London
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias WessendorfHTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias Wessendorf
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias Wessendorf
JAX London2.2K views
Play framework 2 : Peter Hilton by JAX London
Play framework 2 : Peter HiltonPlay framework 2 : Peter Hilton
Play framework 2 : Peter Hilton
JAX London3.7K views
Complexity theory and software development : Tim Berglund by JAX London
Complexity theory and software development : Tim BerglundComplexity theory and software development : Tim Berglund
Complexity theory and software development : Tim Berglund
JAX London2.4K views
Why FLOSS is a Java developer's best friend: Dave Gruber by JAX London
Why FLOSS is a Java developer's best friend: Dave GruberWhy FLOSS is a Java developer's best friend: Dave Gruber
Why FLOSS is a Java developer's best friend: Dave Gruber
JAX London1.7K views
Akka in Action: Heiko Seeburger by JAX London
Akka in Action: Heiko SeeburgerAkka in Action: Heiko Seeburger
Akka in Action: Heiko Seeburger
JAX London2K views
NoSQL Smackdown 2012 : Tim Berglund by JAX London
NoSQL Smackdown 2012 : Tim BerglundNoSQL Smackdown 2012 : Tim Berglund
NoSQL Smackdown 2012 : Tim Berglund
JAX London767 views
Closures, the next "Big Thing" in Java: Russel Winder by JAX London
Closures, the next "Big Thing" in Java: Russel WinderClosures, the next "Big Thing" in Java: Russel Winder
Closures, the next "Big Thing" in Java: Russel Winder
JAX London1K views
Java and the machine - Martijn Verburg and Kirk Pepperdine by JAX London
Java and the machine - Martijn Verburg and Kirk PepperdineJava and the machine - Martijn Verburg and Kirk Pepperdine
Java and the machine - Martijn Verburg and Kirk Pepperdine
JAX London872 views
Mongo DB on the JVM - Brendan McAdams by JAX London
Mongo DB on the JVM - Brendan McAdamsMongo DB on the JVM - Brendan McAdams
Mongo DB on the JVM - Brendan McAdams
JAX London2.3K views
New opportunities for connected data - Ian Robinson by JAX London
New opportunities for connected data - Ian RobinsonNew opportunities for connected data - Ian Robinson
New opportunities for connected data - Ian Robinson
JAX London1.1K views
HTML5 Websockets and Java - Arun Gupta by JAX London
HTML5 Websockets and Java - Arun GuptaHTML5 Websockets and Java - Arun Gupta
HTML5 Websockets and Java - Arun Gupta
JAX London8.9K views
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker by JAX London
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian PloskerThe Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
JAX London1.1K views

Recently uploaded

Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...ShapeBlue
173 views15 slides
LLMs in Production: Tooling, Process, and Team Structure by
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team StructureAggregage
42 views77 slides
Business Analyst Series 2023 - Week 4 Session 7 by
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7DianaGray10
139 views31 slides
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... by
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...ShapeBlue
194 views62 slides
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... by
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...ShapeBlue
126 views10 slides
Generative AI: Shifting the AI Landscape by
Generative AI: Shifting the AI LandscapeGenerative AI: Shifting the AI Landscape
Generative AI: Shifting the AI LandscapeDeakin University
53 views55 slides

Recently uploaded(20)

Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue173 views
LLMs in Production: Tooling, Process, and Team Structure by Aggregage
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
Aggregage42 views
Business Analyst Series 2023 - Week 4 Session 7 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10139 views
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... by ShapeBlue
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
ShapeBlue194 views
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... by ShapeBlue
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
ShapeBlue126 views
The Role of Patterns in the Era of Large Language Models by Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li85 views
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc170 views
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online by ShapeBlue
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
ShapeBlue221 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... by ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue139 views
The Power of Generative AI in Accelerating No Code Adoption.pdf by Saeed Al Dhaheri
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdf
Saeed Al Dhaheri32 views
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De... by Moses Kemibaro
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Moses Kemibaro34 views
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue206 views
"Package management in monorepos", Zoltan Kochan by Fwdays
"Package management in monorepos", Zoltan Kochan"Package management in monorepos", Zoltan Kochan
"Package management in monorepos", Zoltan Kochan
Fwdays33 views
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And... by ShapeBlue
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
ShapeBlue106 views
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue135 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue297 views

Radical NoSQL Scalability with Cassandra - Tim Berglund

  • 1. Radical NoSQL Scalability with @tlberglund Tuesday, October 16, 12
  • 2. Data Model for developers Tuesday, October 16, 12
  • 4. Column full_name: “Tim Berglund” Tuesday, October 16, 12
  • 5. Column full_name: “Tim Berglund” 20120425T1832 Tuesday, October 16, 12
  • 6. Column Key/Value pair full_name: “Tim Berglund” 20120425T1832 Tuesday, October 16, 12
  • 7. Column Key/Value pair full_name: “Tim Berglund” 20120425T1832 Timestamp Tuesday, October 16, 12
  • 9. Column ‣ Key-value pair Tuesday, October 16, 12
  • 10. Column ‣ Key-value pair ‣ Optionally typed Tuesday, October 16, 12
  • 11. Column ‣ Key-value pair ‣ Optionally typed ‣ Timestamped Tuesday, October 16, 12
  • 12. Column ‣ Key-value pair ‣ Optionally typed ‣ Timestamped ‣ Fundamental unit Tuesday, October 16, 12
  • 14. Row column Tuesday, October 16, 12
  • 15. Row column column Tuesday, October 16, 12
  • 16. Row column column column Tuesday, October 16, 12
  • 17. Row row key column column column Tuesday, October 16, 12
  • 18. Row name: bday: role: tlberglund Tim 06-15 teacher Tuesday, October 16, 12
  • 19. Row Row Key name: bday: role: tlberglund Tim 06-15 teacher Tuesday, October 16, 12
  • 20. Row Row Key name: bday: role: tlberglund Tim 06-15 teacher Columns Tuesday, October 16, 12
  • 21. Row name: bday: role: tlberglund Tim 06-15 teacher Tuesday, October 16, 12
  • 22. Row bday: name: role: tlberglund 06-15 Tim teacher Tuesday, October 16, 12
  • 23. Row bday: name: role: tlberglund 06-15 Tim teacher Sorted by UTF8Type comparator Tuesday, October 16, 12
  • 26. Table name: role: status: tim Tim teacher Cool Tuesday, October 16, 12
  • 27. Table name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen Tuesday, October 16, 12
  • 28. Table name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen billy role: CEO Tuesday, October 16, 12
  • 29. Table name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen billy role: CEO name: role: status: matt Matt founder ubercool Tuesday, October 16, 12
  • 30. Table name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen billy role: CEO name: role: status: matt Matt founder ubercool Tuesday, October 16, 12
  • 31. Outer hash key Table name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen billy role: CEO name: role: status: matt Matt founder ubercool Tuesday, October 16, 12
  • 32. Outer hash key Table Inner hash key name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen billy role: CEO name: role: status: matt Matt founder ubercool Tuesday, October 16, 12
  • 33. Outer hash key Table Inner hash key name: role: status: tim Tim teacher Cool Sparse name: role: kristen marketing Kristen billy role: CEO name: role: status: matt Matt founder ubercool Tuesday, October 16, 12
  • 34. Database Accounts ClickStream Orders InventoryEvents Tuesday, October 16, 12
  • 35. Database Accounts ClickStream Tables Orders InventoryEvents Tuesday, October 16, 12
  • 36. Cluster System Database Application Database Tuesday, October 16, 12
  • 38. Secondary Indexes ‣ Ubiquitous in relational databases Tuesday, October 16, 12
  • 39. Secondary Indexes ‣ Ubiquitous in relational databases ‣ Supported in Cassandra, with qualifications Tuesday, October 16, 12
  • 40. Secondary Indexes name: email: role: tim Tim tb@a.com teacher name: email: kristen k@ds.com Kristen billy role: CEO name: email: status: matt Matt m@ds.com ubercool Tuesday, October 16, 12
  • 41. Secondary Indexes name: email: role: tim Tim tb@a.com teacher name: email: kristen k@ds.com Kristen billy role: CEO name: email: status: matt Matt m@ds.com ubercool Tuesday, October 16, 12
  • 42. Secondary Indexes name: email: role: tim Tim tb@a.com teacher name: email: kristen k@ds.com Kristen billy role: CEO name: email: status: matt Matt m@ds.com ubercool Tuesday, October 16, 12
  • 44. Secondary Indexes ‣ In relational databases: performant for high cardinality Tuesday, October 16, 12
  • 45. Secondary Indexes ‣ In relational databases: performant for high cardinality ‣ In Cassandra: the reverse Tuesday, October 16, 12
  • 46. Secondary Indexes ‣ In relational databases: performant for high cardinality ‣ In Cassandra: the reverse ‣ Not suitable for lookup-by-email Tuesday, October 16, 12
  • 47. Secondary Indexes ‣ In relational databases: performant for high cardinality ‣ In Cassandra: the reverse ‣ Not suitable for lookup-by-email ‣ Suitable for lookup by: region code, gender, state, etc. Tuesday, October 16, 12
  • 51. Why BigTable? Flexibility Tuesday, October 16, 12
  • 52. Why BigTable? Flexibility Performance Tuesday, October 16, 12
  • 53. Query Language for developers Tuesday, October 16, 12
  • 56. CQL (Cassandra Query Language) Tuesday, October 16, 12
  • 59. CREATE KEYSPACE CREATE KEYSPACE DemoKeyspace WITH strategy_class='SimpleStrategy' AND strategy_options:replication_factor=1; Tuesday, October 16, 12
  • 61. CREATE TABLE CREATE TABLE accounts (KEY text PRIMARY KEY) WITH comparator=text AND default_validation=text; Tuesday, October 16, 12
  • 63. CREATE TABLE CREATE TABLE accounts (KEY text PRIMARY KEY, name text, email text, signed_up_at timestamp) WITH comparator=text; Tuesday, October 16, 12
  • 66. INSERT INSERT INTO accounts (KEY, name, email, signed_up_at) VALUES ('tlberglund', 'Tim Berglund', 'tlberglund@gmail.com', '2012-04-25'); Tuesday, October 16, 12
  • 68. INSERT INSERT INTO events (KEY, 0, 1, 2, 3, 4) VALUES ('2012-04-25T11:04:34-0700', 55.4, 56.2, 59.6, 65.3, 79) USING CONSISTENCY QUORUM AND TTL 86400; Tuesday, October 16, 12
  • 71. SELECT SELECT * FROM accounts WHERE KEY='tlberglund'; Tuesday, October 16, 12
  • 73. SELECT SELECT 1..3 FROM events WHERE KEY='2012-04-25T11:04:34-0700'; Tuesday, October 16, 12
  • 75. SELECT SELECT * FROM accounts WHERE KEY='tlberglund' USING CONSISTENCY ONE; Tuesday, October 16, 12
  • 78. UPDATE UPDATE accounts SET last_login='2012-04-25T09:37:35-0700' WHERE KEY='tlberglund'; Tuesday, October 16, 12
  • 81. 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 82. Writing a Key Tuesday, October 16, 12
  • 83. 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 84. 0000 name: Tim E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 85. 0000 3D97: Tim E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 86. 0000 E000 2000 C000 4000 3D97: Tim A000 6000 8000 Tuesday, October 16, 12
  • 87. 0000 role: Teacher E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 88. 0000 9C4F: Teacher E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 89. 0000 E000 2000 C000 4000 A000 9C4F: Teacher 6000 8000 Tuesday, October 16, 12
  • 90. Reading a Key Tuesday, October 16, 12
  • 91. 0000 3D97? E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 92. 0000 3D97? E000 2000 name: Tim C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 93. 0000 9C4F? E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 94. 0000 9C4F? E000 2000 role: Teacher C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 97. Replication Replication factor (N) Tuesday, October 16, 12
  • 98. Replication Replication factor (N) Pluggable placement strategies Tuesday, October 16, 12
  • 100. Replication Strategies Simple Tuesday, October 16, 12
  • 101. Replication Strategies Simple Network Topology Aware Tuesday, October 16, 12
  • 102. Simple Strategy 0000 E000 2000 C000 4000 N=3 A000 6000 8000 Tuesday, October 16, 12
  • 103. Simple Strategy 0000 3D97: Tim E000 2000 C000 4000 N=3 A000 6000 8000 Tuesday, October 16, 12
  • 104. Simple Strategy 0000 E000 2000 C000 4000 3D97: Tim N=3 A000 6000 8000 Tuesday, October 16, 12
  • 105. Simple Strategy 0000 E000 2000 C000 4000 3D97: Tim N=3 A000 6000 3D97: Tim 8000 Tuesday, October 16, 12
  • 106. Simple Strategy 0000 E000 2000 C000 4000 3D97: Tim N=3 A000 6000 3D97: Tim 8000 3D97: Tim Tuesday, October 16, 12
  • 107. Topology Aware DC1 DC2 2000 0000 6000 4000 A000 8000 E000 C000 Tuesday, October 16, 12
  • 108. Topology Aware DC1 DC2 2000 0000 6000 4000 A000 8000 E000 C000 Tuesday, October 16, 12
  • 109. Topology Aware 3D97: Tim DC1 DC2 2000 0000 6000 4000 A000 8000 E000 C000 Tuesday, October 16, 12
  • 110. Topology Aware DC1 DC2 2000 0000 6000 4000 3D97: Tim A000 8000 E000 C000 Tuesday, October 16, 12
  • 111. Topology Aware DC1 DC2 2000 0000 6000 3D97: Tim 4000 3D97: Tim A000 8000 E000 C000 Tuesday, October 16, 12
  • 112. Topology Aware DC1 DC2 2000 0000 6000 3D97: Tim 4000 3D97: Tim A000 8000 3D97: Tim E000 C000 Tuesday, October 16, 12
  • 113. Topology Aware DC1 DC2 2000 0000 6000 3D97: Tim 4000 3D97: Tim A000 3D97: Tim 8000 3D97: Tim E000 C000 Tuesday, October 16, 12
  • 116. Client Connections 0000 E000 2000 C000 4000 Client Machine A000 6000 8000 Tuesday, October 16, 12
  • 117. Client Connections 0000 E000 2000 Client Machine ? C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 118. Client Connections 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 119. Client Connections 0000 E000 2000 3D97? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 120. Client Connections 0000 E000 2000 3D97? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 121. Client Connections 0000 E000 2000 3D97? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 122. Client Connections 0000 E000 2000 3D97? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 123. Client Connections 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 124. Client Connections 0000 E000 2000 9C4F? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 125. Client Connections 0000 E000 2000 9C4F? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 126. Client Connections 0000 E000 2000 9C4F? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 127. Client Connections 0000 E000 2000 9C4F? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 129. Load Balancer Hardware VIP Tuesday, October 16, 12
  • 130. Load Balancer Hardware VIP HAProxy Tuesday, October 16, 12
  • 131. Load Balancer Hardware VIP HAProxy Round-robin DNS Tuesday, October 16, 12
  • 132. Load Balancer Hardware VIP HAProxy Round-robin DNS Client-side Tuesday, October 16, 12
  • 133. Load Balancer Hardware VIP HAProxy Round-robin DNS Client-side (Hector does this) Tuesday, October 16, 12
  • 135. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 136. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 137. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 138. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 139. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 140. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 141. Write Consistency 0000 Coordinator E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 142. Write Consistency 0000 E000 ---- C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 143. Write Consistency 0000 E000 ---- C000 4000 Client Machine 14C7 A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 144. Write Consistency 0000 E000 ---- X C000 4000 Client Machine 14C7 A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 145. Write Consistency 0000 E000 ---- X C000 4000 Client Machine 14C7 A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 146. Write Consistency 0000 E000 ---- X C000 4000 Client Machine 14C7 A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 147. Write Consistency 0000 E000 ---- X C000 4000 Client Machine 14C7 A000 6000 Load Balancer 8000 Coordinator stores a “hint” Tuesday, October 16, 12
  • 149. Write Consistency ‣ ANY At least one node (hinted handoffs allowed) Tuesday, October 16, 12
  • 150. Write Consistency ‣ ANY At least one node (hinted handoffs allowed) ‣ ONE At least one node (no hinted handoffs) Tuesday, October 16, 12
  • 151. Write Consistency ‣ ANY At least one node (hinted handoffs allowed) ‣ ONE At least one node (no hinted handoffs) ‣ QUORUM (N/2)+1 nodes Tuesday, October 16, 12
  • 153. Write Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone Tuesday, October 16, 12
  • 154. Write Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone ‣ EACH_QUORUM (N/2)+1 nodes in all availability zones Tuesday, October 16, 12
  • 155. Write Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone ‣ EACH_QUORUM (N/2)+1 nodes in all availability zones ‣ ALL Write successfully to all replicas Tuesday, October 16, 12
  • 157. Read Consistency 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 158. Read Consistency 9C4F? 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 159. Read Consistency 9C4F? 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 160. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 161. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY Tim TODAY C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 162. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY Tim TODAY C000 4000 Tim TODAY A000 6000 8000 Tuesday, October 16, 12
  • 163. Read Consistency Coordinator 9C4F? 0000 E000 2000 Tim TODAY Tim TODAY C000 4000 Tim TODAY A000 6000 8000 Tuesday, October 16, 12
  • 164. Read Consistency 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 165. Read Consistency 9C4F? 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 166. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 167. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY Jim YESTERDAY C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 168. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY Jim YESTERDAY C000 4000 Tim TODAY A000 6000 8000 Tuesday, October 16, 12
  • 169. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY Jim YESTERDAY C000 4000 Tim TODAY A000 6000 8000 Inconsistent Tuesday, October 16, 12
  • 170. Passive Read Repair Tuesday, October 16, 12
  • 171. Passive Read Repair ‣ Initiated by coordinator Tuesday, October 16, 12
  • 172. Passive Read Repair ‣ Initiated by coordinator ‣ Cleans up entropy in a single row Tuesday, October 16, 12
  • 173. Passive Read Repair ‣ Initiated by coordinator ‣ Cleans up entropy in a single row ‣ Happens regardless of consistency level Tuesday, October 16, 12
  • 174. Passive Read Repair ‣ Initiated by coordinator ‣ Cleans up entropy in a single row ‣ Happens regardless of consistency level ‣ Just reading the database reduces its entropy Tuesday, October 16, 12
  • 176. Read Consistency ‣ ONE Get response from the closest replica Tuesday, October 16, 12
  • 177. Read Consistency ‣ ONE Get response from the closest replica ‣ QUORUM Get (N/2)+1 nodes, return most recent timestamp Tuesday, October 16, 12
  • 179. Read Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone Tuesday, October 16, 12
  • 180. Read Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone ‣ EACH_QUORUM (N/2)+1 nodes in all availability zones Tuesday, October 16, 12
  • 181. Read Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone ‣ EACH_QUORUM (N/2)+1 nodes in all availability zones ‣ ALL Wait for all replicas to respond Tuesday, October 16, 12
  • 183. But what about Column Families? Tuesday, October 16, 12
  • 184. Replication For Real Tuesday, October 16, 12
  • 185. Replication For Real Rows are replicated Tuesday, October 16, 12
  • 186. Replication For Real Rows are replicated “Key” is row key Tuesday, October 16, 12
  • 187. Replication For Real Rows are replicated “Key” is row key “Value” is the row data Tuesday, October 16, 12
  • 188. Replication For Real Rows are replicated “Key” is row key “Value” is the row data Implications for row size Tuesday, October 16, 12
  • 191. Gossip ‣ Naive heartbeats don’t scale Tuesday, October 16, 12
  • 192. Gossip ‣ Naive heartbeats don’t scale ‣ Cluster still needs to know its state Tuesday, October 16, 12
  • 193. Gossip ‣ Naive heartbeats don’t scale ‣ Cluster still needs to know its state ‣ Computes a real-valued “suspicion” for each node Tuesday, October 16, 12
  • 194. Gossip ‣ Naive heartbeats don’t scale ‣ Cluster still needs to know its state ‣ Computes a real-valued “suspicion” for each node ‣ Probabilistic Tuesday, October 16, 12
  • 195. Gossip ‣ Naive heartbeats don’t scale ‣ Cluster still needs to know its state ‣ Computes a real-valued “suspicion” for each node ‣ Probabilistic ‣ Just like the real thing Tuesday, October 16, 12
  • 196. 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 197. 0000 E000 2000 How are you, E000? C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 198. 0000 E000 2000 I'm cool. C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 199. 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 200. 0000 E000 2000 C000 4000 How about you, A000? A000 6000 8000 Tuesday, October 16, 12
  • 201. 0000 E000 2000 C000 4000 Oh, I'm fine. A000 6000 8000 Tuesday, October 16, 12
  • 202. 0000 E000 2000 I trust e000. Not so sure about A000... C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 203. 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 204. 0000 E000 C000, what 2000 do you know? C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 205. 0000 E000 2000 I'm Great! C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 206. 0000 E000 2000 E000 is doing well. C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 207. 0000 E000 2000 Poor A000 is having trouble lately. C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 208. 0000 E000 2000 Hmmm, so... C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 209. 0000 E000 2000 ✔ C000 ✔ E000 ✘ A000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 211. Gossip Config ‣ A new node needs “seed nodes” Tuesday, October 16, 12
  • 212. Gossip Config ‣ A new node needs “seed nodes” ‣ Seed nodes configured in $CASSANDRA_HOME/conf/ cassandra.yaml Tuesday, October 16, 12
  • 215. Storage Engine ‣ “Log-structured storage” Tuesday, October 16, 12
  • 216. Storage Engine ‣ “Log-structured storage” ‣ All writes are sequential Tuesday, October 16, 12
  • 217. Storage Engine ‣ “Log-structured storage” ‣ All writes are sequential ‣ All writes are immutable Tuesday, October 16, 12
  • 218. Storage Engine ‣ “Log-structured storage” ‣ All writes are sequential ‣ All writes are immutable ‣ Designed to avoid seeks Tuesday, October 16, 12
  • 219. Storage Engine ‣ “Log-structured storage” ‣ All writes are sequential ‣ All writes are immutable ‣ Designed to avoid seeks ‣ Writes are faster than reads Tuesday, October 16, 12
  • 221. Write Sequence Write from Commit Coordinator Log Tuesday, October 16, 12
  • 222. Write Sequence Write from Commit Coordinator Log Memtable Tuesday, October 16, 12
  • 223. Write Sequence Write from Commit Coordinator Log SSTable(s) Memtable (lots of tuning here) Tuesday, October 16, 12
  • 224. Write Sequence Write from Commit Coordinator Log Compaction (more tuning here) SSTable(s) Memtable (lots of tuning here) Tuesday, October 16, 12
  • 226. Commit Log ‣ Writes go here first Tuesday, October 16, 12
  • 227. Commit Log ‣ Writes go here first ‣ Commit succeeds before node reports successful write Tuesday, October 16, 12
  • 228. Commit Log ‣ Writes go here first ‣ Commit succeeds before node reports successful write ‣ Append-only, sequential writes Tuesday, October 16, 12
  • 229. Commit Log ‣ Writes go here first ‣ Commit succeeds before node reports successful write ‣ Append-only, sequential writes ‣ One per server Tuesday, October 16, 12
  • 230. Commit Log ‣ Writes go here first ‣ Commit succeeds before node reports successful write ‣ Append-only, sequential writes ‣ One per server ‣ Good to have a dedicated spindle Tuesday, October 16, 12
  • 232. MemTable ‣ An in-memory structure Tuesday, October 16, 12
  • 233. MemTable ‣ An in-memory structure ‣ One per column family Tuesday, October 16, 12
  • 234. MemTable ‣ An in-memory structure ‣ One per column family ‣ Holds most recent row changes Tuesday, October 16, 12
  • 235. MemTable ‣ An in-memory structure ‣ One per column family ‣ Holds most recent row changes ‣ Tunable memory use Tuesday, October 16, 12
  • 236. MemTable ‣ An in-memory structure ‣ One per column family ‣ Holds most recent row changes ‣ Tunable memory use ‣ Flushed to disk when “full” Tuesday, October 16, 12
  • 238. SSTable ‣ Memtables flushed to disk here Tuesday, October 16, 12
  • 239. SSTable ‣ Memtables flushed to disk here ‣ Many SSTable files per column family Tuesday, October 16, 12
  • 240. SSTable ‣ Memtables flushed to disk here ‣ Many SSTable files per column family ‣ Every SSTable is immutable Tuesday, October 16, 12
  • 241. SSTable ‣ Memtables flushed to disk here ‣ Many SSTable files per column family ‣ Every SSTable is immutable ‣ SSTables are accessed during reads Tuesday, October 16, 12
  • 242. SSTable ‣ Memtables flushed to disk here ‣ Many SSTable files per column family ‣ Every SSTable is immutable ‣ SSTables are accessed during reads ‣ Must be compacted Tuesday, October 16, 12
  • 243. Read Sequence Read from Coordinator Tuesday, October 16, 12
  • 244. Read Sequence Memtable Tuesday, October 16, 12
  • 245. Read Sequence Memtable Are all columns here? Tuesday, October 16, 12
  • 246. Read Sequence Memtable Are all columns here? YES: stop and return result. Tuesday, October 16, 12
  • 247. Read Sequence Memtable Are all columns here? YES: stop and return result. NO: continue. Tuesday, October 16, 12
  • 248. Read Sequence Newest SSTable Tuesday, October 16, 12
  • 249. Read Sequence Newest SSTable Are all columns here? Tuesday, October 16, 12
  • 250. Read Sequence Newest SSTable Are all columns here? YES: stop and return result. Tuesday, October 16, 12
  • 251. Read Sequence Newest SSTable Are all columns here? YES: stop and return result. NO: continue. Tuesday, October 16, 12
  • 252. Read Sequence Next Oldest SSTable Tuesday, October 16, 12
  • 253. Read Sequence Next Oldest SSTable How about now? Tuesday, October 16, 12
  • 254. Read Sequence Next Oldest SSTable How about now? YES: great! Tuesday, October 16, 12
  • 255. Read Sequence Next Oldest SSTable How about now? YES: great! NO: keep looking... Tuesday, October 16, 12
  • 256. Read Sequence And so on. Tuesday, October 16, 12
  • 257. Read Sequence And so on. Doesn’t this get old? Tuesday, October 16, 12
  • 259. Read Sequence ‣ Check Memtable first Tuesday, October 16, 12
  • 260. Read Sequence ‣ Check Memtable first ‣ Read SSTables from newest to oldest Tuesday, October 16, 12
  • 261. Read Sequence ‣ Check Memtable first ‣ Read SSTables from newest to oldest ‣ Bloom filters prevent most reads Tuesday, October 16, 12
  • 262. Read Sequence ‣ Check Memtable first ‣ Read SSTables from newest to oldest ‣ Bloom filters prevent most reads ‣ Compaction shrinks number of files Tuesday, October 16, 12
  • 265. Compaction ‣ Combine many SSTables into one Tuesday, October 16, 12
  • 266. Compaction ‣ Combine many SSTables into one ‣ Performed in the background Tuesday, October 16, 12
  • 267. Compaction ‣ Combine many SSTables into one ‣ Performed in the background ‣ Node still operates Tuesday, October 16, 12
  • 268. Compaction ‣ Combine many SSTables into one ‣ Performed in the background ‣ Node still operates ‣ Requires extra disk space Tuesday, October 16, 12
  • 269. Compaction ‣ Combine many SSTables into one ‣ Performed in the background ‣ Node still operates ‣ Requires extra disk space ‣ Three tunable varieties Tuesday, October 16, 12
  • 271. Compaction ‣ Major All SSTables are merged into one clean one Tuesday, October 16, 12
  • 272. Compaction ‣ Major All SSTables are merged into one clean one ‣ Minor Similarly-sized SSTables are merged together after reaching a threshold Tuesday, October 16, 12
  • 273. Compaction ‣ Major All SSTables are merged into one clean one ‣ Minor Similarly-sized SSTables are merged together after reaching a threshold ‣ Leveled http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra Tuesday, October 16, 12
  • 274. http://www.datastax.com/ products/community Tuesday, October 16, 12
  • 275. http://www.datastax.com/ products/community t hi s! n lo ad D ow Tuesday, October 16, 12
  • 276. Tim Berglund tlberglund@github.com @tlberglund Thank You Tuesday, October 16, 12