Bucket your partitions
wisely
Markus Höfer
IT Consultant
2
Recap c* partitions
3
• Partition defines on which c* node
the data resides
• Identified by partition key
• Nodes „own“ tokenranges which are
directly related to partitions
• Tokens calculated by hashing
partition key
Recap c* partitions
4
• DataStax recommendations for
partitions:
• Maximum number of rows:
hundreds of thousands
• Disk size: 100‘s of MB
Recap c* partitions
5
What‘s the problem with big partitions?
• Every request for these partitions hit the
same nodes. -> Not scaleable!
• Deleting frequently will slow down your reads
or even lead to
TombstoneOverwhelmingExceptions
Recap c* partitions
6
Use case
„notebook“
7
Use case - Environment
Keyspace µ
Keyspace µ
Keyspace µ
8
• Many concurrent processes
• Scaleability important
• Load peaks will happen!
Use case – Load and requirements
9
• A user (owner) can create a
notebook
• An owner can create notes
belonging to a notebook
• Users can fetch notes (idealy only
once), not necessarily in certain
order
• Users can delete notes
Use case
Note_by_notebook
P Notebook [text]
C Title [text]
Comment [text]
Owner [text]
CreatedOn
[timestamp]
10
First things first:
Dev: „How many notes per notebook?“
PO: „I assume a maximum of 100.000
notes“
Use case
11
Use case – Let‘s do the math
How many values do we store having
100.000 rows per notebook?‚
Nv = Nr ´(Nc - Npk - Ns )+ Ns
Note_by_notebook
P Notebook [text]
C Title [text]
Comment [text]
Creator [text]
CreatedOn
[timestamp]
num_rows
* num_regular_columns
+ num_static_columns
= values_per_notebook
100.000
* 3
+ 0
= 300.000
12
Use case – Size assumptions
Note_by_notebook
P Notebook [text] 16 bytes
C Title [text] 60 bytes
Comment [text] 200 bytes
Owner[text] 16 bytes
CreatedOn [timestamp] 8 bytes
13
Use case – Let‘s do the math
Ok, so how much data is that on disk?
sizeOf (cki
)+
i
å sizeOf (Csj
j
å )+ Nr ´ (sizeOf (Crk
)+ sizeOf (
l
å Ccl
))+8´ Nv
k
å
Note_by_notebook
P Notebook [text]
C Title [text]
Comment [text]
Owner [text]
CreatedOn [text]
sizeof(P)
+ sizeof(S)
+ num_rows
* (sizeof(C)+sizeof(regular_column))
+ 8*num_values
= bytes_per_partition
16 bytes
+ 0 bytes
+ 100.000
* (60 bytes + 224 bytes)
+ 8 bytes * 300.000
= 30.800.016 bytes
14
Use case
Dev: „31 MB for 100.000 rows on a
partition“
PO: „Sorry ‘bout that, but its going to
be 300.000 rows. Is that a problem?“
15
Use case – Let‘s do the math
How many data do we store having
300.000 rows per notebook?
Note_by_notebook
P Notebook [text]
C Title [text]
Comment [text]
Owner [text]
CreatedOn [text]
92.400.016 bytes
16
Use case
Dev: „That might be ok if we don‘t delete
too much, it‘ll be around 93 MB for
300.000 rows on a partition“
PO: „Small mistake on my side... It actually
could happen that someone inserts 20
million notes.“
17
Use case – Let‘s do the math
Ok, just for fun: How much data is that on
disk?
sizeOf (cki
)+
i
å sizeOf (Csj
j
å )+ Nr ´ (sizeOf (Crk
)+ sizeOf (
l
å Ccl
))+8´ Nv
k
å
Note_by_notebook
P Notebook [text]
C Title [text]
Comment [text]
Owner [text]
CreatedOn [text]sum(sizeof(P))
+ sum(sizeof(S))
+ num_rows
* (sum(sizeof(C)+sum(regular_column))
+ 8*num_values
= bytes_per_partition
16 bytes
+ 0 bytes
+ 20.000.000
* (60 bytes + 224 bytes)
+ 8 bytes * 60.000.000
= 6.160.000.016 bytes
18
Bucketing strategies
19
Bucketing strategies – Incrementing Bucket id
Incrementing bucket „counter“ based on row count inside
partition
+ Good if client is able to track the count
- Not very scalable
- Possible unreliable counter
insertNote bucketFull?
no
yes
Bucket++
notebook Bucket
n1 0
n1 1
Note_by_notebook
P Notebook [text]
P bucket [int]
C Title [text]
...
20
Bucketing strategies – Unique bucketing
insertNote bucketFull?
no
yes New bucket
uuid2
notebook Bucket
n1 uuid1
n1 uuid2
Identify buckets using uuids
+ Good if clients are able to track the count
+ Better scaleable
- Possibly unreliable counter
- Lookuptable(s) needed
Note_by_notebook
P Notebook [text]
P bucket [uuid]
C Title [text]
...
21
Bucketing strategies – Time based bucketing
Split partitions in descrete timeframes
e.g. new Bucket every 10
minutes
+ Amount of buckets per day defined
+ Fast solution on insert
- Not very scalable
Time notebook Bucket
0:00 – 0:10 n1 0
0:10 – 0:20 n1 1
0:20 – 0:30 n1 2
Note_by_notebook
P Notebook [text]
P bucket [int]
C Title [text]
...
22
Bucketing strategies – Hash based bucketing
Calculate buckets using primary key
Note_by_notebook
P Notebook [text]
C Title [text]
...
9523
% 2000
notebook Bucket
n1 1523
n1 1723
Example: Amount of Buckets = 2000
7723
% 2000
#
#
+ Amount of buckets defined
+ Deterministic
+ Fast solution
- Not possible if amount of rows is
unknown
Note_by_notebook
P Notebook [text]
P bucket [int]
C Title [text]
...
23
Incrementin
g
Time based Unique Hash based
Unknown amount
of Notes
- + -
Scaleable - - + -
No lookuptables
needed
- - +
Fast for writing + + +
Amount of
buckets known
- + - +
Bucketing strategies – Comparison
24
Datamodel
„notebook“
25
Datamodel – Unique bucketing
note_by_notebook
P Notebook [text]
P Bucket [timeuuid]
C Title [text]
Comment [text]
Creator [text]
CreatedOn
[timestamp]
notebook_partitions_by_na
me
P Notebook [text]
C Bucket [timeuuid]
notebook_partitions_by_note
P Notebook [text]
P Note_title [text]
Bucket [timeuuid]
Problems:
● How to make sure partitions
don‘t grow too big?
● How to make sure notes are not
picked twice?
26
How to make sure partitions don‘t grow
too big?
● Client side caching for writing
● Client instance „owns“ partition for distinct
time
● Creates new partition after this time
Datamodel
27
How to make sure notes are not picked
twice?
● Fetch whole partition not only one note
● Partition is „owned“ by one client instance for
a certain amount of time
● After that time it can be fetched again
Datamodel
28
Conclusion
● Scaleable
● Partition sizes something around 1000 notes
per notebook
● Fast writes
● Fast enough reads
Datamodel
29
Lessons learned
30
Lessons learned
• Annoy your PO!
• Be sure about your datamodel before going
productive!
• Do the math!
• Be aware of the problems caused by too big
partitions and tombstones!
• Delete partitions, not rows when possible!
31
Questions?
Markus Höfer
IT Consultant
markus.hoefer@codecentric.de
www.codecentric.de
blog.codecentric.de/en
HashtagMarkus

Bucket your partitions wisely - Cassandra summit 2016

  • 1.
  • 2.
  • 3.
    3 • Partition defineson which c* node the data resides • Identified by partition key • Nodes „own“ tokenranges which are directly related to partitions • Tokens calculated by hashing partition key Recap c* partitions
  • 4.
    4 • DataStax recommendationsfor partitions: • Maximum number of rows: hundreds of thousands • Disk size: 100‘s of MB Recap c* partitions
  • 5.
    5 What‘s the problemwith big partitions? • Every request for these partitions hit the same nodes. -> Not scaleable! • Deleting frequently will slow down your reads or even lead to TombstoneOverwhelmingExceptions Recap c* partitions
  • 6.
  • 7.
    7 Use case -Environment Keyspace µ Keyspace µ Keyspace µ
  • 8.
    8 • Many concurrentprocesses • Scaleability important • Load peaks will happen! Use case – Load and requirements
  • 9.
    9 • A user(owner) can create a notebook • An owner can create notes belonging to a notebook • Users can fetch notes (idealy only once), not necessarily in certain order • Users can delete notes Use case Note_by_notebook P Notebook [text] C Title [text] Comment [text] Owner [text] CreatedOn [timestamp]
  • 10.
    10 First things first: Dev:„How many notes per notebook?“ PO: „I assume a maximum of 100.000 notes“ Use case
  • 11.
    11 Use case –Let‘s do the math How many values do we store having 100.000 rows per notebook?‚ Nv = Nr ´(Nc - Npk - Ns )+ Ns Note_by_notebook P Notebook [text] C Title [text] Comment [text] Creator [text] CreatedOn [timestamp] num_rows * num_regular_columns + num_static_columns = values_per_notebook 100.000 * 3 + 0 = 300.000
  • 12.
    12 Use case –Size assumptions Note_by_notebook P Notebook [text] 16 bytes C Title [text] 60 bytes Comment [text] 200 bytes Owner[text] 16 bytes CreatedOn [timestamp] 8 bytes
  • 13.
    13 Use case –Let‘s do the math Ok, so how much data is that on disk? sizeOf (cki )+ i å sizeOf (Csj j å )+ Nr ´ (sizeOf (Crk )+ sizeOf ( l å Ccl ))+8´ Nv k å Note_by_notebook P Notebook [text] C Title [text] Comment [text] Owner [text] CreatedOn [text] sizeof(P) + sizeof(S) + num_rows * (sizeof(C)+sizeof(regular_column)) + 8*num_values = bytes_per_partition 16 bytes + 0 bytes + 100.000 * (60 bytes + 224 bytes) + 8 bytes * 300.000 = 30.800.016 bytes
  • 14.
    14 Use case Dev: „31MB for 100.000 rows on a partition“ PO: „Sorry ‘bout that, but its going to be 300.000 rows. Is that a problem?“
  • 15.
    15 Use case –Let‘s do the math How many data do we store having 300.000 rows per notebook? Note_by_notebook P Notebook [text] C Title [text] Comment [text] Owner [text] CreatedOn [text] 92.400.016 bytes
  • 16.
    16 Use case Dev: „Thatmight be ok if we don‘t delete too much, it‘ll be around 93 MB for 300.000 rows on a partition“ PO: „Small mistake on my side... It actually could happen that someone inserts 20 million notes.“
  • 17.
    17 Use case –Let‘s do the math Ok, just for fun: How much data is that on disk? sizeOf (cki )+ i å sizeOf (Csj j å )+ Nr ´ (sizeOf (Crk )+ sizeOf ( l å Ccl ))+8´ Nv k å Note_by_notebook P Notebook [text] C Title [text] Comment [text] Owner [text] CreatedOn [text]sum(sizeof(P)) + sum(sizeof(S)) + num_rows * (sum(sizeof(C)+sum(regular_column)) + 8*num_values = bytes_per_partition 16 bytes + 0 bytes + 20.000.000 * (60 bytes + 224 bytes) + 8 bytes * 60.000.000 = 6.160.000.016 bytes
  • 18.
  • 19.
    19 Bucketing strategies –Incrementing Bucket id Incrementing bucket „counter“ based on row count inside partition + Good if client is able to track the count - Not very scalable - Possible unreliable counter insertNote bucketFull? no yes Bucket++ notebook Bucket n1 0 n1 1 Note_by_notebook P Notebook [text] P bucket [int] C Title [text] ...
  • 20.
    20 Bucketing strategies –Unique bucketing insertNote bucketFull? no yes New bucket uuid2 notebook Bucket n1 uuid1 n1 uuid2 Identify buckets using uuids + Good if clients are able to track the count + Better scaleable - Possibly unreliable counter - Lookuptable(s) needed Note_by_notebook P Notebook [text] P bucket [uuid] C Title [text] ...
  • 21.
    21 Bucketing strategies –Time based bucketing Split partitions in descrete timeframes e.g. new Bucket every 10 minutes + Amount of buckets per day defined + Fast solution on insert - Not very scalable Time notebook Bucket 0:00 – 0:10 n1 0 0:10 – 0:20 n1 1 0:20 – 0:30 n1 2 Note_by_notebook P Notebook [text] P bucket [int] C Title [text] ...
  • 22.
    22 Bucketing strategies –Hash based bucketing Calculate buckets using primary key Note_by_notebook P Notebook [text] C Title [text] ... 9523 % 2000 notebook Bucket n1 1523 n1 1723 Example: Amount of Buckets = 2000 7723 % 2000 # # + Amount of buckets defined + Deterministic + Fast solution - Not possible if amount of rows is unknown Note_by_notebook P Notebook [text] P bucket [int] C Title [text] ...
  • 23.
    23 Incrementin g Time based UniqueHash based Unknown amount of Notes - + - Scaleable - - + - No lookuptables needed - - + Fast for writing + + + Amount of buckets known - + - + Bucketing strategies – Comparison
  • 24.
  • 25.
    25 Datamodel – Uniquebucketing note_by_notebook P Notebook [text] P Bucket [timeuuid] C Title [text] Comment [text] Creator [text] CreatedOn [timestamp] notebook_partitions_by_na me P Notebook [text] C Bucket [timeuuid] notebook_partitions_by_note P Notebook [text] P Note_title [text] Bucket [timeuuid] Problems: ● How to make sure partitions don‘t grow too big? ● How to make sure notes are not picked twice?
  • 26.
    26 How to makesure partitions don‘t grow too big? ● Client side caching for writing ● Client instance „owns“ partition for distinct time ● Creates new partition after this time Datamodel
  • 27.
    27 How to makesure notes are not picked twice? ● Fetch whole partition not only one note ● Partition is „owned“ by one client instance for a certain amount of time ● After that time it can be fetched again Datamodel
  • 28.
    28 Conclusion ● Scaleable ● Partitionsizes something around 1000 notes per notebook ● Fast writes ● Fast enough reads Datamodel
  • 29.
  • 30.
    30 Lessons learned • Annoyyour PO! • Be sure about your datamodel before going productive! • Do the math! • Be aware of the problems caused by too big partitions and tombstones! • Delete partitions, not rows when possible!
  • 31.

Editor's Notes

  • #5 2.0 -> 100.000 rows and < 100 MB Now 2.X
  • #22 10 minutes = 144 buckets
  • #24 Which strategy fits best to our requirements