SlideShare a Scribd company logo
1 of 40
Download to read offline
Scalable eCommerce Platform Solutions
Scalable eCommerce Platform Solutions
Apache Cassandra
high level overview and lessons learned
Scalable eCommerce Platform SolutionsScalable eCommerce Platform Solutions
Apache Cassandra
2
2/14/14
Scalable eCommerce Platform Solutions
Highlights
• Distributed columnar family database
• No SPOF
• decentralized
• data is both partitioned and replicated
• Optimized for high write throughput
• Query time tunable A vs C in CAP
• SEDA
3
2/14/14
Scalable eCommerce Platform Solutions
Partitioning (Consistent Hashing)
4
2/14/14
Scalable eCommerce Platform Solutions
Replication (RF 3)
5
2/14/14
Scalable eCommerce Platform Solutions
Adding New Node
6
2/14/14
Scalable eCommerce Platform Solutions
Partitioning (MOD N)
7
2/14/14
Node 1 Node 2 Node 3
0 1 2
3 4 5
6 7 8
Node 1 Node 2 Node 3 Node 4
0 1 2 3
4 5 6 7
8
Scalable eCommerce Platform Solutions
Virtual Nodes
8
2/14/14
• Going from one token and range per node to
many tokens per node
• No manual assignments of tokens to nodes
• Load is evenly distributed when a node joins
and leaves cluster
• Improves the use of heterogeneous machines in
a cluster
Scalable eCommerce Platform Solutions
Key Data Distribution Components
• Partitioner calculates token by a row key
(determines where to place first replica of a row)
• Replication Strategy determines total number of
replicas and where to place them
• Snitch defines network topology such as location
of nodes grouping them by racks and data
centers. Used by
• Replication Strategy
• Routing Requests (+Dynamic Snitch)
9
2/14/14
Scalable eCommerce Platform Solutions
Write Requests
• A coordinator node sends a write request to all
replicas regardless of Consistency Level (CL)
• It acknowledges request when CL is satisfied
10
2/14/14
Scalable eCommerce Platform Solutions
Read Requests - Optimistic Flow
• A coordinator node sends direct read requests to
CL number of fastest replicas (Dynamic Snitch)
• 1 request for full read
• CL - 1 requests for digest reads
• If there is a match it is returned to client
• Background read repair requests are sent to
other owners of that row based on read repair
chance
11
2/14/14
Scalable eCommerce Platform Solutions
Read Requests - Mismatch Case
• If there is a mismatch a coordinator node sends
direct full read requests to CL number of those
replicas
• Most recent copy returned to client
12
2/14/14
Scalable eCommerce Platform Solutions
Write Path
!
!
!
!
!
!
• Flush to disk is when memtable size threshold or commit log size
threshold or heap utilization threshold reached
• Never random disk IO or modification in place
• Compaction is in background
• A delete just marks a column with a tombstone
13
2/14/14
!
• commit log contains
all mutations
• memtable keeps
track of latest version
of data
Scalable eCommerce Platform Solutions
Read Path
!
!
!
!
!
!
!
!
!
!
!
• Each SSTable is read, results are combined with unflushed memtable(s), latest version
returned
• KeyCache is fixed size and shared among all tables
• are stored off heap (v1.2.X)
14
2/14/14
Scalable eCommerce Platform Solutions
ACID
• Atomicity
• a write is atomic at the row-level
• doesn’t roll back if a write fails on some replicas
• Consistency
• tunable through CL requirements (C vs A)
• Strong Consistency W + R > N
• Isolation
• row-level
• Durability
• yes, but
• commit log fsync each 10 seconds by default
• Lightweight transactions in Cassandra 2.0
• For INSERT, UPDATE statements
• using IF clause
15
2/14/14
Scalable eCommerce Platform Solutions
Built-in Repair Tools
• Hinted handoff
• does no count towards CL requirement
• if CL.ANY is used, not readable until at least
one normal owner is recovered
• Read repair
• Anti-entropy node repair
16
2/14/14
Scalable eCommerce Platform SolutionsScalable eCommerce Platform Solutions
Data Modeling
17
2/14/14
Scalable eCommerce Platform Solutions
Data Modeling
• Read by partition key
• Reduce number of reads
• aggregate data used together in a single row
• even at expense of number of writes to
duplicate some data
• Writes should not depend on reads
• Keep metadata overhead low
18
2/14/14
Scalable eCommerce Platform Solutions
CQL3 Overview
• It looks like SQL
• Compound keys
• Standard data types are built-in
• Collection type
• Asynchronous queries
• Tracing of queries
• … and more
19
2/14/14
Scalable eCommerce Platform Solutions
Simple Row / CQL3
CREATE TABLE simple_table (
my_key int PRIMARY KEY,
my_field_1 text,
my_field_2 boolean
);
!
INSERT INTO simple_table (my_key, my_field_1, my_field_2) VALUES ( 1, 'my value 1', false);
INSERT INTO simple_table (my_key, my_field_1, my_field_2) VALUES ( 2, 'my value 2', true);
!
SELECT * FROM simple_table ;
!
my_key | my_field_1 | my_field_2
--------+------------+------------
1 | my value 1 | False
2 | my value 2 | True
20
2/14/14
Scalable eCommerce Platform Solutions
Simple Row / Internal
[default@test] list simple_table;
-------------------
RowKey: 1
=> (name=, value=, timestamp=1395180822477000)
=> (name=my_field_1, value=6d792076616c75652031, timestamp=1395180822477000)
=> (name=my_field_2, value=00, timestamp=1395180822477000)
-------------------
RowKey: 2
=> (name=, value=, timestamp=1395180822480000)
=> (name=my_field_1, value=6d792076616c75652032, timestamp=1395180822480000)
=> (name=my_field_2, value=01, timestamp=1395180822480000)
!
1. Column name (size is proportional to column name length) and timestamp is stored for each column
2. There is an additional “empty” column per row
21
2/14/14
Scalable eCommerce Platform Solutions
Compound Key / CQL3
22
2/14/14
CREATE TABLE compound_key_table (
my_part_key int,
my_clust_key text,
my_field int,
PRIMARY KEY (my_part_key, my_clust_key)
);
!
INSERT INTO compound_key_table (my_part_key, my_clust_key, my_field) VALUES ( 1, 'my value 2', 2);
INSERT INTO compound_key_table (my_part_key, my_clust_key, my_field) VALUES ( 1, 'my value 1', 1);
INSERT INTO compound_key_table (my_part_key, my_clust_key, my_field) VALUES ( 1, 'my value 3', 3);
SELECT * FROM compound_key_table ;
!
my_part_key | my_clust_key | my_field
-------------+--------------+----------
1 | my value 1 | 1
1 | my value 2 | 2
1 | my value 3 | 3
Scalable eCommerce Platform Solutions
Compound Key / Internal
23
2/14/14
[default@test] list compound_key_table;
-------------------
RowKey: 1
=> (name=my value 1:, value=, timestamp=1395192704575000)
=> (name=my value 1:my_field, value=00000001, timestamp=1395192704575000)
=> (name=my value 2:, value=, timestamp=1395192704572000)
=> (name=my value 2:my_field, value=00000002, timestamp=1395192704572000)
=> (name=my value 3:, value=, timestamp=1395192704577000)
=> (name=my value 3:my_field, value=00000003, timestamp=1395192704577000)
!
1. Both CQL3 rows are in the same physical row, thus single read operation can read both of them
2. Still can read or update them partially (need to know PK - use lookup table)
3. Value of ‘my_clust_key’ column joined with ‘my_field’ column name and becomes my_field’s value column name
4. Value of ‘my_clust_key’ value doesn’t have associated timestamp, since it is part of PK
5. The CQL3 rows are sorted by value of ‘my_clust_key’ and can be used in ‘where’ clause
6. There is an additional “empty” column per CQL3 row
7. PK column names are hidden in system.schema_columnfamilies
Scalable eCommerce Platform Solutions
Collection Type / CQL3
24
2/14/14
CREATE TABLE collection_type_table (
my_key int PRIMARY KEY,
my_set set<int>,
my_map map<int, int>,
my_list list<int>,
);
!
INSERT INTO collection_type_table (my_key, my_set, my_map, my_list)
VALUES ( 1, {1, 2}, {1:2, 3:4}, [1, 2]);
SELECT * FROM collection_type_table ;
!
my_key | my_list | my_map | my_set
--------+---------+--------------+--------
1 | [1, 2] | {1: 2, 3: 4} | {1, 2}
Scalable eCommerce Platform Solutions
Collection Type / Internal
25
2/14/14
[default@test] list collection_type_table;
-------------------
RowKey: 1
=> (name=, value=, timestamp=1395253516706000)
=> (name=my_list:d1da8820af9311e38f4e97aee9b28d0c, value=00000001, timestamp=1395253516706000)
=> (name=my_list:d1da8821af9311e38f4e97aee9b28d0c, value=00000002, timestamp=1395253516706000)
=> (name=my_map:00000001, value=00000002, timestamp=1395253516706000)
=> (name=my_map:00000003, value=00000004, timestamp=1395253516706000)
=> (name=my_set:00000001, value=, timestamp=1395253516706000)
=> (name=my_set:00000002, value=, timestamp=1395253516706000)
!
1. Each element of each collection gets its own column
2. Each element of List type additionally consumes 16 bytes to maintain order of elements
3. Map key goes to column name
4. Set value goes to column name
Scalable eCommerce Platform Solutions
Column Overhead
• name : 2 bytes (length as short int) + byte[]
• flags : 1 byte
• if counter column : 8 bytes (timestamp of last
delete)
• if expiring column : 4 bytes (TTL) + 4 bytes
(local deletion time)
• timestamp : 8 bytes (long)
• value : 4 bytes (len as int) + byte[]
26
2/14/14
http://btoddb-cass-storage.blogspot.ru/2011/07/column-overhead-and-sizing-every-column.html
Scalable eCommerce Platform Solutions
Metadata Overhead
• Simple case (no TTL or not a Counter column ):
• regular_column_size = column_name_size +
column_value_size + 15 bytes
• row has has 23 bytes of overhead
• A column with name “my_column” of type int stores
your 4 bytes and incurs 24 bytes of overhead
• Keep in mind when internal columns created for CQL3
structures like Compound Keys or Collection Types
• Keep in mind when column value is used as column
name for many other columns
27
2/14/14
Scalable eCommerce Platform Solutions
JSON vs Separate Columns
• Drastically reduces metadata overhead
• A column with name “my_column” of type
text which stores your 1 kB bytes JSON
object and incurs 24 bytes of overhead
sounds much better!
• Saves CPU cycles and reduces read latency
• Supports complex hierarchical structures
• But it loses in partial reads / updates and
complicates schema versioning
28
2/14/14
Scalable eCommerce Platform Solutions
Use Case 1: Products and Upcs
29
2/14/14
CREATE TABLE product (
pid int,
upc int,
value text,
rstat text,
PRIMARY KEY(pid, uid)
);
!
pid | upc | rstat | value
-----+-----+---------------------+---------------------
123 | 0 | Reviews JSON Object | Product JSON Object
123 | 456 | null | Upc JSON Object
123 | 789 | null | Upc JSON Object
Scalable eCommerce Platform Solutions
Use Case 2: Availability
30
2/14/14
CREATE TABLE online_inventory (
pid int, upc int, available boolean,
PRIMARY KEY (pid, upc)
);
!
INSERT INTO online_inventory (pid, upc, available, tmp)
VALUES ( 123, 456, true, 0) USING TIMESTAMP 5;
INSERT INTO online_inventory (pid, upc, available, tmp)
VALUES ( 123, 456, false, 0) USING TIMESTAMP 4;
!
pid | upc | available | writetime(available)
-----+-----+-----------+----------------------
123 | 456 | True | 5
Scalable eCommerce Platform Solutions
Use Case 3: Product Pagination
31
2/14/14
CREATE TABLE product_pagination (
filter text,
pid int,
PRIMARY KEY (filter, pid)
)
!
INSERT INTO product_pagination (filter, pid ) VALUES ( 'ACTIVE', 45);
INSERT INTO product_pagination (filter, pid ) VALUES ( 'ACTIVE', 25);
INSERT INTO product_pagination (filter, pid ) VALUES ( 'ACTIVE', 75);
INSERT INTO product_pagination (filter, pid ) VALUES ( 'ACTIVE', 15);
SELECT * FROM product_pagination where filter = 'ACTIVE' and pid > 15 limit 2 ;
!
filter | pid
--------+-----
ACTIVE | 25
ACTIVE | 45
Scalable eCommerce Platform SolutionsScalable eCommerce Platform Solutions
DataStax Java Driver
32
2/14/14
Scalable eCommerce Platform Solutions
DataStax Java Driver
• Flexible load balancing policies
• includes token aware load balancing
• Connection pooling
• Flexible retry policy
• can retry on other nodes
• or reduce CL requirement
• Non-blocking I/O
• up to 128 simultaneous requests per connection
• asynchronous API
• Nodes discovery
33
2/14/14
Scalable eCommerce Platform Solutions
Multi-gets
• When you have N keys and want to read them all
• Built-in token-aware load balancer evaluates the first
key and sends all N keys to that node! oops…
• We preferred sending N fine-grained single-get queries in
async mode
• retries only those which failed
• can return partial result
• smart route for each key
• We tried multi-get-aware token-aware load balancer
• worked worse
34
2/14/14
Scalable eCommerce Platform SolutionsScalable eCommerce Platform Solutions
Data Loader
35
2/14/14
Scalable eCommerce Platform Solutions
Data Loader
36
2/14/14
• partitions the whole
data set (MOD N)
• sorts all result sets by
product id
• accumulates assembled
products and executes
batch write to C*
• single connection per
reader thread
Scalable eCommerce Platform SolutionsScalable eCommerce Platform Solutions
Cassandra 1.2.X Known Issues
37
2/14/14
Scalable eCommerce Platform Solutions
OOM #1
• select count (*) from product limit 75000000;
• wait for timeout
• hmm, try again (arrow up, enter)
• select count (*) from product limit 75000000;
• wait for timeout
• again
38
2/14/14
Scalable eCommerce Platform Solutions
OOM #2
• Try the following in production and get
permanent vacation
• truncate, drop, create table
• load data there
• start light read load
• Up to all C* nodes can get OOM simultaneously
• That is called high availability!
39
2/14/14
Scalable eCommerce Platform Solutions
DROP/CREATE without TRUNCATE
• SSTable files are still on disk after DROP
• CREATE triggers reading of the files
• and C* fails…
40
2/14/14

More Related Content

Viewers also liked (14)

Обзор кредитной активности банков в I квартале 2014 года
Обзор кредитной активности банков в I квартале 2014 годаОбзор кредитной активности банков в I квартале 2014 года
Обзор кредитной активности банков в I квартале 2014 года
 
Командная разработка “толстых клиентов”
Командная разработка “толстых клиентов”Командная разработка “толстых клиентов”
Командная разработка “толстых клиентов”
 
Mpdf (3)
Mpdf (3)Mpdf (3)
Mpdf (3)
 
Rust: абстракции и безопасность, совершенно бесплатно
Rust: абстракции и безопасность, совершенно бесплатноRust: абстракции и безопасность, совершенно бесплатно
Rust: абстракции и безопасность, совершенно бесплатно
 
No typo here
No typo hereNo typo here
No typo here
 
Mpdf (14)
Mpdf (14)Mpdf (14)
Mpdf (14)
 
Summerschool Heelkunde UGent 2013
Summerschool Heelkunde UGent 2013Summerschool Heelkunde UGent 2013
Summerschool Heelkunde UGent 2013
 
Manual corel draw
Manual corel drawManual corel draw
Manual corel draw
 
1099 fire ppt
1099 fire ppt1099 fire ppt
1099 fire ppt
 
Excerto livro-ca-excel2013
Excerto livro-ca-excel2013Excerto livro-ca-excel2013
Excerto livro-ca-excel2013
 
Ha update 2014 (1)
Ha update 2014 (1)Ha update 2014 (1)
Ha update 2014 (1)
 
Why to Consider Account Based Marketing?
Why to Consider Account Based Marketing?Why to Consider Account Based Marketing?
Why to Consider Account Based Marketing?
 
4.teoridasarlistrik01
4.teoridasarlistrik014.teoridasarlistrik01
4.teoridasarlistrik01
 
how to get more link like sitemap?
how to get more link like sitemap?how to get more link like sitemap?
how to get more link like sitemap?
 

Similar to Cooking Cassandra

New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 Richie Rump
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
NOCOUG_201311_Fine_Tuning_Execution_Plans.pdf
NOCOUG_201311_Fine_Tuning_Execution_Plans.pdfNOCOUG_201311_Fine_Tuning_Execution_Plans.pdf
NOCOUG_201311_Fine_Tuning_Execution_Plans.pdfcookie1969
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastSingleStore
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's Newdpcobb
 
MS SQL Server.ppt
MS SQL Server.pptMS SQL Server.ppt
MS SQL Server.pptQuyVo27
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPTony Rogerson
 
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld
 
Reduced instruction set computers
Reduced instruction set computersReduced instruction set computers
Reduced instruction set computersSyed Zaid Irshad
 
Query optimizer vivek sharma
Query optimizer vivek sharmaQuery optimizer vivek sharma
Query optimizer vivek sharmaaioughydchapter
 
T sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powersT sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powersShehap Elnagar
 
T sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powersT sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powersShehap Elnagar
 
T sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powersT sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powersShehap Elnagar
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksMYXPLAIN
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectMao Geng
 
Sap abap
Sap abapSap abap
Sap abapnrj10
 

Similar to Cooking Cassandra (20)

New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
NOCOUG_201311_Fine_Tuning_Execution_Plans.pdf
NOCOUG_201311_Fine_Tuning_Execution_Plans.pdfNOCOUG_201311_Fine_Tuning_Execution_Plans.pdf
NOCOUG_201311_Fine_Tuning_Execution_Plans.pdf
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks Webcast
 
RISC.ppt
RISC.pptRISC.ppt
RISC.ppt
 
13 risc
13 risc13 risc
13 risc
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's New
 
MS SQL Server.ppt
MS SQL Server.pptMS SQL Server.ppt
MS SQL Server.ppt
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTP
 
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
 
Mssql
MssqlMssql
Mssql
 
Reduced instruction set computers
Reduced instruction set computersReduced instruction set computers
Reduced instruction set computers
 
Query optimizer vivek sharma
Query optimizer vivek sharmaQuery optimizer vivek sharma
Query optimizer vivek sharma
 
T sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powersT sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powers
 
T sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powersT sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powers
 
T sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powersT sql performance guidelines for better db stress powers
T sql performance guidelines for better db stress powers
 
Sql server introduction to sql server
Sql server introduction to sql server Sql server introduction to sql server
Sql server introduction to sql server
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New Tricks
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
 
Sap abap
Sap abapSap abap
Sap abap
 

More from Open-IT

How to make friends python with win32 api
How to make friends python with win32 apiHow to make friends python with win32 api
How to make friends python with win32 apiOpen-IT
 
Mathematical optimization and python
Mathematical optimization and pythonMathematical optimization and python
Mathematical optimization and pythonOpen-IT
 
Секретный доклад
Секретный докладСекретный доклад
Секретный докладOpen-IT
 
Введение в Apache Cassandra
Введение в Apache CassandraВведение в Apache Cassandra
Введение в Apache CassandraOpen-IT
 
λ | Lenses
λ | Lensesλ | Lenses
λ | LensesOpen-IT
 
Akka и реактивное программирование на JVM
Akka и реактивное программирование на JVMAkka и реактивное программирование на JVM
Akka и реактивное программирование на JVMOpen-IT
 
Fuel's current use cases, architecture and next steps
Fuel's current use cases, architecture and next stepsFuel's current use cases, architecture and next steps
Fuel's current use cases, architecture and next stepsOpen-IT
 
Виртуализация как инструмент разработчика
Виртуализация как инструмент разработчикаВиртуализация как инструмент разработчика
Виртуализация как инструмент разработчикаOpen-IT
 
Microsoft kinect
Microsoft kinectMicrosoft kinect
Microsoft kinectOpen-IT
 
Сам себе АНБ, API социальных сетей
Сам себе АНБ, API социальных сетейСам себе АНБ, API социальных сетей
Сам себе АНБ, API социальных сетейOpen-IT
 
Talkbits service architecture and deployment
Talkbits service architecture and deploymentTalkbits service architecture and deployment
Talkbits service architecture and deploymentOpen-IT
 

More from Open-IT (11)

How to make friends python with win32 api
How to make friends python with win32 apiHow to make friends python with win32 api
How to make friends python with win32 api
 
Mathematical optimization and python
Mathematical optimization and pythonMathematical optimization and python
Mathematical optimization and python
 
Секретный доклад
Секретный докладСекретный доклад
Секретный доклад
 
Введение в Apache Cassandra
Введение в Apache CassandraВведение в Apache Cassandra
Введение в Apache Cassandra
 
λ | Lenses
λ | Lensesλ | Lenses
λ | Lenses
 
Akka и реактивное программирование на JVM
Akka и реактивное программирование на JVMAkka и реактивное программирование на JVM
Akka и реактивное программирование на JVM
 
Fuel's current use cases, architecture and next steps
Fuel's current use cases, architecture and next stepsFuel's current use cases, architecture and next steps
Fuel's current use cases, architecture and next steps
 
Виртуализация как инструмент разработчика
Виртуализация как инструмент разработчикаВиртуализация как инструмент разработчика
Виртуализация как инструмент разработчика
 
Microsoft kinect
Microsoft kinectMicrosoft kinect
Microsoft kinect
 
Сам себе АНБ, API социальных сетей
Сам себе АНБ, API социальных сетейСам себе АНБ, API социальных сетей
Сам себе АНБ, API социальных сетей
 
Talkbits service architecture and deployment
Talkbits service architecture and deploymentTalkbits service architecture and deployment
Talkbits service architecture and deployment
 

Recently uploaded

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 

Recently uploaded (20)

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

Cooking Cassandra

  • 1. Scalable eCommerce Platform Solutions Scalable eCommerce Platform Solutions Apache Cassandra high level overview and lessons learned
  • 2. Scalable eCommerce Platform SolutionsScalable eCommerce Platform Solutions Apache Cassandra 2 2/14/14
  • 3. Scalable eCommerce Platform Solutions Highlights • Distributed columnar family database • No SPOF • decentralized • data is both partitioned and replicated • Optimized for high write throughput • Query time tunable A vs C in CAP • SEDA 3 2/14/14
  • 4. Scalable eCommerce Platform Solutions Partitioning (Consistent Hashing) 4 2/14/14
  • 5. Scalable eCommerce Platform Solutions Replication (RF 3) 5 2/14/14
  • 6. Scalable eCommerce Platform Solutions Adding New Node 6 2/14/14
  • 7. Scalable eCommerce Platform Solutions Partitioning (MOD N) 7 2/14/14 Node 1 Node 2 Node 3 0 1 2 3 4 5 6 7 8 Node 1 Node 2 Node 3 Node 4 0 1 2 3 4 5 6 7 8
  • 8. Scalable eCommerce Platform Solutions Virtual Nodes 8 2/14/14 • Going from one token and range per node to many tokens per node • No manual assignments of tokens to nodes • Load is evenly distributed when a node joins and leaves cluster • Improves the use of heterogeneous machines in a cluster
  • 9. Scalable eCommerce Platform Solutions Key Data Distribution Components • Partitioner calculates token by a row key (determines where to place first replica of a row) • Replication Strategy determines total number of replicas and where to place them • Snitch defines network topology such as location of nodes grouping them by racks and data centers. Used by • Replication Strategy • Routing Requests (+Dynamic Snitch) 9 2/14/14
  • 10. Scalable eCommerce Platform Solutions Write Requests • A coordinator node sends a write request to all replicas regardless of Consistency Level (CL) • It acknowledges request when CL is satisfied 10 2/14/14
  • 11. Scalable eCommerce Platform Solutions Read Requests - Optimistic Flow • A coordinator node sends direct read requests to CL number of fastest replicas (Dynamic Snitch) • 1 request for full read • CL - 1 requests for digest reads • If there is a match it is returned to client • Background read repair requests are sent to other owners of that row based on read repair chance 11 2/14/14
  • 12. Scalable eCommerce Platform Solutions Read Requests - Mismatch Case • If there is a mismatch a coordinator node sends direct full read requests to CL number of those replicas • Most recent copy returned to client 12 2/14/14
  • 13. Scalable eCommerce Platform Solutions Write Path ! ! ! ! ! ! • Flush to disk is when memtable size threshold or commit log size threshold or heap utilization threshold reached • Never random disk IO or modification in place • Compaction is in background • A delete just marks a column with a tombstone 13 2/14/14 ! • commit log contains all mutations • memtable keeps track of latest version of data
  • 14. Scalable eCommerce Platform Solutions Read Path ! ! ! ! ! ! ! ! ! ! ! • Each SSTable is read, results are combined with unflushed memtable(s), latest version returned • KeyCache is fixed size and shared among all tables • are stored off heap (v1.2.X) 14 2/14/14
  • 15. Scalable eCommerce Platform Solutions ACID • Atomicity • a write is atomic at the row-level • doesn’t roll back if a write fails on some replicas • Consistency • tunable through CL requirements (C vs A) • Strong Consistency W + R > N • Isolation • row-level • Durability • yes, but • commit log fsync each 10 seconds by default • Lightweight transactions in Cassandra 2.0 • For INSERT, UPDATE statements • using IF clause 15 2/14/14
  • 16. Scalable eCommerce Platform Solutions Built-in Repair Tools • Hinted handoff • does no count towards CL requirement • if CL.ANY is used, not readable until at least one normal owner is recovered • Read repair • Anti-entropy node repair 16 2/14/14
  • 17. Scalable eCommerce Platform SolutionsScalable eCommerce Platform Solutions Data Modeling 17 2/14/14
  • 18. Scalable eCommerce Platform Solutions Data Modeling • Read by partition key • Reduce number of reads • aggregate data used together in a single row • even at expense of number of writes to duplicate some data • Writes should not depend on reads • Keep metadata overhead low 18 2/14/14
  • 19. Scalable eCommerce Platform Solutions CQL3 Overview • It looks like SQL • Compound keys • Standard data types are built-in • Collection type • Asynchronous queries • Tracing of queries • … and more 19 2/14/14
  • 20. Scalable eCommerce Platform Solutions Simple Row / CQL3 CREATE TABLE simple_table ( my_key int PRIMARY KEY, my_field_1 text, my_field_2 boolean ); ! INSERT INTO simple_table (my_key, my_field_1, my_field_2) VALUES ( 1, 'my value 1', false); INSERT INTO simple_table (my_key, my_field_1, my_field_2) VALUES ( 2, 'my value 2', true); ! SELECT * FROM simple_table ; ! my_key | my_field_1 | my_field_2 --------+------------+------------ 1 | my value 1 | False 2 | my value 2 | True 20 2/14/14
  • 21. Scalable eCommerce Platform Solutions Simple Row / Internal [default@test] list simple_table; ------------------- RowKey: 1 => (name=, value=, timestamp=1395180822477000) => (name=my_field_1, value=6d792076616c75652031, timestamp=1395180822477000) => (name=my_field_2, value=00, timestamp=1395180822477000) ------------------- RowKey: 2 => (name=, value=, timestamp=1395180822480000) => (name=my_field_1, value=6d792076616c75652032, timestamp=1395180822480000) => (name=my_field_2, value=01, timestamp=1395180822480000) ! 1. Column name (size is proportional to column name length) and timestamp is stored for each column 2. There is an additional “empty” column per row 21 2/14/14
  • 22. Scalable eCommerce Platform Solutions Compound Key / CQL3 22 2/14/14 CREATE TABLE compound_key_table ( my_part_key int, my_clust_key text, my_field int, PRIMARY KEY (my_part_key, my_clust_key) ); ! INSERT INTO compound_key_table (my_part_key, my_clust_key, my_field) VALUES ( 1, 'my value 2', 2); INSERT INTO compound_key_table (my_part_key, my_clust_key, my_field) VALUES ( 1, 'my value 1', 1); INSERT INTO compound_key_table (my_part_key, my_clust_key, my_field) VALUES ( 1, 'my value 3', 3); SELECT * FROM compound_key_table ; ! my_part_key | my_clust_key | my_field -------------+--------------+---------- 1 | my value 1 | 1 1 | my value 2 | 2 1 | my value 3 | 3
  • 23. Scalable eCommerce Platform Solutions Compound Key / Internal 23 2/14/14 [default@test] list compound_key_table; ------------------- RowKey: 1 => (name=my value 1:, value=, timestamp=1395192704575000) => (name=my value 1:my_field, value=00000001, timestamp=1395192704575000) => (name=my value 2:, value=, timestamp=1395192704572000) => (name=my value 2:my_field, value=00000002, timestamp=1395192704572000) => (name=my value 3:, value=, timestamp=1395192704577000) => (name=my value 3:my_field, value=00000003, timestamp=1395192704577000) ! 1. Both CQL3 rows are in the same physical row, thus single read operation can read both of them 2. Still can read or update them partially (need to know PK - use lookup table) 3. Value of ‘my_clust_key’ column joined with ‘my_field’ column name and becomes my_field’s value column name 4. Value of ‘my_clust_key’ value doesn’t have associated timestamp, since it is part of PK 5. The CQL3 rows are sorted by value of ‘my_clust_key’ and can be used in ‘where’ clause 6. There is an additional “empty” column per CQL3 row 7. PK column names are hidden in system.schema_columnfamilies
  • 24. Scalable eCommerce Platform Solutions Collection Type / CQL3 24 2/14/14 CREATE TABLE collection_type_table ( my_key int PRIMARY KEY, my_set set<int>, my_map map<int, int>, my_list list<int>, ); ! INSERT INTO collection_type_table (my_key, my_set, my_map, my_list) VALUES ( 1, {1, 2}, {1:2, 3:4}, [1, 2]); SELECT * FROM collection_type_table ; ! my_key | my_list | my_map | my_set --------+---------+--------------+-------- 1 | [1, 2] | {1: 2, 3: 4} | {1, 2}
  • 25. Scalable eCommerce Platform Solutions Collection Type / Internal 25 2/14/14 [default@test] list collection_type_table; ------------------- RowKey: 1 => (name=, value=, timestamp=1395253516706000) => (name=my_list:d1da8820af9311e38f4e97aee9b28d0c, value=00000001, timestamp=1395253516706000) => (name=my_list:d1da8821af9311e38f4e97aee9b28d0c, value=00000002, timestamp=1395253516706000) => (name=my_map:00000001, value=00000002, timestamp=1395253516706000) => (name=my_map:00000003, value=00000004, timestamp=1395253516706000) => (name=my_set:00000001, value=, timestamp=1395253516706000) => (name=my_set:00000002, value=, timestamp=1395253516706000) ! 1. Each element of each collection gets its own column 2. Each element of List type additionally consumes 16 bytes to maintain order of elements 3. Map key goes to column name 4. Set value goes to column name
  • 26. Scalable eCommerce Platform Solutions Column Overhead • name : 2 bytes (length as short int) + byte[] • flags : 1 byte • if counter column : 8 bytes (timestamp of last delete) • if expiring column : 4 bytes (TTL) + 4 bytes (local deletion time) • timestamp : 8 bytes (long) • value : 4 bytes (len as int) + byte[] 26 2/14/14 http://btoddb-cass-storage.blogspot.ru/2011/07/column-overhead-and-sizing-every-column.html
  • 27. Scalable eCommerce Platform Solutions Metadata Overhead • Simple case (no TTL or not a Counter column ): • regular_column_size = column_name_size + column_value_size + 15 bytes • row has has 23 bytes of overhead • A column with name “my_column” of type int stores your 4 bytes and incurs 24 bytes of overhead • Keep in mind when internal columns created for CQL3 structures like Compound Keys or Collection Types • Keep in mind when column value is used as column name for many other columns 27 2/14/14
  • 28. Scalable eCommerce Platform Solutions JSON vs Separate Columns • Drastically reduces metadata overhead • A column with name “my_column” of type text which stores your 1 kB bytes JSON object and incurs 24 bytes of overhead sounds much better! • Saves CPU cycles and reduces read latency • Supports complex hierarchical structures • But it loses in partial reads / updates and complicates schema versioning 28 2/14/14
  • 29. Scalable eCommerce Platform Solutions Use Case 1: Products and Upcs 29 2/14/14 CREATE TABLE product ( pid int, upc int, value text, rstat text, PRIMARY KEY(pid, uid) ); ! pid | upc | rstat | value -----+-----+---------------------+--------------------- 123 | 0 | Reviews JSON Object | Product JSON Object 123 | 456 | null | Upc JSON Object 123 | 789 | null | Upc JSON Object
  • 30. Scalable eCommerce Platform Solutions Use Case 2: Availability 30 2/14/14 CREATE TABLE online_inventory ( pid int, upc int, available boolean, PRIMARY KEY (pid, upc) ); ! INSERT INTO online_inventory (pid, upc, available, tmp) VALUES ( 123, 456, true, 0) USING TIMESTAMP 5; INSERT INTO online_inventory (pid, upc, available, tmp) VALUES ( 123, 456, false, 0) USING TIMESTAMP 4; ! pid | upc | available | writetime(available) -----+-----+-----------+---------------------- 123 | 456 | True | 5
  • 31. Scalable eCommerce Platform Solutions Use Case 3: Product Pagination 31 2/14/14 CREATE TABLE product_pagination ( filter text, pid int, PRIMARY KEY (filter, pid) ) ! INSERT INTO product_pagination (filter, pid ) VALUES ( 'ACTIVE', 45); INSERT INTO product_pagination (filter, pid ) VALUES ( 'ACTIVE', 25); INSERT INTO product_pagination (filter, pid ) VALUES ( 'ACTIVE', 75); INSERT INTO product_pagination (filter, pid ) VALUES ( 'ACTIVE', 15); SELECT * FROM product_pagination where filter = 'ACTIVE' and pid > 15 limit 2 ; ! filter | pid --------+----- ACTIVE | 25 ACTIVE | 45
  • 32. Scalable eCommerce Platform SolutionsScalable eCommerce Platform Solutions DataStax Java Driver 32 2/14/14
  • 33. Scalable eCommerce Platform Solutions DataStax Java Driver • Flexible load balancing policies • includes token aware load balancing • Connection pooling • Flexible retry policy • can retry on other nodes • or reduce CL requirement • Non-blocking I/O • up to 128 simultaneous requests per connection • asynchronous API • Nodes discovery 33 2/14/14
  • 34. Scalable eCommerce Platform Solutions Multi-gets • When you have N keys and want to read them all • Built-in token-aware load balancer evaluates the first key and sends all N keys to that node! oops… • We preferred sending N fine-grained single-get queries in async mode • retries only those which failed • can return partial result • smart route for each key • We tried multi-get-aware token-aware load balancer • worked worse 34 2/14/14
  • 35. Scalable eCommerce Platform SolutionsScalable eCommerce Platform Solutions Data Loader 35 2/14/14
  • 36. Scalable eCommerce Platform Solutions Data Loader 36 2/14/14 • partitions the whole data set (MOD N) • sorts all result sets by product id • accumulates assembled products and executes batch write to C* • single connection per reader thread
  • 37. Scalable eCommerce Platform SolutionsScalable eCommerce Platform Solutions Cassandra 1.2.X Known Issues 37 2/14/14
  • 38. Scalable eCommerce Platform Solutions OOM #1 • select count (*) from product limit 75000000; • wait for timeout • hmm, try again (arrow up, enter) • select count (*) from product limit 75000000; • wait for timeout • again 38 2/14/14
  • 39. Scalable eCommerce Platform Solutions OOM #2 • Try the following in production and get permanent vacation • truncate, drop, create table • load data there • start light read load • Up to all C* nodes can get OOM simultaneously • That is called high availability! 39 2/14/14
  • 40. Scalable eCommerce Platform Solutions DROP/CREATE without TRUNCATE • SSTable files are still on disk after DROP • CREATE triggers reading of the files • and C* fails… 40 2/14/14