SlideShare a Scribd company logo
2013 © Trivadis
BASEL BERN BRUGES LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MUNICH STUTTGART VIENNA
2013 © Trivadis
Architecture et modèle de données Cassandra
Genève 26.01.2015
Ulises Fasoli
Senior Consultant
Trivadis AG
January 2016
Architecture et modèle de données Cassandra
1
2013 © Trivadis
Agenda
1. Introduction to NoSQL datastores and Polyglot Persistence
2. What is Apache Cassandra?
3. Why Cassandra, What is DataStax?
4. Cassandra Architecture
5. Cassandra Data Model
6. Cassandra Query Language (CQL)
7. Cassandra/DataStax @ Trivadis
January 2016
Architecture et modèle de données Cassandra
2
2013 © Trivadis
History of Databases
1960s File-based, Network (CODASYL) and Hierarchical Databases
1970s Relational Database
1980 SQL became the standard query language
Early 1990 Object-Databases
Late 1990 XML Databases
2004 NoSQL Databases
January 2016
Architecture et modèle de données Cassandra
3
2013 © Trivadis
What‘s wrong with Relational Databases ?
• SQL provides a rich, declarative query language
• Database enforce referential integrity
• ACID semantics
• Well understood by developers, database administrators
• Well supported by different languages, frameworks and tools
• Hibernate, JPA, JDBC, iBATIS, Entity Framework
• Well understood and accepted by operations people (DBAs)
• Configuration
• Monitoring
• Backup and Recovery
• Tuning
• Design
January 2016
Architecture et modèle de données Cassandra
4
They are great ….
2013 © Trivadis
Relational Databases are great ... But!
New trends
Big Data
Concurrency
Connectivity
Diversity
P2P Knowledge
Cloud/Grid
January 2016
Architecture et modèle de données Cassandra
5
2013 © Trivadis
Relational Databases are great ... But!
Problem: Complex Object Graphs
Object/Relational impedance mismatch
Complicated to map rich domain model
to relational schema
Performance issues
• Many rows in many tables
• Many joins
• Eager vs. lazy loading
ORDER
ADDRESS
CUSTOMER
ORDER_LINES
Order
ID: 1001
Order Date: 15.9.2012
Line Items
Customer
First Name: Peter
Last Name: Sample
Billing Address
Street: Somestreet 10
City: Somewhere
Postal Code: 55901
Name
Ipod Touch
Monster Beat
Apple Mouse
Quantity
1
2
1
Price
220.95
190.00
69.90
January 2016
Architecture et modèle de données Cassandra
6
2013 © Trivadis
Relational Databases are great ... But!
Problem: Schema evolution
Adding attributes to an object => have to add columns to table
Expensive, if lots of data in that table
 Holding locks on the tables for long time
 What if new values should be mandatory, cannot enforce NOT NULL
constraint
 Application downtime …
January 2016
Architecture et modèle de données Cassandra
7
2013 © Trivadis
Relational Databases are great ... But!
Problem: Semi-structured data
Relational schema doesn‘t easily handle semi-structured data
Common solutions
 Name/Value table
- Poor performance
- Lack of constraint
 Serialize as Blob
- Fewer joins, but no query capabilities
January 2016
Architecture et modèle de données Cassandra
8
2013 © Trivadis
RDBMS
Database
Relational Databases are great ... But!
Problem: Scaling
Scaling writes difficult/expensive/impossible => Big Data
Scaling a relational database:
 Vertical scaling is limited and is expensive
 Horizontal scaling is limited and is expensive
RDBMS
Database
RDBMS
Database
RDBMS
Database
RDBMS
Database
RDBMS
Database
Node
1
Node
2
P1 P2 P3
ClientClientClient Client
Single DB => Partitioned Table => Database Sharding => Database Cluster
January 2016
Architecture et modèle de données Cassandra
9
2013 © Trivadis
So, what’s Wrong With RDBMS?
• Many programmers are already
familiar with it.
• Transactions and ACID make
development easy.
• Lots of tools to use.
• Rigid schema design.
• Harder to scale.
• Replication.
January 2016
Architecture et modèle de données Cassandra
10
Nothing
No one size fits all
2013 © Trivadis
Solution: NoSQL ?
No standard definition of what NoSQL means
• Not Only SQL and not No SQL
• Not only relational would have been better
Term began in a workshop organized in 2009
Use the right tools (DBs) for the job
It is more like a feature set, or event the not of a feature set
January 2016
Architecture et modèle de données Cassandra
11
2013 © Trivadis
Use Cases for NoSQL
• Massive write performance.
• Fast key value look ups.
• Flexible schema and data types.
• No single point of failure.
• Fast prototyping and development.
• Out of the box scalability.
• Easy maintenance.
January 2016
Architecture et modèle de données Cassandra
12
2013 © Trivadis
Brewer's CAP Theorem
Any networked shared-data system can have at most two of the three
desirable properties:
 Consistency
All of the nodes see the same data at
the same time, regardless of
where the data is stored
 Availability
Node failures do not prevent
survivors from continuing to
operate
 Network Partition tolerance
The system continues to operate
despite arbitrary message loss
January 2016
Architecture et modèle de données Cassandra
13
Availability
Consistency
Network
Partition
Tolerance
n/a
CA CP
AP
2013 © Trivadis
Data Store Positioning
January 2016
Architecture et modèle de données Cassandra
14
Scalability
Standardized Model, Tooling, Complexity
Key-value
Wide Column (Column Families / Extensible Records)
Document
Graph
Relational
SQL Comfort Zone
Multi Dimensional
2013 © Trivadis
Polyglot Persistence
In 2006, Neal Ford coined the term Polyglot
Programming
 Applications should be written in a mix of
languages to take advantage of the fact
that different languages are suitable for
tackling different problems
Polyglot Persistence defines a a hybrid
approach to persistence
 Using multiple data storage technologies
 Selected based on the way data is being
used by individual applications
 Why store binary images in RDBMs, when
there are better storage systems?
January 2016
Architecture et modèle de données Cassandra
15
Polyglot Programmer
2013 © Trivadis
Polyglot Persistence
Today we use the same
database for all kind of data
• Business transactions, session
management data, reporting,
logging information, content
information, ...
No need for same properties of
availability, consistency or
backup requirements
Polyglot Data Storage Usage
allows to mix and match
Relational and NoSQL data
stores
January 2016
Architecture et modèle de données Cassandra
16
Polygot Persistence Model
E-commerce Application
Shopping cart data User Sessions Product Catalog RecomendationsCompleted Order
Key-Value RDMBS Document Graph
„Traditional“ Persistence Model
E-commerce Application
RDBMS
Shopping cart data User Sessions Product Catalog RecomendationsCompleted Order
2013 © Trivadis
Agenda
1. Introduction to NoSQL datastores and Polyglot Persistence
2. What is Apache Cassandra?
3. Why Cassandra, What is DataStax?
4. Cassandra Architecture
5. Cassandra Data Model
6. Cassandra Query Language (CQL)
7. Cassandra/DataStax @ Trivadis
January 2016
Architecture et modèle de données Cassandra
17
2013 © Trivadis
Definition of Cassandra
Apache Cassandra™ is a free
• Distributed…
• High performance…
• Extremely scalable…
• Fault tolerant (i.e. no single point of failure)…
post-relational database solution.
Cassandra can serve as both real-time Datastore (the "system of record")
for online/transactional applications, and as a read-intensive database for
business intelligence systems.
January 2016
Architecture et modèle de données Cassandra
18
2013 © Trivadis
History of Cassandra
January 2016
Architecture et modèle de données Cassandra
19
Bigtable Dynamo
2013 © Trivadis
Architecture Overview
Cassandra was designed with the understanding that system/hardware
failures can and do occur :
• Peer-to-peer, distributed system
• All nodes the same
• Data partitioned among all nodes in the cluster
• Custom data replication to ensure fault tolerance
• Read/Write-anywhere design
January 2016
Architecture et modèle de données Cassandra
20
2013 © Trivadis
Big Data Scalability
• Capable of comfortably scaling to petabytes
• New nodes = Linear performance increases
• Add new nodes online
January 2016
Architecture et modèle de données Cassandra
21
2013 © Trivadis
Who is using Cassandra?
January 2016
Architecture et modèle de données Cassandra
22
Largest publicly known cluster has over 300 TB of data spanning 400
machines
2013 © Trivadis
Agenda
1. Introduction to NoSQL datastores and Polyglot Persistence
2. What is Apache Cassandra?
3. Why Cassandra, What is DataStax?
4. Cassandra Architecture
5. Cassandra Data Model
6. Cassandra Query Language (CQL)
7. Cassandra/DataStax @ Trivadis
January 2016
Architecture et modèle de données Cassandra
23
2013 © Trivadis
Why Cassandra?
Tunable data consistency
Flexible schema design
Data Compression
CQL language (like SQL)
Support for key languages and
platforms
No need for special hardware or
software
Gigabyte to Petabyte scalability
Linear performance gains through
adding nodes
No single point of failure
Easy replication / data distribution
Multi-data center and Cloud
capable
No need for separate caching layer
January 2016
Architecture et modèle de données Cassandra
24
2013 © Trivadis
Cassandra Use Cases
Product Catalog / Playlists
Personalization
• Ads
• Recommendations
• Ratings
Fraud Detection
Time Series
• Finance
• Smart Meter
IoT / Sensor Data
Graph / Network data
January 2016
Architecture et modèle de données Cassandra
25
2013 © Trivadis
DataStax Enterprise Edition (DSE)
January 2016
Architecture et modèle de données Cassandra
26
2013 © Trivadis
Datastax OpsCenter
January 2016
Architecture et modèle de données Cassandra
27
2013 © Trivadis
Agenda
1. Introduction to NoSQL datastores and Polyglot Persistence
2. What is Apache Cassandra?
3. Why Cassandra, What is DataStax?
4. Cassandra Architecture
5. Cassandra Data Model
6. Cassandra Query Language (CQL)
7. Cassandra/DataStax @ Trivadis
January 2016
Architecture et modèle de données Cassandra
28
2013 © Trivadis
Architecture Overview
Each node communicates with each other through the Gossip protocol,
which exchanges information across the cluster every second
A commit log is used on each node to capture write activity. Data durability
is assured
Data also written to an in-memory structure (memtable) and then to disk
once the memory structure is full (an SSTable)
January 2016
Architecture et modèle de données Cassandra
29
2013 © Trivadis
No Single Point of Failure
All nodes the same
Customized replication affords tunable
data redundancy
Read/write from any node
Can replicate data among different
physical data center racks
January 2016
Architecture et modèle de données Cassandra
30
2013 © Trivadis
Easy Replication / Data Distribution
Transparently handled by
Cassandra
Multi-data center capable
Exploits all the benefits of Cloud
computing
Able to do hybrid Cloud/On-
premise setup
January 2016
Architecture et modèle de données Cassandra
31
2013 © Trivadis
Partitioning
• Nodes are logically structured in Ring Topology.
• Hashed value of key associated with data partition is used to assign it to
a node in the ring.
• Lightly loaded nodes moves position to alleviate highly loaded nodes.
January 2016
Architecture et modèle de données Cassandra
32
2013 © Trivadis
Data Replication
Replication for high availability and data durability
• Replication factor N: each row is replicated at N nodes
• Each row key k is assigned to a coordination node
• The coordinator node is responsible for replicating the rows within its
key range
January 2016
Architecture et modèle de données Cassandra
33
2013 © Trivadis
Partitioning and Replication
January 2016
Architecture et modèle de données Cassandra
34
01
1/2
F
E
D
C
B
A N=3
h(key2)
h(key1)
2013 © Trivadis
Data Replication
Each data item is replicated at N (replication factor) nodes.
Different Replication Policies
 Rack Unaware – replicate data at N-1 successive nodes after its
coordinator
 Rack Aware – uses 'Zookeeper' to choose a leader which tells nodes
the range they are replicas for
 Datacenter Aware – similar to Rack Aware but leader is chosen at
Datacenter level instead of Rack level.
January 2016
Architecture et modèle de données Cassandra
35
2013 © Trivadis
Write Path
When a write occurs, Cassandra stores the data in a structure in memory,
the Memtable, and also appends writes to the commit log on disk,
providing configurable durability.
January 2016
Architecture et modèle de données Cassandra
36
2013 © Trivadis
Write Requests
Coordinator sends a write request to all replicas that own the row being
written
January 2016
Architecture et modèle de données Cassandra
37
2013 © Trivadis
Write Consistency
The consistency level for writing to Cassandra specifies how many replicas
the write must succeed before returning an ACK to the client
• Quorum: (replication_factor / 2) + 1
January 2016
Architecture et modèle de données Cassandra
38
2013 © Trivadis
Read Path
When a read request for a row
comes in to a node, the row
must be combined from all
SSTables on that node that
contain columns from the row in
question
as well as from any unflushed
memtables, to produce the
requested data
January 2016
Architecture et modèle de données Cassandra
39
2013 © Trivadis
Read Requests
There are two types of read requests that a coordinator can send to a
replica:
• A direct read request
• A background read repair request
The number of replicas contacted by a direct read request is determined by
the consistency level specified by the client.
January 2016
Architecture et modèle de données Cassandra
40
2013 © Trivadis
Read Consistency
The consistency level for reading from Cassandra specified how many
replicas must respond before a result is returned to the client
• Quorum: (replication_factor / 2) + 1
January 2016
Architecture et modèle de données Cassandra
41
2013 © Trivadis
Agenda
1. Introduction to NoSQL datastores and Polyglot Persistence
2. What is Apache Cassandra?
3. Why Cassandra, What is DataStax?
4. Cassandra Architecture
5. Cassandra Data Model
6. Cassandra Query Language (CQL)
7. Cassandra/DataStax @ Trivadis
January 2016
Architecture et modèle de données Cassandra
42
2013 © Trivadis
Cassandra Data Model
• Table is a multi dimensional map indexed by key (row key).
• Columns are grouped into Column Families
• Dynamic schema design allows for much more flexible data storage
than rigid RDBMS
• Each Column has
- Name
- Value
- Timestamp
January 2016
Architecture et modèle de données Cassandra
43
2013 © Trivadis
How Cassandra stores data
• Model brought from Google Bigtable
• Row Key and a lot of columns
• Column names sorted (UTF8, Int, Timestamp, etc.)
January 2016
Architecture et modèle de données Cassandra
44
Column Name … Column Name
Column Value Column Value
Timestamp Timestamp
TTL TTL
Row Key
1 2 Billion
BillionofRows
2013 © Trivadis
Cassandra Data Model
January 2016
Keyspace
Architecture et modèle de données Cassandra
45
Column Family Column Family
2013 © Trivadis
Row, row key, column key, and column value
January 2016
Architecture et modèle de données Cassandra
46
row
key
va
cola
vb
colb
vc
colc
vd
cold
Column keys (or column names)Row
Column values (or cells)
• Rows: individual rows constitute a column family
• Row key: uniquely identifies a row in a column family
• Row: stores pairs of column keys and column values
• Column key: uniquely identifies a column value in a row
• Column value : stores one value or a collection of values
2013 © Trivadis
Static vs. Dynamic Column Family
Static column family (skinny rows)
• Contains a predefined set of columns with metadata
• Number of columns can vary across multiple rows within the column family
• Similar to RDMBS, except no NULL values
January 2016
Architecture et modèle de données Cassandra
47
John
Lennon
1940
born
England
country
1980
died
Rock
style
artist
type
The Beatles
England
country
1957
founded
Rock
style
band
type
2013 © Trivadis
What is a wide row?
Rows may be described as “skinny” or “wide”
 Wide row – has a relatively large number of column keys (hundreds or
thousands); this number may increase as new data values are inserted
- For example, a row that stores all bands of the same style
- The number of such bands will increase as new bands are formed
 Note that column values do not exist in this example
- The column key – in this case a band name – stores all the data desired
- Could have stored the number of albums, or year founded, etc., as column
values
©2014 DataStax Training. Use only with permission.
Slide 48
Rock
The Animals The Beatles...
...
...
...
...
...
2013 © Trivadis
What are composite row key and
composite column key?
Composite row key – multiple components separated by colon
‘Revolver’ and 1966 are the album title and year
‘tracks’ value is a collection (map)
Composite column key – multiple components separated by colon
Composite column keys are sorted by each component
©2014 DataStax Training. Use only with permission.
Slide 49
Revolver:1966
Rock
genre
The Beatles
performer
{1: 'Taxman', ..., 14: 'Tomorrow Never Knows'}
tracks
Revolver:1966
Taxman
1:title
Eleanor Rigby
2:title
Tomorrow Never Knows
14:title...
...
2013 © Trivadis
Data Modelling with Cassandra
• De-normalize, De-normalize, De-normalize
• Forget about old-school 3NF
• De-normalize wherever you can for quicker retrieval and let application logic
handle the responsibility of reliably updating redundancies
• Rows are gigantic and sorted
• Giga-sized rows (2 billion columns max) can be used to store sortable and
sliceable columns
• Comments by timestamp, ordered bids by quoted price, Ratings by product, ..
• One row, one machine
• Each row stays on one machine
• Rows are not shared across nodes
• Beware of this, don't create hotspots with a high demand row!
January 2016
Architecture et modèle de données Cassandra
50
From Query to Model
2013 © Trivadis
Remember this
• Cassandra finds rows fast
• Cassandra scans columns fast
• Cassandra does not scan rows
January 2016
Architecture et modèle de données Cassandra
51
2013 © Trivadis
Agenda
1. Introduction to NoSQL datastores and Polyglot Persistence
2. What is Apache Cassandra?
3. Why Cassandra, What is DataStax?
4. Cassandra Architecture
5. Cassandra Data Model
6. Cassandra Query Language (CQL)
7. Cassandra/DataStax @ Trivadis
January 2016
Architecture et modèle de données Cassandra
52
2013 © Trivadis
Cassandra API – Thrift vs. CQL
Thrift
• exposes the internal storage structure of Cassandra pretty much directly
• Complicated, low-level, full control
• legacy
CQL
• New way to go
• Provides thin abstraction layer over Cassandra's internal structure
• Hides some distracting and useless implementation details
• Allows to provide native syntax for common encodings/idioms (like
collections) instead of letting each client (library) re-implement them in their
own, different and thus incompatible way
January 2016
Architecture et modèle de données Cassandra
53
2013 © Trivadis
CQL Language
Very similar to RDBMS SQL syntax
Create objects via DDL (e.g. CREATE…)
Core DML commands supported: INSERT, UPDATE, DELETE
Query data with SELECT
Current version is CQL3
January 2016
Architecture et modèle de données Cassandra
54
2013 © Trivadis
CQL Shell for Apache Cassandra
cqlsh is the command line utility for execution CQL commands (think of
SQL*Plus for Cassandra)
CQL3 is default since Cassandra 1.2
January 2016
Architecture et modèle de données Cassandra
55
$ cqlsh
Connected to DataStaxCluster at localhost:9160.
[cqlsh 4.1.0 | Cassandra 2.0.5.24 | CQL spec 3.1.1 | Thrift
protocol 19.39.0]
Use HELP for help.
cqlsh>
2013 © Trivadis
The CQL/Cassandra Mapping – Static Table
January 2016
name | age | role
-----+-----+-----
john | 37 | dev
eric | 38 | ceo
age role
john 37 dev
Eric 38 ceo
CREATE TABLE employee (
name text PRIMARY KEY,
age int,
role text);
Architecture et modèle de données Cassandra
56
2013 © Trivadis
Create a Dynamic table (wide-row) Employee
A Dynamic Table is also created with the CREATE TABLE statement but
using a composite primary key
January 2016
Architecture et modèle de données Cassandra
57
cqlsh:training> CREATE TABLE employees (
company text,
name text,
age int,
role text,
PRIMARY KEY (company,name)
);
2013 © Trivadis
The CQL/Cassandra Mapping – Dynamic Table
January 2016
company | name | age | role
--------+------+-----+-----
OSC | eric | 38 | ceo
OSC | john | 37 | dev
RKG | anya | 29 | lead
RKG | ben | 27 | dev
RKG | chad | 35 | ops
eric:age eric:role john:age john:role
OSC 38 dev 37 dev
anya:age anya:role ben:age ben:role chad:age chad:role
RKG 29 lead 27 dev 35 ops
CREATE TABLE employees (
company text,
name text,
age int,
role text,
PRIMARY KEY (company,name)
);
Architecture et modèle de données Cassandra
58
2013 © Trivadis
Insert data into Employee
The INSERT command is similar to the SQL counterpart
Major difference is that the PRIMARY KEY is always required
If the same statement is executed twice, there will be no error
if same PRIMARY KEY value is reused with different other column value,
then the last one wins!
January 2016
Architecture et modèle de données Cassandra
59
cqlsh:training> INSERT INTO employee (name, age, role)
VALUES ('john', 37, 'dev');
cqlsh:training> INSERT INTO employee (name, age, role)
VALUES ('eric', 38, 'ceo');
2013 © Trivadis
Retrieving data from Employee table (II)
Restriction on column other than PRIMARY KEY won't work
Can be solved with an Index (but be careful, better use de-normalization)
January 2016
Architecture et modèle de données Cassandra
60
cqlsh:training> SELECT * FROM employee
WHERE age = 37;
Bad Request: No indexed columns present in by-columns clause
with Equal operator
cqlsh:training> CREATE INDEX employee_age_idx
ON employee (age);
cqlsh:training> SELECT * FROM employee
WHERE age = 37;
name | age | role
------+-----+------
john | 37 | dev
(1 rows)
2013 © Trivadis
Update data in Employee
The UPDATE statement is similar to the SQL UPDATE command
Just as with the INSERT, the PRIMARY KEY column must be specified as
part of the UPDATE
In CQL the UPDATE does not check for the existence of the row, if it does
not exist, CQL will just create it
January 2016
Architecture et modèle de données Cassandra
61
cqlsh:training> UPDATE employee SET age = 38
WHERE name = 'john';
2013 © Trivadis
Cassandra Data Types
January 2016
Architecture et modèle de données Cassandra
62
Category CQL Data Type Description
String ascii US-ASCII character string
text UTF-8 encoded string, used most of the time for
storing String data.
varchar UTF-8 Strings.
inet Used for storing IP addresses
Numeric int 32-bit signed integer
float 32-bit IEEE-754 floating point
double 64-bit IEEE-754 floating point
varint Arbitrary precision integers
bigint 64-bit number, equivalent to long.
decimal Variable-precision decimal
counter Distributed counter value (64-bit long)
2013 © Trivadis
Cassandra Data Types (II)
January 2016
Architecture et modèle de données Cassandra
63
Category CQL Data Type Description
UUIDs uuid A UUID in standard UUID format
timeuuid Type 1 UUID only, for storing unique time-base
IDs
Collections list Ordered collection of one or more elements
map Collection of arbitrary key-value pairs
set Unordered collection of one or more unique
elements
Miscellaneous boolean Boolean (true/false)
blob Used for storing binary data written in
hexadecimal
timestamp Date/Time
2013 © Trivadis
Cassandra Data Types (III)
TimeUUID
• Have a few extra functions, that allow extracting the time information
• now() returns a new TimeUUID with the time of the current timestamp,
ensures globally unique values
• minTimeuuid() and maxTimeuuid() are used when querying ranges of
TimeUUIDs
Counter
• Cannot mix counter columns with other types
• Value can not be set, only incremented/decremented by specified amount
• Counters may not be part of the PRIMARY KEY of the table
January 2016
Architecture et modèle de données Cassandra
64
WHERE event_time > maxTimeuuid('2013-01-01 00:05+0000')
AND event_time < minTimeuuid('2013-02-02 10:00+0000')
2013 © Trivadis
Collections
CQL3 also supports collections for storing complex data structures
• Set {value,…}, List [value,…], Map {key:value,…}
January 2016
Architecture et modèle de données Cassandra
65
cqlsh:training> CREATE TABLE collection_sample(
id int PRIMARY KEY,
string_set set<text>,
string_list list<text>,
string_map map<text, text>);
cqlsh:training> INSERT INTO coll
(id, string_set, string_list, string_map)
VALUES (1,
{'text1','text2','text1'},
['text1','text2','text1'],
{'key1':'value1'});
2013 © Trivadis
Collections (II)
January 2016
Architecture et modèle de données Cassandra
66
cqlsh:training> SELECT * FROM collection_sample;
id | string_list | string_map | string_set
----+-----------------------------+--------------------+--------------------
1 | ['text1', 'text2', 'text1'] | {'key1': 'value1'} | {'text1', 'text2'}
(1 rows)
2013 © Trivadis
Counter Columns
Create a Counter Column Table that counts “favorite” events
January 2016
Architecture et modèle de données Cassandra
67
cqlsh:training> CREATE TABLE favorites (
product_id int,
month int,
number COUNTER,
PRIMARY KEY (product_id, month));
cqlsh:training> UPDATE favorites SET number = number + 1
WHERE product_id = 4910 AND month = 06;
cqlsh:training> SELECT * FROM favorites;
product_id | month | number
------------+-------+--------
4910 | 6 | 1
2013 © Trivadis
Time-to-Live (TTL) on Insert
Insert a row with a TTL in seconds (30s) – after that the row is deleted
January 2016
Architecture et modèle de données Cassandra
68
cqlsh:training> INSERT INTO employee (name, age, role)
VALUES ('bob', 29, 'dev')
USING TTL 30;
cqlsh:training> SELECT TTL(role)
FROM employee WHERE name='bob';
ttl(role)
-----------
22
cqlsh:training> SELECT TTL(role) FROM employee WHERE
name='bob';
(0 rows)
2013 © Trivadis
Agenda
1. Introduction to NoSQL datastores and Polyglot Persistence
2. What is Apache Cassandra?
3. Why Cassandra, What is DataStax?
4. Cassandra Architecture
5. Cassandra Data Model
6. Cassandra Query Language (CQL)
7. Cassandra/DataStax @ Trivadis
January 2016
Architecture et modèle de données Cassandra
69
2013 © Trivadis
Trivadis / DataStax Partnership
• Since December 2014 we are a DataStax silver partner
• DataStax Partner Network (DSPN)
• Available certifications
• Admin
• Developer
• Architect
• Currently only one other partner in Switzerland: Intersys
• http://www.datastax.com/partners
January 2016
Architecture et modèle de données Cassandra
70
2013 © Trivadis
Questions and answers ...
2013 © Trivadis
BASEL BERN BRUGES LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MUNICH STUTTGART VIENNA
Ulises Fasoli
Senior consultant
+41 21 321 47 00
ulises.fasoli@trivadis.com
January 2016
Architecture et modèle de données Cassandra
71

More Related Content

What's hot

Get started with Microsoft SQL Polybase
Get started with Microsoft SQL PolybaseGet started with Microsoft SQL Polybase
Get started with Microsoft SQL Polybase
Henk van der Valk
 
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
DataStax
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
Sascha Dittmann
 
Azure SQL Data Warehouse for beginners
Azure SQL Data Warehouse for beginnersAzure SQL Data Warehouse for beginners
Azure SQL Data Warehouse for beginners
Michaela Murray
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
Nikiforos Botis
 
Announcing Spark Driver for Cassandra
Announcing Spark Driver for CassandraAnnouncing Spark Driver for Cassandra
Announcing Spark Driver for Cassandra
DataStax
 
Changing the game with cloud dw
Changing the game with cloud dwChanging the game with cloud dw
Changing the game with cloud dw
elephantscale
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
Tayfun Sevimli
 
Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6
DataStax
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
Sascha Dittmann
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
Grant Fritchey
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An Overview
C. Scyphers
 
Data Modeling Basics for the Cloud with DataStax
Data Modeling Basics for the Cloud with DataStaxData Modeling Basics for the Cloud with DataStax
Data Modeling Basics for the Cloud with DataStax
DataStax
 
Azure SQL DWH
Azure SQL DWHAzure SQL DWH
Azure SQL DWH
Shy Engelberg
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explained
Satya Pal
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016
James Serra
 
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
Insight Technology, Inc.
 
How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStax
DataStax
 
Integrated Data Warehouse with Hadoop and Oracle Database
Integrated Data Warehouse with Hadoop and Oracle DatabaseIntegrated Data Warehouse with Hadoop and Oracle Database
Integrated Data Warehouse with Hadoop and Oracle Database
Gwen (Chen) Shapira
 
NoSQL Seminer
NoSQL SeminerNoSQL Seminer
NoSQL Seminer
Partha Das
 

What's hot (20)

Get started with Microsoft SQL Polybase
Get started with Microsoft SQL PolybaseGet started with Microsoft SQL Polybase
Get started with Microsoft SQL Polybase
 
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
 
Azure SQL Data Warehouse for beginners
Azure SQL Data Warehouse for beginnersAzure SQL Data Warehouse for beginners
Azure SQL Data Warehouse for beginners
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
 
Announcing Spark Driver for Cassandra
Announcing Spark Driver for CassandraAnnouncing Spark Driver for Cassandra
Announcing Spark Driver for Cassandra
 
Changing the game with cloud dw
Changing the game with cloud dwChanging the game with cloud dw
Changing the game with cloud dw
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 
Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An Overview
 
Data Modeling Basics for the Cloud with DataStax
Data Modeling Basics for the Cloud with DataStaxData Modeling Basics for the Cloud with DataStax
Data Modeling Basics for the Cloud with DataStax
 
Azure SQL DWH
Azure SQL DWHAzure SQL DWH
Azure SQL DWH
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explained
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016
 
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
 
How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStax
 
Integrated Data Warehouse with Hadoop and Oracle Database
Integrated Data Warehouse with Hadoop and Oracle DatabaseIntegrated Data Warehouse with Hadoop and Oracle Database
Integrated Data Warehouse with Hadoop and Oracle Database
 
NoSQL Seminer
NoSQL SeminerNoSQL Seminer
NoSQL Seminer
 

Viewers also liked

Performance #1 memory
Performance #1   memoryPerformance #1   memory
Performance #1 memory
Vitali Pekelis
 
BMOR-DECA-Report
BMOR-DECA-ReportBMOR-DECA-Report
BMOR-DECA-Report
Shannon Hui
 
Gym tonik
Gym tonikGym tonik
Gym tonik
Panos Christias
 
Kevin_C_Wright_Final_Paper
Kevin_C_Wright_Final_PaperKevin_C_Wright_Final_Paper
Kevin_C_Wright_Final_Paper
Kevin Wright
 
12 ways to get ripped off when you sell your home
12 ways to get ripped off when you sell your home12 ways to get ripped off when you sell your home
12 ways to get ripped off when you sell your home
Carl Weisman
 
Les défis des architectures cloud sur OpenStack
Les défis des architectures cloud sur OpenStackLes défis des architectures cloud sur OpenStack
Les défis des architectures cloud sur OpenStack
Osones
 
Audience profile
Audience profileAudience profile
Audience profile
AmyShields00
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
Andrey Lomakin
 
20151118 Retour d'Expérience : déploiement Cloud OpenStack chez un opérateur
20151118 Retour d'Expérience : déploiement Cloud OpenStack chez un opérateur20151118 Retour d'Expérience : déploiement Cloud OpenStack chez un opérateur
20151118 Retour d'Expérience : déploiement Cloud OpenStack chez un opérateur
Objectif Libre
 
OpenStack dans la pratique
OpenStack dans la pratiqueOpenStack dans la pratique
OpenStack dans la pratique
Osones
 
Valid &amp; invalid arguments
Valid &amp; invalid argumentsValid &amp; invalid arguments
Valid &amp; invalid arguments
Abdur Rehman
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Le Cloud IaaS & PaaS, OpenStack réseau et sécurité
Le Cloud IaaS & PaaS, OpenStack réseau et sécuritéLe Cloud IaaS & PaaS, OpenStack réseau et sécurité
Le Cloud IaaS & PaaS, OpenStack réseau et sécurité
Noureddine BOUYAHIAOUI
 
Openstack framework Iaas
Openstack framework IaasOpenstack framework Iaas
Openstack framework Iaas
Noureddine BOUYAHIAOUI
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
SlideShare
 

Viewers also liked (15)

Performance #1 memory
Performance #1   memoryPerformance #1   memory
Performance #1 memory
 
BMOR-DECA-Report
BMOR-DECA-ReportBMOR-DECA-Report
BMOR-DECA-Report
 
Gym tonik
Gym tonikGym tonik
Gym tonik
 
Kevin_C_Wright_Final_Paper
Kevin_C_Wright_Final_PaperKevin_C_Wright_Final_Paper
Kevin_C_Wright_Final_Paper
 
12 ways to get ripped off when you sell your home
12 ways to get ripped off when you sell your home12 ways to get ripped off when you sell your home
12 ways to get ripped off when you sell your home
 
Les défis des architectures cloud sur OpenStack
Les défis des architectures cloud sur OpenStackLes défis des architectures cloud sur OpenStack
Les défis des architectures cloud sur OpenStack
 
Audience profile
Audience profileAudience profile
Audience profile
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
 
20151118 Retour d'Expérience : déploiement Cloud OpenStack chez un opérateur
20151118 Retour d'Expérience : déploiement Cloud OpenStack chez un opérateur20151118 Retour d'Expérience : déploiement Cloud OpenStack chez un opérateur
20151118 Retour d'Expérience : déploiement Cloud OpenStack chez un opérateur
 
OpenStack dans la pratique
OpenStack dans la pratiqueOpenStack dans la pratique
OpenStack dans la pratique
 
Valid &amp; invalid arguments
Valid &amp; invalid argumentsValid &amp; invalid arguments
Valid &amp; invalid arguments
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Le Cloud IaaS & PaaS, OpenStack réseau et sécurité
Le Cloud IaaS & PaaS, OpenStack réseau et sécuritéLe Cloud IaaS & PaaS, OpenStack réseau et sécurité
Le Cloud IaaS & PaaS, OpenStack réseau et sécurité
 
Openstack framework Iaas
Openstack framework IaasOpenstack framework Iaas
Openstack framework Iaas
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
 

Similar to Architecture et modèle de données Cassandra

Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overview
ElifTech
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
DataStax
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to Cassandra
Umair Mansoob
 
Slides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationSlides: Relational to NoSQL Migration
Slides: Relational to NoSQL Migration
DATAVERSITY
 
Apache Cassandra Lunch #64: Cassandra for .NET Developers
Apache Cassandra Lunch #64: Cassandra for .NET DevelopersApache Cassandra Lunch #64: Cassandra for .NET Developers
Apache Cassandra Lunch #64: Cassandra for .NET Developers
Anant Corporation
 
NoSQL Options Compared
NoSQL Options ComparedNoSQL Options Compared
NoSQL Options Compared
Sergey Bushik
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Johnny Miller
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
ijiert bestjournal
 
DSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and CassandraDSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and Cassandra
Shrikant Samarth
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربي
Mohamed Galal
 
Stargate, the gateway for some multi-models data API
Stargate, the gateway for some multi-models data APIStargate, the gateway for some multi-models data API
Stargate, the gateway for some multi-models data API
Data Con LA
 
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraLow-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Caserta
 
No sql database
No sql databaseNo sql database
No sql database
vishal gupta
 
Performance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationPerformance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migration
Ramkumar Nottath
 
Report 2.0.docx
Report 2.0.docxReport 2.0.docx
Report 2.0.docx
pinstechwork
 
Erciyes university
Erciyes universityErciyes university
Erciyes university
hothaifa alkhazraji
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
Francisco González Jiménez
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
Raul Chong
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
balwinders
 
Business Growth Is Fueled By Your Event-Centric Digital Strategy
Business Growth Is Fueled By Your Event-Centric Digital StrategyBusiness Growth Is Fueled By Your Event-Centric Digital Strategy
Business Growth Is Fueled By Your Event-Centric Digital Strategy
zitipoff
 

Similar to Architecture et modèle de données Cassandra (20)

Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overview
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to Cassandra
 
Slides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationSlides: Relational to NoSQL Migration
Slides: Relational to NoSQL Migration
 
Apache Cassandra Lunch #64: Cassandra for .NET Developers
Apache Cassandra Lunch #64: Cassandra for .NET DevelopersApache Cassandra Lunch #64: Cassandra for .NET Developers
Apache Cassandra Lunch #64: Cassandra for .NET Developers
 
NoSQL Options Compared
NoSQL Options ComparedNoSQL Options Compared
NoSQL Options Compared
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
 
DSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and CassandraDSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and Cassandra
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربي
 
Stargate, the gateway for some multi-models data API
Stargate, the gateway for some multi-models data APIStargate, the gateway for some multi-models data API
Stargate, the gateway for some multi-models data API
 
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraLow-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
 
No sql database
No sql databaseNo sql database
No sql database
 
Performance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationPerformance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migration
 
Report 2.0.docx
Report 2.0.docxReport 2.0.docx
Report 2.0.docx
 
Erciyes university
Erciyes universityErciyes university
Erciyes university
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Business Growth Is Fueled By Your Event-Centric Digital Strategy
Business Growth Is Fueled By Your Event-Centric Digital StrategyBusiness Growth Is Fueled By Your Event-Centric Digital Strategy
Business Growth Is Fueled By Your Event-Centric Digital Strategy
 

Recently uploaded

Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 

Recently uploaded (20)

Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 

Architecture et modèle de données Cassandra

  • 1. 2013 © Trivadis BASEL BERN BRUGES LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MUNICH STUTTGART VIENNA 2013 © Trivadis Architecture et modèle de données Cassandra Genève 26.01.2015 Ulises Fasoli Senior Consultant Trivadis AG January 2016 Architecture et modèle de données Cassandra 1
  • 2. 2013 © Trivadis Agenda 1. Introduction to NoSQL datastores and Polyglot Persistence 2. What is Apache Cassandra? 3. Why Cassandra, What is DataStax? 4. Cassandra Architecture 5. Cassandra Data Model 6. Cassandra Query Language (CQL) 7. Cassandra/DataStax @ Trivadis January 2016 Architecture et modèle de données Cassandra 2
  • 3. 2013 © Trivadis History of Databases 1960s File-based, Network (CODASYL) and Hierarchical Databases 1970s Relational Database 1980 SQL became the standard query language Early 1990 Object-Databases Late 1990 XML Databases 2004 NoSQL Databases January 2016 Architecture et modèle de données Cassandra 3
  • 4. 2013 © Trivadis What‘s wrong with Relational Databases ? • SQL provides a rich, declarative query language • Database enforce referential integrity • ACID semantics • Well understood by developers, database administrators • Well supported by different languages, frameworks and tools • Hibernate, JPA, JDBC, iBATIS, Entity Framework • Well understood and accepted by operations people (DBAs) • Configuration • Monitoring • Backup and Recovery • Tuning • Design January 2016 Architecture et modèle de données Cassandra 4 They are great ….
  • 5. 2013 © Trivadis Relational Databases are great ... But! New trends Big Data Concurrency Connectivity Diversity P2P Knowledge Cloud/Grid January 2016 Architecture et modèle de données Cassandra 5
  • 6. 2013 © Trivadis Relational Databases are great ... But! Problem: Complex Object Graphs Object/Relational impedance mismatch Complicated to map rich domain model to relational schema Performance issues • Many rows in many tables • Many joins • Eager vs. lazy loading ORDER ADDRESS CUSTOMER ORDER_LINES Order ID: 1001 Order Date: 15.9.2012 Line Items Customer First Name: Peter Last Name: Sample Billing Address Street: Somestreet 10 City: Somewhere Postal Code: 55901 Name Ipod Touch Monster Beat Apple Mouse Quantity 1 2 1 Price 220.95 190.00 69.90 January 2016 Architecture et modèle de données Cassandra 6
  • 7. 2013 © Trivadis Relational Databases are great ... But! Problem: Schema evolution Adding attributes to an object => have to add columns to table Expensive, if lots of data in that table  Holding locks on the tables for long time  What if new values should be mandatory, cannot enforce NOT NULL constraint  Application downtime … January 2016 Architecture et modèle de données Cassandra 7
  • 8. 2013 © Trivadis Relational Databases are great ... But! Problem: Semi-structured data Relational schema doesn‘t easily handle semi-structured data Common solutions  Name/Value table - Poor performance - Lack of constraint  Serialize as Blob - Fewer joins, but no query capabilities January 2016 Architecture et modèle de données Cassandra 8
  • 9. 2013 © Trivadis RDBMS Database Relational Databases are great ... But! Problem: Scaling Scaling writes difficult/expensive/impossible => Big Data Scaling a relational database:  Vertical scaling is limited and is expensive  Horizontal scaling is limited and is expensive RDBMS Database RDBMS Database RDBMS Database RDBMS Database RDBMS Database Node 1 Node 2 P1 P2 P3 ClientClientClient Client Single DB => Partitioned Table => Database Sharding => Database Cluster January 2016 Architecture et modèle de données Cassandra 9
  • 10. 2013 © Trivadis So, what’s Wrong With RDBMS? • Many programmers are already familiar with it. • Transactions and ACID make development easy. • Lots of tools to use. • Rigid schema design. • Harder to scale. • Replication. January 2016 Architecture et modèle de données Cassandra 10 Nothing No one size fits all
  • 11. 2013 © Trivadis Solution: NoSQL ? No standard definition of what NoSQL means • Not Only SQL and not No SQL • Not only relational would have been better Term began in a workshop organized in 2009 Use the right tools (DBs) for the job It is more like a feature set, or event the not of a feature set January 2016 Architecture et modèle de données Cassandra 11
  • 12. 2013 © Trivadis Use Cases for NoSQL • Massive write performance. • Fast key value look ups. • Flexible schema and data types. • No single point of failure. • Fast prototyping and development. • Out of the box scalability. • Easy maintenance. January 2016 Architecture et modèle de données Cassandra 12
  • 13. 2013 © Trivadis Brewer's CAP Theorem Any networked shared-data system can have at most two of the three desirable properties:  Consistency All of the nodes see the same data at the same time, regardless of where the data is stored  Availability Node failures do not prevent survivors from continuing to operate  Network Partition tolerance The system continues to operate despite arbitrary message loss January 2016 Architecture et modèle de données Cassandra 13 Availability Consistency Network Partition Tolerance n/a CA CP AP
  • 14. 2013 © Trivadis Data Store Positioning January 2016 Architecture et modèle de données Cassandra 14 Scalability Standardized Model, Tooling, Complexity Key-value Wide Column (Column Families / Extensible Records) Document Graph Relational SQL Comfort Zone Multi Dimensional
  • 15. 2013 © Trivadis Polyglot Persistence In 2006, Neal Ford coined the term Polyglot Programming  Applications should be written in a mix of languages to take advantage of the fact that different languages are suitable for tackling different problems Polyglot Persistence defines a a hybrid approach to persistence  Using multiple data storage technologies  Selected based on the way data is being used by individual applications  Why store binary images in RDBMs, when there are better storage systems? January 2016 Architecture et modèle de données Cassandra 15 Polyglot Programmer
  • 16. 2013 © Trivadis Polyglot Persistence Today we use the same database for all kind of data • Business transactions, session management data, reporting, logging information, content information, ... No need for same properties of availability, consistency or backup requirements Polyglot Data Storage Usage allows to mix and match Relational and NoSQL data stores January 2016 Architecture et modèle de données Cassandra 16 Polygot Persistence Model E-commerce Application Shopping cart data User Sessions Product Catalog RecomendationsCompleted Order Key-Value RDMBS Document Graph „Traditional“ Persistence Model E-commerce Application RDBMS Shopping cart data User Sessions Product Catalog RecomendationsCompleted Order
  • 17. 2013 © Trivadis Agenda 1. Introduction to NoSQL datastores and Polyglot Persistence 2. What is Apache Cassandra? 3. Why Cassandra, What is DataStax? 4. Cassandra Architecture 5. Cassandra Data Model 6. Cassandra Query Language (CQL) 7. Cassandra/DataStax @ Trivadis January 2016 Architecture et modèle de données Cassandra 17
  • 18. 2013 © Trivadis Definition of Cassandra Apache Cassandra™ is a free • Distributed… • High performance… • Extremely scalable… • Fault tolerant (i.e. no single point of failure)… post-relational database solution. Cassandra can serve as both real-time Datastore (the "system of record") for online/transactional applications, and as a read-intensive database for business intelligence systems. January 2016 Architecture et modèle de données Cassandra 18
  • 19. 2013 © Trivadis History of Cassandra January 2016 Architecture et modèle de données Cassandra 19 Bigtable Dynamo
  • 20. 2013 © Trivadis Architecture Overview Cassandra was designed with the understanding that system/hardware failures can and do occur : • Peer-to-peer, distributed system • All nodes the same • Data partitioned among all nodes in the cluster • Custom data replication to ensure fault tolerance • Read/Write-anywhere design January 2016 Architecture et modèle de données Cassandra 20
  • 21. 2013 © Trivadis Big Data Scalability • Capable of comfortably scaling to petabytes • New nodes = Linear performance increases • Add new nodes online January 2016 Architecture et modèle de données Cassandra 21
  • 22. 2013 © Trivadis Who is using Cassandra? January 2016 Architecture et modèle de données Cassandra 22 Largest publicly known cluster has over 300 TB of data spanning 400 machines
  • 23. 2013 © Trivadis Agenda 1. Introduction to NoSQL datastores and Polyglot Persistence 2. What is Apache Cassandra? 3. Why Cassandra, What is DataStax? 4. Cassandra Architecture 5. Cassandra Data Model 6. Cassandra Query Language (CQL) 7. Cassandra/DataStax @ Trivadis January 2016 Architecture et modèle de données Cassandra 23
  • 24. 2013 © Trivadis Why Cassandra? Tunable data consistency Flexible schema design Data Compression CQL language (like SQL) Support for key languages and platforms No need for special hardware or software Gigabyte to Petabyte scalability Linear performance gains through adding nodes No single point of failure Easy replication / data distribution Multi-data center and Cloud capable No need for separate caching layer January 2016 Architecture et modèle de données Cassandra 24
  • 25. 2013 © Trivadis Cassandra Use Cases Product Catalog / Playlists Personalization • Ads • Recommendations • Ratings Fraud Detection Time Series • Finance • Smart Meter IoT / Sensor Data Graph / Network data January 2016 Architecture et modèle de données Cassandra 25
  • 26. 2013 © Trivadis DataStax Enterprise Edition (DSE) January 2016 Architecture et modèle de données Cassandra 26
  • 27. 2013 © Trivadis Datastax OpsCenter January 2016 Architecture et modèle de données Cassandra 27
  • 28. 2013 © Trivadis Agenda 1. Introduction to NoSQL datastores and Polyglot Persistence 2. What is Apache Cassandra? 3. Why Cassandra, What is DataStax? 4. Cassandra Architecture 5. Cassandra Data Model 6. Cassandra Query Language (CQL) 7. Cassandra/DataStax @ Trivadis January 2016 Architecture et modèle de données Cassandra 28
  • 29. 2013 © Trivadis Architecture Overview Each node communicates with each other through the Gossip protocol, which exchanges information across the cluster every second A commit log is used on each node to capture write activity. Data durability is assured Data also written to an in-memory structure (memtable) and then to disk once the memory structure is full (an SSTable) January 2016 Architecture et modèle de données Cassandra 29
  • 30. 2013 © Trivadis No Single Point of Failure All nodes the same Customized replication affords tunable data redundancy Read/write from any node Can replicate data among different physical data center racks January 2016 Architecture et modèle de données Cassandra 30
  • 31. 2013 © Trivadis Easy Replication / Data Distribution Transparently handled by Cassandra Multi-data center capable Exploits all the benefits of Cloud computing Able to do hybrid Cloud/On- premise setup January 2016 Architecture et modèle de données Cassandra 31
  • 32. 2013 © Trivadis Partitioning • Nodes are logically structured in Ring Topology. • Hashed value of key associated with data partition is used to assign it to a node in the ring. • Lightly loaded nodes moves position to alleviate highly loaded nodes. January 2016 Architecture et modèle de données Cassandra 32
  • 33. 2013 © Trivadis Data Replication Replication for high availability and data durability • Replication factor N: each row is replicated at N nodes • Each row key k is assigned to a coordination node • The coordinator node is responsible for replicating the rows within its key range January 2016 Architecture et modèle de données Cassandra 33
  • 34. 2013 © Trivadis Partitioning and Replication January 2016 Architecture et modèle de données Cassandra 34 01 1/2 F E D C B A N=3 h(key2) h(key1)
  • 35. 2013 © Trivadis Data Replication Each data item is replicated at N (replication factor) nodes. Different Replication Policies  Rack Unaware – replicate data at N-1 successive nodes after its coordinator  Rack Aware – uses 'Zookeeper' to choose a leader which tells nodes the range they are replicas for  Datacenter Aware – similar to Rack Aware but leader is chosen at Datacenter level instead of Rack level. January 2016 Architecture et modèle de données Cassandra 35
  • 36. 2013 © Trivadis Write Path When a write occurs, Cassandra stores the data in a structure in memory, the Memtable, and also appends writes to the commit log on disk, providing configurable durability. January 2016 Architecture et modèle de données Cassandra 36
  • 37. 2013 © Trivadis Write Requests Coordinator sends a write request to all replicas that own the row being written January 2016 Architecture et modèle de données Cassandra 37
  • 38. 2013 © Trivadis Write Consistency The consistency level for writing to Cassandra specifies how many replicas the write must succeed before returning an ACK to the client • Quorum: (replication_factor / 2) + 1 January 2016 Architecture et modèle de données Cassandra 38
  • 39. 2013 © Trivadis Read Path When a read request for a row comes in to a node, the row must be combined from all SSTables on that node that contain columns from the row in question as well as from any unflushed memtables, to produce the requested data January 2016 Architecture et modèle de données Cassandra 39
  • 40. 2013 © Trivadis Read Requests There are two types of read requests that a coordinator can send to a replica: • A direct read request • A background read repair request The number of replicas contacted by a direct read request is determined by the consistency level specified by the client. January 2016 Architecture et modèle de données Cassandra 40
  • 41. 2013 © Trivadis Read Consistency The consistency level for reading from Cassandra specified how many replicas must respond before a result is returned to the client • Quorum: (replication_factor / 2) + 1 January 2016 Architecture et modèle de données Cassandra 41
  • 42. 2013 © Trivadis Agenda 1. Introduction to NoSQL datastores and Polyglot Persistence 2. What is Apache Cassandra? 3. Why Cassandra, What is DataStax? 4. Cassandra Architecture 5. Cassandra Data Model 6. Cassandra Query Language (CQL) 7. Cassandra/DataStax @ Trivadis January 2016 Architecture et modèle de données Cassandra 42
  • 43. 2013 © Trivadis Cassandra Data Model • Table is a multi dimensional map indexed by key (row key). • Columns are grouped into Column Families • Dynamic schema design allows for much more flexible data storage than rigid RDBMS • Each Column has - Name - Value - Timestamp January 2016 Architecture et modèle de données Cassandra 43
  • 44. 2013 © Trivadis How Cassandra stores data • Model brought from Google Bigtable • Row Key and a lot of columns • Column names sorted (UTF8, Int, Timestamp, etc.) January 2016 Architecture et modèle de données Cassandra 44 Column Name … Column Name Column Value Column Value Timestamp Timestamp TTL TTL Row Key 1 2 Billion BillionofRows
  • 45. 2013 © Trivadis Cassandra Data Model January 2016 Keyspace Architecture et modèle de données Cassandra 45 Column Family Column Family
  • 46. 2013 © Trivadis Row, row key, column key, and column value January 2016 Architecture et modèle de données Cassandra 46 row key va cola vb colb vc colc vd cold Column keys (or column names)Row Column values (or cells) • Rows: individual rows constitute a column family • Row key: uniquely identifies a row in a column family • Row: stores pairs of column keys and column values • Column key: uniquely identifies a column value in a row • Column value : stores one value or a collection of values
  • 47. 2013 © Trivadis Static vs. Dynamic Column Family Static column family (skinny rows) • Contains a predefined set of columns with metadata • Number of columns can vary across multiple rows within the column family • Similar to RDMBS, except no NULL values January 2016 Architecture et modèle de données Cassandra 47 John Lennon 1940 born England country 1980 died Rock style artist type The Beatles England country 1957 founded Rock style band type
  • 48. 2013 © Trivadis What is a wide row? Rows may be described as “skinny” or “wide”  Wide row – has a relatively large number of column keys (hundreds or thousands); this number may increase as new data values are inserted - For example, a row that stores all bands of the same style - The number of such bands will increase as new bands are formed  Note that column values do not exist in this example - The column key – in this case a band name – stores all the data desired - Could have stored the number of albums, or year founded, etc., as column values ©2014 DataStax Training. Use only with permission. Slide 48 Rock The Animals The Beatles... ... ... ... ... ...
  • 49. 2013 © Trivadis What are composite row key and composite column key? Composite row key – multiple components separated by colon ‘Revolver’ and 1966 are the album title and year ‘tracks’ value is a collection (map) Composite column key – multiple components separated by colon Composite column keys are sorted by each component ©2014 DataStax Training. Use only with permission. Slide 49 Revolver:1966 Rock genre The Beatles performer {1: 'Taxman', ..., 14: 'Tomorrow Never Knows'} tracks Revolver:1966 Taxman 1:title Eleanor Rigby 2:title Tomorrow Never Knows 14:title... ...
  • 50. 2013 © Trivadis Data Modelling with Cassandra • De-normalize, De-normalize, De-normalize • Forget about old-school 3NF • De-normalize wherever you can for quicker retrieval and let application logic handle the responsibility of reliably updating redundancies • Rows are gigantic and sorted • Giga-sized rows (2 billion columns max) can be used to store sortable and sliceable columns • Comments by timestamp, ordered bids by quoted price, Ratings by product, .. • One row, one machine • Each row stays on one machine • Rows are not shared across nodes • Beware of this, don't create hotspots with a high demand row! January 2016 Architecture et modèle de données Cassandra 50 From Query to Model
  • 51. 2013 © Trivadis Remember this • Cassandra finds rows fast • Cassandra scans columns fast • Cassandra does not scan rows January 2016 Architecture et modèle de données Cassandra 51
  • 52. 2013 © Trivadis Agenda 1. Introduction to NoSQL datastores and Polyglot Persistence 2. What is Apache Cassandra? 3. Why Cassandra, What is DataStax? 4. Cassandra Architecture 5. Cassandra Data Model 6. Cassandra Query Language (CQL) 7. Cassandra/DataStax @ Trivadis January 2016 Architecture et modèle de données Cassandra 52
  • 53. 2013 © Trivadis Cassandra API – Thrift vs. CQL Thrift • exposes the internal storage structure of Cassandra pretty much directly • Complicated, low-level, full control • legacy CQL • New way to go • Provides thin abstraction layer over Cassandra's internal structure • Hides some distracting and useless implementation details • Allows to provide native syntax for common encodings/idioms (like collections) instead of letting each client (library) re-implement them in their own, different and thus incompatible way January 2016 Architecture et modèle de données Cassandra 53
  • 54. 2013 © Trivadis CQL Language Very similar to RDBMS SQL syntax Create objects via DDL (e.g. CREATE…) Core DML commands supported: INSERT, UPDATE, DELETE Query data with SELECT Current version is CQL3 January 2016 Architecture et modèle de données Cassandra 54
  • 55. 2013 © Trivadis CQL Shell for Apache Cassandra cqlsh is the command line utility for execution CQL commands (think of SQL*Plus for Cassandra) CQL3 is default since Cassandra 1.2 January 2016 Architecture et modèle de données Cassandra 55 $ cqlsh Connected to DataStaxCluster at localhost:9160. [cqlsh 4.1.0 | Cassandra 2.0.5.24 | CQL spec 3.1.1 | Thrift protocol 19.39.0] Use HELP for help. cqlsh>
  • 56. 2013 © Trivadis The CQL/Cassandra Mapping – Static Table January 2016 name | age | role -----+-----+----- john | 37 | dev eric | 38 | ceo age role john 37 dev Eric 38 ceo CREATE TABLE employee ( name text PRIMARY KEY, age int, role text); Architecture et modèle de données Cassandra 56
  • 57. 2013 © Trivadis Create a Dynamic table (wide-row) Employee A Dynamic Table is also created with the CREATE TABLE statement but using a composite primary key January 2016 Architecture et modèle de données Cassandra 57 cqlsh:training> CREATE TABLE employees ( company text, name text, age int, role text, PRIMARY KEY (company,name) );
  • 58. 2013 © Trivadis The CQL/Cassandra Mapping – Dynamic Table January 2016 company | name | age | role --------+------+-----+----- OSC | eric | 38 | ceo OSC | john | 37 | dev RKG | anya | 29 | lead RKG | ben | 27 | dev RKG | chad | 35 | ops eric:age eric:role john:age john:role OSC 38 dev 37 dev anya:age anya:role ben:age ben:role chad:age chad:role RKG 29 lead 27 dev 35 ops CREATE TABLE employees ( company text, name text, age int, role text, PRIMARY KEY (company,name) ); Architecture et modèle de données Cassandra 58
  • 59. 2013 © Trivadis Insert data into Employee The INSERT command is similar to the SQL counterpart Major difference is that the PRIMARY KEY is always required If the same statement is executed twice, there will be no error if same PRIMARY KEY value is reused with different other column value, then the last one wins! January 2016 Architecture et modèle de données Cassandra 59 cqlsh:training> INSERT INTO employee (name, age, role) VALUES ('john', 37, 'dev'); cqlsh:training> INSERT INTO employee (name, age, role) VALUES ('eric', 38, 'ceo');
  • 60. 2013 © Trivadis Retrieving data from Employee table (II) Restriction on column other than PRIMARY KEY won't work Can be solved with an Index (but be careful, better use de-normalization) January 2016 Architecture et modèle de données Cassandra 60 cqlsh:training> SELECT * FROM employee WHERE age = 37; Bad Request: No indexed columns present in by-columns clause with Equal operator cqlsh:training> CREATE INDEX employee_age_idx ON employee (age); cqlsh:training> SELECT * FROM employee WHERE age = 37; name | age | role ------+-----+------ john | 37 | dev (1 rows)
  • 61. 2013 © Trivadis Update data in Employee The UPDATE statement is similar to the SQL UPDATE command Just as with the INSERT, the PRIMARY KEY column must be specified as part of the UPDATE In CQL the UPDATE does not check for the existence of the row, if it does not exist, CQL will just create it January 2016 Architecture et modèle de données Cassandra 61 cqlsh:training> UPDATE employee SET age = 38 WHERE name = 'john';
  • 62. 2013 © Trivadis Cassandra Data Types January 2016 Architecture et modèle de données Cassandra 62 Category CQL Data Type Description String ascii US-ASCII character string text UTF-8 encoded string, used most of the time for storing String data. varchar UTF-8 Strings. inet Used for storing IP addresses Numeric int 32-bit signed integer float 32-bit IEEE-754 floating point double 64-bit IEEE-754 floating point varint Arbitrary precision integers bigint 64-bit number, equivalent to long. decimal Variable-precision decimal counter Distributed counter value (64-bit long)
  • 63. 2013 © Trivadis Cassandra Data Types (II) January 2016 Architecture et modèle de données Cassandra 63 Category CQL Data Type Description UUIDs uuid A UUID in standard UUID format timeuuid Type 1 UUID only, for storing unique time-base IDs Collections list Ordered collection of one or more elements map Collection of arbitrary key-value pairs set Unordered collection of one or more unique elements Miscellaneous boolean Boolean (true/false) blob Used for storing binary data written in hexadecimal timestamp Date/Time
  • 64. 2013 © Trivadis Cassandra Data Types (III) TimeUUID • Have a few extra functions, that allow extracting the time information • now() returns a new TimeUUID with the time of the current timestamp, ensures globally unique values • minTimeuuid() and maxTimeuuid() are used when querying ranges of TimeUUIDs Counter • Cannot mix counter columns with other types • Value can not be set, only incremented/decremented by specified amount • Counters may not be part of the PRIMARY KEY of the table January 2016 Architecture et modèle de données Cassandra 64 WHERE event_time > maxTimeuuid('2013-01-01 00:05+0000') AND event_time < minTimeuuid('2013-02-02 10:00+0000')
  • 65. 2013 © Trivadis Collections CQL3 also supports collections for storing complex data structures • Set {value,…}, List [value,…], Map {key:value,…} January 2016 Architecture et modèle de données Cassandra 65 cqlsh:training> CREATE TABLE collection_sample( id int PRIMARY KEY, string_set set<text>, string_list list<text>, string_map map<text, text>); cqlsh:training> INSERT INTO coll (id, string_set, string_list, string_map) VALUES (1, {'text1','text2','text1'}, ['text1','text2','text1'], {'key1':'value1'});
  • 66. 2013 © Trivadis Collections (II) January 2016 Architecture et modèle de données Cassandra 66 cqlsh:training> SELECT * FROM collection_sample; id | string_list | string_map | string_set ----+-----------------------------+--------------------+-------------------- 1 | ['text1', 'text2', 'text1'] | {'key1': 'value1'} | {'text1', 'text2'} (1 rows)
  • 67. 2013 © Trivadis Counter Columns Create a Counter Column Table that counts “favorite” events January 2016 Architecture et modèle de données Cassandra 67 cqlsh:training> CREATE TABLE favorites ( product_id int, month int, number COUNTER, PRIMARY KEY (product_id, month)); cqlsh:training> UPDATE favorites SET number = number + 1 WHERE product_id = 4910 AND month = 06; cqlsh:training> SELECT * FROM favorites; product_id | month | number ------------+-------+-------- 4910 | 6 | 1
  • 68. 2013 © Trivadis Time-to-Live (TTL) on Insert Insert a row with a TTL in seconds (30s) – after that the row is deleted January 2016 Architecture et modèle de données Cassandra 68 cqlsh:training> INSERT INTO employee (name, age, role) VALUES ('bob', 29, 'dev') USING TTL 30; cqlsh:training> SELECT TTL(role) FROM employee WHERE name='bob'; ttl(role) ----------- 22 cqlsh:training> SELECT TTL(role) FROM employee WHERE name='bob'; (0 rows)
  • 69. 2013 © Trivadis Agenda 1. Introduction to NoSQL datastores and Polyglot Persistence 2. What is Apache Cassandra? 3. Why Cassandra, What is DataStax? 4. Cassandra Architecture 5. Cassandra Data Model 6. Cassandra Query Language (CQL) 7. Cassandra/DataStax @ Trivadis January 2016 Architecture et modèle de données Cassandra 69
  • 70. 2013 © Trivadis Trivadis / DataStax Partnership • Since December 2014 we are a DataStax silver partner • DataStax Partner Network (DSPN) • Available certifications • Admin • Developer • Architect • Currently only one other partner in Switzerland: Intersys • http://www.datastax.com/partners January 2016 Architecture et modèle de données Cassandra 70
  • 71. 2013 © Trivadis Questions and answers ... 2013 © Trivadis BASEL BERN BRUGES LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MUNICH STUTTGART VIENNA Ulises Fasoli Senior consultant +41 21 321 47 00 ulises.fasoli@trivadis.com January 2016 Architecture et modèle de données Cassandra 71