10. @clunven | @shiftConf_co | SHIFT DEV CONF 2019
Sweet spots
1. High Throughput (because we can keep up)
2. High Volume (because we scale linearly and still OLTP)
3. High Availability (replication, masterless)
4. Data distribution (read/write around the globe)
16. @clunven | @shiftConf_co | SHIFT DEV CONF 2019
Application Workflow
R1: Find comments related to target video using its identifier
• Get most recent first
• Implement Paging
R2: Find comments related to target user using its identifier
• Get most recent first
• Implement Paging
R3: Implement CRUD operations
17. @clunven | @shiftConf_co | SHIFT DEV CONF 2019
Mapping
Q2: Find comments posted for a user with a
known id (show most recent first)
comments_by_video
comments_by_user
Q1: Find comments for a video with a
known id (show most recent first)
Q3: CRUD Operations
18. @clunven | @shiftConf_co | SHIFT DEV CONF 2019
Logical Data Model
userid
creationdate
commentid
videoid
comment
comments_by_user
K
C
↑
videoid
creationdate
commentid
userid
comment
comments_by_video
C
↑
K
C
↑
↑C
19. @clunven | @shiftConf_co | SHIFT DEV CONF 2019
Physical Data Model
userid
commentid
videoid
comment
comments_by_user
TIMEUUID
K
TEXT
C
UUID
UUID
↑
videoid
commentid
userid
comment
comments_by_video
TIMEUUID
K
TEXT
C
UUID
UUID
↑
20. @clunven | @shiftConf_co | SHIFT DEV CONF 2019
Schema DDL
CREATE TABLE IF NOT EXISTS comments_by_user (
userid uuid,
commentid timeuuid,
videoid uuid,
comment text,
PRIMARY KEY ((userid), commentid)
) WITH CLUSTERING ORDER BY (commentid DESC);
CREATE TABLE IF NOT EXISTS comments_by_video (
videoid uuid,
commentid timeuuid,
userid uuid,
comment text,
PRIMARY KEY ((videoid), commentid)
) WITH CLUSTERING ORDER BY (commentid DESC);
21. @clunven | @shiftConf_co | SHIFT DEV CONF 2019
How?
Conceptual Data
Model
(Entities, Relations)
Application Workflow
(Queries)
Database Family
(Technos +Table)
27. @clunven | @shiftConf_co | SHIFT DEV CONF 2019
Decoupling Client / Server (Schema on read)
Api Lifecycle (Versioning)
Tooling (API Management, Serverless)
Verbose payloads (json, xml)
No discoverability
Not suitable for command-like (functions) API
CRUD superstar
Relevant for mutations (OLTP)
Public and web APIs
28. @clunven | @shiftConf_co | SHIFT DEV CONF 2019
High Performances (http/2 – binary serialisation)
Multiple stubs : Sync, Async, Streaming
Multi languages - Interoperability
Strongly coupled (schema with proto files)
No discoverability
Protobuf serialization format
Distributed network of services (no waits)
High throughput & streaming use cases
Command-like (eg: slack)
29. @clunven | @shiftConf_co | SHIFT DEV CONF 2019
Discoverability, documentation
Custom payloads
Match standards (Json | Http)
Single endpoint (versioning, monitoring, security)
Complex implementation (tooling, still young)
Nice for customers nasty for DB (N+1 select)
Backend for frontend (JS)
Service aggregation | composition (joins)
When volume matters (mobile phones)
GraphQL
Thank you
I will speak you about …..
What do you do : 80% Access your Data – 20% execute functions
One of Cassandra's fault-tolerance strategies is replication.
Replication is a matter of duplicating data across nodes.
The number of replicas is called the Replication Factor.
Let’s look at some examples…
<click> (RF=1 appears)
Let’s start with the simplest example:
a replication factor of 1 – only a single copy
It’s not something you would likely do in production,
but it's a good place to start the discussion.
<click> (data appears)
Here we’re showing a write request
Some data with a partition token of 59
<click> (data moves to node)
The top-right node will serve as the coordinator
<click> (data turns purple)
Notice 59 falls in the purple range
<click> (data move to the node)
So the coordinator forwards the data to the purple node
<click> (data clears)
<click> (RF=2 appears)
Let's increase the replication factor to 2
<click> (ring colors double)
This doubles the range that each node is responsible for.
For example, node 75 becomes responsible for
the red range
and the purple range
<click> (data appears)
Again our request to write partition with token 59 arrives.
But this time the coordinator sends it to two nodes.
<click> (data moves)
<click> (data fades)
Let's increase the replication factor to 3
<click> (RF=3 appears)
This means that each node is responsible for 3 ranges
<click> (3 ranges appear)
Once again, the data arrives at the coordinator
Where will the coordinator send the data this time?
<click> (data moves)
We see the data replicated to all three nodes
<click> (data fades)
So, in a nutshell, that's how replication works
Consistency level is different than replication factor.
On a read, consistency level is how many replicas you read
Each replica has a time stamp.
The most recent replica wins
In this example, imagine we have a replication factor of 3.
<click> (shows write arrows)
So when we write, we write 3 replicas of the data.
<click> (shows CL=ONE)
Now let's say we want to read from this cluster with a consistency level of 1.
In this case, we only need to read from a single node to resolve the data.
<click> (shows read arrows)
Now, we can change the consistency level to quorum,
<click> (CL=QUORUM appears)
Which means we want to read a majority of the replicas.
Since the replication factor is 3, quorum implies reading 2 replicas
<click> (second read line appears)
Notice, if the replicas disagree, the coordinator returns the data with the most recent time stamp
<click> (CL clears)
We can even specify a consistency level of ALL,
<click> (CL=ALL appears)
Which means we will read all replicas
<click> (third read line appears)
(pause to let people absorb)
ALEXANDRE
Few logos:
As we already told we are using Java why not using the last Java 12
Services are implemented and connected with Spring
Everything is wrapped into a Spring boot 2.1 application
Services are exposed as REST with Spring MVC
Did you see our gray hairs and beards here, we do Java,
we are serious people and do not play with the teenager language JavaScript
ALEXANDRE
Show repository, stress its simplicity and blocking calls
Show controller, same thing
Show controller unit test, run tests