Fast track to getting started with DSE Max @ ING

@doanduyhai
Fast Track to DSE Max
Spark & Cassandra
DuyHai DOAN, Technical Advocate

@doanduyhai
Who Am I ?!
Duy Hai DOAN
Cassandra technical advocate
•  talks, meetups, confs
•  open-source devs (Achilles, …)
•  OSS Cassandra point of contact
☞ duy_hai.doan@datastax.com
☞ @doanduyhai
2

@doanduyhai
Datastax!
•  Founded in April 2010
•  We contribute a lot to Apache Cassandra™
•  400+ customers (25 of the Fortune 100), 400+ employees
•  Headquarter in San Francisco Bay area
•  EU headquarter in London, ofﬁces in France and Germany
•  Datastax Enterprise = OSS Cassandra + extra features
3

Spark & Cassandra Presentation !
Spark & its eco-system!
Cassandra & token ranges!
!

@doanduyhai
What is Apache Spark ?!
Created at

Apache Project since 2010

General data processing framework

Faster than Hadoop, in memory

One-framework-many-components approach
5

@doanduyhai
Spark code example!
Setup
Data-set (can be from text, CSV, JSON, Cassandra, HDFS, …)
val$conf$=$new$SparkConf(true)$
$ .setAppName("basic_example")$
$ .setMaster("local[3]")$
$
val$sc$=$new$SparkContext(conf)$
val$people$=$List(("jdoe","John$DOE",$33),$
$$$$$$$$$$$$$$$$$$("hsue","Helen$SUE",$24),$
$$$$$$$$$$$$$$$$$$("rsmith",$"Richard$Smith",$33))$
6

@doanduyhai
RDDs!
RDD = Resilient Distributed Dataset

val$parallelPeople:$RDD[(String,$String,$Int)]$=$sc.parallelize(people)$
$
val$extractAge:$RDD[(Int,$(String,$String,$Int))]$=$parallelPeople$
$ $ $ $ $ $ .map(tuple$=>$(tuple._3,$tuple))$
$
val$groupByAge:$RDD[(Int,$Iterable[(String,$String,$Int)])]=extractAge.groupByKey()$
$
val$countByAge:$Map[Int,$Long]$=$groupByAge.countByKey()$
7

@doanduyhai
RDDs!
RDD[A] = distributed collection of A
•  RDD[Person]
•  RDD[(String,Int)], …

RDD[A] split into partitions

Partitions distributed over n workers à parallel computing
8

@doanduyhai
Direct transformations!
map(f: A => B): RDD[B]

ﬁlter(f: A => Boolean): RDD[A]

…
9

@doanduyhai
Transformations requiring shufﬂe!
groupByKey(): RDD[(K,V)]

reduceByKey(f: (V,V) => V): RDD[(K,V)]

join[W](otherRDD: RDD[(K,W)]): RDD[(K, (V,W))]

…
10

@doanduyhai
Actions!
collect(): Array[A]

take(number: Int): Array[A]

foreach(f: A => Unit): Unit

…
11

@doanduyhai
Partitions transformations!
map(tuple => (tuple._3, tuple))
Direct transformation
Shuffle (expensive !)
groupByKey()
countByKey()
partition
RDD
Final action
12

@doanduyhai
Spark eco-system!
Local Standalone cluster YARN Mesos
Spark Core Engine (Scala/Java/Python)
Spark Streaming MLLibGraphXSpark SQL
Persistence
Cluster Manager
…
etc…
13

@doanduyhai
Spark eco-system!
Local Standalone cluster YARN Mesos
Spark Core Engine (Scala/Java/Python)
Spark Streaming MLLibGraphXSpark SQL
Persistence
Cluster Manager
…
etc…
14

@doanduyhai
What is Apache Cassandra?!
Created at

Apache Project since 2009

Distributed NoSQL database

Eventual consistency (A & P of the CAP theorem)

Distributed table abstraction
15

@doanduyhai
Cassandra data distribution reminder!
Random: hash of #partition → token = hash(#p)

Hash: ]-X, X]

X = huge number (264/2)

n1
n2
n3
n4
n5
n6
n7
n8
16

@doanduyhai
Cassandra token ranges!
A: ]0, X/8]
B: ] X/8, 2X/8]
C: ] 2X/8, 3X/8]
D: ] 3X/8, 4X/8]
E: ] 4X/8, 5X/8]
F: ] 5X/8, 6X/8]
G: ] 6X/8, 7X/8]
H: ] 7X/8, X]

Murmur3 hash function
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
17

@doanduyhai
Linear scalability!
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
user_id1
user_id2
user_id3
user_id4
user_id5
18

@doanduyhai
Linear scalability!
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
user_id1
user_id2
user_id3
user_id4
user_id5
19

@doanduyhai
Cassandra Query Language (CQL)!

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);

UPDATE users SET age = 34 WHERE login = ‘jdoe’;

DELETE age FROM users WHERE login = ‘jdoe’;

SELECT age FROM users WHERE login = ‘jdoe’;
20

Spark & Cassandra Connector!
Spark Core API!
SparkSQL!
SparkStreaming!

@doanduyhai
Spark/Cassandra connector architecture!
All Cassandra types supported and converted to Scala types

Server side data ﬁltering (SELECT … WHERE …)

Use Java-driver underneath
!
Scala and Java support
22

@doanduyhai
Connector architecture – Core API!
Cassandra tables exposed as Spark RDDs

Read from and write to Cassandra

Mapping of C* tables and rows to Scala objects
•  CassandraRDD and CassandraRow
•  Scala case class (object mapper)
•  Scala tuples

23

@doanduyhai
Spark Core
https://github.com/doanduyhai/Cassandra-Spark-Demo

@doanduyhai
Connector architecture – Spark SQL !
Mapping of Cassandra table to SchemaRDD
•  CassandraSQLRow à SparkRow
•  custom query plan
•  push predicates to CQL for early ﬁltering

SELECT * FROM user_emails WHERE login = ‘jdoe’;
25

@doanduyhai
Spark SQL

@doanduyhai
Connector architecture – Spark Streaming !
Streaming data INTO Cassandra table
•  trivial setup
•  be careful about your Cassandra data model when having an infinite
stream !!!
Streaming data OUT of Cassandra tables (CDC) ?
•  work in progress …
27

@doanduyhai
Spark Streaming

Spark/Cassandra operations!
Cluster deployment & job lifecycle!
Data locality!
Failure handling!
Cross-region operations!

@doanduyhai
Cluster deployment!
C*
SparkM
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
Stand-alone cluster
31

@doanduyhai
Cassandra – Spark placement!
Spark Worker Spark Worker Spark Worker Spark Worker
1 Cassandra process ⟷ 1 Spark worker
C* C* C* C*
32
Spark Master

@doanduyhai
Cassandra – Spark job lifecycle!
C* C* C* C*
33
Spark Master
Spark Client
Driver Program
Spark Context
1
D e f i n e y o u r
business logic
here !

@doanduyhai
C* C* C* C*
34
Spark Master
Spark Client
Driver Program
Spark Context
2

@doanduyhai
C* C* C* C*
35
Spark Master
Spark Client
Driver Program
Spark Context
3 333

@doanduyhai
C* C* C* C*
36
Spark Master
Spark Client
Driver Program
Spark Context
Spark Executor Spark Executor Spark Executor Spark Executor
444 4

@doanduyhai
C* C* C* C*
37
Spark Master
Spark Client
Driver Program
Spark Context
Spark Executor Spark Executor Spark Executor Spark Executor
5 5 5 5

@doanduyhai
Data Locality – remember token ranges ?!
A: ]0, X/8]
B: ] X/8, 2X/8]
C: ] 2X/8, 3X/8]
D: ] 3X/8, 4X/8]
E: ] 4X/8, 5X/8]
F: ] 5X/8, 6X/8]
G: ] 6X/8, 7X/8]
H: ] 7X/8, X]
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
38

@doanduyhai
Data Locality – how to!
C*
SparkM
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
Spark partition RDD
Cassandra
tokens ranges
39

@doanduyhai
Data Locality – how to!
C*
SparkM
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
Use Murmur3Partitioner

40

@doanduyhai
Read data locality!
Read from Cassandra
41

@doanduyhai
Read data locality!
Spark shuffle operations
42

@doanduyhai
Write to Cassandra without data locality!
Async batches fan-out writes to Cassandra
Because of shuffle, original data locality is lost
43

@doanduyhai
Write to Cassandra with data locality!
Write to Cassandra
rdd.repartitionByCassandraReplica("keyspace","table")
44

@doanduyhai
Write data locality!
•  either stream data in Spark layer using repartitionByCassandraReplica()
•  or flush data to Cassandra by async batches
•  in any case, there will be data movement on network (sorry no magic)
45

@doanduyhai
Joins with data locality!

CREATE TABLE artists(name text, style text, … PRIMARY KEY(name));

CREATE TABLE albums(title text, artist text, year int,… PRIMARY KEY(title));
val join: CassandraJoinRDD[(String,Int), (String,String)] =
sc.cassandraTable[(String,Int)](KEYSPACE, ALBUMS)
// Select only useful columns for join and processing
.select("artist","year")
.as((_:String, _:Int))
// Repartition RDDs by "artists" PK, which is "name"
.repartitionByCassandraReplica(KEYSPACE, ARTISTS)
// Join with "artists" table, selecting only "name" and "country" columns
.joinWithCassandraTable[(String,String)](KEYSPACE, ARTISTS, SomeColumns("name","country"))
.on(SomeColumns("name"))
46

@doanduyhai
Joins pipeline with data locality!
LOCAL READ
FROM CASSANDRA
47

@doanduyhai
SHUFFLE DATA
WITH SPARK
48

@doanduyhai
REPARTITION TO MAP
CASSANDRA REPLICA
49

@doanduyhai
JOIN WITH
DATA LOCALITY
50

@doanduyhai
ANOTHER ROUND
OF SHUFFLING
51

@doanduyhai
REPARTITION AGAIN
FOR CASSANDRA
52

@doanduyhai
SAVE TO CASSANDRA
WITH LOCALITY
53

@doanduyhai
Perfect data locality scenario!
•  read localy from Cassandra
•  use operations that do not require shuffle in Spark (map, ﬁlter, …)
•  repartitionbyCassandraReplica()
•  à to a table having same partition key as original table
•  save back into this Cassandra table
Sanitize, validate, normalize, transform data
USE CASE
54

@doanduyhai
Failure handling!
Stand-alone cluster
C*
SparkM
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
55

@doanduyhai
Failure handling!
What if 1 node down ?
What if 1 node overloaded ?
C*
SparkM
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
56

@doanduyhai
Failure handling!
What if 1 node down ?
What if 1 node overloaded ?

☞ Spark master will re-assign
the job to another worker
C*
SparkM
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
57

@doanduyhai
Failure handling!
Oh no, my data locality !!!
58

@doanduyhai
Failure handling!
59

@doanduyhai
Data Locality Impl!
RDD interface (extract)
abstract'class'RDD[T](…)'{'
' @DeveloperApi'
' def'compute(split:'Partition,'context:'TaskContext):'Iterator[T]'
'
' protected'def'getPartitions:'Array[Partition]'
' '
' protected'def'getPreferredLocations(split:'Partition):'Seq[String]'='Nil''''''''
}'
60

@doanduyhai
CassandraRDD!
def getPreferredLocations(split: Partition): Cassandra replicas IP address
corresponding to this Spark partition
61

@doanduyhai
Failure handling!
If RF > 1 the Spark master choses
the next preferred location, which
is a replica 😎
Tune parameters:
①  spark.locality.wait
②  spark.locality.wait.process
③  spark.locality.wait.node
C*
SparkM
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
C*
SparkW
62

@doanduyhai
val confDC1 = new SparkConf(true)
.setAppName("data_migration")
.setMaster("master_ip")
.set("spark.cassandra.connection.host", "DC_1_hostnames")
.set("spark.cassandra.connection.local_dc", "DC_1")
val confDC2 = new SparkConf(true)
.set("spark.cassandra.connection.host", "DC_2_hostnames")
.set("spark.cassandra.connection.local_dc", "DC_2 ")
val sc = new SparkContext(confDC1)
sc.cassandraTable[Performer](KEYSPACE,PERFORMERS)
.map[Performer](???)
.saveToCassandra(KEYSPACE,PERFORMERS)
(CassandraConnector(confDC2),implicitly[RowWriterFactory[Performer]])
Cross-DC operations!
63

@doanduyhai
val confCluster1 = new SparkConf(true)
.set("spark.cassandra.connection.host", "cluster_1_hostnames")
val confCluster2 = new SparkConf(true)
.set("spark.cassandra.connection.host", "cluster_2_hostnames")
val sc = new SparkContext(confCluster1)
sc.cassandraTable[Performer](KEYSPACE,PERFORMERS)
.map[Performer](???)
.saveToCassandra(KEYSPACE,PERFORMERS)
(CassandraConnector(confCluster2),implicitly[RowWriterFactory[Performer]])
Cross-cluster operations!
64

Spark/Cassandra use-cases!
Data cleaning!
Schema migration!
Analytics!
!

@doanduyhai
Use Cases!
Load data from various
sources
Analytics (join, aggregate, transform, …)
Schema migration,
Data conversion
66

@doanduyhai
Data cleaning use-cases!
Bug in your application ?

Dirty input data ?

☞ Spark job to clean it up! (perfect data locality)
67

@doanduyhai
Data Cleaning

@doanduyhai
Schema migration use-cases!
Business requirements change with time ?

Current data model no longer relevant ?

☞ Spark job to migrate data !
Schema migration,
Data conversion
69

@doanduyhai
Data Migration

@doanduyhai
Analytics use-cases!
Given existing tables of performers and albums, I want:

①  top 10 most common music styles (pop,rock, RnB, …) ?
②  performer productivity(albums count) by origin country and by
decade ?

☞ Spark job to compute analytics !
Analytics (join, aggregate, transform, …)
71

@doanduyhai
Analytics pipeline!
①  Read from production transactional tables
②  Perform aggregation with Spark
③  Save back data into dedicated tables for fast visualization
④  Repeat step ①
72

@doanduyhai
Analytics

DSE production deployment!
Typical production architecture!
Spark production conﬁg!

@doanduyhai
Typical production architecture!
75
n2
n3
n4
n5
n6
n7
n8
n1
n2
n3
n4n5
n1
Production
DC (Live)
Analytics DC
(Spark)Async
replication

@doanduyhai
DSE setup!
76
File Parameters
dse.yaml initial_spark_worker_resources: 0.7
cassandra.yaml Cassandra-related configuration
cassandra-env.sh MAX_HEAP_SIZE, HEAP_NEWSIZE, CASSANDRA_MEMORY_IN_MB, …
spark-env.sh
SPARK_WORKER_DIR, SPARK_TMP_DIR, SPARK_RDD_DIR, SPARK_LOG_DIR, …
SPARK_MASTER_PORT=7077, PARK_MASTER_WEBUI_PORT=7080,
SPARK_WORKER_WEBUI_PORT=7081, …
SPARK_WORKER_MEMORY, SPARK_MEM, SPARK_REPL_MEM, …
SPARK_WORKER_CORES, DEFAULT_PER_APP_CORES, …

@doanduyhai
Cassandra memory conﬁg!
77
Formula in cassandra-env.sh:

Can be manually overriden in cassandra-env.sh:
MAX_HEAP_SIZE="1024M"
HEAP_NEWSIZE="200M" # young generation size

@doanduyhai
Spark memory conﬁg!
78
Spark master is lightweight, few resources requirement
Spark workers & executors require much more memory

Spark memory computed based on settings in dse.yaml:

initial_spark_worker_resources: 0.7

@doanduyhai
Spark memory conﬁg!
79
Can be manually overriden in spark-env.sh :
# The amount of memory used by Spark Worker, cluster-wide
export SPARK_WORKER_MEMORY= "4096M”

# The amount of memory used by one Spark node
export SPARK_MEM="1024M”

# The amount of memory used by Spark Shell (client)
export SPARK_REPL_MEM="256M”

@doanduyhai
Spark CPU (cores) conﬁg!
80
Spark cores computed based on settings in dse.yaml:

initial_spark_worker_resources: 0.7

@doanduyhai
Spark CPU (cores) conﬁg!
81
Can be manually overriden in spark-env.sh :
# Set the number of cores used by Spark Worker
export SPARK_WORKER_CORES="2"

# The default number of cores assigned to each application.
# It can be modiﬁed in a particular application.
# If left blank, each Spark application will consume all
# available cores in the cluster.
export DEFAULT_PER_APP_CORES="3"

@doanduyhai
Important architecture considerations!
82
In theory
•  1 Spark job is run inside 1 Spark executor
•  1 Spark worker can have 1..N executors (1..N jobs)

In stand-alone mode
•  1 Spark worker cannot have more than 1 executor (see SPARK-1607)
•  change SPARK_WORKER_INSTANCES in spark-env.sh

@doanduyhai
Important resources considerations!
83
Memory
•  the more the better for Spark (beware of JVM GC cycles)
•  let enough memory for OS page cache (used by Cassandra)

Cores
•  beware of context switching on highly contended load
•  1 Spark DStream = 1 core used !

General resources
•  if not enough resources for all jobs à schedule them!

The “Big Data Platform”!
Analytics & Search!
Integrated High Availability!

@doanduyhai
Our vision!
We had a dream …

to provide an integrated Big Data Platform …

built for the performance & high availabitly demands

of IoT, web & mobile applications
85

@doanduyhai
Datastax Enterprise!
Cassandra + Solr in same JVM

Unlock full text search power for Cassandra

CQL syntax extension

SELECT * FROM users WHERE solr_query = ‘age:[33 TO *] AND gender:male’;

SELECT * FROM users WHERE solr_query = ‘lastname:*schwei?er’;
86

@doanduyhai
Now with Spark!
Cassandra + Spark

Unlock full analytics power for Cassandra

Spark/Cassandra connector
88

@doanduyhai
The “SearchAnalytics” mode!
Spark + DSE Search in Datastax Enterprise 4.7

Unlock full text search + analytics power for Cassandra
89

@doanduyhai
The “SearchAnalytics” mode(DSE 4.7)!
Spark + DSE Search in Datastax Enterprise 4.7

Unlock full text search + analytics power for Cassandra
90

@doanduyhai
How to speed up Spark jobs ?!
Bruce force solution for rich people:
•  Add more RAM (100Gb, 200Gb, …)
91

@doanduyhai
How to speed up Spark jobs ?!
Smart solution for broke people:
•  Filter at maximum the initial dataset
92

@doanduyhai
The idea!
①  Filter the maximum with DSE Search
②  Fetch only small data set in memory
③  Aggregate with Spark
☞ near real time interactive analytics query possible if restrictive criteria
93

@doanduyhai
Datastax Enterprise 4.7!
With a 3rd component for full text search …

how to preserve data locality ?
94

@doanduyhai
Stand-alone search cluster caveat!
The bad way v1: perform search from the Spark « driver program »
95

@doanduyhai
The bad way v2: search from Spark workers with restricted routing
96

@doanduyhai
The bad way v3: search from Cassandra nodes with a connector to Solr
97

@doanduyhai
The ops won’t be your friend
•  3 clusters to manage: Spark, Cassandra & Solr/whatever
•  lots of moving parts

Impacts on Spark jobs
•  increased response time due to latency
•  the 99.9 percentile latency can be very high
98

@doanduyhai
The right way, distributed « local search »
99

@doanduyhai
val join: CassandraJoinRDD[(String,Int), (String,String)] =
sc.cassandraTable[(String,Int)](KEYSPACE, ALBUMS)
// Select only useful columns for join and processing
.select("artist","year").where("solr_query = 'style:*rock* AND ratings:[3 TO *]' ")
.as((_:String, _:Int))
.repartitionByCassandraReplica(KEYSPACE, ARTISTS)
.joinWithCassandraTable[(String,String)](KEYSPACE, ARTISTS, SomeColumns("name","country"))
.on(SomeColumns("name")).where("solr_query = 'age:[20 TO 30]' ")
①  compute Spark partitions using Cassandra token ranges
②  on each partition, use Solr for local data filtering (no distributed query!)
③  fetch data back into Spark for aggregations
100

@doanduyhai

SELECT … FROM …

WHERE token(#partition)> 3X/8

AND token(#partition)<= 4X/8

AND solr_query='full text search expression';
1
2
3
Advantages of same JVM Cassandra + Solr integration
1
Single-pass local full text search (no fan out) 2
Data retrieval
Token Range : ] 3X/8, 4X/8]
101

@doanduyhai
Scalable solution: x2 volume à x2 nodes à ≈ constant processing time
102

@doanduyhai
Spark + DSE Search

@doanduyhai
Spark has a master/slave architecture …

☞ need to pull in Apache Zookeeper for H/A

☞ yet another external component in the stack
104

@doanduyhai
DSE 4.7 will feature a new general Leader election framework

•  using Cassandra robust Gossip protocol
•  using Cassandra battleﬁeld-proven Phi Accrual failure detector
•  using Cassandra LightWeight Transaction(Paxos) primitive for leader
election
105

@doanduyhai
Applied to Spark

•  start a Spark master process inside DSE
•  use Cassandra Gossip & Failure detector to detect Spark master state
•  use Cassandra LightWeight Transaction for robust master election
106

@doanduyhai
107

@doanduyhai
Thank You
@doanduyhai
duy_hai.doan@datastax.com
https://academy.datastax.com/

Fast track to getting started with DSE Max @ ING

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (17)

Similar to Fast track to getting started with DSE Max @ ING

Similar to Fast track to getting started with DSE Max @ ING (20)

More from Duyhai Doan

More from Duyhai Doan (7)

Recently uploaded

Recently uploaded (20)

Fast track to getting started with DSE Max @ ING