SlideShare a Scribd company logo
1 of 43
Download to read offline
Apache Spark
Mate Gulyas
CTO & Co-Founder
GULYÁS MÁTÉ
@gulyasm
Getting Started
Spark Core
Spark SQL
Spark Streaming
MLlib
GraphX
Cluster Managers
UNIFIED STACK
Spark Core
Spark SQL
Spark Streaming
MLlib
GraphX
Cluster Managers
RDD API
Dataframe API
Dataset API
UNIFIED STACK
Spark Core
RDD API
Dataframe API
Dataset API
❏ Scala
❏ Java
❏ Python
❏ R
WHICH LANGUAGE TO SPARK ON?
SPARK INSTALL
DRIVER
SPARKCONTEXT
DRIVER PROGRAM
Your main function. This is what you write.
Launches parallel operations on the cluster. The
driver access Spark through SparkContext.
You access the computing cluster via SparkContext
Via SparkContext you can create RDDs.
❏ INTERACTIVE
❏ STANDALONE
A “SPARK SOFTWARE”
Resilient Distributed
Dataset (RDD)
THE MAIN ATTRACTION
RDD
❏ TRANSFORMATION
❏ ACTION
OPERATIONS ON RDD
CREATES ANOTHER RDD
TRANSFORMATION
CALCULATE VALUE AND RETURN IT
TO THE DRIVER PROGRAM
ACTION
LAZY EVALUATION
INTERACTIVE
❏ The code: github.com/gulyasm/bigdata
❏ Databricks site: spark.apache.org
❏ User mailing list
❏ Spark books
MATERIALS
MATE GULYAS
gulyasm@enbrite.ly
@gulyasm
@enbritely
THANK YOU!
TRANSFORMATIONS
ACTIONS
LAZY EVALUATION
LIFECYCLE OF A SPARK PROGRAM
1. READ DATA FROM EXTERNAL SOURCE
2. CREATE LAZY EVALUATED
TRANSFORMATIONS
3. CACHE ANY INTERMEDIATE RDD TO REUSE
4. KICK IT OFF BY CALLING SOME ACTION
PARTITIONS
RDD INTERNALS
RDD INTERFACE
➔ set of PARTITIONS
➔ list of DEPENDENCIES on PARENT RDDs
➔ functions to COMPUTE a partition given parents
➔ preferred LOCATIONS (optional)
➔ PARTITIONER for K/V pairs (optional)
MULTIPLE RDDs
/**
* :: DeveloperApi ::
* Implemented by subclasses to compute a given partition.
*/
@DeveloperApi
def compute(split: Partition, context: TaskContext): Iterator[T]
/** Implemented by subclasses to return the set of partitions in this RDD. */
protected def getPartitions: Array[Partition]
/** Implemented by subclasses to return how this RDD depends on parent RDDs.
*/
protected def getDependencies : Seq[Dependency[_]] = deps
/** Optionally overridden by subclasses to specify placement preferences. */
protected def getPreferredLocations (split: Partition): Seq[String] = Nil
/** Optionally overridden by subclasses to specify how they are partitioned.
*/
@transient val partitioner: Option[Partitioner] = None
INTERNALS
THE IMPORTANT PART
❏ HOW EXECUTION WORKS
❏ TERMINOLOGY
❏ WHAT SHOULD WE CARE ABOUT
PIPELINING
❏ Parallel to CPU pipelining
❏ More steps at a time
❏ Recap: computation kicks of when an
action is called due to lazy evaluation
PIPELINING
text = sc.textFile("twit1.txt")
words = nonempty.flatMap(lambda x: x.split(" "))
fwords = words.filter(lambda x: len(x) > 0)
ones = fwords.map(lambda x: (x, 1))
result = ones.reduceByKey(lambda l,r: r+l)
result.collect()
PIPELINING
text = sc.textFile( )
words = nonempty.flatMap( )
fwords = words.filter( )
ones = fwords.map( )
result = ones.reduceByKey( )
result.collect()
PIPELINING
sc.textFile( )
.flatMap( )
.filter( )
.map( )
.reduceByKey( )
PIPELINING
sc.textFile().flatMap().filter().map().reduceByKey()
RDD RDD RDD RDD RDD
textFile(
) flatMap() filter() map() reduceByKey()
text resultwords fwords ones
PIPELINING
PIPELINING
def runJob[T, U](
rdd: RDD[T],
partitions: Seq[Int],
func: (Iterator[T]) => U)
) : Array[U]
RDD RDD RDD RDD RDD
textFile(
) flatMap() filter() map() reduceByKey()
text resultwords fwords ones
collect()
PIPELINING
JOB
❏ Basically an action
❏ An action creates a job
❏ A whole computation with all
dependencies
RDD RDD RDD RDD RDD
textFile(
) flatMap() filter() map() reduceByKey()
text resultwords fwords ones
collect()
Job
STAGE
❏ Unit of execution
❏ Named after the last transformation
(the one runJob was called on)
❏ Transformations pipelined together into
stages
❏ Stage boundary usually means shuffling
RDD RDD RDD RDD RDD
textFile(
) flatMap() filter() map() reduceByKey()
text resultwords fwords ones
collect()
Job
Stage 1 Stage 2
STAGE
❏ Unit of execution
❏ Named after the last transformation
(the one runJob was called on)
❏ Transformations pipelined together into
stages
❏ Stage boundary usually means shuffling
RDD RDD RDD RDD RDD
textFile(
) flatMap() filter() map() reduceByKey()
text resultwords fwords ones
collect()
Job
Stage 1 Stage 2
PT1
PT2
PT1
PT2
PT1
PT2
PT1
PT2
PT1
PT1
Shuffle
Repartitioning
text = sc.textFile("twit1.txt")
words = nonempty.flatMap(lambda x: x.split(" "))
fwords = words.filter(lambda x: len(x) > 1)
ones = fwords.map(lambda x: (x, 1))
rp = ones.repartition(6)
result = rp.reduceByKey(lambda l,r: r+l)
result.collect()
TaskSet
THE PROCESS
RDD Objects DAG Scheduler Task Scheduler Executor
RDD
RDD RDD
RDD
RDD
sc.textFile.
map()
.groupBy()
.filter()
Build DAG of
operators
T
T
T
T
T
T
T
T
T
S
S
S
S
- Split DAG into
stages of tasks
- Each stage when
ready = ALL
dependent task are
finished
DAG Task
Task
Scheduler
- Launches tasks
- Retry failed tasks
Executor
Block manager
Task threads
Task threads
Task threads
- Store and serve
blocks
- Executes tasks
MATE GULYAS
gulyasm@enbrite.ly
@gulyasm
@enbritely
THANK YOU!

More Related Content

Viewers also liked

Viewers also liked (7)

Introduction to Spark - DataFactZ
Introduction to Spark - DataFactZIntroduction to Spark - DataFactZ
Introduction to Spark - DataFactZ
 
Apache Spark streaming and HBase
Apache Spark streaming and HBaseApache Spark streaming and HBase
Apache Spark streaming and HBase
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBase
 
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
 
Apache Spark An Overview
Apache Spark An OverviewApache Spark An Overview
Apache Spark An Overview
 
Zero to Streaming: Spark and Cassandra
Zero to Streaming: Spark and CassandraZero to Streaming: Spark and Cassandra
Zero to Streaming: Spark and Cassandra
 
Applying Machine Learning to Live Patient Data
Applying Machine Learning to  Live Patient DataApplying Machine Learning to  Live Patient Data
Applying Machine Learning to Live Patient Data
 

More from Máté Gulyás

More from Máté Gulyás (7)

Go meetup | Go 1.8 release
Go meetup | Go 1.8 releaseGo meetup | Go 1.8 release
Go meetup | Go 1.8 release
 
Info tanar meetup
Info tanar meetupInfo tanar meetup
Info tanar meetup
 
Big Data Universe - How we design architectures
Big Data Universe - How we design architecturesBig Data Universe - How we design architectures
Big Data Universe - How we design architectures
 
Startup safary | Getting started with microservices
Startup safary | Getting started with microservicesStartup safary | Getting started with microservices
Startup safary | Getting started with microservices
 
VDay 2015 - Golang as the Dream of a Devops Engineer
VDay 2015 - Golang as the Dream of a Devops EngineerVDay 2015 - Golang as the Dream of a Devops Engineer
VDay 2015 - Golang as the Dream of a Devops Engineer
 
Webkonf 2015 | A web sötét oldala
Webkonf 2015 | A web sötét oldalaWebkonf 2015 | A web sötét oldala
Webkonf 2015 | A web sötét oldala
 
Apache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platformApache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platform
 

Recently uploaded

Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
gajnagarg
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
amitlee9823
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 

Recently uploaded (20)

Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 

Budapest Spark Meetup - Basics of Spark coding