SlideShare a Scribd company logo
1 of 53
Spark says the darndest things
Sandy Ryza
Me
● Data science at Cloudera
● Recently lead Apache Spark development at
Cloudera
● Before that, committing on Apache YARN
and MapReduce
● Lead author on “Advanced Analytics with
Spark”
com.esotericsoftware.kryo.
KryoException: Unable to find class:
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC
$$iwC$$iwC$$anonfun$4$$anonfun$apply$3
org.apache.spark.SparkException: Job aborted due to stage failure: Task
0.0:0 failed 4 times, most recent failure: Exception failure in TID 6 on
host bottou02-10g.pa.cloudera.com: java.lang.ArithmeticException: / by
zero
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply$mcII$sp(<console>:13)
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13)
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13)
scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1016)
[...]
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.
org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages
(DAGScheduler.scala:1033)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.
apply(DAGScheduler.scala:1017)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.
apply(DAGScheduler.scala:1015)
[...]
org.apache.spark.SparkException: Job aborted due to stage failure: Task
0.0:0 failed 4 times, most recent failure: Exception failure in TID 6 on
host bottou02-10g.pa.cloudera.com: java.lang.ArithmeticException: / by
zero
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply$mcII$sp(<console>:13)
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13)
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13)
scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1016)
[...]
org.apache.spark.SparkException: Job aborted due to stage failure: Task
0.0:0 failed 4 times, most recent failure: Exception failure in TID 6 on
host bottou02-10g.pa.cloudera.com: java.lang.ArithmeticException: / by
zero
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply$mcII$sp(<console>:13)
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13)
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13)
scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1016)
[...]
val file = sc.textFile("hdfs://...")
file.filter(_.startsWith(“banana”))
.count()
Job
Stage
Task
Anatomy of a job
Task
Task Task
Stage
Task Task
Task Task
What the heck is a stage?
map
val rdd1 = sc.textFile(“hdfs://...”)
.map(someFunc)
.filter(filterFunc)
textFile filter
val rdd2 = sc.hadoopFile(“hdfs:
//...”)
.groupByKey()
.map(someOtherFunc)
groupByKeyhadoopFile map
val rdd3 = rdd1.join(rdd2)
.map(someFunc)
join map
rdd3.collect()
maptextFile filter
group
ByKey
hadoop
File
map
join map
maptextFile filter
group
ByKey
hadoop
File
map
join map
Stage
Task Task
Task Task
Stage
Task Task
Task Task
Stage
Task Task
Task Task
org.apache.spark.SparkException: Job aborted due to stage failure: Task
0.0:0 failed 4 times, most recent failure: Exception failure in TID 6 on
host bottou02-10g.pa.cloudera.com: java.lang.ArithmeticException: / by
zero
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply$mcII$sp(<console>:13)
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13)
$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13)
scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1016)
[...]
14/04/22 11:59:58 ERROR executor.Executor: Exception in task ID 2866
java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:565)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:648)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:706)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:173)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:164)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:149)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
at org.apache.spark.scheduler.Task.run(Task.scala:53)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Spark architecture
YARN architecture
ResourceManager
NodeManager NodeManager
Container Container
Application
Master
Container
Client
Spark on YARN architecture
ResourceManager
NodeManager NodeManager
Container
Map Task
Container
Application
Master
Container
Reduce Task
Client
Spark on YARN architecture
ResourceManager
NodeManager NodeManager
Container
Map Task
Container
Application
Master
Container
Reduce Task
Client
Spark on YARN architecture
ResourceManager
NodeManager NodeManager
Container
Map Task
Container
Application
Master
Container
Reduce Task
Client
Container [pid=63375,
containerID=container_1388158490598_0001_01_00
0003] is running beyond physical memory
limits. Current usage: 2.1 GB of 2 GB physical
memory used; 2.8 GB of 4.2 GB virtual memory
used. Killing container.
yarn.nodemanager.resource.memory-mb
Executor Container
spark.yarn.
executor.
memoryOverhead
spark.executor.memory
spark.shuffle.
memoryFraction
spark.storage.
memoryFraction
14/04/22 11:59:58 ERROR executor.Executor: Exception in task ID 2866
java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:565)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:648)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:706)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:173)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:164)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:149)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
at org.apache.spark.scheduler.Task.run(Task.scala:53)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
GC Stalls
Too much spilling
maptextFile filter
group
ByKey
hadoop
File
map
join map
Shuffle
Shuffle
Most perf issues are in shuffles!
Most perf issues are in shuffles!
Inside a Task: Fetch & Aggregation
ExternalAppend
OnlyMap
Block
Block
deserialize
deserialize
Inside a Task: Fetch & Aggregation
ExternalAppend
OnlyMap
key1 -> values
key2 -> values
key3 -> values
Inside a Task: Fetch & Aggregation
ExternalAppend
OnlyMap
key1 -> values
key2 -> values
key3 -> values
Inside a Task: Fetch & Aggregation
ExternalAppend
OnlyMap
key1 -> values
key2 -> values
key3 -> values
Sort & Spill
rdd.reduceByKey(reduceFunc,
numPartitions=1000)
numPartitions=Integer.MaxValue
Ok, but then why not:
Too much parallelism
● Task start overhead
● More fetches -> more disk seeks
● Driver needs to track state per-task
● Keep multiplying by 1.5 and see what works
●
So how to choose?
Is Spark bad?
● No
● Distributed systems are complicated
Shuffle: Sending (“map”) Side
Task
Task
key1 -> values
key2 -> values
key3 -> values
Write stuff
out
key1 -> values
key2 -> values
key3 -> values Task
Shuffle: Sending (“map”) Side
Task
Task
key1 -> values
key2 -> values
key3 -> values
Write stuff
out
key1 -> values
key2 -> values
key3 -> values Task
Sort-based shuffle
Records
Buffer
Spill
Single file
Buffer
Sort & Spill
Partition 1
Records
Partition 2
Records
Partition 3
Records
Index file
Thanks!

More Related Content

What's hot

Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Nitin S
 
A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices Nebulaworks
 
Terraform 0.9 + good practices
Terraform 0.9 + good practicesTerraform 0.9 + good practices
Terraform 0.9 + good practicesRadek Simko
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team ApachePatrick McFadin
 
Terraform modules restructured
Terraform modules restructuredTerraform modules restructured
Terraform modules restructuredAmi Mahloof
 
Hashiconf EU 2019 - A Tour of Terraform 0.12
Hashiconf EU 2019 - A Tour of Terraform 0.12Hashiconf EU 2019 - A Tour of Terraform 0.12
Hashiconf EU 2019 - A Tour of Terraform 0.12Mitchell Pronschinske
 
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Spark Summit
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
Integrating icinga2 and the HashiCorp suite
Integrating icinga2 and the HashiCorp suiteIntegrating icinga2 and the HashiCorp suite
Integrating icinga2 and the HashiCorp suiteBram Vogelaar
 
Infrastructure as Code with Terraform
Infrastructure as Code with TerraformInfrastructure as Code with Terraform
Infrastructure as Code with TerraformTim Berry
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariAlejandro Fernandez
 
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...thelabdude
 
Spark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousSpark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousRussell Spitzer
 
Introductory Overview to Managing AWS with Terraform
Introductory Overview to Managing AWS with TerraformIntroductory Overview to Managing AWS with Terraform
Introductory Overview to Managing AWS with TerraformMichael Heyns
 

What's hot (20)

Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
 
A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices
 
Intro to Terraform
Intro to TerraformIntro to Terraform
Intro to Terraform
 
Refactoring terraform
Refactoring terraformRefactoring terraform
Refactoring terraform
 
Terraform 0.9 + good practices
Terraform 0.9 + good practicesTerraform 0.9 + good practices
Terraform 0.9 + good practices
 
Terraform at Scale
Terraform at ScaleTerraform at Scale
Terraform at Scale
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
 
Apache SolrCloud
Apache SolrCloudApache SolrCloud
Apache SolrCloud
 
Terraform modules restructured
Terraform modules restructuredTerraform modules restructured
Terraform modules restructured
 
Hashiconf EU 2019 - A Tour of Terraform 0.12
Hashiconf EU 2019 - A Tour of Terraform 0.12Hashiconf EU 2019 - A Tour of Terraform 0.12
Hashiconf EU 2019 - A Tour of Terraform 0.12
 
HiveServer2
HiveServer2HiveServer2
HiveServer2
 
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
 
Terraform day02
Terraform day02Terraform day02
Terraform day02
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
 
Integrating icinga2 and the HashiCorp suite
Integrating icinga2 and the HashiCorp suiteIntegrating icinga2 and the HashiCorp suite
Integrating icinga2 and the HashiCorp suite
 
Infrastructure as Code with Terraform
Infrastructure as Code with TerraformInfrastructure as Code with Terraform
Infrastructure as Code with Terraform
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
 
Spark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousSpark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 Furious
 
Introductory Overview to Managing AWS with Terraform
Introductory Overview to Managing AWS with TerraformIntroductory Overview to Managing AWS with Terraform
Introductory Overview to Managing AWS with Terraform
 

Viewers also liked

Why your Spark job is failing
Why your Spark job is failingWhy your Spark job is failing
Why your Spark job is failingSandy Ryza
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsCloudera, Inc.
 
0bbleedingedge long-140614012258-phpapp02 lynn-langit
0bbleedingedge long-140614012258-phpapp02 lynn-langit0bbleedingedge long-140614012258-phpapp02 lynn-langit
0bbleedingedge long-140614012258-phpapp02 lynn-langitData Con LA
 
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Data Con LA
 
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of Altiscale
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of AltiscaleDebugging Hive with Hadoop-in-the-Cloud by David Chaiken of Altiscale
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of AltiscaleData Con LA
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
 
Why your Spark Job is Failing
Why your Spark Job is FailingWhy your Spark Job is Failing
Why your Spark Job is FailingDataWorks Summit
 

Viewers also liked (7)

Why your Spark job is failing
Why your Spark job is failingWhy your Spark job is failing
Why your Spark job is failing
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
 
0bbleedingedge long-140614012258-phpapp02 lynn-langit
0bbleedingedge long-140614012258-phpapp02 lynn-langit0bbleedingedge long-140614012258-phpapp02 lynn-langit
0bbleedingedge long-140614012258-phpapp02 lynn-langit
 
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
 
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of Altiscale
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of AltiscaleDebugging Hive with Hadoop-in-the-Cloud by David Chaiken of Altiscale
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of Altiscale
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 
Why your Spark Job is Failing
Why your Spark Job is FailingWhy your Spark Job is Failing
Why your Spark Job is Failing
 

Similar to Why is My Spark Job Failing? by Sandy Ryza of Cloudera

Innovation and Security in Ruby on Rails
Innovation and Security in Ruby on RailsInnovation and Security in Ruby on Rails
Innovation and Security in Ruby on Railstielefeld
 
AngularJS Tips&Tricks
AngularJS Tips&TricksAngularJS Tips&Tricks
AngularJS Tips&TricksPetr Bela
 
Spark Summit EU talk by Nimbus Goehausen
Spark Summit EU talk by Nimbus GoehausenSpark Summit EU talk by Nimbus Goehausen
Spark Summit EU talk by Nimbus GoehausenSpark Summit
 
Alfrescotomcat stderr.2013-03-05
Alfrescotomcat stderr.2013-03-05Alfrescotomcat stderr.2013-03-05
Alfrescotomcat stderr.2013-03-05NX21
 
Catalina.2013 03-05
Catalina.2013 03-05Catalina.2013 03-05
Catalina.2013 03-05NX21
 
Why Your Apache Spark Job is Failing
Why Your Apache Spark Job is FailingWhy Your Apache Spark Job is Failing
Why Your Apache Spark Job is FailingCloudera, Inc.
 
What happens in laravel 4 bootstraping
What happens in laravel 4 bootstrapingWhat happens in laravel 4 bootstraping
What happens in laravel 4 bootstrapingJace Ju
 
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesBurn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesLindsay Holmwood
 
Forget about Index.php and build you applications around HTTP - PHPers Cracow
Forget about Index.php and build you applications around HTTP - PHPers CracowForget about Index.php and build you applications around HTTP - PHPers Cracow
Forget about Index.php and build you applications around HTTP - PHPers CracowKacper Gunia
 
Forget about index.php and build you applications around HTTP!
Forget about index.php and build you applications around HTTP!Forget about index.php and build you applications around HTTP!
Forget about index.php and build you applications around HTTP!Kacper Gunia
 
High Performance Ruby: Evented vs. Threaded
High Performance Ruby: Evented vs. ThreadedHigh Performance Ruby: Evented vs. Threaded
High Performance Ruby: Evented vs. ThreadedEngine Yard
 
Understanding OpenStack Deployments - PuppetConf 2014
Understanding OpenStack Deployments - PuppetConf 2014Understanding OpenStack Deployments - PuppetConf 2014
Understanding OpenStack Deployments - PuppetConf 2014Puppet
 
DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...
DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...
DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...PROIDEA
 
Spark on Yarn
Spark on YarnSpark on Yarn
Spark on YarnQubole
 
Practical PHP 5.3
Practical PHP 5.3Practical PHP 5.3
Practical PHP 5.3Nate Abele
 
Performance Profiling in Rust
Performance Profiling in RustPerformance Profiling in Rust
Performance Profiling in RustInfluxData
 

Similar to Why is My Spark Job Failing? by Sandy Ryza of Cloudera (20)

Innovation and Security in Ruby on Rails
Innovation and Security in Ruby on RailsInnovation and Security in Ruby on Rails
Innovation and Security in Ruby on Rails
 
AngularJS Tips&Tricks
AngularJS Tips&TricksAngularJS Tips&Tricks
AngularJS Tips&Tricks
 
Spark Summit EU talk by Nimbus Goehausen
Spark Summit EU talk by Nimbus GoehausenSpark Summit EU talk by Nimbus Goehausen
Spark Summit EU talk by Nimbus Goehausen
 
Alfrescotomcat stderr.2013-03-05
Alfrescotomcat stderr.2013-03-05Alfrescotomcat stderr.2013-03-05
Alfrescotomcat stderr.2013-03-05
 
Catalina.2013 03-05
Catalina.2013 03-05Catalina.2013 03-05
Catalina.2013 03-05
 
Why Your Apache Spark Job is Failing
Why Your Apache Spark Job is FailingWhy Your Apache Spark Job is Failing
Why Your Apache Spark Job is Failing
 
Sparkstreaming
SparkstreamingSparkstreaming
Sparkstreaming
 
What happens in laravel 4 bootstraping
What happens in laravel 4 bootstrapingWhat happens in laravel 4 bootstraping
What happens in laravel 4 bootstraping
 
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesBurn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
 
Forget about Index.php and build you applications around HTTP - PHPers Cracow
Forget about Index.php and build you applications around HTTP - PHPers CracowForget about Index.php and build you applications around HTTP - PHPers Cracow
Forget about Index.php and build you applications around HTTP - PHPers Cracow
 
Forget about index.php and build you applications around HTTP!
Forget about index.php and build you applications around HTTP!Forget about index.php and build you applications around HTTP!
Forget about index.php and build you applications around HTTP!
 
High Performance Ruby: Evented vs. Threaded
High Performance Ruby: Evented vs. ThreadedHigh Performance Ruby: Evented vs. Threaded
High Performance Ruby: Evented vs. Threaded
 
Understanding OpenStack Deployments - PuppetConf 2014
Understanding OpenStack Deployments - PuppetConf 2014Understanding OpenStack Deployments - PuppetConf 2014
Understanding OpenStack Deployments - PuppetConf 2014
 
Jmx capture
Jmx captureJmx capture
Jmx capture
 
DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...
DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...
DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...
 
Spark on Yarn
Spark on YarnSpark on Yarn
Spark on Yarn
 
infra-as-code
infra-as-codeinfra-as-code
infra-as-code
 
Practical PHP 5.3
Practical PHP 5.3Practical PHP 5.3
Practical PHP 5.3
 
Performance Profiling in Rust
Performance Profiling in RustPerformance Profiling in Rust
Performance Profiling in Rust
 
Build application using sbt
Build application using sbtBuild application using sbt
Build application using sbt
 

More from Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA
 

More from Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Recently uploaded

Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Why is My Spark Job Failing? by Sandy Ryza of Cloudera

  • 1. Spark says the darndest things Sandy Ryza
  • 2. Me ● Data science at Cloudera ● Recently lead Apache Spark development at Cloudera ● Before that, committing on Apache YARN and MapReduce ● Lead author on “Advanced Analytics with Spark”
  • 3. com.esotericsoftware.kryo. KryoException: Unable to find class: $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC $$iwC$$iwC$$anonfun$4$$anonfun$apply$3
  • 4.
  • 5.
  • 6.
  • 7. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:0 failed 4 times, most recent failure: Exception failure in TID 6 on host bottou02-10g.pa.cloudera.com: java.lang.ArithmeticException: / by zero $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply$mcII$sp(<console>:13) $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13) $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13) scala.collection.Iterator$$anon$11.next(Iterator.scala:328) org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1016) [...] Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler. org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages (DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1. apply(DAGScheduler.scala:1017) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1. apply(DAGScheduler.scala:1015) [...]
  • 8. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:0 failed 4 times, most recent failure: Exception failure in TID 6 on host bottou02-10g.pa.cloudera.com: java.lang.ArithmeticException: / by zero $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply$mcII$sp(<console>:13) $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13) $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13) scala.collection.Iterator$$anon$11.next(Iterator.scala:328) org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1016) [...]
  • 9. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:0 failed 4 times, most recent failure: Exception failure in TID 6 on host bottou02-10g.pa.cloudera.com: java.lang.ArithmeticException: / by zero $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply$mcII$sp(<console>:13) $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13) $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13) scala.collection.Iterator$$anon$11.next(Iterator.scala:328) org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1016) [...]
  • 10. val file = sc.textFile("hdfs://...") file.filter(_.startsWith(“banana”)) .count()
  • 11. Job Stage Task Anatomy of a job Task Task Task Stage Task Task Task Task
  • 12. What the heck is a stage?
  • 13. map val rdd1 = sc.textFile(“hdfs://...”) .map(someFunc) .filter(filterFunc) textFile filter
  • 14. val rdd2 = sc.hadoopFile(“hdfs: //...”) .groupByKey() .map(someOtherFunc) groupByKeyhadoopFile map
  • 15. val rdd3 = rdd1.join(rdd2) .map(someFunc) join map
  • 22. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:0 failed 4 times, most recent failure: Exception failure in TID 6 on host bottou02-10g.pa.cloudera.com: java.lang.ArithmeticException: / by zero $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply$mcII$sp(<console>:13) $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13) $iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:13) scala.collection.Iterator$$anon$11.next(Iterator.scala:328) org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1016) [...]
  • 23. 14/04/22 11:59:58 ERROR executor.Executor: Exception in task ID 2866 java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:565) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:648) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:706) at java.io.DataInputStream.read(DataInputStream.java:100) at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:173) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45) at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:164) at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:149) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388) at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724)
  • 25. YARN architecture ResourceManager NodeManager NodeManager Container Container Application Master Container Client
  • 26. Spark on YARN architecture ResourceManager NodeManager NodeManager Container Map Task Container Application Master Container Reduce Task Client
  • 27. Spark on YARN architecture ResourceManager NodeManager NodeManager Container Map Task Container Application Master Container Reduce Task Client
  • 28. Spark on YARN architecture ResourceManager NodeManager NodeManager Container Map Task Container Application Master Container Reduce Task Client
  • 29. Container [pid=63375, containerID=container_1388158490598_0001_01_00 0003] is running beyond physical memory limits. Current usage: 2.1 GB of 2 GB physical memory used; 2.8 GB of 4.2 GB virtual memory used. Killing container.
  • 31. 14/04/22 11:59:58 ERROR executor.Executor: Exception in task ID 2866 java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:565) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:648) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:706) at java.io.DataInputStream.read(DataInputStream.java:100) at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:173) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45) at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:164) at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:149) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388) at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724)
  • 32.
  • 33.
  • 37. Most perf issues are in shuffles!
  • 38. Most perf issues are in shuffles!
  • 39. Inside a Task: Fetch & Aggregation ExternalAppend OnlyMap Block Block deserialize deserialize
  • 40. Inside a Task: Fetch & Aggregation ExternalAppend OnlyMap key1 -> values key2 -> values key3 -> values
  • 41. Inside a Task: Fetch & Aggregation ExternalAppend OnlyMap key1 -> values key2 -> values key3 -> values
  • 42. Inside a Task: Fetch & Aggregation ExternalAppend OnlyMap key1 -> values key2 -> values key3 -> values Sort & Spill
  • 44.
  • 46. Too much parallelism ● Task start overhead ● More fetches -> more disk seeks ● Driver needs to track state per-task
  • 47. ● Keep multiplying by 1.5 and see what works ● So how to choose?
  • 48. Is Spark bad? ● No ● Distributed systems are complicated
  • 49. Shuffle: Sending (“map”) Side Task Task key1 -> values key2 -> values key3 -> values Write stuff out key1 -> values key2 -> values key3 -> values Task
  • 50. Shuffle: Sending (“map”) Side Task Task key1 -> values key2 -> values key3 -> values Write stuff out key1 -> values key2 -> values key3 -> values Task
  • 52. Spill Single file Buffer Sort & Spill Partition 1 Records Partition 2 Records Partition 3 Records Index file