SlideShare a Scribd company logo
William Benton
Red Hat, Inc.
@willb • willb@redhat.com
CONTAINERIZED SPARK 

ON KUBERNETES
BACKGROUND
BACKGROUND
BACKGROUND
BACKGROUND
BACKGROUND
BACKGROUND
BACKGROUND
BACKGROUND
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Networked
POSIX FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Mesos
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Networked
POSIX FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Mesos
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Networked
POSIX FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
1
2
3
4
Mesos
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Networked
POSIX FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
1
2
3
4
1
1
2
3
3
4
Mesos
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Networked
POSIX FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
1
2
3
4
1
1
2
3
3
4
Analytics is no longer a
separate workload.
Analytics is an essential
component of modern data-
driven applications.
OUR GOALS
OUR GOALS
OUR GOALS
OUR GOALS
OUR GOALS
git
OUR GOALS
git
FORECAST
Motivating containerized microservices
Architectures for analytics and applications
Spark clusters in containers: practicalities and pitfalls
Play along at home
Future work
MOTIVATING MICROSERVICES
A microservice architecture
employs lightweight, modular,
and typically stateless
components with well-defined
interfaces and contracts.
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
2 + 2
BENEFITS OF MICROSERVICE ARCHITECTURES
2 + 2 5
BENEFITS OF MICROSERVICE ARCHITECTURES
2 + 2 5
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
?
BENEFITS OF MICROSERVICE ARCHITECTURES
?
BENEFITS OF MICROSERVICE ARCHITECTURES
?
MICROSERVICES AND SPARK
executor
1 2 3
executor
4 5 6
executor
7 8 9
executor
10 11 12
master
MICROSERVICES AND SPARK
executor
1 2 3
executor
4 5 6
executor
7 8 9
executor
10 11 12
master
λ x: x * 2
MICROSERVICES AND SPARK
executor
1 2 3
executor
4 5 6
executor
7 8 9
executor
10 11 12
master
λ x: x * 22 4 6 8 10 12 14 16 18 20 22 24
λ x: x * 2 λ x: x * 2 λ x: x * 2 λ x: x * 2
MICROSERVICES AND SPARK
executor
1 2 3
executor
4 5 6
executor
7 8 9
executor
10 11 12
master
λ x: x * 22 4 6 8 10 12 14 16 18 20 22 24
λ x: x * 2 λ x: x * 2 λ x: x * 2 λ x: x * 2
MICROSERVICES AND SPARK
executor
1 2 3
executor
4 5 6
executor
7 8 9
executor
10 11 12
master
λ x: x * 22 4 6 8 10 12 14 16 18 20 22 24
λ x: x * 2 λ x: x * 2 λ x: x * 2 λ x: x * 2
ARCHITECTURES FOR 

ANALYTICS AND APPLICATIONS
APPLICATION RESPONSIBILITIES
transform
transform
transform
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
transform
transform
transform
aggregate
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
archive
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
archive
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
archive
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
web and mobile
reporting
developer UI
APPLICATION RESPONSIBILITIES
archive
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
management
web and mobile
reporting
developer UI
LEGACY ARCHITECTURES
CONVENTIONAL DATA WAREHOUSE
events
CONVENTIONAL DATA WAREHOUSE
transformevents
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS RDBMS
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS RDBMS
analysis
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS RDBMS
analysis
reporting
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS RDBMS
analysis
reporting
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS RDBMS
analysis
interactive

query
reporting
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS analytic

processing
RDBMS
analysis
interactive

query
reporting
HADOOP-STYLE “DATA LAKE”
HDFS HDFS HDFS
HADOOP-STYLE “DATA LAKE”
HDFS HDFS HDFS HDFS HDFS
HADOOP-STYLE “DATA LAKE”
HDFS
events
HDFS HDFS HDFS HDFS
HADOOP-STYLE “DATA LAKE”
HDFS
events
HDFS HDFS HDFS HDFS
HADOOP-STYLE “DATA LAKE”
HDFS
compute
events
HDFS
compute
HDFS
compute compute compute
HDFS HDFS
MODERN ARCHITECTURES
THE LAMBDA ARCHITECTURE
events
(imprecise)

analysistransform
THE LAMBDA ARCHITECTURE
events
(imprecise)

analysistransform
DFS
speed layer
THE LAMBDA ARCHITECTURE
events
(imprecise)

analysistransform
DFS
speed layer
THE LAMBDA ARCHITECTURE
events
(precise)

analysistransform
(imprecise)

analysistransform
DFS
speed layer
THE LAMBDA ARCHITECTURE
events
batch layer
(precise)

analysistransform
(imprecise)

analysistransform
DFS
speed layer
THE LAMBDA ARCHITECTURE
events
batch layer
federate
(precise)

analysistransform
(imprecise)

analysistransform
DFS
speed layer
THE LAMBDA ARCHITECTURE
events
batch layer
UIfederate
(precise)

analysistransform
(imprecise)

analysistransform
DFS
serving layerspeed layer
THE LAMBDA ARCHITECTURE
events
batch layer
UIfederate
(precise)

analysistransform
(imprecise)

analysistransform
DFS
THE KAPPA ARCHITECTURE
events
queue for “raw data” topic
THE KAPPA ARCHITECTURE
events
queue for “raw data” topic
THE KAPPA ARCHITECTURE
events
transform
queue for “preprocessed data” topic
queue for “raw data” topic
THE KAPPA ARCHITECTURE
events
transform analysis
queue for “preprocessed data” topic
queue for “analysis results” topic
queue for “raw data” topic
THE KAPPA ARCHITECTURE
events
transform analysis
queue for “preprocessed data” topic
queue for “analysis results” topic
reporting end-user UI
DATA FEDERATION IN THE COMPUTE LAYER
aggregate
trainmodels
archive
events
databases
file, object
storage
management
web and mobile
reporting
developer UItransform
transform
transform
DATA FEDERATION IN THE COMPUTE LAYER
aggregate
trainmodels
archive
events
databases
file, object
storage
management
web and mobile
reporting
developer UItransform
transform
transform
DATA FEDERATION IN THE COMPUTE LAYER
aggregate
trainmodels
archive
events
databases
file, object
storage
management
web and mobile
reporting
developer UItransform
transform
transform
Cluster scheduler
SIDEBAR: THE MONOLITHIC SPARK ANTIPATTERN
Shared FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Resource manager
app 1 app 2
app 4app 3
Cluster scheduler
SIDEBAR: THE MONOLITHIC SPARK ANTIPATTERN
Shared FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Resource manager
app 1 app 2
app 4app 3
Resource manager
ONE CLUSTER PER APPLICATION
Object stores
app 1 app 2
app 5app 4
app 3
app 6
Databases
Resource manager
ONE CLUSTER PER APPLICATION
Object stores
app 1 app 2
app 5app 4
app 3
app 6
app 2
app 4
Databases
PRACTICALITIES AND
POTENTIAL PITFALLS
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
Object stores
Databases
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1
SECURITY
SECURITY
SECURITY
$SPARK_HOME/bin/spark-class 
org.apache.spark.deploy.worker.Worker 
master:7077
pid
root
net
SECURITY
$SPARK_HOME/bin/spark-class 
org.apache.spark.deploy.worker.Worker 
master:7077
pid
root
net
/tmp/foo
SECURITY
SECURITY
k8s namespace
SECURITY
k8s namespace*
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
POSIX
filesystem
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
POSIX
filesystem
✓ familiar interface
✓ interoperability with
other programs
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
POSIX
filesystem
✓ familiar interface
✓ interoperability with
other programs
✗ unnecessary
semantic guarantees
✗ difficult to manage
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
HDFS HDFS
HDFS HDFS
HDFS
HDFS
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
HDFS HDFS
HDFS HDFS
HDFS
HDFS
✓ support for legacy
Hadoop installations
✗ inelastic
✗ stateful
✗ can’t collocate
compute and data
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
HDFS HDFS
HDFS HDFS
HDFS
HDFS
✓ support for legacy
Hadoop installations
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
object store
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
object store
✓ interoperability
✓ fine-grained AC
✓ many implementations
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
object store
✓ interoperability
✓ fine-grained AC
✓ many implementations
✗ consistency model
✗ performance (?)
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
object store
✓ interoperability
✓ fine-grained AC
✓ many implementations
✗ consistency model
✗ performance
“…in a cloud native architecture, the benefit of
HDFS is actually very small and that is why
many cloud-first organizations no longer run
HDFS, or only run it as a caching layer for S3.”
—Reynold Xin on Quora (http://qr.ae/TAF4cN)
NETWORKING
NETWORKING
NETWORKING
NETWORKING
http://app1:8080
NETWORKING
http://app1:8080
✗ can’t access worker web UI
(but wait for Spark 2.1!)
NETWORKING
http://app1:8080
✗ can’t access worker web UI
(but wait for Spark 2.1!)
NETWORKING
http://app1:8080
✗ can’t access worker web UI
(but wait for Spark 2.1!)
http://app1:80
NEXT STEPS: FUTURE WORK &
PLAYING ALONG AT HOME
NEXT STEPS
Further performance evaluation
Better developer experience
Improved scheduling of Spark tasks on Kubernetes
TRY IT OUT YOURSELF
Kubernetes standalone Spark example:

https://github.com/kubernetes/kubernetes/tree/master/examples/spark
Enabling Spark on OpenShift: https://github.com/radanalyticsio
Native Spark on Kubernetes proposal:

https://github.com/kubernetes/kubernetes/issues/34377
@willb • willb@redhat.com

https://chapeau.freevariable.com
THANKS!

More Related Content

What's hot

Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesSimplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesDatabricks
 
Sparklyr: Recap, Updates, and Use Cases with Javier Luraschi
Sparklyr: Recap, Updates, and Use Cases with Javier LuraschiSparklyr: Recap, Updates, and Use Cases with Javier Luraschi
Sparklyr: Recap, Updates, and Use Cases with Javier LuraschiDatabricks
 
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...Spark Summit
 
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis MagdaApache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis MagdaDatabricks
 
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...Databricks
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Databricks
 
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim DowlingSpark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim DowlingSpark Summit
 
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraUsing Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraSpark Summit
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesDatabricks
 
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan VolzArchiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan VolzDatabricks
 
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed Spark Summit
 
Spark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesYousun Jeong
 
Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftLi Gao
 
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...Spark Summit
 
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Evan Chan
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouDatabricks
 
Scaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftScaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftDatabricks
 

What's hot (20)

Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesSimplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
 
Sparklyr: Recap, Updates, and Use Cases with Javier Luraschi
Sparklyr: Recap, Updates, and Use Cases with Javier LuraschiSparklyr: Recap, Updates, and Use Cases with Javier Luraschi
Sparklyr: Recap, Updates, and Use Cases with Javier Luraschi
 
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
 
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis MagdaApache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
 
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
 
Data science lifecycle with Apache Zeppelin
Data science lifecycle with Apache ZeppelinData science lifecycle with Apache Zeppelin
Data science lifecycle with Apache Zeppelin
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
 
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim DowlingSpark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim Dowling
 
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraUsing Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
 
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan VolzArchiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
 
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
 
Spark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg Schad
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on Kubernetes
 
Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at Lyft
 
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
 
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
 
Migrating pipelines into Docker
Migrating pipelines into DockerMigrating pipelines into Docker
Migrating pipelines into Docker
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
 
Scaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftScaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at Lyft
 

Viewers also liked

Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir DragievSpark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir DragievSpark Summit
 
Kubernetes Community Growth and Use Case
Kubernetes Community Growth and Use CaseKubernetes Community Growth and Use Case
Kubernetes Community Growth and Use CaseChris Gaun
 
Productionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerProductionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerEvan Chan
 
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...Spark Summit
 
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...Spark Summit
 
Data driven innovation for growth and well being
Data driven innovation for growth and well beingData driven innovation for growth and well being
Data driven innovation for growth and well beinginnovationoecd
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
Operational Tips for Deploying Spark by Miklos Christine
Operational Tips for Deploying Spark by Miklos ChristineOperational Tips for Deploying Spark by Miklos Christine
Operational Tips for Deploying Spark by Miklos ChristineSpark Summit
 
Spark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
Spark Summit EU talk by Shaun Klopfenstein and Neelesh ShastrySpark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
Spark Summit EU talk by Shaun Klopfenstein and Neelesh ShastrySpark Summit
 
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...Spark Summit
 
Zeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data ArchitectureZeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data ArchitectureMapR Technologies
 
Keeping Spark on Track: Productionizing Spark for ETL
Keeping Spark on Track: Productionizing Spark for ETLKeeping Spark on Track: Productionizing Spark for ETL
Keeping Spark on Track: Productionizing Spark for ETLDatabricks
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSAmazon Web Services
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Spark Summit
 
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...Spark Summit
 
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...Spark Summit
 
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...Spark Summit
 
Next Generation Enterprise Architecture
Next Generation Enterprise ArchitectureNext Generation Enterprise Architecture
Next Generation Enterprise ArchitectureMapR Technologies
 

Viewers also liked (20)

Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir DragievSpark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
 
Kubernetes Community Growth and Use Case
Kubernetes Community Growth and Use CaseKubernetes Community Growth and Use Case
Kubernetes Community Growth and Use Case
 
Productionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerProductionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job Server
 
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
 
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
 
Running Spark in Production
Running Spark in ProductionRunning Spark in Production
Running Spark in Production
 
Data driven innovation for growth and well being
Data driven innovation for growth and well beingData driven innovation for growth and well being
Data driven innovation for growth and well being
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
Operational Tips for Deploying Spark by Miklos Christine
Operational Tips for Deploying Spark by Miklos ChristineOperational Tips for Deploying Spark by Miklos Christine
Operational Tips for Deploying Spark by Miklos Christine
 
Spark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
Spark Summit EU talk by Shaun Klopfenstein and Neelesh ShastrySpark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
Spark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
 
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
 
Zeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data ArchitectureZeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data Architecture
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Keeping Spark on Track: Productionizing Spark for ETL
Keeping Spark on Track: Productionizing Spark for ETLKeeping Spark on Track: Productionizing Spark for ETL
Keeping Spark on Track: Productionizing Spark for ETL
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWS
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
 
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
 
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
 
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
 
Next Generation Enterprise Architecture
Next Generation Enterprise ArchitectureNext Generation Enterprise Architecture
Next Generation Enterprise Architecture
 

Similar to Spark Summit EU talk by William Benton

Tech huddle paas_session
Tech huddle paas_sessionTech huddle paas_session
Tech huddle paas_sessionRob Edwards
 
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.ioCompleting the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.ioCA Technologies
 
Leveraging a distributed architecture to your advantage
Leveraging a distributed architecture to your advantageLeveraging a distributed architecture to your advantage
Leveraging a distributed architecture to your advantageMichelangelo van Dam
 
Apache big-data-2017-spark-profiling
Apache big-data-2017-spark-profilingApache big-data-2017-spark-profiling
Apache big-data-2017-spark-profilingJayesh Thakrar
 
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014Amazon Web Services
 
Hands-on VeriFast with STM32 microcontroller
Hands-on VeriFast with STM32 microcontrollerHands-on VeriFast with STM32 microcontroller
Hands-on VeriFast with STM32 microcontrollerKiwamu Okabe
 
NSA for Enterprises Log Analysis Use Cases
NSA for Enterprises   Log Analysis Use Cases NSA for Enterprises   Log Analysis Use Cases
NSA for Enterprises Log Analysis Use Cases WSO2
 
Digital foundations - Fixing slow delivery of existing applications
Digital foundations - Fixing slow delivery of existing applicationsDigital foundations - Fixing slow delivery of existing applications
Digital foundations - Fixing slow delivery of existing applicationsEric D. Schabell
 
DevOps Days Tel Aviv - Serverless Architecture
DevOps Days Tel Aviv - Serverless ArchitectureDevOps Days Tel Aviv - Serverless Architecture
DevOps Days Tel Aviv - Serverless ArchitectureAntons Kranga
 
Scaling your apps with Kubernetes and Docker - TheConf 2018
Scaling your apps with Kubernetes and Docker - TheConf 2018Scaling your apps with Kubernetes and Docker - TheConf 2018
Scaling your apps with Kubernetes and Docker - TheConf 2018Erick Wendel
 
Graphs: Fabric of DevOps
Graphs: Fabric of DevOpsGraphs: Fabric of DevOps
Graphs: Fabric of DevOpsNeo4j
 
Vectorized R Execution in Apache Spark
Vectorized R Execution in Apache SparkVectorized R Execution in Apache Spark
Vectorized R Execution in Apache SparkDatabricks
 
【IVS CTO Night & Day】Amazon Container Services
【IVS CTO Night & Day】Amazon Container Services【IVS CTO Night & Day】Amazon Container Services
【IVS CTO Night & Day】Amazon Container ServicesAmazon Web Services Japan
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Databricks
 
Apache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDBApache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDBMongoDB
 
Hands-on VeriFast with STM32 microcontroller @ Osaka
Hands-on VeriFast with STM32 microcontroller @ OsakaHands-on VeriFast with STM32 microcontroller @ Osaka
Hands-on VeriFast with STM32 microcontroller @ OsakaKiwamu Okabe
 
Smart.js: JavaScript engine running on tiny MCU
Smart.js: JavaScript engine running on tiny MCUSmart.js: JavaScript engine running on tiny MCU
Smart.js: JavaScript engine running on tiny MCUKiwamu Okabe
 

Similar to Spark Summit EU talk by William Benton (20)

Tech huddle paas_session
Tech huddle paas_sessionTech huddle paas_session
Tech huddle paas_session
 
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.ioCompleting the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
 
Leveraging a distributed architecture to your advantage
Leveraging a distributed architecture to your advantageLeveraging a distributed architecture to your advantage
Leveraging a distributed architecture to your advantage
 
Apache big-data-2017-spark-profiling
Apache big-data-2017-spark-profilingApache big-data-2017-spark-profiling
Apache big-data-2017-spark-profiling
 
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
 
Hands-on VeriFast with STM32 microcontroller
Hands-on VeriFast with STM32 microcontrollerHands-on VeriFast with STM32 microcontroller
Hands-on VeriFast with STM32 microcontroller
 
NSA for Enterprises Log Analysis Use Cases
NSA for Enterprises   Log Analysis Use Cases NSA for Enterprises   Log Analysis Use Cases
NSA for Enterprises Log Analysis Use Cases
 
Digital foundations - Fixing slow delivery of existing applications
Digital foundations - Fixing slow delivery of existing applicationsDigital foundations - Fixing slow delivery of existing applications
Digital foundations - Fixing slow delivery of existing applications
 
DevOps Days Tel Aviv - Serverless Architecture
DevOps Days Tel Aviv - Serverless ArchitectureDevOps Days Tel Aviv - Serverless Architecture
DevOps Days Tel Aviv - Serverless Architecture
 
Scaling your apps with Kubernetes and Docker - TheConf 2018
Scaling your apps with Kubernetes and Docker - TheConf 2018Scaling your apps with Kubernetes and Docker - TheConf 2018
Scaling your apps with Kubernetes and Docker - TheConf 2018
 
Graphs: Fabric of DevOps
Graphs: Fabric of DevOpsGraphs: Fabric of DevOps
Graphs: Fabric of DevOps
 
Vectorized R Execution in Apache Spark
Vectorized R Execution in Apache SparkVectorized R Execution in Apache Spark
Vectorized R Execution in Apache Spark
 
【IVS CTO Night & Day】Amazon Container Services
【IVS CTO Night & Day】Amazon Container Services【IVS CTO Night & Day】Amazon Container Services
【IVS CTO Night & Day】Amazon Container Services
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
 
Cassandra and Spark SQL
Cassandra and Spark SQLCassandra and Spark SQL
Cassandra and Spark SQL
 
Docker - introduction
Docker - introductionDocker - introduction
Docker - introduction
 
Apache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDBApache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDB
 
Hands-on VeriFast with STM32 microcontroller @ Osaka
Hands-on VeriFast with STM32 microcontroller @ OsakaHands-on VeriFast with STM32 microcontroller @ Osaka
Hands-on VeriFast with STM32 microcontroller @ Osaka
 
Going serverless
Going serverlessGoing serverless
Going serverless
 
Smart.js: JavaScript engine running on tiny MCU
Smart.js: JavaScript engine running on tiny MCUSmart.js: JavaScript engine running on tiny MCU
Smart.js: JavaScript engine running on tiny MCU
 

More from Spark Summit

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang WuSpark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya RaghavendraSpark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakSpark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimSpark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraSpark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovSpark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkSpark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...Spark Summit
 

More from Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 

Recently uploaded

Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxbenishzehra469
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?DOT TECH
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单vcaxypu
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIAlejandraGmez176757
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单yhkoc
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单enxupq
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单nscud
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单ewymefz
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatheahmadsaood
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单nscud
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...elinavihriala
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单enxupq
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单ewymefz
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sMAQIB18
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单ewymefz
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Domenico Conte
 

Recently uploaded (20)

Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
 

Spark Summit EU talk by William Benton