SlideShare a Scribd company logo
William Benton
Red Hat, Inc.
@willb • willb@redhat.com
CONTAINERIZED SPARK 

ON KUBERNETES
BACKGROUND
BACKGROUND
BACKGROUND
BACKGROUND
BACKGROUND
BACKGROUND
BACKGROUND
BACKGROUND
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Networked
POSIX FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Mesos
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Networked
POSIX FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Mesos
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Networked
POSIX FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
1
2
3
4
Mesos
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Networked
POSIX FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
1
2
3
4
1
1
2
3
3
4
Mesos
WHAT OUR SPARK CLUSTER LOOKED LIKE IN 2014
Networked
POSIX FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
1
2
3
4
1
1
2
3
3
4
Analytics is no longer a
separate workload.
Analytics is an essential
component of modern data-
driven applications.
OUR GOALS
OUR GOALS
OUR GOALS
OUR GOALS
OUR GOALS
git
OUR GOALS
git
FORECAST
Motivating containerized microservices
Architectures for analytics and applications
Spark clusters in containers: practicalities and pitfalls
Play along at home
Future work
MOTIVATING MICROSERVICES
A microservice architecture
employs lightweight, modular,
and typically stateless
components with well-defined
interfaces and contracts.
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
2 + 2
BENEFITS OF MICROSERVICE ARCHITECTURES
2 + 2 5
BENEFITS OF MICROSERVICE ARCHITECTURES
2 + 2 5
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
BENEFITS OF MICROSERVICE ARCHITECTURES
?
BENEFITS OF MICROSERVICE ARCHITECTURES
?
BENEFITS OF MICROSERVICE ARCHITECTURES
?
MICROSERVICES AND SPARK
executor
1 2 3
executor
4 5 6
executor
7 8 9
executor
10 11 12
master
MICROSERVICES AND SPARK
executor
1 2 3
executor
4 5 6
executor
7 8 9
executor
10 11 12
master
λ x: x * 2
MICROSERVICES AND SPARK
executor
1 2 3
executor
4 5 6
executor
7 8 9
executor
10 11 12
master
λ x: x * 22 4 6 8 10 12 14 16 18 20 22 24
λ x: x * 2 λ x: x * 2 λ x: x * 2 λ x: x * 2
MICROSERVICES AND SPARK
executor
1 2 3
executor
4 5 6
executor
7 8 9
executor
10 11 12
master
λ x: x * 22 4 6 8 10 12 14 16 18 20 22 24
λ x: x * 2 λ x: x * 2 λ x: x * 2 λ x: x * 2
MICROSERVICES AND SPARK
executor
1 2 3
executor
4 5 6
executor
7 8 9
executor
10 11 12
master
λ x: x * 22 4 6 8 10 12 14 16 18 20 22 24
λ x: x * 2 λ x: x * 2 λ x: x * 2 λ x: x * 2
ARCHITECTURES FOR 

ANALYTICS AND APPLICATIONS
APPLICATION RESPONSIBILITIES
transform
transform
transform
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
transform
transform
transform
aggregate
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
archive
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
archive
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
APPLICATION RESPONSIBILITIES
archive
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
web and mobile
reporting
developer UI
APPLICATION RESPONSIBILITIES
archive
trainmodels
transform
transform
transform
aggregate
events
databases
file, object
storage
management
web and mobile
reporting
developer UI
LEGACY ARCHITECTURES
CONVENTIONAL DATA WAREHOUSE
events
CONVENTIONAL DATA WAREHOUSE
transformevents
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS RDBMS
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS RDBMS
analysis
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS RDBMS
analysis
reporting
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS RDBMS
analysis
reporting
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS RDBMS
analysis
interactive

query
reporting
transaction

processing
CONVENTIONAL DATA WAREHOUSE
transformevents
UI
business
logic
RDBMS analytic

processing
RDBMS
analysis
interactive

query
reporting
HADOOP-STYLE “DATA LAKE”
HDFS HDFS HDFS
HADOOP-STYLE “DATA LAKE”
HDFS HDFS HDFS HDFS HDFS
HADOOP-STYLE “DATA LAKE”
HDFS
events
HDFS HDFS HDFS HDFS
HADOOP-STYLE “DATA LAKE”
HDFS
events
HDFS HDFS HDFS HDFS
HADOOP-STYLE “DATA LAKE”
HDFS
compute
events
HDFS
compute
HDFS
compute compute compute
HDFS HDFS
MODERN ARCHITECTURES
THE LAMBDA ARCHITECTURE
events
(imprecise)

analysistransform
THE LAMBDA ARCHITECTURE
events
(imprecise)

analysistransform
DFS
speed layer
THE LAMBDA ARCHITECTURE
events
(imprecise)

analysistransform
DFS
speed layer
THE LAMBDA ARCHITECTURE
events
(precise)

analysistransform
(imprecise)

analysistransform
DFS
speed layer
THE LAMBDA ARCHITECTURE
events
batch layer
(precise)

analysistransform
(imprecise)

analysistransform
DFS
speed layer
THE LAMBDA ARCHITECTURE
events
batch layer
federate
(precise)

analysistransform
(imprecise)

analysistransform
DFS
speed layer
THE LAMBDA ARCHITECTURE
events
batch layer
UIfederate
(precise)

analysistransform
(imprecise)

analysistransform
DFS
serving layerspeed layer
THE LAMBDA ARCHITECTURE
events
batch layer
UIfederate
(precise)

analysistransform
(imprecise)

analysistransform
DFS
THE KAPPA ARCHITECTURE
events
queue for “raw data” topic
THE KAPPA ARCHITECTURE
events
queue for “raw data” topic
THE KAPPA ARCHITECTURE
events
transform
queue for “preprocessed data” topic
queue for “raw data” topic
THE KAPPA ARCHITECTURE
events
transform analysis
queue for “preprocessed data” topic
queue for “analysis results” topic
queue for “raw data” topic
THE KAPPA ARCHITECTURE
events
transform analysis
queue for “preprocessed data” topic
queue for “analysis results” topic
reporting end-user UI
DATA FEDERATION IN THE COMPUTE LAYER
aggregate
trainmodels
archive
events
databases
file, object
storage
management
web and mobile
reporting
developer UItransform
transform
transform
DATA FEDERATION IN THE COMPUTE LAYER
aggregate
trainmodels
archive
events
databases
file, object
storage
management
web and mobile
reporting
developer UItransform
transform
transform
DATA FEDERATION IN THE COMPUTE LAYER
aggregate
trainmodels
archive
events
databases
file, object
storage
management
web and mobile
reporting
developer UItransform
transform
transform
Cluster scheduler
SIDEBAR: THE MONOLITHIC SPARK ANTIPATTERN
Shared FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Resource manager
app 1 app 2
app 4app 3
Cluster scheduler
SIDEBAR: THE MONOLITHIC SPARK ANTIPATTERN
Shared FS
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Resource manager
app 1 app 2
app 4app 3
Resource manager
ONE CLUSTER PER APPLICATION
Object stores
app 1 app 2
app 5app 4
app 3
app 6
Databases
Resource manager
ONE CLUSTER PER APPLICATION
Object stores
app 1 app 2
app 5app 4
app 3
app 6
app 2
app 4
Databases
PRACTICALITIES AND
POTENTIAL PITFALLS
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
Object stores
Databases
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1
SCHEDULING
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1
SECURITY
SECURITY
SECURITY
$SPARK_HOME/bin/spark-class 
org.apache.spark.deploy.worker.Worker 
master:7077
pid
root
net
SECURITY
$SPARK_HOME/bin/spark-class 
org.apache.spark.deploy.worker.Worker 
master:7077
pid
root
net
/tmp/foo
SECURITY
SECURITY
k8s namespace
SECURITY
k8s namespace*
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
POSIX
filesystem
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
POSIX
filesystem
✓ familiar interface
✓ interoperability with
other programs
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
POSIX
filesystem
✓ familiar interface
✓ interoperability with
other programs
✗ unnecessary
semantic guarantees
✗ difficult to manage
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
HDFS HDFS
HDFS HDFS
HDFS
HDFS
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
HDFS HDFS
HDFS HDFS
HDFS
HDFS
✓ support for legacy
Hadoop installations
✗ inelastic
✗ stateful
✗ can’t collocate
compute and data
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
HDFS HDFS
HDFS HDFS
HDFS
HDFS
✓ support for legacy
Hadoop installations
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
object store
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
object store
✓ interoperability
✓ fine-grained AC
✓ many implementations
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
object store
✓ interoperability
✓ fine-grained AC
✓ many implementations
✗ consistency model
✗ performance (?)
STORAGE
Kubernetes
app 1 app 2
app 5app 4
app 3
app 6
app 1 app 2
app 5app 4
app 3
app 6
object store
✓ interoperability
✓ fine-grained AC
✓ many implementations
✗ consistency model
✗ performance
“…in a cloud native architecture, the benefit of
HDFS is actually very small and that is why
many cloud-first organizations no longer run
HDFS, or only run it as a caching layer for S3.”
—Reynold Xin on Quora (http://qr.ae/TAF4cN)
NETWORKING
NETWORKING
NETWORKING
NETWORKING
http://app1:8080
NETWORKING
http://app1:8080
✗ can’t access worker web UI
(but wait for Spark 2.1!)
NETWORKING
http://app1:8080
✗ can’t access worker web UI
(but wait for Spark 2.1!)
NETWORKING
http://app1:8080
✗ can’t access worker web UI
(but wait for Spark 2.1!)
http://app1:80
NEXT STEPS: FUTURE WORK &
PLAYING ALONG AT HOME
NEXT STEPS
Further performance evaluation
Better developer experience
Improved scheduling of Spark tasks on Kubernetes
TRY IT OUT YOURSELF
Kubernetes standalone Spark example:

https://github.com/kubernetes/kubernetes/tree/master/examples/spark
Enabling Spark on OpenShift: https://github.com/radanalyticsio
Native Spark on Kubernetes proposal:

https://github.com/kubernetes/kubernetes/issues/34377
@willb • willb@redhat.com

https://chapeau.freevariable.com
THANKS!

More Related Content

What's hot

Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesSimplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Databricks
 
Sparklyr: Recap, Updates, and Use Cases with Javier Luraschi
Sparklyr: Recap, Updates, and Use Cases with Javier LuraschiSparklyr: Recap, Updates, and Use Cases with Javier Luraschi
Sparklyr: Recap, Updates, and Use Cases with Javier Luraschi
Databricks
 
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
Spark Summit
 
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis MagdaApache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
Databricks
 
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
Databricks
 
Data science lifecycle with Apache Zeppelin
Data science lifecycle with Apache ZeppelinData science lifecycle with Apache Zeppelin
Data science lifecycle with Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Databricks
 
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim DowlingSpark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim Dowling
Spark Summit
 
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraUsing Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Spark Summit
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
Databricks
 
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan VolzArchiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Databricks
 
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Spark Summit
 
Spark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg Schad
Spark Summit
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on Kubernetes
Yousun Jeong
 
Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at Lyft
Li Gao
 
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Spark Summit
 
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Evan Chan
 
Migrating pipelines into Docker
Migrating pipelines into DockerMigrating pipelines into Docker
Migrating pipelines into Docker
DataWorks Summit/Hadoop Summit
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Databricks
 
Scaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftScaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at Lyft
Databricks
 

What's hot (20)

Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesSimplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
 
Sparklyr: Recap, Updates, and Use Cases with Javier Luraschi
Sparklyr: Recap, Updates, and Use Cases with Javier LuraschiSparklyr: Recap, Updates, and Use Cases with Javier Luraschi
Sparklyr: Recap, Updates, and Use Cases with Javier Luraschi
 
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
Opaque: A Data Analytics Platform with Strong Security: Spark Summit East tal...
 
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis MagdaApache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT with Denis Magda
 
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
 
Data science lifecycle with Apache Zeppelin
Data science lifecycle with Apache ZeppelinData science lifecycle with Apache Zeppelin
Data science lifecycle with Apache Zeppelin
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
 
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim DowlingSpark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim Dowling
 
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraUsing Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
 
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan VolzArchiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
 
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
 
Spark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg Schad
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on Kubernetes
 
Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at Lyft
 
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
 
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
 
Migrating pipelines into Docker
Migrating pipelines into DockerMigrating pipelines into Docker
Migrating pipelines into Docker
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
 
Scaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftScaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at Lyft
 

Viewers also liked

Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir DragievSpark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit
 
Kubernetes Community Growth and Use Case
Kubernetes Community Growth and Use CaseKubernetes Community Growth and Use Case
Kubernetes Community Growth and Use Case
Chris Gaun
 
Productionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerProductionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job Server
Evan Chan
 
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
Spark Summit
 
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Spark Summit
 
Running Spark in Production
Running Spark in ProductionRunning Spark in Production
Running Spark in Production
DataWorks Summit/Hadoop Summit
 
Data driven innovation for growth and well being
Data driven innovation for growth and well beingData driven innovation for growth and well being
Data driven innovation for growth and well being
innovationoecd
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
 
Operational Tips for Deploying Spark by Miklos Christine
Operational Tips for Deploying Spark by Miklos ChristineOperational Tips for Deploying Spark by Miklos Christine
Operational Tips for Deploying Spark by Miklos Christine
Spark Summit
 
Spark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
Spark Summit EU talk by Shaun Klopfenstein and Neelesh ShastrySpark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
Spark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
Spark Summit
 
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
Spark Summit
 
Zeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data ArchitectureZeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data Architecture
MapR Technologies
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
Amazon Web Services
 
Keeping Spark on Track: Productionizing Spark for ETL
Keeping Spark on Track: Productionizing Spark for ETLKeeping Spark on Track: Productionizing Spark for ETL
Keeping Spark on Track: Productionizing Spark for ETL
Databricks
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWS
Amazon Web Services
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Spark Summit
 
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
Spark Summit
 
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
Spark Summit
 
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
Spark Summit
 
Next Generation Enterprise Architecture
Next Generation Enterprise ArchitectureNext Generation Enterprise Architecture
Next Generation Enterprise Architecture
MapR Technologies
 

Viewers also liked (20)

Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir DragievSpark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
 
Kubernetes Community Growth and Use Case
Kubernetes Community Growth and Use CaseKubernetes Community Growth and Use Case
Kubernetes Community Growth and Use Case
 
Productionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerProductionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job Server
 
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
No More “Sbt Assembly”: Rethinking Spark-Submit Using CueSheet: Spark Summit ...
 
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
 
Running Spark in Production
Running Spark in ProductionRunning Spark in Production
Running Spark in Production
 
Data driven innovation for growth and well being
Data driven innovation for growth and well beingData driven innovation for growth and well being
Data driven innovation for growth and well being
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
Operational Tips for Deploying Spark by Miklos Christine
Operational Tips for Deploying Spark by Miklos ChristineOperational Tips for Deploying Spark by Miklos Christine
Operational Tips for Deploying Spark by Miklos Christine
 
Spark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
Spark Summit EU talk by Shaun Klopfenstein and Neelesh ShastrySpark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
Spark Summit EU talk by Shaun Klopfenstein and Neelesh Shastry
 
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
 
Zeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data ArchitectureZeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data Architecture
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Keeping Spark on Track: Productionizing Spark for ETL
Keeping Spark on Track: Productionizing Spark for ETLKeeping Spark on Track: Productionizing Spark for ETL
Keeping Spark on Track: Productionizing Spark for ETL
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWS
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
 
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
 
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...
 
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
 
Next Generation Enterprise Architecture
Next Generation Enterprise ArchitectureNext Generation Enterprise Architecture
Next Generation Enterprise Architecture
 

Similar to Spark Summit EU talk by William Benton

Tech huddle paas_session
Tech huddle paas_sessionTech huddle paas_session
Tech huddle paas_session
Rob Edwards
 
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.ioCompleting the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
CA Technologies
 
Leveraging a distributed architecture to your advantage
Leveraging a distributed architecture to your advantageLeveraging a distributed architecture to your advantage
Leveraging a distributed architecture to your advantage
Michelangelo van Dam
 
Apache big-data-2017-spark-profiling
Apache big-data-2017-spark-profilingApache big-data-2017-spark-profiling
Apache big-data-2017-spark-profiling
Jayesh Thakrar
 
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
Amazon Web Services
 
Hands-on VeriFast with STM32 microcontroller
Hands-on VeriFast with STM32 microcontrollerHands-on VeriFast with STM32 microcontroller
Hands-on VeriFast with STM32 microcontroller
Kiwamu Okabe
 
NSA for Enterprises Log Analysis Use Cases
NSA for Enterprises   Log Analysis Use Cases NSA for Enterprises   Log Analysis Use Cases
NSA for Enterprises Log Analysis Use Cases
WSO2
 
Digital foundations - Fixing slow delivery of existing applications
Digital foundations - Fixing slow delivery of existing applicationsDigital foundations - Fixing slow delivery of existing applications
Digital foundations - Fixing slow delivery of existing applications
Eric D. Schabell
 
DevOps Days Tel Aviv - Serverless Architecture
DevOps Days Tel Aviv - Serverless ArchitectureDevOps Days Tel Aviv - Serverless Architecture
DevOps Days Tel Aviv - Serverless Architecture
Antons Kranga
 
Scaling your apps with Kubernetes and Docker - TheConf 2018
Scaling your apps with Kubernetes and Docker - TheConf 2018Scaling your apps with Kubernetes and Docker - TheConf 2018
Scaling your apps with Kubernetes and Docker - TheConf 2018
Erick Wendel
 
Graphs: Fabric of DevOps
Graphs: Fabric of DevOpsGraphs: Fabric of DevOps
Graphs: Fabric of DevOps
Neo4j
 
Vectorized R Execution in Apache Spark
Vectorized R Execution in Apache SparkVectorized R Execution in Apache Spark
Vectorized R Execution in Apache Spark
Databricks
 
【IVS CTO Night & Day】Amazon Container Services
【IVS CTO Night & Day】Amazon Container Services【IVS CTO Night & Day】Amazon Container Services
【IVS CTO Night & Day】Amazon Container Services
Amazon Web Services Japan
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Databricks
 
Cassandra and Spark SQL
Cassandra and Spark SQLCassandra and Spark SQL
Cassandra and Spark SQL
Russell Spitzer
 
Docker - introduction
Docker - introductionDocker - introduction
Docker - introduction
Michał Kurzeja
 
Apache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDBApache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDB
MongoDB
 
Hands-on VeriFast with STM32 microcontroller @ Osaka
Hands-on VeriFast with STM32 microcontroller @ OsakaHands-on VeriFast with STM32 microcontroller @ Osaka
Hands-on VeriFast with STM32 microcontroller @ Osaka
Kiwamu Okabe
 
Going serverless
Going serverlessGoing serverless
Going serverless
Jeremy Green
 
Smart.js: JavaScript engine running on tiny MCU
Smart.js: JavaScript engine running on tiny MCUSmart.js: JavaScript engine running on tiny MCU
Smart.js: JavaScript engine running on tiny MCU
Kiwamu Okabe
 

Similar to Spark Summit EU talk by William Benton (20)

Tech huddle paas_session
Tech huddle paas_sessionTech huddle paas_session
Tech huddle paas_session
 
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.ioCompleting the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
 
Leveraging a distributed architecture to your advantage
Leveraging a distributed architecture to your advantageLeveraging a distributed architecture to your advantage
Leveraging a distributed architecture to your advantage
 
Apache big-data-2017-spark-profiling
Apache big-data-2017-spark-profilingApache big-data-2017-spark-profiling
Apache big-data-2017-spark-profiling
 
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014
 
Hands-on VeriFast with STM32 microcontroller
Hands-on VeriFast with STM32 microcontrollerHands-on VeriFast with STM32 microcontroller
Hands-on VeriFast with STM32 microcontroller
 
NSA for Enterprises Log Analysis Use Cases
NSA for Enterprises   Log Analysis Use Cases NSA for Enterprises   Log Analysis Use Cases
NSA for Enterprises Log Analysis Use Cases
 
Digital foundations - Fixing slow delivery of existing applications
Digital foundations - Fixing slow delivery of existing applicationsDigital foundations - Fixing slow delivery of existing applications
Digital foundations - Fixing slow delivery of existing applications
 
DevOps Days Tel Aviv - Serverless Architecture
DevOps Days Tel Aviv - Serverless ArchitectureDevOps Days Tel Aviv - Serverless Architecture
DevOps Days Tel Aviv - Serverless Architecture
 
Scaling your apps with Kubernetes and Docker - TheConf 2018
Scaling your apps with Kubernetes and Docker - TheConf 2018Scaling your apps with Kubernetes and Docker - TheConf 2018
Scaling your apps with Kubernetes and Docker - TheConf 2018
 
Graphs: Fabric of DevOps
Graphs: Fabric of DevOpsGraphs: Fabric of DevOps
Graphs: Fabric of DevOps
 
Vectorized R Execution in Apache Spark
Vectorized R Execution in Apache SparkVectorized R Execution in Apache Spark
Vectorized R Execution in Apache Spark
 
【IVS CTO Night & Day】Amazon Container Services
【IVS CTO Night & Day】Amazon Container Services【IVS CTO Night & Day】Amazon Container Services
【IVS CTO Night & Day】Amazon Container Services
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
 
Cassandra and Spark SQL
Cassandra and Spark SQLCassandra and Spark SQL
Cassandra and Spark SQL
 
Docker - introduction
Docker - introductionDocker - introduction
Docker - introduction
 
Apache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDBApache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDB
 
Hands-on VeriFast with STM32 microcontroller @ Osaka
Hands-on VeriFast with STM32 microcontroller @ OsakaHands-on VeriFast with STM32 microcontroller @ Osaka
Hands-on VeriFast with STM32 microcontroller @ Osaka
 
Going serverless
Going serverlessGoing serverless
Going serverless
 
Smart.js: JavaScript engine running on tiny MCU
Smart.js: JavaScript engine running on tiny MCUSmart.js: JavaScript engine running on tiny MCU
Smart.js: JavaScript engine running on tiny MCU
 

More from Spark Summit

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Spark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Spark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Spark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Spark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
 

More from Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 

Recently uploaded

Cyber Insurance Mathematical Model & Pricing 2
Cyber Insurance Mathematical Model & Pricing 2Cyber Insurance Mathematical Model & Pricing 2
Cyber Insurance Mathematical Model & Pricing 2
BaraDaniel1
 
Self-healing Security Systems - CloudIOTEnterpriseSystems-Group5.pptx
Self-healing Security Systems - CloudIOTEnterpriseSystems-Group5.pptxSelf-healing Security Systems - CloudIOTEnterpriseSystems-Group5.pptx
Self-healing Security Systems - CloudIOTEnterpriseSystems-Group5.pptx
BiplabRoy71
 
Celebrity Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
Celebrity Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...Celebrity Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
Celebrity Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
arti singh$A17
 
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion dataTowards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Samuel Jackson
 
Histology of Muscle types histology o.ppt
Histology of Muscle types histology o.pptHistology of Muscle types histology o.ppt
Histology of Muscle types histology o.ppt
SamanArshad11
 
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
45unexpected
 
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Alireza Kamrani
 
Female Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service An...
Female Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service An...Female Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service An...
Female Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service An...
sheetal singh$A17
 
History and Application of LLM Leveraging Big Data
History and Application of LLM Leveraging Big DataHistory and Application of LLM Leveraging Big Data
History and Application of LLM Leveraging Big Data
Jongwook Woo
 
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
revolutionary575
 
Cyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & PricingCyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & Pricing
BaraDaniel1
 
UNITEC Institute of Technology diploma
UNITEC Institute of Technology diplomaUNITEC Institute of Technology diploma
UNITEC Institute of Technology diploma
oyhka
 
VIP Kolkata Girls Call Kolkata 0X0000000X Doorstep High-Profile Girl Service ...
VIP Kolkata Girls Call Kolkata 0X0000000X Doorstep High-Profile Girl Service ...VIP Kolkata Girls Call Kolkata 0X0000000X Doorstep High-Profile Girl Service ...
VIP Kolkata Girls Call Kolkata 0X0000000X Doorstep High-Profile Girl Service ...
sukaniyasunnu
 
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdfWhy_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Alexander Teggin
 
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
6459astrid
 
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
Grant McAlister
 
CMO MRM_May 2024 WITH BREAKDOWN AND IMPROVEMENTDATA.pdf
CMO MRM_May 2024 WITH BREAKDOWN AND IMPROVEMENTDATA.pdfCMO MRM_May 2024 WITH BREAKDOWN AND IMPROVEMENTDATA.pdf
CMO MRM_May 2024 WITH BREAKDOWN AND IMPROVEMENTDATA.pdf
IndranilDasgupta19
 
Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?
SomalyEng
 
Training on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptxTraining on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptx
lenjisoHussein
 
Semantic Web and organizational data .pptx
Semantic Web and organizational data .pptxSemantic Web and organizational data .pptx
Semantic Web and organizational data .pptx
Kanchana Weerasinghe
 

Recently uploaded (20)

Cyber Insurance Mathematical Model & Pricing 2
Cyber Insurance Mathematical Model & Pricing 2Cyber Insurance Mathematical Model & Pricing 2
Cyber Insurance Mathematical Model & Pricing 2
 
Self-healing Security Systems - CloudIOTEnterpriseSystems-Group5.pptx
Self-healing Security Systems - CloudIOTEnterpriseSystems-Group5.pptxSelf-healing Security Systems - CloudIOTEnterpriseSystems-Group5.pptx
Self-healing Security Systems - CloudIOTEnterpriseSystems-Group5.pptx
 
Celebrity Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
Celebrity Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...Celebrity Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
Celebrity Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
 
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion dataTowards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
 
Histology of Muscle types histology o.ppt
Histology of Muscle types histology o.pptHistology of Muscle types histology o.ppt
Histology of Muscle types histology o.ppt
 
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
 
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
 
Female Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service An...
Female Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service An...Female Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service An...
Female Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service An...
 
History and Application of LLM Leveraging Big Data
History and Application of LLM Leveraging Big DataHistory and Application of LLM Leveraging Big Data
History and Application of LLM Leveraging Big Data
 
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
 
Cyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & PricingCyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & Pricing
 
UNITEC Institute of Technology diploma
UNITEC Institute of Technology diplomaUNITEC Institute of Technology diploma
UNITEC Institute of Technology diploma
 
VIP Kolkata Girls Call Kolkata 0X0000000X Doorstep High-Profile Girl Service ...
VIP Kolkata Girls Call Kolkata 0X0000000X Doorstep High-Profile Girl Service ...VIP Kolkata Girls Call Kolkata 0X0000000X Doorstep High-Profile Girl Service ...
VIP Kolkata Girls Call Kolkata 0X0000000X Doorstep High-Profile Girl Service ...
 
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdfWhy_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
 
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
 
CMO MRM_May 2024 WITH BREAKDOWN AND IMPROVEMENTDATA.pdf
CMO MRM_May 2024 WITH BREAKDOWN AND IMPROVEMENTDATA.pdfCMO MRM_May 2024 WITH BREAKDOWN AND IMPROVEMENTDATA.pdf
CMO MRM_May 2024 WITH BREAKDOWN AND IMPROVEMENTDATA.pdf
 
Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?
 
Training on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptxTraining on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptx
 
Semantic Web and organizational data .pptx
Semantic Web and organizational data .pptxSemantic Web and organizational data .pptx
Semantic Web and organizational data .pptx
 

Spark Summit EU talk by William Benton