SlideShare a Scribd company logo
Anirudh Ramananathan (foxish@google.com)
Software Engineer (Kubernetes)
Timothy Chen (tim@hyperpilot.io)
Co-founder & CTO (HyperPilot)
Apache Spark on
Kubernetes
Agenda
• Kubernetes & Containers
• Motivation
• Design
• Demo
• Deep Dive
• Roadmap
What is Kubernetes?
Kubernetes
Kubernetes is an open-source system
Kubernetes
Kubernetes is an open-source system for automating
deployment, scaling, and management
Kubernetes
Kubernetes is an open-source system for automating
deployment, scaling, and management of containerized
applications.
‘Containerized’
Containers
libs
app
kernel
libs
app
libs
app
libs
app
• Repeatable Builds and Workflows
• Application Portability
• High Degree of Control over
Software
• Faster Development Cycle
• Reduced dev-ops load
• Improved Infrastructure Utilization
• Large OSS Community - 1200+
contributors and 45k+ commits
• Ecosystem and Partners - 100+
organizations involved
• One of the top 100 projects overall on
GitHub - 23k+ stars
Kubernetes
Overview
At a Glance
kubelet
UI
kubeletCLI
API
users master nodes
etcd
kubelet
scheduler
controllers
apiserver
Nodes and Pods
Pod
Volume
Containers
Pod
Containers
8080 8080 8080
Volume
Node
• A pod is a set of co-located containers
• Created by a declarative specification
supplied to the master
• Each pod has its own IP address
• Volumes can be local or
network-attached
Motivation
Why Spark on Kubernetes?
• Resource sharing between batch, serving and stateful workloads
– Streamlined developer experience
– Reduced operational costs
– Improved infrastructure utilization
• Kubernetes and the Container Ecosystem
– Lots of addon services: third-party logging, monitoring, and security tools
– For example, the Istio project, announced May 24, by IBM, Google and Lyft,
provides a service mesh for authenticating, authorizing, tracing, and timing,
and rate-limiting container-to-container communication, and more.
Design
Spark, meet Kubernetes!
Spark Core
Kubernetes Standalone YARN Mesos
GraphX SparkSQL MLlib Streaming
Spark, meet Kubernetes!
Spark Core Kubernetes Scheduler Backend
Kubernetes Clusternew executors
remove executors
configuration
• Resource Requests
• Authnz
• Communication with K8s
Kubernetes, meet Spark!
Kubernetes Cluster
File Staging Server
• Staging server: component to
stage local files
• Spark Shuffle service:
component to store shuffle data
for dynamic allocation
• ThirdParty/CustomResources:
extend Kubernetes API with
Spark Knowledge
Shuffle Service
SparkJob API
Endpoint
Kubernetes
Integration
Dependencies
Container images with dependencies
baked in
Files from GCS/S3/HDFS/HTTP
File Staging Server
Staged files and
JARs
Several ways of running Spark Jobs along with their dependencies on
Kubernetes
Administration
Namespaces
Resource
Accounting
Logging
Monitoring
Resource
Quota
Pluggable
Authorization
Admission
Control
RBAC
• Launch Spark Jobs as a particular user
into a specific namespace
• RBAC and Namespace-level resource
quotas
• Audit logging for clusters
• Several monitoring solutions to see
node, cluster and pod-level statistics
Focus Areas
Wordcloud of the command-line options we added to spark-submit
on Kubernetes
Demo
Deep Dive
Deep Dive
spark-subm
it
kubernetes cluster
apiserver
scheduler
• Spark Submit submits job to K8s
• Spark Submit submits job to K8s
• K8s schedules the driver for job
Deep Dive
kubernetes cluster
apiserver
scheduler schedule driver pod
spark driver
• Spark Submit submits job to K8s
• K8s schedules the driver for job
Deep Dive
• Spark Submit submits job to K8s
• K8s schedules the driver for job
• Driver requests executors as needed
kubernetes cluster
apiserver
scheduler
spark driver
create executor
pods
• Spark Submit submits job to K8s
• K8s schedules the driver for job
Deep Dive
• Spark Submit submits job to K8s
• K8s schedules the driver for job
• Driver requests executors as needed
• Executors scheduled and created
kubernetes cluster
apiserver
scheduler
spark driver
schedule
executorpods
executors
• Spark Submit submits job to K8s
• K8s schedules the driver for job
Deep Dive
• Spark Submit submits job to K8s
• K8s schedules the driver for job
• Driver requests executors as needed
• Executors scheduled and created
• Executors run tasks
kubernetes cluster
apiserver
scheduler
spark driver
executors
• Spark Submit submits job to K8s
• K8s schedules the driver for job
Deep Dive
• Spark Submit submits job to K8s
• K8s schedules the driver for job
• Driver requests executors as needed
• Executors scheduled and created
• Executors run tasks
• Driver “completes” job and persists
logs
kubernetes cluster
apiserver
scheduler
spark driver
Roadmap
Spark Streaming
Spark Roadmap
Spark Shell
Client Mode
Python/R support
Cluster Mode
Java/Scala Support
Dynamic Allocation Local File Staging High Availability
Spark SQL
GraphX MLlib
Dec 2016
Development
Began
Mar 2017
Alpha
Release
June 2017
Beta
Release
Nov 2016
Design
= supported but untested = not yet supported
We’re just getting started...
• Kubernetes CustomResources
• Priorities and Preemption for Pods
• Batch Scheduling and Resource Sharing
• Cluster Federation and Multi-cloud deployments
• Ecosystem: Kafka, Cassandra, HDFS, etc
Contributors
Organizations Alphabetically:
• Google
• Haiwen
• Hyperpilot
• Intel
• Palantir
• Pepperdata
• Red Hat
Links:
• Spark 2.2.0 Documentation
• https://issues.apache.org/jira/browse
/SPARK-18278
• https://github.com/kubernetes/kubern
etes/issues/34377
Try it out today!
https://github.com/apache-spark-on-k8s/spark
Thank You.
HDFS on Kubernetes - Lessons Learned
June 7 at 11:00 AM in Room 2003
Join us Wednesdays at 10am PT at the SIG BigData meeting
https://github.com/kubernetes/community/

More Related Content

What's hot

CI Implementation with Kubernetes at LivePerson by Saar Demri
CI Implementation with Kubernetes at LivePerson by Saar DemriCI Implementation with Kubernetes at LivePerson by Saar Demri
CI Implementation with Kubernetes at LivePerson by Saar Demri
DoiT International
 
Setup Hybrid Clusters Using Kubernetes Federation
Setup Hybrid Clusters Using Kubernetes FederationSetup Hybrid Clusters Using Kubernetes Federation
Setup Hybrid Clusters Using Kubernetes Federation
inwin stack
 
What's new in Kubernetes
What's new in KubernetesWhat's new in Kubernetes
What's new in Kubernetes
Daniel Smith
 
Webcast - Making kubernetes production ready
Webcast - Making kubernetes production readyWebcast - Making kubernetes production ready
Webcast - Making kubernetes production ready
Applatix
 
Running and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStackRunning and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStack
Victor Palma
 
Spinnaker on Kubernetes
Spinnaker on KubernetesSpinnaker on Kubernetes
Spinnaker on Kubernetes
Jinwoong Kim
 
Implement Advanced Scheduling Techniques in Kubernetes
Implement Advanced Scheduling Techniques in Kubernetes Implement Advanced Scheduling Techniques in Kubernetes
Implement Advanced Scheduling Techniques in Kubernetes
Kublr
 
AKS
AKSAKS
Kubernetes for Serverless - Serverless Summit 2017 - Krishna Kumar
Kubernetes for Serverless  - Serverless Summit 2017 - Krishna KumarKubernetes for Serverless  - Serverless Summit 2017 - Krishna Kumar
Kubernetes for Serverless - Serverless Summit 2017 - Krishna Kumar
CodeOps Technologies LLP
 
DevOps with Azure, Kubernetes, and Helm Webinar
DevOps with Azure, Kubernetes, and Helm WebinarDevOps with Azure, Kubernetes, and Helm Webinar
DevOps with Azure, Kubernetes, and Helm Webinar
Codefresh
 
Autoscaling Kubernetes
Autoscaling KubernetesAutoscaling Kubernetes
Autoscaling Kubernetes
craigbox
 
DevOps with Kubernetes
DevOps with KubernetesDevOps with Kubernetes
DevOps with Kubernetes
EastBanc Tachnologies
 
Centralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container OperationsCentralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container Operations
Kublr
 
K8S in prod
K8S in prodK8S in prod
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
confluent
 
The Evolution of your Kubernetes Cluster
The Evolution of your Kubernetes ClusterThe Evolution of your Kubernetes Cluster
The Evolution of your Kubernetes Cluster
Kublr
 
From Code to Kubernetes
From Code to KubernetesFrom Code to Kubernetes
From Code to Kubernetes
Daniel Oliveira Filho
 
Cloud spanner architecture and use cases
Cloud spanner architecture and use casesCloud spanner architecture and use cases
Cloud spanner architecture and use cases
GDG Cloud Bengaluru
 
Crafting Kubernetes Operators
Crafting Kubernetes OperatorsCrafting Kubernetes Operators
Crafting Kubernetes Operators
Red Hat Developers
 
Container Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesContainer Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and Kubernetes
Will Hall
 

What's hot (20)

CI Implementation with Kubernetes at LivePerson by Saar Demri
CI Implementation with Kubernetes at LivePerson by Saar DemriCI Implementation with Kubernetes at LivePerson by Saar Demri
CI Implementation with Kubernetes at LivePerson by Saar Demri
 
Setup Hybrid Clusters Using Kubernetes Federation
Setup Hybrid Clusters Using Kubernetes FederationSetup Hybrid Clusters Using Kubernetes Federation
Setup Hybrid Clusters Using Kubernetes Federation
 
What's new in Kubernetes
What's new in KubernetesWhat's new in Kubernetes
What's new in Kubernetes
 
Webcast - Making kubernetes production ready
Webcast - Making kubernetes production readyWebcast - Making kubernetes production ready
Webcast - Making kubernetes production ready
 
Running and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStackRunning and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStack
 
Spinnaker on Kubernetes
Spinnaker on KubernetesSpinnaker on Kubernetes
Spinnaker on Kubernetes
 
Implement Advanced Scheduling Techniques in Kubernetes
Implement Advanced Scheduling Techniques in Kubernetes Implement Advanced Scheduling Techniques in Kubernetes
Implement Advanced Scheduling Techniques in Kubernetes
 
AKS
AKSAKS
AKS
 
Kubernetes for Serverless - Serverless Summit 2017 - Krishna Kumar
Kubernetes for Serverless  - Serverless Summit 2017 - Krishna KumarKubernetes for Serverless  - Serverless Summit 2017 - Krishna Kumar
Kubernetes for Serverless - Serverless Summit 2017 - Krishna Kumar
 
DevOps with Azure, Kubernetes, and Helm Webinar
DevOps with Azure, Kubernetes, and Helm WebinarDevOps with Azure, Kubernetes, and Helm Webinar
DevOps with Azure, Kubernetes, and Helm Webinar
 
Autoscaling Kubernetes
Autoscaling KubernetesAutoscaling Kubernetes
Autoscaling Kubernetes
 
DevOps with Kubernetes
DevOps with KubernetesDevOps with Kubernetes
DevOps with Kubernetes
 
Centralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container OperationsCentralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container Operations
 
K8S in prod
K8S in prodK8S in prod
K8S in prod
 
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
 
The Evolution of your Kubernetes Cluster
The Evolution of your Kubernetes ClusterThe Evolution of your Kubernetes Cluster
The Evolution of your Kubernetes Cluster
 
From Code to Kubernetes
From Code to KubernetesFrom Code to Kubernetes
From Code to Kubernetes
 
Cloud spanner architecture and use cases
Cloud spanner architecture and use casesCloud spanner architecture and use cases
Cloud spanner architecture and use cases
 
Crafting Kubernetes Operators
Crafting Kubernetes OperatorsCrafting Kubernetes Operators
Crafting Kubernetes Operators
 
Container Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesContainer Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and Kubernetes
 

Similar to [Spark Summit 2017 NA] Apache Spark on Kubernetes

Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Apache Spark on Kubernetes Anirudh Ramanathan and Tim ChenApache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Databricks
 
Big data and Kubernetes
Big data and KubernetesBig data and Kubernetes
Big data and Kubernetes
Anirudh Ramanathan
 
Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1
Hao H. Zhang
 
Webinar kubernetes and-spark
Webinar  kubernetes and-sparkWebinar  kubernetes and-spark
Webinar kubernetes and-spark
cnvrg.io AI OS - Hands-on ML Workshops
 
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
VMUG IT
 
Serverless spark
Serverless sparkServerless spark
Serverless spark
MamathaBusi
 
[DevDay 2017] OpenShift Enterprise - Speaker: Linh Do - DevOps Engineer at Ax...
[DevDay 2017] OpenShift Enterprise - Speaker: Linh Do - DevOps Engineer at Ax...[DevDay 2017] OpenShift Enterprise - Speaker: Linh Do - DevOps Engineer at Ax...
[DevDay 2017] OpenShift Enterprise - Speaker: Linh Do - DevOps Engineer at Ax...
DevDay.org
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...
DataWorks Summit
 
CNCF Projects Overview
CNCF Projects OverviewCNCF Projects Overview
CNCF Projects Overview
Neependra Khare
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
DataWorks Summit
 
Spark volume requirements 2018
Spark volume requirements 2018Spark volume requirements 2018
Spark volume requirements 2018
Rachit Arora
 
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd についてKubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
LINE Corporation
 
SpringOne Tour: An Introduction to Azure Spring Apps Enterprise
SpringOne Tour: An Introduction to Azure Spring Apps EnterpriseSpringOne Tour: An Introduction to Azure Spring Apps Enterprise
SpringOne Tour: An Introduction to Azure Spring Apps Enterprise
VMware Tanzu
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learning
Antje Barth
 
Azure Kubernetes Service 2019 ふりかえり
Azure Kubernetes Service 2019 ふりかえりAzure Kubernetes Service 2019 ふりかえり
Azure Kubernetes Service 2019 ふりかえり
Toru Makabe
 
Kubernetes overview 101
Kubernetes overview 101Kubernetes overview 101
Kubernetes overview 101
Boskey Savla
 
Docker kubernetes fundamental(pod_service)_190307
Docker kubernetes fundamental(pod_service)_190307Docker kubernetes fundamental(pod_service)_190307
Docker kubernetes fundamental(pod_service)_190307
Inhye Park
 
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Vietnam Open Infrastructure User Group
 
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup
 
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
Athens Big Data
 

Similar to [Spark Summit 2017 NA] Apache Spark on Kubernetes (20)

Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Apache Spark on Kubernetes Anirudh Ramanathan and Tim ChenApache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
 
Big data and Kubernetes
Big data and KubernetesBig data and Kubernetes
Big data and Kubernetes
 
Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1
 
Webinar kubernetes and-spark
Webinar  kubernetes and-sparkWebinar  kubernetes and-spark
Webinar kubernetes and-spark
 
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
 
Serverless spark
Serverless sparkServerless spark
Serverless spark
 
[DevDay 2017] OpenShift Enterprise - Speaker: Linh Do - DevOps Engineer at Ax...
[DevDay 2017] OpenShift Enterprise - Speaker: Linh Do - DevOps Engineer at Ax...[DevDay 2017] OpenShift Enterprise - Speaker: Linh Do - DevOps Engineer at Ax...
[DevDay 2017] OpenShift Enterprise - Speaker: Linh Do - DevOps Engineer at Ax...
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...
 
CNCF Projects Overview
CNCF Projects OverviewCNCF Projects Overview
CNCF Projects Overview
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
 
Spark volume requirements 2018
Spark volume requirements 2018Spark volume requirements 2018
Spark volume requirements 2018
 
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd についてKubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
 
SpringOne Tour: An Introduction to Azure Spring Apps Enterprise
SpringOne Tour: An Introduction to Azure Spring Apps EnterpriseSpringOne Tour: An Introduction to Azure Spring Apps Enterprise
SpringOne Tour: An Introduction to Azure Spring Apps Enterprise
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learning
 
Azure Kubernetes Service 2019 ふりかえり
Azure Kubernetes Service 2019 ふりかえりAzure Kubernetes Service 2019 ふりかえり
Azure Kubernetes Service 2019 ふりかえり
 
Kubernetes overview 101
Kubernetes overview 101Kubernetes overview 101
Kubernetes overview 101
 
Docker kubernetes fundamental(pod_service)_190307
Docker kubernetes fundamental(pod_service)_190307Docker kubernetes fundamental(pod_service)_190307
Docker kubernetes fundamental(pod_service)_190307
 
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
 
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
 
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
 

Recently uploaded

Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 

Recently uploaded (20)

Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 

[Spark Summit 2017 NA] Apache Spark on Kubernetes

  • 1. Anirudh Ramananathan (foxish@google.com) Software Engineer (Kubernetes) Timothy Chen (tim@hyperpilot.io) Co-founder & CTO (HyperPilot) Apache Spark on Kubernetes
  • 2. Agenda • Kubernetes & Containers • Motivation • Design • Demo • Deep Dive • Roadmap
  • 4. Kubernetes Kubernetes is an open-source system
  • 5. Kubernetes Kubernetes is an open-source system for automating deployment, scaling, and management
  • 6. Kubernetes Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.
  • 8. Containers libs app kernel libs app libs app libs app • Repeatable Builds and Workflows • Application Portability • High Degree of Control over Software • Faster Development Cycle • Reduced dev-ops load • Improved Infrastructure Utilization
  • 9. • Large OSS Community - 1200+ contributors and 45k+ commits • Ecosystem and Partners - 100+ organizations involved • One of the top 100 projects overall on GitHub - 23k+ stars Kubernetes
  • 11. At a Glance kubelet UI kubeletCLI API users master nodes etcd kubelet scheduler controllers apiserver
  • 12. Nodes and Pods Pod Volume Containers Pod Containers 8080 8080 8080 Volume Node • A pod is a set of co-located containers • Created by a declarative specification supplied to the master • Each pod has its own IP address • Volumes can be local or network-attached
  • 14. Why Spark on Kubernetes? • Resource sharing between batch, serving and stateful workloads – Streamlined developer experience – Reduced operational costs – Improved infrastructure utilization • Kubernetes and the Container Ecosystem – Lots of addon services: third-party logging, monitoring, and security tools – For example, the Istio project, announced May 24, by IBM, Google and Lyft, provides a service mesh for authenticating, authorizing, tracing, and timing, and rate-limiting container-to-container communication, and more.
  • 16. Spark, meet Kubernetes! Spark Core Kubernetes Standalone YARN Mesos GraphX SparkSQL MLlib Streaming
  • 17. Spark, meet Kubernetes! Spark Core Kubernetes Scheduler Backend Kubernetes Clusternew executors remove executors configuration • Resource Requests • Authnz • Communication with K8s
  • 18. Kubernetes, meet Spark! Kubernetes Cluster File Staging Server • Staging server: component to stage local files • Spark Shuffle service: component to store shuffle data for dynamic allocation • ThirdParty/CustomResources: extend Kubernetes API with Spark Knowledge Shuffle Service SparkJob API Endpoint
  • 19. Kubernetes Integration Dependencies Container images with dependencies baked in Files from GCS/S3/HDFS/HTTP File Staging Server Staged files and JARs Several ways of running Spark Jobs along with their dependencies on Kubernetes
  • 20. Administration Namespaces Resource Accounting Logging Monitoring Resource Quota Pluggable Authorization Admission Control RBAC • Launch Spark Jobs as a particular user into a specific namespace • RBAC and Namespace-level resource quotas • Audit logging for clusters • Several monitoring solutions to see node, cluster and pod-level statistics
  • 21. Focus Areas Wordcloud of the command-line options we added to spark-submit on Kubernetes
  • 22. Demo
  • 25. • Spark Submit submits job to K8s • K8s schedules the driver for job Deep Dive kubernetes cluster apiserver scheduler schedule driver pod spark driver
  • 26. • Spark Submit submits job to K8s • K8s schedules the driver for job Deep Dive • Spark Submit submits job to K8s • K8s schedules the driver for job • Driver requests executors as needed kubernetes cluster apiserver scheduler spark driver create executor pods
  • 27. • Spark Submit submits job to K8s • K8s schedules the driver for job Deep Dive • Spark Submit submits job to K8s • K8s schedules the driver for job • Driver requests executors as needed • Executors scheduled and created kubernetes cluster apiserver scheduler spark driver schedule executorpods executors
  • 28. • Spark Submit submits job to K8s • K8s schedules the driver for job Deep Dive • Spark Submit submits job to K8s • K8s schedules the driver for job • Driver requests executors as needed • Executors scheduled and created • Executors run tasks kubernetes cluster apiserver scheduler spark driver executors
  • 29. • Spark Submit submits job to K8s • K8s schedules the driver for job Deep Dive • Spark Submit submits job to K8s • K8s schedules the driver for job • Driver requests executors as needed • Executors scheduled and created • Executors run tasks • Driver “completes” job and persists logs kubernetes cluster apiserver scheduler spark driver
  • 31. Spark Streaming Spark Roadmap Spark Shell Client Mode Python/R support Cluster Mode Java/Scala Support Dynamic Allocation Local File Staging High Availability Spark SQL GraphX MLlib Dec 2016 Development Began Mar 2017 Alpha Release June 2017 Beta Release Nov 2016 Design = supported but untested = not yet supported
  • 32. We’re just getting started... • Kubernetes CustomResources • Priorities and Preemption for Pods • Batch Scheduling and Resource Sharing • Cluster Federation and Multi-cloud deployments • Ecosystem: Kafka, Cassandra, HDFS, etc
  • 33. Contributors Organizations Alphabetically: • Google • Haiwen • Hyperpilot • Intel • Palantir • Pepperdata • Red Hat Links: • Spark 2.2.0 Documentation • https://issues.apache.org/jira/browse /SPARK-18278 • https://github.com/kubernetes/kubern etes/issues/34377
  • 34. Try it out today! https://github.com/apache-spark-on-k8s/spark
  • 35. Thank You. HDFS on Kubernetes - Lessons Learned June 7 at 11:00 AM in Room 2003 Join us Wednesdays at 10am PT at the SIG BigData meeting https://github.com/kubernetes/community/