SlideShare a Scribd company logo
1 of 34
Download to read offline
Cloud-Native Spark Scheduling with YuniKorn Scheduler
Li Gao
Tech lead and engineer @ Databricks Compute Fabric
Previous tech lead at data infrastructure @ Lyft
Weiwei Yang
Tech Lead @ Cloudera Compute Platform
Apache Hadoop Committer & PMC member
Previous tech lead at Real-time Compute Infra @ Alibaba
Agenda
Li Gao
Why Lyft is choosing Spark on K8s
The need for custom k8s scheduler for Spark
Weiwei Yang
Spark Scheduling with YuniKorn
Deep Dive into YuniKorn Features
Community and Roadmap
Role of K8s in Lyft’s Data Landscape
Why Choose K8s for Spark
▪ Containerized spark compute to provide shared resources across
different ML and ETL jobs
▪ Support for multiple Spark versions, Python versions, and version
controlled containers on the shared K8s clusters for both faster iteration
and stable production
▪ A single, unified infrastructure for both majority of our data compute and
micro services with advanced, unified observability and resource
isolation support
▪ Fine grained access controls on shared clusters
The Spark K8s Infra @ Lyft
Multi-step creation for a Spark K8s job
Resource
Labels
Jobs
Cluster
Pool
K8s
Cluster
Namespace
Group
Namespace
Spark CRD
Spark Pods
DataLake
Problems of existing Spark K8s infrastructure
▪ Complexity of layers of custom K8s controllers to handle the scale of the
spark jobs
▪ Tight coupling of controller layers makes latency issues amplified in
certain cases
▪ Priority queues between jobs, clusters, and namespaces are managed by
multiple layers of controllers to achieve desired performance
Why we need a customized K8s Scheduler
▪ High latency (~100 seconds) using the default scheduler is observed on a single
K8s cluster for large volumes of batch workloads
▪ Large batch fair sharing in the same resource pool is unpredictable with the
default scheduler
▪ Mix of FIFO and FAIR requirements on shared jobs clusters
▪ The need for an elastic and hierarchical priority management for jobs in K8s
▪ Richer and online user visibility into the scheduling behavior
▪ Simplified layers of controllers with custom K8s scheduler
Spark Scheduling with YuniKorn
Flavors of Running Spark on K8s
Native Spark on K8s Spark K8s Operator
Identify Spark jobs by the pod labels Identify Spark jobs by CRDs (e.g SparkApplication)
Resource Scheduling in K8s
Scheduler workflow in human language: The scheduler picks
up a pod each time, find the best fit node and then launch
the pod on that node.
Spark on K8s: the scheduling challenges
▪ Job Scheduling Requirements
▪ Job ordering/queueing
▪ Job level priority
▪ Resource fairness (between jobs / queues)
▪ Gang scheduling
▪ Resource Sharing and Utilization Challenges
▪ Spark driver pods occupied all resources in a
namespace
▪ Resource competition, deadlock between large jobs
▪ Misbehave jobs could abuse resources
▪ High throughput
Ad-Hoc Queries Batch Jobs
Workflow (DAG) Streaming
The need of an unified architecture for both on-prem, cloud,
multi-cloud and hybrid cloud
K8s default scheduler was NOT created to tackle these challenges
Apache YuniKorn (Incubating)
What is it:
▪ A standalone resource scheduler for K8s
▪ Focus on building scheduling capabilities to empower Big Data
on K8s
Simple to use:
▪ A stateless service on K8s
▪ A secondary K8s scheduler or replacement of the default
scheduler
Resource Scheduling in YuniKorn (and compare w/ default scheduler)
Apps
API
Server
ETCD
Resource
Scheduler
master
Apps
Nodes
Queues
Request
Kubelet
Filter
Score
Sort
Extensions
Queue
Sort
App
Sort
Node
Sort
Pluggable
Policies
YUNIKORN
Default
Scheduler
31 2
YuniKorn QUEUE, APP
concepts are critical to
provide advanced job
scheduling and fine-grained
resource management
capabilities
Main difference (YuniKorn v.s Default Scheduler)
Feature Default
Scheduler
YUNIKORN Note
Scheduling at app
dimension
App is the 1st class citizen in YuniKorn, YuniKorn schedules apps with respect
to, e,g their submission order, priority, resource usage, etc.
Job ordering YuniKorn supports FIFO/FAIR/Priority (WIP) job ordering policies
Fine-grained resource
capacity management
Manage cluster resources with hierarchy queues, queue provides the
guaranteed resources (min) and the resource quota (max).
Resource fairness Inter-queue resource fairness
Natively support Big Data
workloads
The default scheduler is main for long-running services. YuniKorn is designed
for Big Data app workloads, it natively supports Spark/Flink/Tensorflow, etc.
Scale & Performance YuniKorn is optimized for performance, it is suitable for high throughput and
large scale environments.
Run Spark with YuniKorn
Submit a Spark job
1) Run spark-submit
2) Create SparkApplication CRD
Spark
Driver
pod
Pending
Spark-job-001
Spark-job-001
Spark
Driver
Pod Spark-job-001 Spark-job-001
Spark
Driver
Pod
Spark
Executor
Pod
Api-server creates the
driver pod
Spark job is registered to
YuniKorn in a leaf queue
Sort queues -> sort apps -> select
request -> select node
Driver pod is started, it
requests for Spark
executor pods from
api-server
Api-server binds the
pod to the assigned
node
Driver pod requests for
executors, api-server creates
executor pods
Spark
Driver
pod
Pending
Spark
Driver
pod
Bound
Spark
Driver
pod
Bound
Spark
Driver
pod
Bound
Spark
Executor
pod
Spark
Executor
pod
Spark
Executor
pod
Bound
Job is Starting
Spark driver is
running
Spark executors
are created
Spark job is
running
Spark-job-001
Spark
Driver
Pod
Spark
Executor
Pod
Spark
Executor
pod
Spark
Executor
pod
Spark
Executor
pod
Pending
New executors are added as
pending requests
Ask api-server to bind
the pod to the node
Schedule, and bind executors
Pending
Deep Dive into YuniKorn Features/Performance
Job Ordering
Why this matters?
▪ If I submit the job earlier, I want my job to run first
▪ I don’t want my job gets starved as resources are used by others
▪ I have a urgent job, let me run first!
Per queue sorting policy
▪ FIFO - Order jobs by submission time
▪ FAIR - Order jobs by resource consumption
▪ Priority (WIP-0.9) - Order jobs by job-level prioritizes within the
same queue
Resource Quota Management: K8s Namespace ResourceQuota
K8s Namespace Resource Quota
▪ Defines resource limits
▪ Enforced by the quota admission-controller
Problems
▪ Hard to control when resource quotas are overcommitted
▪ Users has no guaranteed resources
▪ Quota could be abused (e.g by pending pods)
▪ No queueing for jobs…
▪ Low utilization?!
Namespace Resource Quota is suboptimal to support
resource sharing between multi-tenants
Resource Quota Management: YuniKorn Queue Capacity
YuniKorn Queue provides a optimal solution to manage resource quotas
▪ A queue can map to one (or more) namespaces automatically
▪ Capacity is elastic from min to max
▪ Honor resource fairness
▪ Quota is only counted for pods which actually consumes resources
▪ Enable Job queueing
Namespace
YuniKorn
Queue
CPU: 1
Memory: 1024Mi
CPU: 2
Memory: 2048Mi
CPU: 2
Memory: 2048Mi
Queue Max CPU: 5
Memory: 5120Mi
-> better resource sharing, ensure guarantee, enforce max
-> zero config queue mgmt
-> avoid starving jobs/users
-> accurate resource counting, improve utilization
-> jobs can be queued in the scheduler, keep client side logic simple
Resource Fairness in YuniKorn Queues
Queue
Guaranteed Resource
(Mem)
Requests
(NumOfPods * Mem)
root.default 500,000 1000 * 10
root.search 400,000 500 * 10
root.test 100,000 200 * 10
root.sandbox 100,000 200 * 50
Scheduling workloads with different requests in 4 queues with
different guaranteed resources.
Usage ratios of queues increased with similar trend
Scheduler Throughput Benchmark
Schedule 50,000 pods on
2,000/4,000 nodes.
Compare Scheduling throughput
(Pods per second allocated by
scheduler)
Red line (YuniKorn)
Green line (Default Scheduler)
617 vs 263 ↑ 134%
373 vs 141 ↑ 164%
Detail report:
https://github.com/apache/incubator-yunikorn-core/blob/master/docs/evaluate-perf-function-with-Kubemark.md
50k pods on 2k nodes 50k pods on 4k nodes
Fully K8s Compatible
▪ Support K8s Predicates
▪ Node selector
▪ Pod affinity/anti-affinity
▪ Taints and toleration
▪ …
▪ Support PersistentVolumeClaim and PersistentVolume
▪ Volume bindings
▪ Dynamical provisioning
▪ Publishes key scheduling events to K8s event system
▪ Work with cluster autoscaler
▪ Support management commands
▪ cordon nodes
YuniKorn Management Console
Compare YuniKorn with other K8s schedulers
Scheduler
Capabilities
Resource Sharing Resource Fairness Preemption
Gang
Scheduling
Bin
Packing Throughput
Hierarchy
queues
Queue
elastic
capacity
Cross
queue
fairness
User level
fairness
App level
fairness
Basic
preemption
With
fairness
K8s
default
scheduler x x x x x v x x v
260 allocs/s
(2k nodes)
Kube-batch x x v x v v x v v
? Likely slower than
kube-default from [1]
YuniKorn v v v v v v v v* YUNIKORN-2 v
610 allocs/s
(2k nodes)
[1] https://github.com/kubernetes-sigs/kube-batch/issues/930
Community, Summary and Next
Current Status
▪ Open source at July 17, 2019, Apache 2.0 License
▪ Enter Apache Incubator since Jan 21, 2020
▪ Latest stable version 0.8.0 released on May 4, 2020
▪ Diverse community with members from Alibaba, Cloudera,
Microsoft, LinkedIn, Apple, Tencent, Nvidia and more…
The Community
▪ Deployed in non-production K8s clusters
▪ Launched 100s of large jobs per day on
some of the YuniKorn queues
▪ Reduced our large job scheduler latency by
factor of ~ 3x at peak time
▪ K8s cluster overall resource utilization
efficiency (cost per compute) improved
over the default kube-scheduler for mixed
workloads
▪ FIFO and FAIR requests are more frequently
met than before
▪ Shipping with Cloudera Public Cloud
offerings
▪ Provide resource quota management and
advanced job scheduling capabilities for
Spark
▪ Responsible for both micro-service, and
batch jobs scheduling
▪ Running on Cloud with auto-scaling enabled
▪ Deployed on pre-production on-prem
cluster with ~100 nodes
▪ Plan to deploy on 1000+ nodes production
K8s cluster this year
▪ Leverage YuniKorn features such as
hiercharchy queues, resource fairness to
run large scale Flink jobs on K8s
▪ Gained x4 scheduling performance
improvements
Roadmap
Current (0.8.0)
● Hirechay queues
● Cross queue fairness
● Fair/FIFO job ordering policies
● Fair/Bin-packing node sorting policies
● Self queue management
● Pluggable app discover
● Metrics system and Prometheus integration
Upcoming (0.9.0)
● Gang Scheduling
● Job/task priority support (scheduling & preemption)
● Support Spark dynamic allocation
3rd quarter of 2020
Our Vision - Resource Mgmt for Big Data & ML
Data Engineering, Realtime
Streaming, Machine Learning
Micro services, batch jobs, long
running workloads, interactive
sessions, model serving
Multi-tenancy, SLA, Resource
Utilization, Cost Mgmt, Budget
Computes Types Targets
Unified Compute Platform for Big Data & ML
Join us in the
YuniKorn Community !!
▪ Project web site: http://yunikorn.apache.org/
▪ Github repo: apache/incubator-yunikorn-core
▪ Mailing list: dev@yunikorn.apache.org
▪ Slack channel:
▪ Bi-weekly/Monthly sync up meetings for different time zones
Thank you!!

More Related Content

What's hot

Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flinkdatamantra
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQLDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...Databricks
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...Databricks
 
From Query Plan to Query Performance: Supercharging your Apache Spark Queries...
From Query Plan to Query Performance: Supercharging your Apache Spark Queries...From Query Plan to Query Performance: Supercharging your Apache Spark Queries...
From Query Plan to Query Performance: Supercharging your Apache Spark Queries...Databricks
 
Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013Julien Le Dem
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeFlink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsDatabricks
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3DataWorks Summit
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceDatabricks
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationDatabricks
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
 

What's hot (20)

Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQL
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 
From Query Plan to Query Performance: Supercharging your Apache Spark Queries...
From Query Plan to Query Performance: Supercharging your Apache Spark Queries...From Query Plan to Query Performance: Supercharging your Apache Spark Queries...
From Query Plan to Query Performance: Supercharging your Apache Spark Queries...
 
Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 

Similar to Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler

Reliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on KubernetesReliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on KubernetesDatabricks
 
Powering Data Science and AI with Apache Spark, Alluxio, and IBM
Powering Data Science and AI with Apache Spark, Alluxio, and IBMPowering Data Science and AI with Apache Spark, Alluxio, and IBM
Powering Data Science and AI with Apache Spark, Alluxio, and IBMAlluxio, Inc.
 
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...Spark Summit
 
Optimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack CloudsOptimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack CloudsYathiraj Udupi, Ph.D.
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudDatabricks
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopHortonworks
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftChester Chen
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesDataWorks Summit
 
Flink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paasFlink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paasMonal Daxini
 
Spark & Yarn better together 1.2
Spark & Yarn better together 1.2Spark & Yarn better together 1.2
Spark & Yarn better together 1.2Jianfeng Zhang
 
Optimized placement in Openstack for NFV
Optimized placement in Openstack for NFVOptimized placement in Openstack for NFV
Optimized placement in Openstack for NFVDebojyoti Dutta
 
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsKubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsRightScale
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
 
Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftLi Gao
 
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...Chris Fregly
 
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015Mac Moore
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksRuslan Meshenberg
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoophitesh1892
 

Similar to Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler (20)

Reliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on KubernetesReliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on Kubernetes
 
Powering Data Science and AI with Apache Spark, Alluxio, and IBM
Powering Data Science and AI with Apache Spark, Alluxio, and IBMPowering Data Science and AI with Apache Spark, Alluxio, and IBM
Powering Data Science and AI with Apache Spark, Alluxio, and IBM
 
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
 
Optimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack CloudsOptimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack Clouds
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the Cloud
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond Kubernetes
 
Flink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paasFlink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paas
 
Spark & Yarn better together 1.2
Spark & Yarn better together 1.2Spark & Yarn better together 1.2
Spark & Yarn better together 1.2
 
Optimized placement in Openstack for NFV
Optimized placement in Openstack for NFVOptimized placement in Openstack for NFV
Optimized placement in Openstack for NFV
 
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsKubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at Lyft
 
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
 
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
 
Kafka & Hadoop in Rakuten
Kafka & Hadoop in RakutenKafka & Hadoop in Rakuten
Kafka & Hadoop in Rakuten
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talks
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionDatabricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 

Recently uploaded

Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 

Recently uploaded (20)

Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 

Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler

  • 1.
  • 2. Cloud-Native Spark Scheduling with YuniKorn Scheduler
  • 3. Li Gao Tech lead and engineer @ Databricks Compute Fabric Previous tech lead at data infrastructure @ Lyft Weiwei Yang Tech Lead @ Cloudera Compute Platform Apache Hadoop Committer & PMC member Previous tech lead at Real-time Compute Infra @ Alibaba
  • 4. Agenda Li Gao Why Lyft is choosing Spark on K8s The need for custom k8s scheduler for Spark Weiwei Yang Spark Scheduling with YuniKorn Deep Dive into YuniKorn Features Community and Roadmap
  • 5. Role of K8s in Lyft’s Data Landscape
  • 6. Why Choose K8s for Spark ▪ Containerized spark compute to provide shared resources across different ML and ETL jobs ▪ Support for multiple Spark versions, Python versions, and version controlled containers on the shared K8s clusters for both faster iteration and stable production ▪ A single, unified infrastructure for both majority of our data compute and micro services with advanced, unified observability and resource isolation support ▪ Fine grained access controls on shared clusters
  • 7. The Spark K8s Infra @ Lyft
  • 8. Multi-step creation for a Spark K8s job Resource Labels Jobs Cluster Pool K8s Cluster Namespace Group Namespace Spark CRD Spark Pods DataLake
  • 9. Problems of existing Spark K8s infrastructure ▪ Complexity of layers of custom K8s controllers to handle the scale of the spark jobs ▪ Tight coupling of controller layers makes latency issues amplified in certain cases ▪ Priority queues between jobs, clusters, and namespaces are managed by multiple layers of controllers to achieve desired performance
  • 10. Why we need a customized K8s Scheduler ▪ High latency (~100 seconds) using the default scheduler is observed on a single K8s cluster for large volumes of batch workloads ▪ Large batch fair sharing in the same resource pool is unpredictable with the default scheduler ▪ Mix of FIFO and FAIR requirements on shared jobs clusters ▪ The need for an elastic and hierarchical priority management for jobs in K8s ▪ Richer and online user visibility into the scheduling behavior ▪ Simplified layers of controllers with custom K8s scheduler
  • 12. Flavors of Running Spark on K8s Native Spark on K8s Spark K8s Operator Identify Spark jobs by the pod labels Identify Spark jobs by CRDs (e.g SparkApplication)
  • 13. Resource Scheduling in K8s Scheduler workflow in human language: The scheduler picks up a pod each time, find the best fit node and then launch the pod on that node.
  • 14. Spark on K8s: the scheduling challenges ▪ Job Scheduling Requirements ▪ Job ordering/queueing ▪ Job level priority ▪ Resource fairness (between jobs / queues) ▪ Gang scheduling ▪ Resource Sharing and Utilization Challenges ▪ Spark driver pods occupied all resources in a namespace ▪ Resource competition, deadlock between large jobs ▪ Misbehave jobs could abuse resources ▪ High throughput Ad-Hoc Queries Batch Jobs Workflow (DAG) Streaming The need of an unified architecture for both on-prem, cloud, multi-cloud and hybrid cloud K8s default scheduler was NOT created to tackle these challenges
  • 15. Apache YuniKorn (Incubating) What is it: ▪ A standalone resource scheduler for K8s ▪ Focus on building scheduling capabilities to empower Big Data on K8s Simple to use: ▪ A stateless service on K8s ▪ A secondary K8s scheduler or replacement of the default scheduler
  • 16. Resource Scheduling in YuniKorn (and compare w/ default scheduler) Apps API Server ETCD Resource Scheduler master Apps Nodes Queues Request Kubelet Filter Score Sort Extensions Queue Sort App Sort Node Sort Pluggable Policies YUNIKORN Default Scheduler 31 2 YuniKorn QUEUE, APP concepts are critical to provide advanced job scheduling and fine-grained resource management capabilities
  • 17. Main difference (YuniKorn v.s Default Scheduler) Feature Default Scheduler YUNIKORN Note Scheduling at app dimension App is the 1st class citizen in YuniKorn, YuniKorn schedules apps with respect to, e,g their submission order, priority, resource usage, etc. Job ordering YuniKorn supports FIFO/FAIR/Priority (WIP) job ordering policies Fine-grained resource capacity management Manage cluster resources with hierarchy queues, queue provides the guaranteed resources (min) and the resource quota (max). Resource fairness Inter-queue resource fairness Natively support Big Data workloads The default scheduler is main for long-running services. YuniKorn is designed for Big Data app workloads, it natively supports Spark/Flink/Tensorflow, etc. Scale & Performance YuniKorn is optimized for performance, it is suitable for high throughput and large scale environments.
  • 18. Run Spark with YuniKorn Submit a Spark job 1) Run spark-submit 2) Create SparkApplication CRD Spark Driver pod Pending Spark-job-001 Spark-job-001 Spark Driver Pod Spark-job-001 Spark-job-001 Spark Driver Pod Spark Executor Pod Api-server creates the driver pod Spark job is registered to YuniKorn in a leaf queue Sort queues -> sort apps -> select request -> select node Driver pod is started, it requests for Spark executor pods from api-server Api-server binds the pod to the assigned node Driver pod requests for executors, api-server creates executor pods Spark Driver pod Pending Spark Driver pod Bound Spark Driver pod Bound Spark Driver pod Bound Spark Executor pod Spark Executor pod Spark Executor pod Bound Job is Starting Spark driver is running Spark executors are created Spark job is running Spark-job-001 Spark Driver Pod Spark Executor Pod Spark Executor pod Spark Executor pod Spark Executor pod Pending New executors are added as pending requests Ask api-server to bind the pod to the node Schedule, and bind executors Pending
  • 19. Deep Dive into YuniKorn Features/Performance
  • 20. Job Ordering Why this matters? ▪ If I submit the job earlier, I want my job to run first ▪ I don’t want my job gets starved as resources are used by others ▪ I have a urgent job, let me run first! Per queue sorting policy ▪ FIFO - Order jobs by submission time ▪ FAIR - Order jobs by resource consumption ▪ Priority (WIP-0.9) - Order jobs by job-level prioritizes within the same queue
  • 21. Resource Quota Management: K8s Namespace ResourceQuota K8s Namespace Resource Quota ▪ Defines resource limits ▪ Enforced by the quota admission-controller Problems ▪ Hard to control when resource quotas are overcommitted ▪ Users has no guaranteed resources ▪ Quota could be abused (e.g by pending pods) ▪ No queueing for jobs… ▪ Low utilization?! Namespace Resource Quota is suboptimal to support resource sharing between multi-tenants
  • 22. Resource Quota Management: YuniKorn Queue Capacity YuniKorn Queue provides a optimal solution to manage resource quotas ▪ A queue can map to one (or more) namespaces automatically ▪ Capacity is elastic from min to max ▪ Honor resource fairness ▪ Quota is only counted for pods which actually consumes resources ▪ Enable Job queueing Namespace YuniKorn Queue CPU: 1 Memory: 1024Mi CPU: 2 Memory: 2048Mi CPU: 2 Memory: 2048Mi Queue Max CPU: 5 Memory: 5120Mi -> better resource sharing, ensure guarantee, enforce max -> zero config queue mgmt -> avoid starving jobs/users -> accurate resource counting, improve utilization -> jobs can be queued in the scheduler, keep client side logic simple
  • 23. Resource Fairness in YuniKorn Queues Queue Guaranteed Resource (Mem) Requests (NumOfPods * Mem) root.default 500,000 1000 * 10 root.search 400,000 500 * 10 root.test 100,000 200 * 10 root.sandbox 100,000 200 * 50 Scheduling workloads with different requests in 4 queues with different guaranteed resources. Usage ratios of queues increased with similar trend
  • 24. Scheduler Throughput Benchmark Schedule 50,000 pods on 2,000/4,000 nodes. Compare Scheduling throughput (Pods per second allocated by scheduler) Red line (YuniKorn) Green line (Default Scheduler) 617 vs 263 ↑ 134% 373 vs 141 ↑ 164% Detail report: https://github.com/apache/incubator-yunikorn-core/blob/master/docs/evaluate-perf-function-with-Kubemark.md 50k pods on 2k nodes 50k pods on 4k nodes
  • 25. Fully K8s Compatible ▪ Support K8s Predicates ▪ Node selector ▪ Pod affinity/anti-affinity ▪ Taints and toleration ▪ … ▪ Support PersistentVolumeClaim and PersistentVolume ▪ Volume bindings ▪ Dynamical provisioning ▪ Publishes key scheduling events to K8s event system ▪ Work with cluster autoscaler ▪ Support management commands ▪ cordon nodes
  • 27. Compare YuniKorn with other K8s schedulers Scheduler Capabilities Resource Sharing Resource Fairness Preemption Gang Scheduling Bin Packing Throughput Hierarchy queues Queue elastic capacity Cross queue fairness User level fairness App level fairness Basic preemption With fairness K8s default scheduler x x x x x v x x v 260 allocs/s (2k nodes) Kube-batch x x v x v v x v v ? Likely slower than kube-default from [1] YuniKorn v v v v v v v v* YUNIKORN-2 v 610 allocs/s (2k nodes) [1] https://github.com/kubernetes-sigs/kube-batch/issues/930
  • 29. Current Status ▪ Open source at July 17, 2019, Apache 2.0 License ▪ Enter Apache Incubator since Jan 21, 2020 ▪ Latest stable version 0.8.0 released on May 4, 2020 ▪ Diverse community with members from Alibaba, Cloudera, Microsoft, LinkedIn, Apple, Tencent, Nvidia and more…
  • 30. The Community ▪ Deployed in non-production K8s clusters ▪ Launched 100s of large jobs per day on some of the YuniKorn queues ▪ Reduced our large job scheduler latency by factor of ~ 3x at peak time ▪ K8s cluster overall resource utilization efficiency (cost per compute) improved over the default kube-scheduler for mixed workloads ▪ FIFO and FAIR requests are more frequently met than before ▪ Shipping with Cloudera Public Cloud offerings ▪ Provide resource quota management and advanced job scheduling capabilities for Spark ▪ Responsible for both micro-service, and batch jobs scheduling ▪ Running on Cloud with auto-scaling enabled ▪ Deployed on pre-production on-prem cluster with ~100 nodes ▪ Plan to deploy on 1000+ nodes production K8s cluster this year ▪ Leverage YuniKorn features such as hiercharchy queues, resource fairness to run large scale Flink jobs on K8s ▪ Gained x4 scheduling performance improvements
  • 31. Roadmap Current (0.8.0) ● Hirechay queues ● Cross queue fairness ● Fair/FIFO job ordering policies ● Fair/Bin-packing node sorting policies ● Self queue management ● Pluggable app discover ● Metrics system and Prometheus integration Upcoming (0.9.0) ● Gang Scheduling ● Job/task priority support (scheduling & preemption) ● Support Spark dynamic allocation 3rd quarter of 2020
  • 32. Our Vision - Resource Mgmt for Big Data & ML Data Engineering, Realtime Streaming, Machine Learning Micro services, batch jobs, long running workloads, interactive sessions, model serving Multi-tenancy, SLA, Resource Utilization, Cost Mgmt, Budget Computes Types Targets Unified Compute Platform for Big Data & ML
  • 33. Join us in the YuniKorn Community !! ▪ Project web site: http://yunikorn.apache.org/ ▪ Github repo: apache/incubator-yunikorn-core ▪ Mailing list: dev@yunikorn.apache.org ▪ Slack channel: ▪ Bi-weekly/Monthly sync up meetings for different time zones