SlideShare a Scribd company logo
1 of 29
Download to read offline
© Cloudera, Inc. All rights reserved.
YuniKorn : Next Generation Scheduler for YARN, K8s &
the cloud
Vinod Kumar Vavilapalli & Sunil Govindan
© Cloudera, Inc. All rights reserved. 2
SPEAKERS
SUNIL GOVINDAN
Engineering Manager @Cloudera
@sunilgovind
Apache Hadoop PMC & Committer
Apache Hadoop since 2012
VINOD KUMAR VAVILAPALLI
Director of Engineering @Cloudera
@tshooter
Apache Hadoop VP & PMC Chair
Apache Hadoop since 2007
© Cloudera, Inc. All rights reserved. 3
AGENDA
Where are we today?
Introducing YuniKorn
YuniKorn Deep Dive
Future & Open Source Story
© Cloudera, Inc. All rights reserved. 4
BATCH WORKLOADS DEEP LEARNING APPS
CUSTOMER JOURNEY - BIG DATA ECOSYSTEM
BIG DATA ECOSYSTEM - TODAY
PUBLIC CLOUD
STORAGE
100100100
101001000
010010101
STORAGECOMPUTE (on-prem/on-cloud)
HIVE on LLAP
SERVICES
© Cloudera, Inc. All rights reserved. 5
The New Trends
Moving to Cloud
Big Data &
Containerization
Mixed Workloads
© Cloudera, Inc. All rights reserved. 6
WHERE ARE WE TODAY ?
RESOURCE ORCHESTRATOR PERSPECTIVE - STRENGTHS
APACHE YARN
Big Data Ecosystem
KUBERNETES
Cloud Ecosystem
CLOUD NATIVE
Public Cloud
Big Data—Optimized to run Big Data
workloads
Batch Workloads—High Throughput
scheduling for batch workloads
SLA—Better SLA for Big Data
workloads
Multi tenant—Quota Management.
Services—Optimized for
containerized microservices
Networking—Strong Network
management support
Cloud Aware—Better tuned for cloud
use cases
Storage—Persistent volumes
Cost—Budget Centric
∞—Infinite resource for infinite $$
© Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved.
APACHE YARN WORLD
Strengthening Dynamic
Environments
• Improve YARN to work
well for cloud (public &
private)
• Focus on Autoscaling,
Smarter Scheduling etc.
Refer: YARN-9548
Improving capabilities for
persistent volumes
• Added CSI (Container
Storage Interface) support
• Enhancing CSI
implementation to expand
and support storages such
S3, Ozone etc as mounted
volume to YARN
containers
Native Service
enhancements
• Improving native
services support in
YARN
• Micro Service upgrades
What’s happening now in YARN today ?
© Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved.
KUBERNETES WORLD
Demand for Better support
of batch workloads
• Early efforts to support batch
scheduling is in progress by
K8s community.
Efforts on running Spark
on K8s
• Spark and Kubernetes
community is working
towards Spark on K8s
deployments.
• Gaps in running Spark
such dynamic resource
allocation, security etc is
still open.
CDP and Kubernetes
Cloudera
CDP Experiences will be
running on Kubernetes
What’s happening now in KUBERNETES today ?
© Cloudera, Inc. All rights reserved. 9
AGENDA
Where are we today?
Introducing YuniKorn
YuniKorn Deep Dive
Future & Open Source Story
© Cloudera, Inc. All rights reserved. 10
Enter the YuniKorn!
YuniKorn (/ˈyo͞ onəˌkôrn/, Y for YARN, K for K8s, uni- for Unified)
• A common resource scheduler
• Platform independent
• Enhanced scheduling capabilities
© Cloudera, Inc. All rights reserved. 11
WHAT YuniKorn IS (IS NOT) ?
YuniKorn is
• A better scheduler for the K8s world, for services and batch workloads
• A unified scheduler for the YARN world (FiFo, Fair and Capacity Scheduler)
• Providing unified resource scheduling experience across the YARN and K8s (and beyond)
• Suitable for both finite resource (datacenter) and infinite resources/dollars (cloud) worlds
Is NOT
• A system to port YARN applications to run on K8s w/o modification, or vice versa
© Cloudera, Inc. All rights reserved. 12© Cloudera, Inc. All rights reserved.
SCHEDULING: What do we mean anyways?
On isolation, capacity allocations, scheduling
Faster!
More! Best for my cluster
Throughput
Utilization
Elasticity
Service uptime
Security
ROI
Everything! Right
now!
SLA!
© Cloudera, Inc. All rights reserved. 13
WHAT IS HAPPENING NOW
CHALLENGES - APACHE YARN
APACHE YARN WORLD
• Challenges in managing TWO MAJOR YARN schedulers for different use cases
• Should Deploy & Manage containerized microservices in the same Hadoop Cluster
• Need strong networking and persistent volume support for services
• Need more powerful Auto scaling and budget control in the public cloud
© Cloudera, Inc. All rights reserved. 14
WHAT IS HAPPENING NOW
CHALLENGES - KUBERNETES
KUBERNETES WORLD
• Challenges in running Big Data workloads along with microservices together
• Need much better quota management and better SLA
• No first class Application management concept for Big Data workloads
© Cloudera, Inc. All rights reserved. 15
SO HOW DO WE SOLVE THIS ?
Assessment
Native Big Data Apps — Moving
batch workloads (MR, TEZ .. ) from
YARN to Kubernetes looking into
High throughput, Low Latency and
With notion of job etc is costly.
Adaptability — Optimized for
services and running batch
workloads exposes hard-to-bridge
gaps such as run few workloads
with or without docker
Services on YARN — Services and
web farms can run on YARN
however it is not as feature-rich as
Kubernetes.
Multiple Schedulers (YARN) —
Different schedulers are focussed
on specific use cases and not very
easy to drive continuous feature
enhancements .
One cannot replace another —
Neither Kubernetes can replace
YARN or vice versa in the near future
considering some of the
fundamental architecture
differences.
EXPENSIVE
Higher cost to achieve the goal
© Cloudera, Inc. All rights reserved. 16
WHERE DOES YuniKorn PLAY?
Not Optimized — to balance use-cases
like batch workloads to the needs like
running web farms or services with
respect to scheduling challenges.
Poor Resource Utilization — Not able to
effectively utilize complete resources in
cluster for services and workloads.
FRAGMENTED
Multiple YARN & Kubernetes schedulers are
UNIFIED RESOURCE SCHEDULER AND APPLICATION
MANAGEMENT
What we need is an effort to improve both YARN and
Kubernetes scheduling worlds.
Multiple schedulers power YARN & Kubernetes for different use cases
© Cloudera, Inc. All rights reserved. 17
AGENDA
Where are we today?
Introducing YuniKorn
YuniKorn Deep Dive
Future & Open Source Story
© Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved.
YuniKorn - A UNIFIED RESOURCE SCHEDULER
Capacity Planning
Capacity Planning
Divide cluster resources into resource pools (queues),
define capacity range based on needs. Enforce
resource quotas and limits.
Resource scheduling
Resource fairness, preemption, high-throughput,
multi-tenant, placement, etc.
Application Management
A central place to monitor application states
Resource Monitoring
A unified view of cluster resources, a dashboard to
easy track resource usage by queue, user or
organization.
Resource
Scheduling
Application
Management
Resource
Monitoring
...
Explore the feature set
© Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved.
CAPACITY PLANNING
Hierarchy of queues
Queues can be organized in honor of user groups or
organizations, with multiple levels.
Elastic Capacity
Each queue has its min-max capacity, usage is elastic
within this range for multiple users.
Resource Quotas
Resource cap for queues or users. Limited amount of
resources, number of applications etc.
Partition
A set of instances (nodes) that are physically isolated
© Cloudera, Inc. All rights reserved. 20© Cloudera, Inc. All rights reserved.
RESOURCE SCHEDULING
Resource Fairness
Queue/User/App level fairness
ensures each entity gets its own fair
share of resources.
Priorities
Queue priority + App priority
Preemption
Queue demands for more resources
have the chance to preempt resources
from other queues for high priority
apps.
Placement Constraints
Affinity/anti-affinity, node constraints
etc
Services
Low Latency
Long Running
Batch
High-throughput
Short-lived
YuniKorn
© Cloudera, Inc. All rights reserved. 21© Cloudera, Inc. All rights reserved.
RESOURCE MONITORING
Common dashboard to monitor resources
Hierarchy of
queues
Cluster resources
are divided into
hierarchy of
queues, all queue
state is visible
Common View
A common view of
resources, cross
platform.
Resource Centric
Focus on resources,
total/available/used, and
all
Resource
Dashboard
© Cloudera, Inc. All rights reserved. 22© Cloudera, Inc. All rights reserved.
APPLICATION MANAGEMENT
Track applications in a consistent fashion
Application originated
GUI to manage
workloads (instead of
individual pods in
K8s).
Entire application
lifecycle is visible and
trackable.
© Cloudera, Inc. All rights reserved. 23© Cloudera, Inc. All rights reserved.
ARCHITECTURE
YuniKorn Core
Scheduler
Shim
Master
Scheduler Interface
Api-server etcd
kubelets
Resource
Manager
Node Managers
GPRC/API
Resource requests, new application,
node updates
GPRC/API
Container allocation, preemption
Master Node
Slave nodes
MR
Spark
Flink
Tez
MySQL
Spark
Web
Server
Kafka
Client API
Allocate, release container
© Cloudera, Inc. All rights reserved. 24
YuniKorn SCHEDULER vs. OTHERS
Disclosure: this table is summarized based on speakers’ analysis
Scheduler
Capabilities
Resource Sharing
Resource
Fairness Preemption Throughput Gang Scheduling
Hierarchy
queues
Queue
prioritoy
Queue elastic
capacity
Cross queue
fairness
User level
fairness
App level
fairness
Basic
preemption With fairness With priority
Kube-default x x x x x x v x v 100+ allocs/s x
Kube-batch x x x x x v v x v ? v
YARN CS/FS v v v v v v v v v 4k+ allocs/s x
YuniKorn v x (wip) v v v v v v x (wip) ? x (wip)
Key capabilities of a resource scheduler from our perspective
© Cloudera, Inc. All rights reserved. 25
AGENDA
Where are we today?
Introducing YuniKorn
YuniKorn Deep Dive
Future & Open Source Story
© Cloudera, Inc. All rights reserved. 26© Cloudera, Inc. All rights reserved.
YARN cluster
TAKE AWAY
BEFORE AFTER
Scenario 1
Scenario 2
Scenario 3
K8s cluster on cloud/prem
Existing K8s
schedulers
K8s cluster on cloud/prem
YuniKorn
scheduler
K8s cluster
Existing K8s
schedulers
YARN cluster
Capacity
Scheduler
Fair
Scheduler
K8s cluster
YuniKorn Scheduler
YARN cluster
Capacity
Scheduler
Fair
Scheduler
YARN cluster
YuniKorn
scheduler
© Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved.
OPEN SOURCE
Yes, YuniKorn is now open source https://github.com/cloudera/yunikorn-core!
Contributions welcome!
Join us in slack!
© Cloudera, Inc. All rights reserved. 28© Cloudera, Inc. All rights reserved.
ACKNOWLEDGMENTS
A big shout out to the folks who helped to design, develop and make this project
possible.
❏ Wangda Tan
❏ Weiwei Yang
❏ Wilfred Spiegelenburg
❏ Akhil PB
❏ Suma Shivaprasad
❏ and many others...
© Cloudera, Inc. All rights reserved.
THANK YOU

More Related Content

What's hot

Getting Started with Apache Geode
Getting Started with Apache GeodeGetting Started with Apache Geode
Getting Started with Apache GeodeJohn Blum
 
How to Get Going with Kubernetes
How to Get Going with KubernetesHow to Get Going with Kubernetes
How to Get Going with KubernetesTed Dunning
 
Spark on Dataproc - Israel Spark Meetup at taboola
Spark on Dataproc - Israel Spark Meetup at taboolaSpark on Dataproc - Israel Spark Meetup at taboola
Spark on Dataproc - Israel Spark Meetup at taboolatsliwowicz
 
Progress for big data in Kubernetes
Progress for big data in KubernetesProgress for big data in Kubernetes
Progress for big data in KubernetesTed Dunning
 
Using Apache Geode: Lessons Learned at Southwest Airlines
Using Apache Geode: Lessons Learned at Southwest AirlinesUsing Apache Geode: Lessons Learned at Southwest Airlines
Using Apache Geode: Lessons Learned at Southwest AirlinesVMware Tanzu
 
Java EE Modernization with Mesosphere DCOS
Java EE Modernization with Mesosphere DCOSJava EE Modernization with Mesosphere DCOS
Java EE Modernization with Mesosphere DCOSMesosphere Inc.
 
Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersBlueData, Inc.
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupRommel Garcia
 
Webinar: What's New in DC/OS 1.11
Webinar: What's New in DC/OS 1.11Webinar: What's New in DC/OS 1.11
Webinar: What's New in DC/OS 1.11Mesosphere Inc.
 
Cloud Native PostgreSQL
Cloud Native PostgreSQLCloud Native PostgreSQL
Cloud Native PostgreSQLEDB
 
Hadoop World 2011: Hadoop as a Service in Cloud
Hadoop World 2011: Hadoop as a Service in CloudHadoop World 2011: Hadoop as a Service in Cloud
Hadoop World 2011: Hadoop as a Service in CloudCloudera, Inc.
 
Serverless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the EnterpriseServerless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the EnterpriseArun Kejariwal
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambariHortonworks
 
KubeCon 2017 - Kubernetes SIG Scheduling and Resource Management Working Grou...
KubeCon 2017 - Kubernetes SIG Scheduling and Resource Management Working Grou...KubeCon 2017 - Kubernetes SIG Scheduling and Resource Management Working Grou...
KubeCon 2017 - Kubernetes SIG Scheduling and Resource Management Working Grou...Jeremy Eder
 
#VirtualDesignMaster 3 Challenge 2 - Dennis George
#VirtualDesignMaster 3 Challenge 2 - Dennis George#VirtualDesignMaster 3 Challenge 2 - Dennis George
#VirtualDesignMaster 3 Challenge 2 - Dennis Georgevdmchallenge
 
Scale Out Your Big Data Apps: The Latest on Pivotal GemFire and GemFire XD
Scale Out Your Big Data Apps: The Latest on Pivotal GemFire and GemFire XDScale Out Your Big Data Apps: The Latest on Pivotal GemFire and GemFire XD
Scale Out Your Big Data Apps: The Latest on Pivotal GemFire and GemFire XDVMware Tanzu
 
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhereDocker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhereDataWorks Summit
 
CNCF Live Webinar: Kubernetes 1.23
CNCF Live Webinar: Kubernetes 1.23CNCF Live Webinar: Kubernetes 1.23
CNCF Live Webinar: Kubernetes 1.23LibbySchulze
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Big Data Joe™ Rossi
 

What's hot (20)

Getting Started with Apache Geode
Getting Started with Apache GeodeGetting Started with Apache Geode
Getting Started with Apache Geode
 
How to Get Going with Kubernetes
How to Get Going with KubernetesHow to Get Going with Kubernetes
How to Get Going with Kubernetes
 
Spark on Dataproc - Israel Spark Meetup at taboola
Spark on Dataproc - Israel Spark Meetup at taboolaSpark on Dataproc - Israel Spark Meetup at taboola
Spark on Dataproc - Israel Spark Meetup at taboola
 
Progress for big data in Kubernetes
Progress for big data in KubernetesProgress for big data in Kubernetes
Progress for big data in Kubernetes
 
Using Apache Geode: Lessons Learned at Southwest Airlines
Using Apache Geode: Lessons Learned at Southwest AirlinesUsing Apache Geode: Lessons Learned at Southwest Airlines
Using Apache Geode: Lessons Learned at Southwest Airlines
 
Java EE Modernization with Mesosphere DCOS
Java EE Modernization with Mesosphere DCOSJava EE Modernization with Mesosphere DCOS
Java EE Modernization with Mesosphere DCOS
 
Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker Containers
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
 
Webinar: What's New in DC/OS 1.11
Webinar: What's New in DC/OS 1.11Webinar: What's New in DC/OS 1.11
Webinar: What's New in DC/OS 1.11
 
Cloud Native PostgreSQL
Cloud Native PostgreSQLCloud Native PostgreSQL
Cloud Native PostgreSQL
 
Apache Slider
Apache SliderApache Slider
Apache Slider
 
Hadoop World 2011: Hadoop as a Service in Cloud
Hadoop World 2011: Hadoop as a Service in CloudHadoop World 2011: Hadoop as a Service in Cloud
Hadoop World 2011: Hadoop as a Service in Cloud
 
Serverless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the EnterpriseServerless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the Enterprise
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambari
 
KubeCon 2017 - Kubernetes SIG Scheduling and Resource Management Working Grou...
KubeCon 2017 - Kubernetes SIG Scheduling and Resource Management Working Grou...KubeCon 2017 - Kubernetes SIG Scheduling and Resource Management Working Grou...
KubeCon 2017 - Kubernetes SIG Scheduling and Resource Management Working Grou...
 
#VirtualDesignMaster 3 Challenge 2 - Dennis George
#VirtualDesignMaster 3 Challenge 2 - Dennis George#VirtualDesignMaster 3 Challenge 2 - Dennis George
#VirtualDesignMaster 3 Challenge 2 - Dennis George
 
Scale Out Your Big Data Apps: The Latest on Pivotal GemFire and GemFire XD
Scale Out Your Big Data Apps: The Latest on Pivotal GemFire and GemFire XDScale Out Your Big Data Apps: The Latest on Pivotal GemFire and GemFire XD
Scale Out Your Big Data Apps: The Latest on Pivotal GemFire and GemFire XD
 
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhereDocker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere
 
CNCF Live Webinar: Kubernetes 1.23
CNCF Live Webinar: Kubernetes 1.23CNCF Live Webinar: Kubernetes 1.23
CNCF Live Webinar: Kubernetes 1.23
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0
 

Similar to Cloudera DataTalks 2019 Bangalore - YuniKorn A next generation scheduler for YARN, K8s

Yarns about YARN: Migrating to MapReduce v2
Yarns about YARN: Migrating to MapReduce v2Yarns about YARN: Migrating to MapReduce v2
Yarns about YARN: Migrating to MapReduce v2DataWorks Summit
 
One Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data MeetupOne Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data MeetupAndrei Savu
 
One Hadoop, Multiple Clouds
One Hadoop, Multiple CloudsOne Hadoop, Multiple Clouds
One Hadoop, Multiple CloudsCloudera, Inc.
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudDataWorks Summit
 
Hadoop on Cloud: Why and How?
Hadoop on Cloud: Why and How?Hadoop on Cloud: Why and How?
Hadoop on Cloud: Why and How?Cloudera, Inc.
 
Docker for any type of workload and any IT Infrastructure
Docker for any type of workload and any IT InfrastructureDocker for any type of workload and any IT Infrastructure
Docker for any type of workload and any IT InfrastructureDocker, Inc.
 
Mesos and the Architecture of the New Datacenter
Mesos and the Architecture of the New DatacenterMesos and the Architecture of the New Datacenter
Mesos and the Architecture of the New DatacenterQAware GmbH
 
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera Director: Unlock the Full Potential of Hadoop in the CloudCloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera Director: Unlock the Full Potential of Hadoop in the CloudCloudera, Inc.
 
Running and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStackRunning and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStackVictor Palma
 
Episode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at ScaleEpisode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at ScaleMesosphere Inc.
 
Zero-to-Hero: Running Postgres in Kubernetes
Zero-to-Hero: Running Postgres in KubernetesZero-to-Hero: Running Postgres in Kubernetes
Zero-to-Hero: Running Postgres in KubernetesEDB
 
YugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesYugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesDoKC
 
Real world hybrid cloud session - OpenStack DACH 2015
Real world hybrid cloud session - OpenStack DACH 2015Real world hybrid cloud session - OpenStack DACH 2015
Real world hybrid cloud session - OpenStack DACH 2015assafleb
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform WebinarCloudera, Inc.
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesDataWorks Summit
 
Running Kubernetes Workloads on Oracle Cloud Infrastructure
Running Kubernetes Workloads on Oracle Cloud InfrastructureRunning Kubernetes Workloads on Oracle Cloud Infrastructure
Running Kubernetes Workloads on Oracle Cloud InfrastructureOracle Developers
 
Building Cloud-Native Applications with a Container-Native SQL Database in th...
Building Cloud-Native Applications with a Container-Native SQL Database in th...Building Cloud-Native Applications with a Container-Native SQL Database in th...
Building Cloud-Native Applications with a Container-Native SQL Database in th...NuoDB
 

Similar to Cloudera DataTalks 2019 Bangalore - YuniKorn A next generation scheduler for YARN, K8s (20)

Yarns about YARN: Migrating to MapReduce v2
Yarns about YARN: Migrating to MapReduce v2Yarns about YARN: Migrating to MapReduce v2
Yarns about YARN: Migrating to MapReduce v2
 
One Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data MeetupOne Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data Meetup
 
One Hadoop, Multiple Clouds
One Hadoop, Multiple CloudsOne Hadoop, Multiple Clouds
One Hadoop, Multiple Clouds
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
 
Hadoop on Cloud: Why and How?
Hadoop on Cloud: Why and How?Hadoop on Cloud: Why and How?
Hadoop on Cloud: Why and How?
 
Docker for any type of workload and any IT Infrastructure
Docker for any type of workload and any IT InfrastructureDocker for any type of workload and any IT Infrastructure
Docker for any type of workload and any IT Infrastructure
 
Mesos and the Architecture of the New Datacenter
Mesos and the Architecture of the New DatacenterMesos and the Architecture of the New Datacenter
Mesos and the Architecture of the New Datacenter
 
Yarns About Yarn
Yarns About YarnYarns About Yarn
Yarns About Yarn
 
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera Director: Unlock the Full Potential of Hadoop in the CloudCloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
 
Running and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStackRunning and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStack
 
Episode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at ScaleEpisode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at Scale
 
Zero-to-Hero: Running Postgres in Kubernetes
Zero-to-Hero: Running Postgres in KubernetesZero-to-Hero: Running Postgres in Kubernetes
Zero-to-Hero: Running Postgres in Kubernetes
 
YugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesYugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on Kubernetes
 
Real world hybrid cloud session - OpenStack DACH 2015
Real world hybrid cloud session - OpenStack DACH 2015Real world hybrid cloud session - OpenStack DACH 2015
Real world hybrid cloud session - OpenStack DACH 2015
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform Webinar
 
Apex day 1.0 fastest route to cloud sept 2015_julian lane
Apex day 1.0 fastest route to cloud sept 2015_julian laneApex day 1.0 fastest route to cloud sept 2015_julian lane
Apex day 1.0 fastest route to cloud sept 2015_julian lane
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond Kubernetes
 
Running Kubernetes Workloads on Oracle Cloud Infrastructure
Running Kubernetes Workloads on Oracle Cloud InfrastructureRunning Kubernetes Workloads on Oracle Cloud Infrastructure
Running Kubernetes Workloads on Oracle Cloud Infrastructure
 
The rise of microservices
The rise of microservicesThe rise of microservices
The rise of microservices
 
Building Cloud-Native Applications with a Container-Native SQL Database in th...
Building Cloud-Native Applications with a Container-Native SQL Database in th...Building Cloud-Native Applications with a Container-Native SQL Database in th...
Building Cloud-Native Applications with a Container-Native SQL Database in th...
 

Recently uploaded

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 

Recently uploaded (20)

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 

Cloudera DataTalks 2019 Bangalore - YuniKorn A next generation scheduler for YARN, K8s

  • 1. © Cloudera, Inc. All rights reserved. YuniKorn : Next Generation Scheduler for YARN, K8s & the cloud Vinod Kumar Vavilapalli & Sunil Govindan
  • 2. © Cloudera, Inc. All rights reserved. 2 SPEAKERS SUNIL GOVINDAN Engineering Manager @Cloudera @sunilgovind Apache Hadoop PMC & Committer Apache Hadoop since 2012 VINOD KUMAR VAVILAPALLI Director of Engineering @Cloudera @tshooter Apache Hadoop VP & PMC Chair Apache Hadoop since 2007
  • 3. © Cloudera, Inc. All rights reserved. 3 AGENDA Where are we today? Introducing YuniKorn YuniKorn Deep Dive Future & Open Source Story
  • 4. © Cloudera, Inc. All rights reserved. 4 BATCH WORKLOADS DEEP LEARNING APPS CUSTOMER JOURNEY - BIG DATA ECOSYSTEM BIG DATA ECOSYSTEM - TODAY PUBLIC CLOUD STORAGE 100100100 101001000 010010101 STORAGECOMPUTE (on-prem/on-cloud) HIVE on LLAP SERVICES
  • 5. © Cloudera, Inc. All rights reserved. 5 The New Trends Moving to Cloud Big Data & Containerization Mixed Workloads
  • 6. © Cloudera, Inc. All rights reserved. 6 WHERE ARE WE TODAY ? RESOURCE ORCHESTRATOR PERSPECTIVE - STRENGTHS APACHE YARN Big Data Ecosystem KUBERNETES Cloud Ecosystem CLOUD NATIVE Public Cloud Big Data—Optimized to run Big Data workloads Batch Workloads—High Throughput scheduling for batch workloads SLA—Better SLA for Big Data workloads Multi tenant—Quota Management. Services—Optimized for containerized microservices Networking—Strong Network management support Cloud Aware—Better tuned for cloud use cases Storage—Persistent volumes Cost—Budget Centric ∞—Infinite resource for infinite $$
  • 7. © Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved. APACHE YARN WORLD Strengthening Dynamic Environments • Improve YARN to work well for cloud (public & private) • Focus on Autoscaling, Smarter Scheduling etc. Refer: YARN-9548 Improving capabilities for persistent volumes • Added CSI (Container Storage Interface) support • Enhancing CSI implementation to expand and support storages such S3, Ozone etc as mounted volume to YARN containers Native Service enhancements • Improving native services support in YARN • Micro Service upgrades What’s happening now in YARN today ?
  • 8. © Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved. KUBERNETES WORLD Demand for Better support of batch workloads • Early efforts to support batch scheduling is in progress by K8s community. Efforts on running Spark on K8s • Spark and Kubernetes community is working towards Spark on K8s deployments. • Gaps in running Spark such dynamic resource allocation, security etc is still open. CDP and Kubernetes Cloudera CDP Experiences will be running on Kubernetes What’s happening now in KUBERNETES today ?
  • 9. © Cloudera, Inc. All rights reserved. 9 AGENDA Where are we today? Introducing YuniKorn YuniKorn Deep Dive Future & Open Source Story
  • 10. © Cloudera, Inc. All rights reserved. 10 Enter the YuniKorn! YuniKorn (/ˈyo͞ onəˌkôrn/, Y for YARN, K for K8s, uni- for Unified) • A common resource scheduler • Platform independent • Enhanced scheduling capabilities
  • 11. © Cloudera, Inc. All rights reserved. 11 WHAT YuniKorn IS (IS NOT) ? YuniKorn is • A better scheduler for the K8s world, for services and batch workloads • A unified scheduler for the YARN world (FiFo, Fair and Capacity Scheduler) • Providing unified resource scheduling experience across the YARN and K8s (and beyond) • Suitable for both finite resource (datacenter) and infinite resources/dollars (cloud) worlds Is NOT • A system to port YARN applications to run on K8s w/o modification, or vice versa
  • 12. © Cloudera, Inc. All rights reserved. 12© Cloudera, Inc. All rights reserved. SCHEDULING: What do we mean anyways? On isolation, capacity allocations, scheduling Faster! More! Best for my cluster Throughput Utilization Elasticity Service uptime Security ROI Everything! Right now! SLA!
  • 13. © Cloudera, Inc. All rights reserved. 13 WHAT IS HAPPENING NOW CHALLENGES - APACHE YARN APACHE YARN WORLD • Challenges in managing TWO MAJOR YARN schedulers for different use cases • Should Deploy & Manage containerized microservices in the same Hadoop Cluster • Need strong networking and persistent volume support for services • Need more powerful Auto scaling and budget control in the public cloud
  • 14. © Cloudera, Inc. All rights reserved. 14 WHAT IS HAPPENING NOW CHALLENGES - KUBERNETES KUBERNETES WORLD • Challenges in running Big Data workloads along with microservices together • Need much better quota management and better SLA • No first class Application management concept for Big Data workloads
  • 15. © Cloudera, Inc. All rights reserved. 15 SO HOW DO WE SOLVE THIS ? Assessment Native Big Data Apps — Moving batch workloads (MR, TEZ .. ) from YARN to Kubernetes looking into High throughput, Low Latency and With notion of job etc is costly. Adaptability — Optimized for services and running batch workloads exposes hard-to-bridge gaps such as run few workloads with or without docker Services on YARN — Services and web farms can run on YARN however it is not as feature-rich as Kubernetes. Multiple Schedulers (YARN) — Different schedulers are focussed on specific use cases and not very easy to drive continuous feature enhancements . One cannot replace another — Neither Kubernetes can replace YARN or vice versa in the near future considering some of the fundamental architecture differences. EXPENSIVE Higher cost to achieve the goal
  • 16. © Cloudera, Inc. All rights reserved. 16 WHERE DOES YuniKorn PLAY? Not Optimized — to balance use-cases like batch workloads to the needs like running web farms or services with respect to scheduling challenges. Poor Resource Utilization — Not able to effectively utilize complete resources in cluster for services and workloads. FRAGMENTED Multiple YARN & Kubernetes schedulers are UNIFIED RESOURCE SCHEDULER AND APPLICATION MANAGEMENT What we need is an effort to improve both YARN and Kubernetes scheduling worlds. Multiple schedulers power YARN & Kubernetes for different use cases
  • 17. © Cloudera, Inc. All rights reserved. 17 AGENDA Where are we today? Introducing YuniKorn YuniKorn Deep Dive Future & Open Source Story
  • 18. © Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved. YuniKorn - A UNIFIED RESOURCE SCHEDULER Capacity Planning Capacity Planning Divide cluster resources into resource pools (queues), define capacity range based on needs. Enforce resource quotas and limits. Resource scheduling Resource fairness, preemption, high-throughput, multi-tenant, placement, etc. Application Management A central place to monitor application states Resource Monitoring A unified view of cluster resources, a dashboard to easy track resource usage by queue, user or organization. Resource Scheduling Application Management Resource Monitoring ... Explore the feature set
  • 19. © Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved. CAPACITY PLANNING Hierarchy of queues Queues can be organized in honor of user groups or organizations, with multiple levels. Elastic Capacity Each queue has its min-max capacity, usage is elastic within this range for multiple users. Resource Quotas Resource cap for queues or users. Limited amount of resources, number of applications etc. Partition A set of instances (nodes) that are physically isolated
  • 20. © Cloudera, Inc. All rights reserved. 20© Cloudera, Inc. All rights reserved. RESOURCE SCHEDULING Resource Fairness Queue/User/App level fairness ensures each entity gets its own fair share of resources. Priorities Queue priority + App priority Preemption Queue demands for more resources have the chance to preempt resources from other queues for high priority apps. Placement Constraints Affinity/anti-affinity, node constraints etc Services Low Latency Long Running Batch High-throughput Short-lived YuniKorn
  • 21. © Cloudera, Inc. All rights reserved. 21© Cloudera, Inc. All rights reserved. RESOURCE MONITORING Common dashboard to monitor resources Hierarchy of queues Cluster resources are divided into hierarchy of queues, all queue state is visible Common View A common view of resources, cross platform. Resource Centric Focus on resources, total/available/used, and all Resource Dashboard
  • 22. © Cloudera, Inc. All rights reserved. 22© Cloudera, Inc. All rights reserved. APPLICATION MANAGEMENT Track applications in a consistent fashion Application originated GUI to manage workloads (instead of individual pods in K8s). Entire application lifecycle is visible and trackable.
  • 23. © Cloudera, Inc. All rights reserved. 23© Cloudera, Inc. All rights reserved. ARCHITECTURE YuniKorn Core Scheduler Shim Master Scheduler Interface Api-server etcd kubelets Resource Manager Node Managers GPRC/API Resource requests, new application, node updates GPRC/API Container allocation, preemption Master Node Slave nodes MR Spark Flink Tez MySQL Spark Web Server Kafka Client API Allocate, release container
  • 24. © Cloudera, Inc. All rights reserved. 24 YuniKorn SCHEDULER vs. OTHERS Disclosure: this table is summarized based on speakers’ analysis Scheduler Capabilities Resource Sharing Resource Fairness Preemption Throughput Gang Scheduling Hierarchy queues Queue prioritoy Queue elastic capacity Cross queue fairness User level fairness App level fairness Basic preemption With fairness With priority Kube-default x x x x x x v x v 100+ allocs/s x Kube-batch x x x x x v v x v ? v YARN CS/FS v v v v v v v v v 4k+ allocs/s x YuniKorn v x (wip) v v v v v v x (wip) ? x (wip) Key capabilities of a resource scheduler from our perspective
  • 25. © Cloudera, Inc. All rights reserved. 25 AGENDA Where are we today? Introducing YuniKorn YuniKorn Deep Dive Future & Open Source Story
  • 26. © Cloudera, Inc. All rights reserved. 26© Cloudera, Inc. All rights reserved. YARN cluster TAKE AWAY BEFORE AFTER Scenario 1 Scenario 2 Scenario 3 K8s cluster on cloud/prem Existing K8s schedulers K8s cluster on cloud/prem YuniKorn scheduler K8s cluster Existing K8s schedulers YARN cluster Capacity Scheduler Fair Scheduler K8s cluster YuniKorn Scheduler YARN cluster Capacity Scheduler Fair Scheduler YARN cluster YuniKorn scheduler
  • 27. © Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved. OPEN SOURCE Yes, YuniKorn is now open source https://github.com/cloudera/yunikorn-core! Contributions welcome! Join us in slack!
  • 28. © Cloudera, Inc. All rights reserved. 28© Cloudera, Inc. All rights reserved. ACKNOWLEDGMENTS A big shout out to the folks who helped to design, develop and make this project possible. ❏ Wangda Tan ❏ Weiwei Yang ❏ Wilfred Spiegelenburg ❏ Akhil PB ❏ Suma Shivaprasad ❏ and many others...
  • 29. © Cloudera, Inc. All rights reserved. THANK YOU