®
© 2015 MapR Technologies 1
®
© 2014 MapR Technologies
Maintaining Low Latency while Maximizing
Throughput
Yuliya Feldman
February 19, 2015
®
© 2015 MapR Technologies 2
Top-Ranked NoSQL
Top-Ranked Hadoop
Distribution
Top-Ranked SQL-on-Hadoop
Solution
®
®
© 2015 MapR Technologies 3
What We Have – Cluster per Use Case
YARN cluster Web ServersYARN
cluster
Too much isolation and poor resource utilization
®
© 2015 MapR Technologies 4
Need Datacenter-wide Resource Manager
What choices do we have?
•  YARN (capacity/fair scheduler)
•  Omega
•  Mesos
•  Others (e.g. Quasar)
®
© 2015 MapR Technologies 5
YARN
•  Motivated by Mesos, but is a Hadoop resource manager
•  Manages Hadoop resources well – “retail”
•  Pluggable schedulers for Hadoop
•  Started handling long-lived tasks
•  Can pre-empt tasks
•  YARN-1051 - YARN Admission Control/Planner: enhancing the
resource allocation model with time
®
© 2015 MapR Technologies 6
Mesos
•  Data-center wide resource manager – negotiator between
frameworks
•  Manages all resources for frameworks well, not particular
framework (e.g. Hadoop) – “wholesale”
•  Doing two-level scheduling
•  Excellent Docker support
•  Schedules, allocates, and isolates cpu, mem, disk, network, and
arbitrary custom resource types
®
© 2015 MapR Technologies 7
Can we….
–  Continue leveraging YARN resource scheduling capabilities
for YARN-based applications?
–  Treat YARN as “yet another” framework within Mesos?
–  Let YARN not bother about non-YARN applications
coexistence?
®
© 2015 MapR Technologies 8
Introducing
Myriad
®
© 2015 MapR Technologies 9
Apache Myriad: True Multi-tenancy
•  Open-source project launched Oct `14
–  MapR, eBay, Mesosphere, others participating
•  Allows Mesos and YARN to cooperate with each other
•  Mesos: datacenter-wide resource manager
–  Dockerized containers and/or cgroups used for isolation
•  Hadoop is launched inside cgroup containers
•  Myriad manages conversation between RM and Mesos master
and between NM and Mesos slaves
®
© 2015 MapR Technologies 10
Why Myriad
•  Run many types of compute frameworks side-by-side
–  Hadoop family, etc. (YARN, Spark, Kafka, Storm)
–  Web-server farm
–  MPP databases (e.g., Vertica)
–  Other services: SOA web-services, Jenkins/build-farm, cron-jobs, shell
scripts, Kubernetes, Cassandra, ElasticSearch, etc.
–  Each compute framework is a cluster in itself
•  Need to break up a physical cluster into many virtual clusters
–  Using Docker (containers) for good isolation
–  But most schedulers can only manage individual nodes inside a cluster
•  Move resources between virtual clusters on-demand
®
© 2015 MapR Technologies 11
Utilize Excess Capacity for Analytics
DC Server Farm Hadoop Analytics
Utilizatiion
Long lived excess
capacity situations
•  “Scale up” Hadoop during long periods of low utilization
•  “Scale down” Hadoop ahead of anticipated high utilization
®
© 2015 MapR Technologies 12
Myriad Again
•  Mesos creates virtual
clusters
•  YARN uses resources
provided by Mesos
•  Myriad can ask YARN to
release some resources
•  Or give it more
Mesos
YARN cluster
YARN
cluster
Web Servers
®
© 2015 MapR Technologies 13
Myriad Services Architecture
Node ManagerResource Manager
Executor
Mesos
Scheduler
Mesos
Container
Container
App
YARN
Scheduler
(fairshare)
Offers
Launch
Tasks
Launch
Tasks
Task
Status
Launch containers
via HB
Submit
Map<Node,
Capacity>
®
© 2015 MapR Technologies 14
REST API
Framework
+
Master
2.
Mesos
Resource
Manager
YARN
Mesos Slave
Mesos
Node
Node
Manager
YARN
Launch Node
Manager
2.5 CPU,
2.5 GB
Advertise
Resources
2 CPU,
2 GB
How it works
Mesos
scheduler
®
© 2015 MapR Technologies 15
REST API
Framework
+
Master
2.
Mesos
Resource
Manager
YARN
Mesos Slave
Mesos
Node
Node
Manager
YARN
Launch Containers
C1
C2
Mesos
scheduler
®
© 2015 MapR Technologies 16
2.
Slave
Mesos
Node1
Node
Manager
YARN
8 CPU, 8 GB
2.
Slave
Mesos
Node2
Node
Manager
YARN
8 CPU,8 GB
REST API
Framework
+
Master
Mesos
Resource
Manager
YARN
Web Traffic spike
Resize
NodeManager(s)
6 CPU, 6 GB 6 CPU, 6 GBWebService
2 CPU, 2 GB 2 CPU, 2 GB
WebService
Use Case – Web Traffic spikes
Mesos
scheduler
®
© 2015 MapR Technologies 17
2.
Slave
Mesos
Node1
Node
Manager
YARN
8 CPU, 8 GB
2.
Slave
Mesos
Node2
Node
Manager
YARN
8 CPU,8 GB
REST API
Framework
+
Master
Mesos
Resource
Manager
YARN
Web Traffic spike
over
Resize
NodeManager(s)
6 CPU, 6 GB 6 CPU, 6 GBWebService
2 CPU, 2 GB 2 CPU, 2 GB
WebService
Mesos
scheduler
®
© 2015 MapR Technologies 18
Myriad Demo
At MapR booth 1009
®
© 2015 MapR Technologies 19
Maintaining Low Latency while Maximizing
Throughput
on a single cluster
®
© 2015 MapR Technologies 20
Batch and Real-time Analytics Together
Compute Cluster
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
Cluster/DC
Scheduler
®
© 2015 MapR Technologies 21
Sharing Resources between Batch and Real-Time
•  Real-time services resource usage pattern can be unpredictable
–  Analysts use services during the day
–  Analysts on the other side of the globe work during the night
–  There are steady states, spikes and dips in the workloads
•  Batch resource usage – more or less predictable
–  Running same jobs all over again with some occasional spikes and dips
®
© 2015 MapR Technologies 22
Real-time Services Resource Utilization/Provisioning
Aggressive
resource provisioning.
< 10% utilization
Moderate resource
provisioning < 60%
utilization
Conservative
resource
provisioning >
80% utilization
®
© 2015 MapR Technologies 23
What Can We Do To Provision Conservatively?
Compute Cluster
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
Cluster/DC
Resource
Manager
Drill
Service
Watcher
Monitors
Drill
Performance
Latency
decrease
Accept Offers
(Mesos)
Need additional
Containers
(YARN)
Allocate
Resources
(Preempt if
needed)
C1
C2 C3
Dummy
containers
Latency
increase
®
© 2015 MapR Technologies 24
SHOWTIME
®
© 2015 MapR Technologies 25
®
© 2015 MapR Technologies 26
Q&A
@mapr maprtech
yfeldman@mapr.com
Engage with us!
MapR
maprtech
mapr-technologies

Maintaining Low Latency While Maximizing Throughput on a Single Cluster

  • 1.
    ® © 2015 MapRTechnologies 1 ® © 2014 MapR Technologies Maintaining Low Latency while Maximizing Throughput Yuliya Feldman February 19, 2015
  • 2.
    ® © 2015 MapRTechnologies 2 Top-Ranked NoSQL Top-Ranked Hadoop Distribution Top-Ranked SQL-on-Hadoop Solution ®
  • 3.
    ® © 2015 MapRTechnologies 3 What We Have – Cluster per Use Case YARN cluster Web ServersYARN cluster Too much isolation and poor resource utilization
  • 4.
    ® © 2015 MapRTechnologies 4 Need Datacenter-wide Resource Manager What choices do we have? •  YARN (capacity/fair scheduler) •  Omega •  Mesos •  Others (e.g. Quasar)
  • 5.
    ® © 2015 MapRTechnologies 5 YARN •  Motivated by Mesos, but is a Hadoop resource manager •  Manages Hadoop resources well – “retail” •  Pluggable schedulers for Hadoop •  Started handling long-lived tasks •  Can pre-empt tasks •  YARN-1051 - YARN Admission Control/Planner: enhancing the resource allocation model with time
  • 6.
    ® © 2015 MapRTechnologies 6 Mesos •  Data-center wide resource manager – negotiator between frameworks •  Manages all resources for frameworks well, not particular framework (e.g. Hadoop) – “wholesale” •  Doing two-level scheduling •  Excellent Docker support •  Schedules, allocates, and isolates cpu, mem, disk, network, and arbitrary custom resource types
  • 7.
    ® © 2015 MapRTechnologies 7 Can we…. –  Continue leveraging YARN resource scheduling capabilities for YARN-based applications? –  Treat YARN as “yet another” framework within Mesos? –  Let YARN not bother about non-YARN applications coexistence?
  • 8.
    ® © 2015 MapRTechnologies 8 Introducing Myriad
  • 9.
    ® © 2015 MapRTechnologies 9 Apache Myriad: True Multi-tenancy •  Open-source project launched Oct `14 –  MapR, eBay, Mesosphere, others participating •  Allows Mesos and YARN to cooperate with each other •  Mesos: datacenter-wide resource manager –  Dockerized containers and/or cgroups used for isolation •  Hadoop is launched inside cgroup containers •  Myriad manages conversation between RM and Mesos master and between NM and Mesos slaves
  • 10.
    ® © 2015 MapRTechnologies 10 Why Myriad •  Run many types of compute frameworks side-by-side –  Hadoop family, etc. (YARN, Spark, Kafka, Storm) –  Web-server farm –  MPP databases (e.g., Vertica) –  Other services: SOA web-services, Jenkins/build-farm, cron-jobs, shell scripts, Kubernetes, Cassandra, ElasticSearch, etc. –  Each compute framework is a cluster in itself •  Need to break up a physical cluster into many virtual clusters –  Using Docker (containers) for good isolation –  But most schedulers can only manage individual nodes inside a cluster •  Move resources between virtual clusters on-demand
  • 11.
    ® © 2015 MapRTechnologies 11 Utilize Excess Capacity for Analytics DC Server Farm Hadoop Analytics Utilizatiion Long lived excess capacity situations •  “Scale up” Hadoop during long periods of low utilization •  “Scale down” Hadoop ahead of anticipated high utilization
  • 12.
    ® © 2015 MapRTechnologies 12 Myriad Again •  Mesos creates virtual clusters •  YARN uses resources provided by Mesos •  Myriad can ask YARN to release some resources •  Or give it more Mesos YARN cluster YARN cluster Web Servers
  • 13.
    ® © 2015 MapRTechnologies 13 Myriad Services Architecture Node ManagerResource Manager Executor Mesos Scheduler Mesos Container Container App YARN Scheduler (fairshare) Offers Launch Tasks Launch Tasks Task Status Launch containers via HB Submit Map<Node, Capacity>
  • 14.
    ® © 2015 MapRTechnologies 14 REST API Framework + Master 2. Mesos Resource Manager YARN Mesos Slave Mesos Node Node Manager YARN Launch Node Manager 2.5 CPU, 2.5 GB Advertise Resources 2 CPU, 2 GB How it works Mesos scheduler
  • 15.
    ® © 2015 MapRTechnologies 15 REST API Framework + Master 2. Mesos Resource Manager YARN Mesos Slave Mesos Node Node Manager YARN Launch Containers C1 C2 Mesos scheduler
  • 16.
    ® © 2015 MapRTechnologies 16 2. Slave Mesos Node1 Node Manager YARN 8 CPU, 8 GB 2. Slave Mesos Node2 Node Manager YARN 8 CPU,8 GB REST API Framework + Master Mesos Resource Manager YARN Web Traffic spike Resize NodeManager(s) 6 CPU, 6 GB 6 CPU, 6 GBWebService 2 CPU, 2 GB 2 CPU, 2 GB WebService Use Case – Web Traffic spikes Mesos scheduler
  • 17.
    ® © 2015 MapRTechnologies 17 2. Slave Mesos Node1 Node Manager YARN 8 CPU, 8 GB 2. Slave Mesos Node2 Node Manager YARN 8 CPU,8 GB REST API Framework + Master Mesos Resource Manager YARN Web Traffic spike over Resize NodeManager(s) 6 CPU, 6 GB 6 CPU, 6 GBWebService 2 CPU, 2 GB 2 CPU, 2 GB WebService Mesos scheduler
  • 18.
    ® © 2015 MapRTechnologies 18 Myriad Demo At MapR booth 1009
  • 19.
    ® © 2015 MapRTechnologies 19 Maintaining Low Latency while Maximizing Throughput on a single cluster
  • 20.
    ® © 2015 MapRTechnologies 20 Batch and Real-time Analytics Together Compute Cluster NM DrillBit NM DrillBit NM DrillBit NM DrillBit NM DrillBit NM DrillBit NM DrillBit NM DrillBit Cluster/DC Scheduler
  • 21.
    ® © 2015 MapRTechnologies 21 Sharing Resources between Batch and Real-Time •  Real-time services resource usage pattern can be unpredictable –  Analysts use services during the day –  Analysts on the other side of the globe work during the night –  There are steady states, spikes and dips in the workloads •  Batch resource usage – more or less predictable –  Running same jobs all over again with some occasional spikes and dips
  • 22.
    ® © 2015 MapRTechnologies 22 Real-time Services Resource Utilization/Provisioning Aggressive resource provisioning. < 10% utilization Moderate resource provisioning < 60% utilization Conservative resource provisioning > 80% utilization
  • 23.
    ® © 2015 MapRTechnologies 23 What Can We Do To Provision Conservatively? Compute Cluster NM DrillBit NM DrillBit NM DrillBit NM DrillBit Cluster/DC Resource Manager Drill Service Watcher Monitors Drill Performance Latency decrease Accept Offers (Mesos) Need additional Containers (YARN) Allocate Resources (Preempt if needed) C1 C2 C3 Dummy containers Latency increase
  • 24.
    ® © 2015 MapRTechnologies 24 SHOWTIME
  • 25.
    ® © 2015 MapRTechnologies 25
  • 26.
    ® © 2015 MapRTechnologies 26 Q&A @mapr maprtech yfeldman@mapr.com Engage with us! MapR maprtech mapr-technologies