Presentation in Mesos Con 2016
https://mesosconna2016.sched.org/event/6jto/optimistic-offer-what-does-it-mean-to-apache-mesos-framework-yong-feng-ibm-canada-ltd
Schema on read is obsolete. Welcome metaprogramming..pdf
Mesos Con 2016 Optimistic Offer
1. Optimistic Offer -
What does it mean to Mesos Frameworks?
Yong Feng
MesosCon North America 2016, 6/01/2016
1
yongfeng@ca.ibm.com
2. Overview
• The Pessimistic Offer Programming Model
• Implementing frameworks to use Pessimistic Offer
• Observed limitations and proposed mitigations
• The Optimistic Offer Programming Model
• Benefits
• Design Plan
• Development status
• Optimistic vs Pessimistic Offer
• Discussion and Additional Resources
2
3. The Pessimistic Offer Programming Model
• Offer: allocate available resources in a host to a single framework. Allocated
resources are locked (cached) for a period of time.
• Resources are allocated or partitioned among frameworks without any knowledge of a
framework’s requirements
• Resources in an offer is either consumed by launching task, or freed by rejecting or
rescinding of an offer
• Frameworks cannot consume or reject parts of an offer while still keeping the left over
resources of that offer
• Resources are only available for re-allocation after the offer is rejected/rescinded or the
task/executor is finished
• Inverse Offer: deallocate (or reclaim) resources from a framework
3
Free Allocated Used
Framework launch task
Reject or rescind
Offer to framework
Task or executor finished
4. Framework Implementation: Cache Offers
4
• What if a framework has no suitable demand when an offer is received?
• They cache it !
• Caching resources can lead to:
• Better performance for future demand
• Better scheduling decision with more candidate hosts, for example affinity …
• Stockpiled resources for future tasks that have higher demands
• Still want to be a good citizen?
• Allow for TTL (expiration) of cached offers
• Frameworks that support TTL of cached offers:
• Swarm
• Kubernetes
5. Framework Implementation: Revive Offers
5
• What if there is no demand to consume the offers in a framework?
• Reject it ! (or send suppress request)
• Rejecting or suppressing unwanted resources allow for:
• Better resource utilization by allocating resources to frameworks that have a demand
• What if there is a demand in a framework, but no offer?
• Revive the offer
• Send a request to Mesos to reallocate the offer
• Frameworks that support Suppress/Revive offer:
• Marathon
6. Pessimistic Offer: Limitations
Inefficient resource allocation Mitigation
Cached resource are not released to other
frameworks in time
Over a period of time, each offer only includes
small pieces of each resource in a host. To fix
this, offer TTL must be enforced by Mesos and
Mesos frameworks.
PO – Pessimistic Offer
OO – Optimistic Offer
Average time to receive offers (ms) for multiple client frameworks (Pessimistic vs Optimistic Offer)
6
7. Pessimistic Offer: Limitations (Cont’d)
Poor scheduling quality Mitigation
Offered resource may not be the most suitable
resource
Frameworks should delay the scheduling decision
until they get suitable resources
PO – Pessimistic Offer
OO – Optimistic Offer
Job duration time (ms) for multiple client frameworks (Pessimistic vs Optimistic Offers)
7
8. Pessimistic Offer: Limitations (Cont’d)
Low QoS guarantee Mitigation
Long tasks prevent resource re-allocation.
Hard to preempt resource among frameworks.
Implement Quotas, reservations, over-
subscription …
https://www.youtube.com/watch?v=jC8lhGQN2Sc 8
9. Can We Squeeze More Out of Pessimistic Offers?
9
• Adjust the size of an offer (Mesos-3765)
• Smaller sized offers improves resource utilization, and fairness
• Oversubscription for reservation (Mesos-4967)
• Lend reserved resource to other framework if it is not being used. This improves
resource utilization without impacting QoS
• Reuse the recovered resource without delay (Mesos-3078 or Mesos-4811)
• Accelerate scheduling so that resources can be used once it becomes available
• More hints in filter when rejecting offer
• Avoid ping-pong of resources between Mesos and framework to improve performance
and avoid starvation
10. The Optimistic Offer Model
Offer: displays all available resources to multiple frameworks.
11. Benefits of the Optimistic Offer Model
• Better resource utilization
• Available resources are always visible to all frameworks
• Improved scheduling performance
• Resources recovered from finished tasks are visible to all frameworks immediately
• Enhanced quality of scheduling decision
• Increased availability of resources help frameworks to make better scheduling
decisions
• QoS guarantee
• QoS of workload is enforced by pre-emption
11
12. The Optimistic Offer - Design Plan
• Track both offered resources and consumed resources
• Offered resources are not allocated resources. Offered resources are viewed as available resources to all
framework
• Consumed resources are equivalent to allocated resource in the Pessimistic Offer model, however they
might still be viewed as available resources to some frameworks under special conditions.
• Offered resources are based on resource plan or usage instead of allocation
• Available resources are visible to frameworks via Offer
• Available resources are decided by resource plan and resource usage
• Refresh resource and quota availability in a timely manner
• Update resource availability and status to framework by either rejecting offer from framework or rescinding
offer from master
• Update resource plan availability with quota or restrictions to each framework
• Pre-empt tasks to enforce the QoS
• Pre-empt task from lower priority framework or overused framework to enforce QoS
• Export the resource status to framework to achieve “smart” pre-emption
12
13. The Optimistic Offer - Design Plan (Cont’d)
• Programming model:
• Allow frameworks to use offers without
the need for rejection of the offer from
a prior framework
• Refresh the status of a resource by
handling rescind messages or rejected
offers
• Use the non-revocable resource first
and then revocable resource.
• Handle the inverse offer smartly for
pre-emption
13
14. The Optimistic Offer - Development Status
• Mesos-1607 Phase 1 (renamed as “Oversubscription for reservation” in
Mesos-4967)
• Target: ~4Q 2016
• Status: Reviewable; ~30 patches are ready for review, try it after applying
the patches
14
Total Resources
Reserved
Allocated
Used
Revocable
15. The Optimistic Offer - Development Status (Cont’d)
15
• Mesos-1607 Prototype
• Target: IBM Mesos Connector
• Design: adjust allocator API to track consumed resources and resolve resource races
• Status: In Bluemix Container Cloud Service, and planned in Mesos roadmap.
16. Optimistic Offer vs Pessimistic Offer Model
16
Idle frameworks Idle frameworks with smaller tasks
PO – Pessimistic Offer
OO – Optimistic Offer
(Number of task per ms)
17. Optimistic Offer vs Pessimistic Offer Model
17
PO – Pessimistic Offer
OO – Optimistic Offer
(Number of task per ms)
Idle frameworks Idle frameworks with smaller tasks and small TTL
18. Optimistic Offer vs Pessimistic Offer Model
18
PO – Pessimistic Offer
OO – Optimistic Offer
(Number of task per ms)
Idle frameworks Idle frameworks when placement constraint is added
19. Optimistic Offer vs Pessimistic Offer Model
19
PO – Pessimistic Offer
OO – Optimistic Offer
(Job duration per ms for multiple clients)
Idle frameworks Idle frameworks with smaller tasks
20. Optimistic Offer vs Pessimistic Offer Model
20
PO – Pessimistic Offer
OO – Optimistic Offer
(Job duration per ms for multiple clients)
Idle frameworks Idle frameworks with smaller tasks and smaller TTL
21. Optimistic Offer vs Pessimistic Offer Model
21
PO – Pessimistic Offer
OO – Optimistic Offer
(slave utilization)
Idle frameworks Idle frameworks with smaller tasks
22. Optimistic Offer vs Pessimistic Offer Model
22
PO – Pessimistic Offer
OO – Optimistic Offer
(slave utilization)
Idle frameworks Idle frameworks with smaller tasks and smaller TTL
23. Evaluation – Conflicts
• 4 swarm frameworks
• 10 nodes, 1000 containers (tasks)
• Task request rejected after 10 placement retries
(a) System load = 0.8 and
varied request resource size
(b) System load = 0.4,0.8,0.99
and random request resource size
0
5
10
15
20
25
small medium large varied
Percentageoftasks
Task resource request size
Conflicted tasks (%)
Rejected tasks (%)
0
2
4
6
8
10
12
14
16
40% 80% 99%
Percentageoftasks
System Load
Conflicted tasks (%)
Rejected tasks (%)
24. Optimistic Offer in IBM Bluemix Container Cloud
24
Mesos Master
Swarm
Scheduler
Docker
CLI/API
framework
Kubernetes
Scheduler
Mesos
Agents
Offers
framework
Offers
Tasks to Mesos
Kubernetes
CLI/API
Mesos
Agents
Mesos
Agents
Network
Agents
Network
Agents
Tasks to Mesos
25. Open Questions
• How to export more policy status and configuration to
frameworks so that frameworks can make better decisions?
• How to reduce scheduling overheads, such as resource
conflicts?
• …
25
26. IBM Booth
26
Welcome to S5
• Spark benchmark with IBM Session Scheduler
• Bluemix Container Service
• GPU in Power
• Mesos Connector
27. Mesos Community Activities
27
• Active development with Mesos community
– 11 IBM Developers.
• 100+ JIRAs delivered or in progress
• Leading or participating in several work
streams: POWER Support, Optimistic
Offers, Container Support, GPU Support,
Swarm and Kubernetes integration
• Relationship with Mesosphere – weekly
calls, on-site developer presence
• Attendance at MesosCon 2016 with
sponsorship and booth
• Aligning with IBM Container Service to
leverage common OSS technologies
• Technical Preview of Mesos with IBM
Value-Add (ASC) on Docker Hub – Both
x86 and POWER images