The document describes Huawei's approach to federating Mesos clusters across multiple data centers. It proposes a multi-master federation approach where each data center runs its own Mesos master that coordinates with other masters. Gossiper modules in each data center gossip with each other to exchange framework and resource information. When one data center reaches a resource threshold, its gossiper will direct work to other data centers based on a simple policy engine. A demo visualization is shown to illustrate work load balancing during normal and failure scenarios.
3. HUAWEI TECHNOLOGIES CO. LTD. 3
What is Federation in General?
“A federation is a group of computing or network providers agreeing upon standards of operation in a collective fashion.” - wiki
Regional Authority: Autonomously working body.
Federal Layer: Helps the regional Authorities co-operate with each other.
So Cloud Federation is the union of multiple co-operating data Centers across the geography solving a common purpose.
Federal
Layer
Regional
Authority
Regional
Authority
Regional
Authority
Regional
Authority
13. HUAWEI TECHNOLOGIES CO. LTD. 13
Why Multi-Master?
Its really hard to control Super-Hero’s if you are not one. Ask this man!!!
Phew!!!
Big Day…
14. HUAWEI TECHNOLOGIES CO. LTD. 14
Benefits of Multi-Master
Each Data Center is a Super Hero, that will co-operate with each other.
• No single point of failure.
• DC co-operate with each other using gossip protocol.
• The frameworks gets fast feedbacks because it is connected to all the masters directly. The framework to be federated.
• Centralized data store layer.
• A simple policy Engine to demonstrate cloud bursting.
16. HUAWEI TECHNOLOGIES CO. LTD. 16
Data Center 3
Data Center 1
Data Center 4Data Center 2
Hashicorp’s Consul will store all the
Policy information
Each Mesos Master is accompanied
by a ‘Gossiper’. Who will be the
representative of this Mesos run
Datacenter in the federation.
‘Gossipers’ talks to each other in the
federation and understand the
current policy
Gossiper
Gossipers negotiate with each other
and informs their respective master
what framework deserves the
offers.
Framework
Gossiper
Gossiper
Gossiper
ConsulMaster Master
Master
Master
Broad Overview
17. HUAWEI TECHNOLOGIES CO. LTD. 17
Data Center 3
Data Center 1
Data Center 4Data Center 2
Gossiper
Gossiper
Gossiper
Gossiper
Consul
Consul
Consul
Consul
Gossipers talk to each other using
hashicorp’s Member List library
Hashicorp’s Consul uses the same
MemberList Library
Overview of
Consul and
Gossiper
Interaction
18. HUAWEI TECHNOLOGIES CO. LTD. 18
Federated Master
FedAlloc: An Allocation module inherited from the default DRF module of Master
FedComm: A Mesos module of type Anonymous to which gossiper will talk to.
FedAlloc FedComm
Master
Gossiper
Allocation Module Anonymous Module
19. HUAWEI TECHNOLOGIES CO. LTD. 19
Internals of Federated Master
FedAlloc FedComm
Mesos Master
(Write only)(Read only)
Plug-in Plug-in
(Conditional
Wait)
F. Id
Suppress by FW
Suppress by
Federation
1122001 True True
1122007 True False
1122005 False True
1122004 False False
Gossiper
20. HUAWEI TECHNOLOGIES CO. LTD. 20
Internals of Federated Master (Cont.)
F. Id Suppress by
FW
Suppress by
Federation
1122001 True True
1122007 True False
1122005 False True
1122004 False False
FedAlloc FedComm
Mesos Master
(Write only)(Read only)
Plug-in Plug-in
(mutex)
FedComm (TCP read on Gossiper)
Lock Table
Write
Unlock Table
Signal Condition
FedAlloc (Conditional Variable)
Lock Table
Read
Call suppress( )/revive( )
unlock
Fedcomm automatically gets invoked
once the condition variable is set.
Gossiper
21. HUAWEI TECHNOLOGIES CO. LTD. 21
Gossiper
Anon Client: This instructs the master when to start and when to stop sending the Offers.
MasterInfo: This module periodically performs http GET on its respective Mesos master to update its statistical information
HTTP: Http Server that exposes some REST API’s
Member List (ML): Module that actually implements gossip layer.
Consul Lib: Library to talk to Consul and Replicate to other DC’s. Also implements a watch if there is any update on the policy.
Policy Engine: Read from Consul and interprets two policies:
1. Max Threshold
2. Next Max DC
HTTPMaster InfoAnon Client
Policy Engine
Consul Lib
ML
Gossiper
22. HUAWEI TECHNOLOGIES CO. LTD. 22
Master-Gossiper Interaction
Consul
Data Center 2
Data Center 3
Data Center 4
Data Center 5
FedAlloc FedComm
Master
HTTPMaster InfoAnon Client
Policy Engine
Consul Lib
ML
Gossiper
23. HUAWEI TECHNOLOGIES CO. LTD. 23
Framework
Protocol
M1: Mesos Master managing our DC1
M2: Mesos Master managing our DC2
M3: Mesos Master managing our DC3
Sample Policy: If we run out of resource in our
DC burst into Next Cloud
Register to Master 1;
M2
Register to Master 2
Offer 1
Launch Task 1
OOR
Offer 2
Launch Task 2
Offer 3
Launch Task 3
M3
Protocol
Register to Master 3
OOR
OOROOR
M1
Sequence Diagram
26. HUAWEI TECHNOLOGIES CO. LTD. 26
Gossiper - Exchange Out Of Resource
Gossiper 4
Gossiper 2
Gossiper 1
Out of Resource(OOR)
Gossiper 3
27. HUAWEI TECHNOLOGIES CO. LTD. 27
Minimal Policy Engine Implemented for this Experiment
• We needed a minimal Policy Engine to demonstrate cloud-busting scenario
• This Policy Engine is embedded as a part of Gossiper and can interpret only two simple rules
• The content of the Policy Engine in an array of Policy objects.
• Each Policy object has set of rules which needs to be applied.
• We use Hashicorp’s Consul to store Policy which is replicated across datacenter to avoid single point failure.
• Any update in the policy in one DC is instantly propagated to others. Gossiper watches Consul KeyStore and keeps the
latest copy of the policy.
{
"Name": "Policy_One",
"Rules": [{
"Name": "MinMax",
"Priority": 1,
"Scope": "",
"Content": {
"MinOrMax": "MAX"
}
}, {
"Name": "Threshold",
"Priority": 4,
"Scope": "",
"Content": {
"ResourceLimit": 90
}
}]
}
Simple Policy with two Rules
Rule 1:
• If Cloud busting which DC to choose ?
• One with Max Resources or Min Resources?
Rule 2:
• When should you perform Cloud busting?
• At what Resource Percentage?
35. HUAWEI TECHNOLOGIES CO. LTD. 35
Challenges / Future Work Planned
Policy Engine with enhanced load balancing/Affinity
Optimize the Gossip protocol for data consistency across clusters.
Network throughput/Latency
Service Discovery (i.e. DNS, etc.)
Consolidated Monitoring, health, alerts, etc.
Security & compliance in the Federation
Work with the Mesos community for further refinement……….