3. ▪ Abhishek Kumar
▪ Basheeruddin Ahmed
▪ Colin Dixon
▪ Harman Singh
▪ Kamal Rameshan
▪ Robert Varga
▪ Tony Tkacik
My Collaborators
3
Tom Pantelis
▪ Luis Gomez
▪ Phillip Shea
▪ Radhika Hirannaiah
▪ and many more…
26. RPC Registry Replication - Gossip
26
version=1
version=2
modify
change
version
Local bucket updates
change version
m1,v1
m2,v5
m3,v7
All buckets and their
versions known to all
members
Every 1 second members
send all known bucket
versions to any one peer
m1
m2 m3status
m2 m3
m1
local versions higher – send update
local versions lower – send status to sender
29. ▪ Some common messages
▪ Actor base classes
▪ The Protobuf messages used in Helium
▪ The Protobuf NormalizedNode serialization code
▪ The NormalizedNode streaming code
▪ Other miscellaneous utility classes
sal-clustering-commons
29
30. ▪ Implementation of the Raft Algorithm on top of akka
▪ Uses akka-persistence for durability
▪ Provides a base class called RaftActor which when can be extended
by anyone who wants to replicate state
▪ See sal-akka-raft-example which provides a simple implementation of
a replicated HashMap
sal-akka-raft
30
31. ▪ ConcurrentDOMDataBroker
▪ DistributedDataStore
▪ Implementation of the DOMStore SPI
▪ Shard built on top of RaftActor
▪ Creates Shards based on Sharding strategy
▪ Code for a client to interact with the Shard Leader
sal-distributed-datastore
31
32. ▪ RemoteRpcProvider
▪ Default RPC Provider. Invoked when an RPC is not found in the local
MD-SAL registry.
▪ Code for BucketStore which provides a mechanism to replicate state
based on Gossip
▪ Code for RpcBroker which allows invoking a remote rpc
sal-remoterpc-connector
32
36. ▪ Recovery must be complete
▪ All Shard Leaders must be known
▪ Three messages are monitored by ShardManager
▪Cluster.MemberStatusUp
▪Used to figure out the address of a cluster member
▪LeaderStateChanged
▪Used to figure out if a Follower has a different Leader
▪ShardRoleChanged
▪Use to figured out any changes in a Shard’s Role
▪ Waiting is not infinite, by default it lasts only 90 seconds but is
configurable
▪ Will block config sub-system
Waiting for Ready
36
67. ▪ Deploy a cluster
▪ Run clustering integration tests
▪ Write an application that works in the cluster
▪ Write bugs to report features which you find missing
▪ Try running dsBenchMark on a cluster
▪ Test out replication using the dummy data store
▪ Check out the code
▪ Send email to controller-dev@lists.opendaylight.org with questions
Suggested Next Steps…
67
There are 2 subsystems supported by the MD-SAL Clustering implementation.
The distributed datastore which ensures that data stored in the datastore is accessible for all members of the cluster and which allows the MD-SAL data tree to be broken up into sub-trees so that it can be distributed around the cluster.
The Remote RPC connector which ensures that an RPC implemented by a remote RPC provider is accessible from anywhere in the cluster.
The MD-SAL clustering implementation is built on top of akka. Akka is a library which allows us to create autonomous components call actors which you can interact with by sending them asynchronous messages. Since Akka can ensure that only one message may be concurrently process by an actor it makes it easy to create thread safe components.
Akka also has several modules which make it easy to build a clustered application or infrastructure on top of it.
The distributed datastore for example uses akka-persistence. This is a module which helps the datastore persist state to disk and to allow it to recover based on that persisted state. Every modification that is made to the datastore gets persisted to disk and occasionally we create a snapshot of the state (tree) so that when we restart the data store we can recover the state faster.
The Distributed Data Store and Remote RPC Connector both use akka-remoting. This module is used for one cluster member to send a message to an actor on a remote cluster member. This is an essential building block of our clustering implementation. Remoting in turn uses netty.
Finally both the Distributed Data Store and Remote RPC connector use akka-clustering to discover new members in an akka cluster so that they can then communicate with their peers on other members.
Distributed Data Store and Remote RPC Connector both use their own akka Actor System.
An akka actor system consist of three things,
An actor hierarchy. It’s called a hierarchy because actors can have supervisors and children. Failures flow up the supervision chain. An actor can also delegate some work to it’s children so that more work can be done in parallel.
Configuration and
Dispatchers which schedule messages to be delivered to an actor.
The Distributed Data Store is a strongly consistent store – this strong consistency is ensured by the use of the Raft Distributed Consensus algorithm which requires that all data modifications are initiated by a Leader which replicates data to it’s followers – thus ensuring that the order in which modifications are done on all cluster members is the same.
The Remote RPC connector also replicates data. It replicates all the registered Rpcs on each node to all the other nodes in the cluster.
RpcBroker executes Rpc requests received from remote notes
RpcRegistry manages the replication of the rpc registry – it extends BucketStore
RemoteRpcImpl is the default delegate which is invoked when an rpc registration is not found in the md-sal core RpcBrokers’ registry
RpcListener receives notifications whenever a rpc is registered by a Provider