Building Fault-Tolerant Distributed Systems with Atomix and Copycat

1
Atomix
BUILDING FAULT-TOLERANT DISTRIBUTED SYSTEMS
STEPAN RAKITIN
MAY 27, 2017

2
About me
• Software Engineer @ EPAM
• HPC Engineer and Researcher @ ITMO University

3
Atomix
• Distributed data-structure/coordination toolkit written in Java
• Provides a collection of asynchronous APIs for state sharing and solving a variety of
common distributed systems problems
• Reactive thus asynchronous and event-driven
• Provides strong consistency using its own Raft consensus algorithm implementation

4
What I will cover
• Raft algorithm
• Consensus
• State machines
• Log replication
• Atomix APIs
• Distributed data-structures API
• Coordination and messaging API
• Copycat
• Raft implementation
• State machine API

5
Consensus
• Agreement on shared state
• Autonomous recovery from server failures
– Minority of servers fail: no problem
– Majority fail: lose availability, retain consistency
Servers

6
What is the purpose of consensus?
• Key to building consistent storage systems
• Top-level system configuration
– Which server is the master?
– What shards exist in my storage system?
– Which servers store shard X?
• Sometimes used to replicate entire storage state

7
Raft
x3 y2 x1 z6
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Servers
Clients
x 1
y 2
z 6
x3 y2 x1 z6
x 1
y 2
z 6
x3 y2 x1 z6
x 1
y 2
z 6
z6

8
Raft terms
1. Leader election
– Select one of the servers to act as leader
– Detect crashes, choose new leader
– Only elect leaders with all committed entries in their logs
2. Log replication (normal operation)
– Leader takes commands from clients, appends them to its log
– Leader replicates its log to other servers (overwriting inconsistencies)
Term 1 Term 2 Term 3 Term 4 Term 5
time
Elections Normal OperationSplit Vote

9
Distributed data-structures
• Drop-in replacements for Java Collections
• Similar to Hazelcast distributed collections
• Strong consistency over availability
• Asynchronous with CompletableFuture

10
• Single value access
• Sets
• Queues
• Maps and multimaps

11
Bootstrap
AtomixReplica replica = AtomixReplica.builder(new Address("localhost", 8700))
.withStorage(storage)
.withTransport(transport)
.build();
CompletableFuture<Atomix> future = replica.bootstrap();
// or
CompletableFuture<Atomix> future = replica.join(new Address(“172.16.42.10", 8700));
Atomix atomix = future.join();

12
DistributedValue<String> value = atomix.getValue("value").join();
value.set("Hello world!").join();
value.get().thenAccept(result -> {
System.out.println("The value is " + result);
});
DistributedMap<String, String> map = atomix.getMap("map").join();
map.put("bar", "Hello world!").thenRun(() -> {
String value = map.get("bar").join();
});
DistributedQueue<Integer> queue = atomix.getQueue("queue").join();
CompletableFuture.allOf(queue.offer(1), queue.offer(2)).join();
queue.poll().thenAccept(value -> { System.out.println("retrieved " + value); });

13
Coordination and messaging API
• Locks
• Group membership management and leader election listening
• Direct messaging within groups
• Publish-subscribe
• Request-reply

14
Distributed locks
DistributedLock lock = atomix.getLock("foo").join();
lock.lock().thenRun(() -> {
System.out.println("Acquired a lock”);
lock.unlock().join();
});

15
Group membership management
DistributedGroup group = atomix.getGroup("group").join();
LocalMember member = group.join().join();
group.onJoin(member -> {
System.out.println(member + " joined the group");
});
group.members().forEach(member -> {
// ...
});
group.election().onElection(term -> {
System.out.println(term.leader() + " elected leader for term " + term.term());
});

16
Publish-subscribe
MessageProducer.Options options = new MessageProducer.Options()
.withExecution(Execution.ASYNC)
.withDelivery(Delivery.BROADCAST);
MessageProducer<String> producer = group.messaging().producer("events", options);
producer.send("change").thenRun(() -> {...});
LocalMember localMember = group.join().join();
MessageConsumer<String> consumer = localMember.messaging().consumer("events");
consumer.onMessage(message -> {
if (message.body().equals("change")) {
message.ack();
}
});

17
Request-reply
Member member = group.member("foo");
MessageProducer.Options options = new MessageProducer.Options()
.withExecution(Execution.REQUEST_REPLY);
MessageProducer<String> producer = member.messaging().producer("hello");
producer.send("Hello world!").thenAccept(reply -> { System.out.println(reply); });
LocalMember localMember = group.join().join();
MessageConsumer<String> consumer = localMember.messaging().consumer("hello");
consumer.onMessage(message -> {
if (message.body().equals("Hello world!")) {
messages.reply("Hello world back!");
}
});

18
What can I do with that?
• Use as the replacement for Zookeeper or etcd with High Level API
• Reliable messaging
• Simply make your Java collections distributed
• Resilient distributed caches
• Create any kind of distributed resource with fault-tolerance and strong consistency
through custom Atomix resource implementation

19
Copycat
• Raft implementation itself (best one in Java)
• State managing framework
• State machine API
• Command pattern

20
Command pattern
public class PutCommand extends Command<String> {
String key
String value
}
public class GetQuery extends Query<String> {
String key
}

21
State machine API
public class MapStateMachine extends StateMachine {
private Map<String, String> map = HashMap<>();
public String put(Commit<PutCommand> commit) {
try {
map.put(commit.operation().key(), commit.operation().value());
} finally {
commit.close();
}
}
public String get(Commit<GetQuery> commit) {
try {
return map.get(commit.operation().key());
} finally {
commit.close();
}
}
}

22
Copycat Server API
Address address = new Address(“localhost", 5000);
Collection<Address> cluster = Arrays.asList(
new Address("192.168.0.1", 8700),
new Address("192.168.0.2", 8700),
new Address("192.168.0.3", 8700)
);
CopycatServer server = CopycatServer.builder(address)
.withStateMachine(MyStateMachine::new)
.build();
server.bootstrap(cluster).join();

23
Copycat Client API
CopycatClient client = CopycatClient.builder()
.withTransport(new NettyTransport())
.withServerSelectionStrategy(ServerSelectionStrategies.FOLLOWERS)
.withConnectionStrategy(ConnectionStrategies.EXPONENTIAL_BACKOFF)
.build();
Collection<Address> cluster = Arrays.asList(
new Address("192.168.0.1", 8700),
new Address("192.168.0.2", 8700),
new Address("192.168.0.3", 8700)
);
client.connect(cluster).join();
client.submit(new PutCommand("foo", "Hello world!")).thenRun(() -> {
String value = client.submit(new GetQuery("foo")).join();
});

24
Why would I need that?
• If Atomix is too high level for you or so much for your task
• If you have some rather sophisticated resource you want to make consistent
• If you just want to practice or implement your ideas

25
In the end
• Atomix provides you with very simple distributed programming interface which allows
you to solve common distributed system problems
• Raft consensus algorithm gives you strong consistency (but over availability)
• Copycat provides you with low level state machine API which could become the
barebone for your distributed resource

Building Fault-Tolerant Distributed Systems with Atomix and Copycat

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Building Fault-Tolerant Distributed Systems with Atomix and Copycat

Similar to Building Fault-Tolerant Distributed Systems with Atomix and Copycat (20)

More from epamspb

More from epamspb (13)

Recently uploaded

Recently uploaded (20)

Building Fault-Tolerant Distributed Systems with Atomix and Copycat