Akka: Distributed by Design - Björn Antonsson (Typesafe)

1,787 views

Published on

Presented at JAX London 2013

Akka is a toolkit and runtime for building highly scalable, distributed, and fault tolerant reactive applications on the JVM using actors. Akka supports scaling both UP (utilizing multi-core processors) and OUT (utilizing the grid/cloud) hence it is "Distributed by Design". This gives the Akka runtime the freedom to do adaptive automatic load-balancing and actor migration, cluster rebalancing, replication and partitioning. All this is made possible through Akkas new decentralized P2P, gossip-based cluster module.

Akka: Distributed by Design - Björn Antonsson (Typesafe)

  1. 1. Akka - Distributed by Design Björn Antonsson @bantonsson Wednesday, 30 October 13
  2. 2. Overview • Akka • Actors • Distributed by Design • Diving Into The Cluster • What The Future Brings @bantonsson Wednesday, 30 October 13
  3. 3. Akka Wednesday, 30 October 13
  4. 4. Akka • Toolkit and runtime for reactive applications • Write applications that are – Concurrent – Distributed – Fault tolerant – Event-driven @bantonsson Wednesday, 30 October 13
  5. 5. Akka • Has multiple tools – Actors – Futures – Dataflow – Remoting – Clustering @bantonsson Wednesday, 30 October 13
  6. 6. Actors Wednesday, 30 October 13
  7. 7. Actors • Isolated lightweight event-based processes • Share nothing • Communicate through async messages • Each actor has a mailbox (message queue) • Each actor has a parent handling its failures • Location transparent (distributable) @bantonsson Wednesday, 30 October 13
  8. 8. Actors • An island of sanity in a sea of concurrency • Everything inside the actor is sequential – Processes one message at a time • Very lightweight – Create millions – Create short lived • Inherently concurrent @bantonsson Wednesday, 30 October 13
  9. 9. Actor code sample Define the message(s) the Actor should be able to respond to public class Greeting implements Serializable { public final String who; public Greeting(String who) { this.who = who; } Define the Actor class } public class GreetingActor extends UntypedActor { LoggingAdapter log = Logging.getLogger(getContext().system(), this); Define the Actor’s int counter = 0; behavior public void onReceive(Object message) { if (message instanceof Greeting) { counter++; log.info("Hello #" + counter + " " + ((Greeting) message).who); } else unhandled(message); } } @bantonsson Wednesday, 30 October 13
  10. 10. Creating and using Actors ActorSystem system = ActorSystem.create("MySystem"); ActorRef greeter = system.actorOf( Props.create(GreetingActor.class), "greeter"); greeter.tell(new Greeting("Charlie Parker"), null); @bantonsson Wednesday, 30 October 13
  11. 11. Actors compared to Objects • Think of an Actor as an Object • You can't peek inside it • You don't call methods – You send messages (asynchronously) • You don't get return values – You receive messages (asynchronously) • The internal state is thread safe @bantonsson Wednesday, 30 October 13
  12. 12. Why should I care? Wednesday, 30 October 13
  13. 13. The world is multicore! Wednesday, 30 October 13
  14. 14. Amdahl’s Law @bantonsson Wednesday, 30 October 13
  15. 15. So what's the catch? • Really no catch • A different programming paradigm • All about tradeoffs – Some things are easier some harder • Think different @bantonsson Wednesday, 30 October 13
  16. 16. Distributed by Design Wednesday, 30 October 13
  17. 17. Remote Actors • Sending messages decouples actors • Local or remote doesn't matter @bantonsson Wednesday, 30 October 13
  18. 18. NODE 1 @bantonsson Wednesday, 30 October 13 NODE 2
  19. 19. Remote Actors • Zero code change deployment decision • Add configuration to the Actor System akka { actor { The "greeter" actor provider = deployment /greeter remote } } Define Remote Path } } @bantonsson Wednesday, 30 October 13 Configure a Remote Provider akka.remote.RemoteActorRefProvider { { = akka.tcp://MySystem@machine1:2552 Protocol Actor System Hostname Port
  20. 20. Looking up Actors ActorSelection greeter = system.actorSelection( "akka.tcp://MySystem@machine1:2552/user/greeter"); @bantonsson Wednesday, 30 October 13
  21. 21. Can you see the problem? Wednesday, 30 October 13
  22. 22. Fixed addresses akka { actor { provider = deployment /greeter remote } } } } akka.remote.RemoteActorRefProvider { { = akka.tcp://MySystem@machine1:2552 ActorSelection greeter = system.actorSelection( "akka.tcp://MySystem@machine1:2552/user/greeter"); @bantonsson Wednesday, 30 October 13
  23. 23. Diving Into The Cluster Wednesday, 30 October 13
  24. 24. Akka Cluster 2.2 • Gossip-Based Cluster Membership • Failure Detector • Cluster DeathWatch • Cluster-Aware Routers @bantonsson Wednesday, 30 October 13
  25. 25. Cluster Membership • Node ring à la Riak / Dynamo • Gossip-protocol for state dissemination • Vector Clocks to resolve conflicts • Peer based failure detector @bantonsson Wednesday, 30 October 13
  26. 26. Node ring with gossiping Members Member Node Member Node Member Node Member Node Member Node Gossip Member Node Member Node Member Node @bantonsson Wednesday, 30 October 13 Member Node Member Node
  27. 27. Gossip Protocol • Cluster Membership • Leader Determination • Targets Random Node – Partly biased towards nodes with older state • Push-Pull based – Sender only sends his version number – Receiver asks for newer information @bantonsson Wednesday, 30 October 13
  28. 28. Vector Clocks • Partial ordering in a distributed system • Detects causality violations • Used to reconcile and merge cluster state @bantonsson Wednesday, 30 October 13
  29. 29. Failure Detection • Uses The Phi Accrual Failure Detector • Peer Based with limited targets – B monitors A – A sends heart beats to B – B samples inter-arrival time to expect next beat – B measures continuum of deadness of A – B marks A as unreachable if A is dead enough @bantonsson Wednesday, 30 October 13
  30. 30. Member Node Member Node Member Node Member Node Member Node Heartbeat Member Node Member Node Member Node @bantonsson Wednesday, 30 October 13 Member Node Member Node
  31. 31. Member Node Member Node Member Node Member Node Member Node Heartbeat Member Node Member Node Member Node @bantonsson Wednesday, 30 October 13 Member Node Member Node
  32. 32. Member States • Joining • Up • Leaving • Exiting • Down • Removed • Unreachable* @bantonsson Wednesday, 30 October 13 join up leave (leader action) joining (fd*) leaving (fd*) (fd*) (leader action) unreachable* (fd*) down exiting removed (leader action)
  33. 33. Leader • No SPOF • Can be any node • No handover involved • Deterministically recognized by all nodes – always the first member in the sorted membership ring @bantonsson Wednesday, 30 October 13
  34. 34. Leader Duties • Shift members from – Joining to Up – Exiting to Removed – Up to Down (auto-downing) to Removed • Can only be performed on convergence @bantonsson Wednesday, 30 October 13
  35. 35. Cluster Convergence • A Node sees that all other Nodes have seen this version of the gossip • Is always local to that Node • Unreachable Nodes blocks this • Mark Unreachable Nodes as Down to proceed – Manual Ops intervention – Automatic action @bantonsson Wednesday, 30 October 13
  36. 36. Cluster Metrics • Gossip based • Metrics about the Nodes in the Cluster – Load – CPU Usage – Processors – Heap Memory • Used, Committed, Max @bantonsson Wednesday, 30 October 13
  37. 37. Cluster Roles • Assign roles to Nodes (named tags) • Nodes can have multiple roles • Restrict work to certain roles • Deterministically recognized role leader @bantonsson Wednesday, 30 October 13
  38. 38. Cluster Events • Subscribe to be notified • Membership Changes – Up, Exited, Removed • • • • Leader Changed Metrics Changed Role Leader Changed Member Unreachable @bantonsson Wednesday, 30 October 13
  39. 39. Cluster DeathWatch • Triggered by marking node «A» Down – Tell parents of their lost children on «A» – Kill all children of actors on «A» – Send Terminated for actors on «A» @bantonsson Wednesday, 30 October 13
  40. 40. Building on The Cluster Wednesday, 30 October 13
  41. 41. Load Balancing • Cluster aware routers – Round Robin Router – Consistent Hashing Router – Adaptive Load Balancing Router • Use Cluster Metrics to select target • CPU/Memory/Load @bantonsson Wednesday, 30 October 13
  42. 42. Cluster Contributions/Patterns • Distributed Pub/Sub Mediator – Publish and Subscribe to message flows • Cluster Singleton – HA singleton actor instance within the cluster • Cluster Client – Let other systems connect to the cluster @bantonsson Wednesday, 30 October 13
  43. 43. Frontend Master Mediator Mediator DistributedPubSubMediator @bantonsson Wednesday, 30 October 13
  44. 44. Frontend Put Mediator Master Mediator DistributedPubSubMediator @bantonsson Wednesday, 30 October 13
  45. 45. Frontend Master Send Mediator Mediator DistributedPubSubMediator @bantonsson Wednesday, 30 October 13
  46. 46. Frontend Master Send Mediator Mediator DistributedPubSubMediator @bantonsson Wednesday, 30 October 13
  47. 47. ClusterSingleton Manager ClusterSingleton Manager Master Master (Standby) ClusterSingleton @bantonsson Wednesday, 30 October 13
  48. 48. ClusterSingleton Manager ClusterSingleton Manager Master Master (Standby) ClusterSingleton @bantonsson Wednesday, 30 October 13
  49. 49. ClusterSingleton Manager ClusterSingleton Manager Master Master ClusterSingleton @bantonsson Wednesday, 30 October 13
  50. 50. Master Receptionist Mediator Mediator Master (Standby) Receptionist Worker Cluster Client Cluster Client Worker ClusterClient & ClusterSingleton @bantonsson Wednesday, 30 October 13
  51. 51. Typesafe Activator Distributed Workers Cluster Template • http://typesafe.com/platform/getstarted @bantonsson Wednesday, 30 October 13
  52. 52. What The Future Brings Wednesday, 30 October 13
  53. 53. Gossip Optimizations • Several times faster Vector Clock comparison • Fewer Vector Clock comparisons • Gossip message size cut in half • Gossip message scrubbing • Lazy deserialization of Gossip messages @bantonsson Wednesday, 30 October 13
  54. 54. Return from Unreachable join up • Unreachable to Reachable • Cluster is more resilient to fluctuations joining (fd*) Wednesday, 30 October 13 leaving (fd*) (fd*) (leader action) unreachable* (fd*) down exiting removed @bantonsson leave (leader action) (leader action)
  55. 55. Rebuilt Routers • Rebuilt from the ground • Routing logic usable in Actors • Actor Selection as routees • Improved Cluster behavior @bantonsson Wednesday, 30 October 13
  56. 56. Persistence • New module akka-persistence • Command sourcing & event sourcing • Based on the proven Eventsourced library • Migrate actors by persisting their state @bantonsson Wednesday, 30 October 13
  57. 57. Resources Wednesday, 30 October 13
  58. 58. Survey and Resources • Help Akka get better. Fill out the survey! – http://tinyurl.com/akka-survey • Akka Cluster Documentation – http://tinyurl.com/akka-cluster • Akka Cluster in Production Blog Post – Ryan Tanner • http://tinyurl.com/akka-at-conspire @bantonsson Wednesday, 30 October 13
  59. 59. Coursera Course • Principles of Reactive Programming by Martin Odersky, Erik Meijer and Roland Kuhn – Starts 4th of November 2013 – 7 weeks – Workload: 5-7 hours a week – Free as in free beer • https://www.coursera.org/course/reactive @bantonsson Wednesday, 30 October 13
  60. 60. Björn Antonsson @bantonsson @akkateam bjorn.antonsson@typesafe.com Wednesday, 30 October 13

×