Developing distributed applications
with Akka and Akka Cluster
What is Akka?
• A Scala and Java framework for scalability, fault-tolerance,
concurrency and remoting through actors.
• Inspired by Erlang OTP.
• Developed by Typesafe: https://www.typesafe.com/.
Concurrency paradigms
• Shared state and locks
• Software Transactional Memory (STM)
• Message-Passing Concurrency (Actors)
• Dataflow Concurrency
• and more…
STM
Dataflow Concurrency
Actors
• Originate in a 1973 paper by Carl Hewitt
• Implemented in Erlang
• Encapsulate state and behavior
• Closer to the definition of OO than classes
• Implements Message-Passing Concurrency
• Share nothing
• Isolated lightweight processes
• Communicates through messages
• Asynchronous and non-blocking
Concurrency model
• No shared state – no synchronization
• Each actor has a mailbox (message queue)
• Non-blocking send
• Blocking receive
• Messages are immutable
userActor ! User(“John Doe”)
class UserActor extends Actor {
def receive = {
case User(name) => sender ! “Hi $name”
}
}
Dispatchers
sample-dispatcher {
type = Dispatcher
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 2
parallelism-factor = 2.0
parallelism-max = 10
}
throughput = 100
}
Supervision and hierarchy
Routers
• Local
• Remote
• Various routing algorithms (round robin, random, consistent hashing
etc)
Let’s build a web crawler!
1.Fetch a page
2.Parse the page to get links
3.Check if max crawl depth has been reached and if yes, finish
4.Go to 1 for all parsed links
Demo time
Going remote
• Everything works using asynchronous message passing which is
good for remoting
• Akka-remoting allows working with remote actors just as if they were
in the same JVM
• Still need to handle additional issues like serialization and handling
potential networking problems
Akka Cluster
• Based on DynamoDB clustering
• Completely decentralized, uses gossip protocol for membership and
failure detection
• Cluster aware routers can be used for balancing tasks across the
cluster
Cluster aware routers
actor {
deployment {
/crawlerService/crawlWorkers {
router = consistent-hashing-group
nr-of-instances = 100
routees.paths = ["/user/crawlWorker"]
cluster {
enabled = on
allow-local-routees = on
use-role = backend
}
}
}
provider = "akka.cluster.ClusterActorRefProvider"
}
Demo time
Detecting cycles
• Need to detect link cycles to avoid needless downloads
• Distributed shared state?
• Akka CRDTs can help!
CRDTs
• Good performance and scalability, the cost is eventual consistency
• Two main classes: operation based and state based
CmRDTs
• Based on messages
• Requires messages to be delivered and processed exactly once
(complex!)
• No need to transfer the whole state
CvRDTs
• Based on the object’s state
• Need merge function that must be commutative, associative, and
idempotent
Demo time
References
• Akka documentation: http://akka.io/
• Good presentation on CRDTs: https://vimeo.com/43903960
• On DynamoDB clustering: http://www.allthingsdistributed.com/files/amazon-dynamo-
sosp2007.pdf
Questions?

Developing distributed applications with Akka and Akka Cluster

  • 1.
  • 2.
  • 3.
    • A Scalaand Java framework for scalability, fault-tolerance, concurrency and remoting through actors. • Inspired by Erlang OTP. • Developed by Typesafe: https://www.typesafe.com/.
  • 4.
  • 5.
    • Shared stateand locks • Software Transactional Memory (STM) • Message-Passing Concurrency (Actors) • Dataflow Concurrency • and more…
  • 6.
  • 7.
  • 8.
  • 9.
    • Originate ina 1973 paper by Carl Hewitt • Implemented in Erlang • Encapsulate state and behavior • Closer to the definition of OO than classes
  • 10.
    • Implements Message-PassingConcurrency • Share nothing • Isolated lightweight processes • Communicates through messages • Asynchronous and non-blocking
  • 11.
    Concurrency model • Noshared state – no synchronization • Each actor has a mailbox (message queue) • Non-blocking send • Blocking receive • Messages are immutable
  • 12.
    userActor ! User(“JohnDoe”) class UserActor extends Actor { def receive = { case User(name) => sender ! “Hi $name” } }
  • 13.
    Dispatchers sample-dispatcher { type =Dispatcher executor = "fork-join-executor" fork-join-executor { parallelism-min = 2 parallelism-factor = 2.0 parallelism-max = 10 } throughput = 100 }
  • 14.
  • 15.
    Routers • Local • Remote •Various routing algorithms (round robin, random, consistent hashing etc)
  • 16.
    Let’s build aweb crawler!
  • 17.
    1.Fetch a page 2.Parsethe page to get links 3.Check if max crawl depth has been reached and if yes, finish 4.Go to 1 for all parsed links
  • 20.
  • 21.
    Going remote • Everythingworks using asynchronous message passing which is good for remoting • Akka-remoting allows working with remote actors just as if they were in the same JVM • Still need to handle additional issues like serialization and handling potential networking problems
  • 22.
    Akka Cluster • Basedon DynamoDB clustering • Completely decentralized, uses gossip protocol for membership and failure detection • Cluster aware routers can be used for balancing tasks across the cluster
  • 23.
    Cluster aware routers actor{ deployment { /crawlerService/crawlWorkers { router = consistent-hashing-group nr-of-instances = 100 routees.paths = ["/user/crawlWorker"] cluster { enabled = on allow-local-routees = on use-role = backend } } } provider = "akka.cluster.ClusterActorRefProvider" }
  • 25.
  • 26.
    Detecting cycles • Needto detect link cycles to avoid needless downloads • Distributed shared state? • Akka CRDTs can help!
  • 27.
    CRDTs • Good performanceand scalability, the cost is eventual consistency • Two main classes: operation based and state based
  • 28.
    CmRDTs • Based onmessages • Requires messages to be delivered and processed exactly once (complex!) • No need to transfer the whole state
  • 29.
    CvRDTs • Based onthe object’s state • Need merge function that must be commutative, associative, and idempotent
  • 30.
  • 31.
    References • Akka documentation:http://akka.io/ • Good presentation on CRDTs: https://vimeo.com/43903960 • On DynamoDB clustering: http://www.allthingsdistributed.com/files/amazon-dynamo- sosp2007.pdf
  • 32.