Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Real World Akka Actor Recipes JavaOne 2013
1. Real -World Akka Actor Recipes
Björn Antonsson
@bantonsson
co-authored with
Jamie Allen and Patrik Nordwall
Tuesday, 1 October 13
2. @bantonsson@bantonsson
Overview
• What is Akka?
• What are actors?
• Getting into the flow
• Getting your message across
• Tying it all together
2
Tuesday, 1 October 13
4. @bantonsson@bantonsson
What is Akka?
• Toolkit and runtime for reactive applications
• Write applications that are
– Concurrent
– Distributed
– Fault tolerant
– Event-driven
4
Tuesday, 1 October 13
7. @bantonsson@bantonsson
What are actors?
• Isolated lightweight event-based processes
• Share nothing
• Communicate through async messages
• Each actor has a mailbox (message queue)
• Location transparent (distributable)
• Supervision-based failure management
7
Tuesday, 1 October 13
8. @bantonsson@bantonsson
What is an actor good for?
• An island of sanity in a sea of concurrency
• Everything inside the actor is sequential
– Processes one message at a time
• Very lightweight
– Create millions
– Create short lived
• Inherently concurrent
8
Tuesday, 1 October 13
9. @bantonsson@bantonsson
Actors compared to Objects
• Think of an Actor as an Object
• You can't peek inside it
• You don't call methods
– You send messages (asynchronously)
• You don't get return values
– You receive messages (asynchronously)
• The internal state is thread safe
9
Tuesday, 1 October 13
13. @bantonsson@bantonsson
So what's the catch?
• Really no catch
• A different programming paradigm
• All about tradeoffs
– Some things are easier some harder
• Think different
13
Tuesday, 1 October 13
15. @bantonsson@bantonsson
Getting into the flow
• Why do you need flow control?
• How do you control the flow?
• Do you really need all messages?
15
Tuesday, 1 October 13
16. @bantonsson@bantonsson
Why do you need flow control?
• Function calls are blocking
• Message sends are asynchronous
• Possible problems
– Produce jobs too fast
– Many jobs need much CPU and/or Memory
– External resources have limits
– Unpredictable job patterns
16
Tuesday, 1 October 13
17. @bantonsson@bantonsson
Why do you need flow control?
• Free flow of messages can lead to
– Blocked from external resource
– Actor mailbox backup
– Slow system
– Out of Memory
17
Tuesday, 1 October 13
18. @bantonsson@bantonsson
How do you control the flow?
• Push with rate limiting
– A fixed number of jobs per time unit
• Push with acknowledgment
– A fixed number of jobs can be in progress
– New jobs are started a"er old jobs finish
• Pull
– New jobs are pulled as old are completed
18
Tuesday, 1 October 13
19. @bantonsson@bantonsson
Push with rate limiting
• A timer sends ticks at fixed intervals
• On every tick the master gets new tokens
• When there are no tokens, jobs get queued
• When there are tokens, start queued jobs
19
Tuesday, 1 October 13
25. @bantonsson@bantonsson
Push with acknowledgement
• A fixed number of jobs are started
• Wait for ACK before starting more jobs
• Jobs that can't be started are queued
• When ACK arrives start queued job
• To keep workers busy, send more than one
job per worker
– Use a high water mark to stop sending and a low
water mark to start sending
25
Tuesday, 1 October 13
32. @bantonsson@bantonsson
Pull
• Incoming jobs are queued
• Workers ask for jobs
• Jobs are handed out when available
• Workers don't do active polling
• Can lead to lag if jobs are small compared to
cost of getting a new job
– Use batching to counteract lag
32
Tuesday, 1 October 13
38. @bantonsson@bantonsson
Do you really need all messages?
• Group messages together
– Batching
• Discard/Aggregate messages
– Scrubbing
38
Tuesday, 1 October 13
39. @bantonsson@bantonsson
Batching
• Collect a number of messages before
sending/processing them
– A predefined number of messages or time
• Useful for things like
– Write behind
– Database bulk insert/update
– Heavyweight operations e.g. GUI rendering
39
Tuesday, 1 October 13
40. @bantonsson@bantonsson
Scrubbing
• Discard or aggregate some messages
– Predefined number of messages or time
• Useful for things like
– Financial market data
– Statistics
40
Tuesday, 1 October 13
42. @bantonsson@bantonsson
Getting your message across
• When is a message delivered?
• The fallacy of guaranteed delivery
• What Akka guarantees
• Reliable messaging
42
Tuesday, 1 October 13
43. @bantonsson@bantonsson
When is a message delivered?
• Function calls block until done
• Message sends return immediately
• Which is the right point?
– Sent/Received Network?
– Enqueued/Dequeued Mailbox?
– Processed by Actor?
• Do ACKing at the business level
43
Tuesday, 1 October 13
44. @bantonsson@bantonsson
Guaranteed delivery
• From Enterprise Integration Patterns
• Messaging system uses built-in store to
persist
• ACK everywhere
– Producer to sender
– Sender to receiver
– Receiver to consumer
44
Tuesday, 1 October 13
45. @bantonsson@bantonsson
Lots of ACKs. What if I just...
• Use Durable Mailboxes?
– When is the message in the mailbox?
– No guarantees that it ever got there
– Still have to ACK to be certain
45
Tuesday, 1 October 13
46. @bantonsson@bantonsson
Lots of ACKs. What if I just...
• Use an External Durable Message Queue
– A SPOF/Bottleneck?
– When is the message in the message queue?
– The queue does ACKing internally
– No guarantees that it ever gets out
– Still have to ACK to be certain
46
Tuesday, 1 October 13
47. @bantonsson@bantonsson
Guaranteed delivery doesn't exist
• Things break
– Persistent store crashes
– Network fails
– Server goes down
• Design for failure and resilience
• Do ACKing at the business level
47
Tuesday, 1 October 13
48. @bantonsson@bantonsson
What Akka guarantees
• At most once delivery
– Message is only delivered once, if at all
– The weakest guarantee
• Ordered per actor sender-receiver pair
– Actor A sends messages to actor B
– If the messages are received by actor B,
it will be in the order as sent by actor A
48
Tuesday, 1 October 13
49. @bantonsson@bantonsson
Other delivery guarantees
• At least once
– Message will eventually be delivered
– Can happen multiple times
• Exactly once
– Message will eventually be delivered
– Will only happen once
• Have to add these yourself
– They involve ACKing ;)
49
Tuesday, 1 October 13
50. @bantonsson@bantonsson
Reliable Messaging: At least once
• Send with acknowledge
– Keep sending until you get an ACK
• Receive with re-request
– When missing a message request it
– Needs unique sequence numbers
• Requires
– Message store at sender (available/redundant)
50
Tuesday, 1 October 13
56. @bantonsson@bantonsson
Reliable Messaging: At least once
• Use a pipe of actors before ACKing
• Keep pipe free of side effects
– Same message might come several times
• ACK should be done at the business level
56
Tuesday, 1 October 13
64. @bantonsson@bantonsson
Receive with re-request
• When missing a message request it
• Requires
– Uniquely identifiable sequence
– Message store at sender (available/redundant)
64
Tuesday, 1 October 13
76. @bantonsson@bantonsson
Tying it all together
• Distributed Workers Example
– Front Ends receive requests
– Master receives work from Front Ends
– Workers pull work from Master
• Available as a Typesafe Activator template
– Zero configuration setup
– Code and tutorial
76
Tuesday, 1 October 13
78. @bantonsson@bantonsson
Tying it all together
• Further Goals
– Elastic addition/removal of front end nodes
– Elastic addition/removal of workers
– Thousands of workers
– Jobs should not be lost
78
Tuesday, 1 October 13
93. @bantonsson@bantonsson
Cluster Technologies/Patterns
• Distributed Pub/Sub Mediator
– Publish and Subscribe to message flows
• Cluster Singleton
– HA singleton actor instance within the cluster
• Cluster Client
– Let other systems connect to the cluster
93
Tuesday, 1 October 13
106. @bantonsson@bantonsson
Resources: Coursera Course
• Principles of Reactive Programming by
Martin Odersky, Erik Meijer and Roland Kuhn
– Starts 4th of November 2013
– 7 weeks
– Workload: 5-7 hours a week
– Free as in free beer
• https://www.coursera.org/course/reactive
106
Tuesday, 1 October 13