SlideShare a Scribd company logo
Adopting actors
An epic tail of loss and learning
Iain Hull
iain.hull@workday.com
@IainHull
http://workday.github.io
Workday
Growth
2013 2014 2015 2016
Cloud Master
Launch tasks Assign to agents
Cloud Master
Launch tasks Assign to agents
Service Growth
in millions of tasks per month
0
5
10
15
20
Print
Large
Small
Batch
Why Akka?
Initial Observations
Parent
Config Child
Snapshots
Changes
Parent
Config Child
Snapshots
Changes
Message flow:
Ensure messages follow a consistent path
Parent
Config Child
Snapshots
Changes
Creation:
Assume actor is recovering from failure
(state machine)
Anti-patterns
God
Class
Movie Star
Pool
Agent
State
Agent Agent Agent Agent
Queu
e
Movie Star
Too much state
• Hard to reason about
• Too many messages in flight
• Hard to recover
• Bad concurrency
Split Brain
Pool
Agent
State
Agent Agent Agent Agent
Duplicate state
Single source of truth
• Synchronizing state is hard
• Failure causes
–State out of sync
–Causes more failure
Split Brain
Pool
Agent
State
Agent Agent Agent Agent
Task
Passing responsibility
Seems simple at first
• Do not always know who is in control
• Both actors updating the same row
• Creates race conditions
Can you
let it crash?
Pool
Agent
State
Agent Agent Agent Agent
Can you let it crash?
Lessons
Test for resilience
• Chaos Marmoset
• Unit test recovery
• Destructive system test
Stateless
Enterprise
idioms
do not apply
Sovereignty
One actor
• One row
• One shard
• One table
Otherwise failure hard to handle
Atomicity
Actors
Atomic receive method
State not shared
Comms async messages
Not nestable
Mutex
Atomic scope
State is shared
Comms via mutable state
Nestable (ACID)
Atomicity
Anything!!! Nothing
Actors Mutex
Pool
Agent
State
Agent Agent Agent Agent
Atomicity
Eventual
consistency
Lessons
- Atomicity and Consistency
- Actor modeling ≠ Object modeling
- Test for Resilience not robustness
- Refactor Early
Adopting Actors: An epic tail of loss and learning

More Related Content

Similar to Adopting Actors: An epic tail of loss and learning

mri-bp2015
mri-bp2015mri-bp2015
mri-bp2015
Keith Swenson
 
Failure the-good-parts
Failure the-good-partsFailure the-good-parts
Failure the-good-partslegendofklang
 
Transitioning Android Teams Into Kotlin
Transitioning Android Teams Into KotlinTransitioning Android Teams Into Kotlin
Transitioning Android Teams Into Kotlin
Garth Gilmour
 
Kotlin for Android - Goto Copenhagan 2019
Kotlin for Android - Goto Copenhagan 2019Kotlin for Android - Goto Copenhagan 2019
Kotlin for Android - Goto Copenhagan 2019
Eamonn Boyle
 
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Amazon Web Services
 
Root cause analysis
Root cause analysisRoot cause analysis
Root cause analysis
Ronald Bartels
 
Indic threads java10-spring-roo-and-the-cloud
Indic threads java10-spring-roo-and-the-cloudIndic threads java10-spring-roo-and-the-cloud
Indic threads java10-spring-roo-and-the-cloudShekhar Gulati
 
Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...
Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...
Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...
IndicThreads
 
React Native - Why Designers should use React native. And everyone else too.
React Native - Why Designers should use React native. And everyone else too.React Native - Why Designers should use React native. And everyone else too.
React Native - Why Designers should use React native. And everyone else too.
Val Scholz
 
Akka.Net Ottawa .NET User Group Meetup
Akka.Net Ottawa .NET User Group Meetup Akka.Net Ottawa .NET User Group Meetup
Akka.Net Ottawa .NET User Group Meetup
Taswar Bhatti
 

Similar to Adopting Actors: An epic tail of loss and learning (10)

mri-bp2015
mri-bp2015mri-bp2015
mri-bp2015
 
Failure the-good-parts
Failure the-good-partsFailure the-good-parts
Failure the-good-parts
 
Transitioning Android Teams Into Kotlin
Transitioning Android Teams Into KotlinTransitioning Android Teams Into Kotlin
Transitioning Android Teams Into Kotlin
 
Kotlin for Android - Goto Copenhagan 2019
Kotlin for Android - Goto Copenhagan 2019Kotlin for Android - Goto Copenhagan 2019
Kotlin for Android - Goto Copenhagan 2019
 
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
 
Root cause analysis
Root cause analysisRoot cause analysis
Root cause analysis
 
Indic threads java10-spring-roo-and-the-cloud
Indic threads java10-spring-roo-and-the-cloudIndic threads java10-spring-roo-and-the-cloud
Indic threads java10-spring-roo-and-the-cloud
 
Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...
Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...
Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...
 
React Native - Why Designers should use React native. And everyone else too.
React Native - Why Designers should use React native. And everyone else too.React Native - Why Designers should use React native. And everyone else too.
React Native - Why Designers should use React native. And everyone else too.
 
Akka.Net Ottawa .NET User Group Meetup
Akka.Net Ottawa .NET User Group Meetup Akka.Net Ottawa .NET User Group Meetup
Akka.Net Ottawa .NET User Group Meetup
 

Recently uploaded

Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 

Recently uploaded (20)

Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 

Adopting Actors: An epic tail of loss and learning

Editor's Notes

  1. Principle Engineer – Workday’s Grid Cloud Master team. – Who is workday
  2. Finance and Human Capital Management – ERP Vendor – 100% in the cloud – all customers on a single version
  3. Fiscal 2016 Total Revenue of $1.16 billion, up 48% year over year Over 5000 employees, over 500 employees in Dublin 2016: Best Workplaces in Ireland, Great Place to Work Institute (#2 for large companies) 2016: 10 Best Large Workplaces in Tech, Fortune (#2)
  4. provide elastic grid – other services Reliable execution of background tasks or Jobs – pdf printing to payrole Cloudmaster - Agents - Schedule and assign to Agents
  5. 5 pools of agents Different types of task, memory size, execution speed
  6. 5 data centers Secure Reliable Safe Isolated – fairness Scalable - Efficient
  7. This talk is about the lessons I learned migrating a multithreaded java server application to Akka. To support this growth we need to move to stateful services -- Why
  8. Actor model of concurrency: Safer (no deadlocks) Easier to reason about Easier to test Better distribution Easier scalability Then Scala because of akka – key selling point
  9. Trying to avoid two way relationship (coupling – mutability) Static State should be immutable
  10. Trying to avoid two way relationship (coupling – mutability) Static State should be immutable
  11. Trying to avoid two way relationship (coupling – mutability) Static State should be immutable
  12. Everyone knows about the God class – threading and mutexes make this worse
  13. Some are big - Marlon Brando – some are small Robert Downey Junior - me Even when small - entourage
  14. AgentPoolActor - Responsible for – Agent actors – Queue of tasks – and their assignments Decomposed into separate classes and traits - Still one actor with an entourage
  15. Also drives more bad decisions
  16. AgentPoolActor and AgentStateActor External DB changes – sending notifications – message loss – recovery Caused by movie star – Thought problem was stream of events were inconsistent – fix that State Inconsistent – failure – production outage
  17. … Beauty of split brains
  18. AgentPoolActor takes job from the Queue Assigns it to an Agent Agent might fail and put it back Pool or Agent might own the job - Cannot reliably find the job EG Cancel Job
  19. Who - When
  20. PoolActor has decided to assign task to an agent Async message to StateActor – PoolActor must ensure agent not reused – before reply What if reply timesout??? Crash - Can I guarantee consistency – what happens to the job?
  21. Chaos Marmoset base actor overrides the unhandled method Messages can cause failures or delays
  22. Horizontal scalability by pushing all state into the database Actors are about data – Actors are Stateful – Impedance Stateless services cannot update the same data as actor
  23. Autonomy – single responsibility If your actors write to the database
  24. We want agent assignments to be consistent
  25. Banking Transactions ACID? No - Suspense Account – Reconciliation – Compensating transactions Must handle failure cases