This document provides an overview of Cadence, an open source workflow platform. It discusses how Cadence can be used to build scalable and reliable distributed applications by providing common building blocks like activities, workflows, queues, and storage. Several case studies are presented that demonstrate how real-world problems from domains like ridesharing and customer loyalty programs can be modeled as workflows in Cadence. Key advantages of Cadence include its ability to handle large volumes of work, ensure high availability even during failures, and unify different workflow definition languages under one platform.
24. Cadence Activities
● Any application specific code
● Potentially long lived (heartbeating)
● Can be implemented asynchronously
● Automatically retried according to a specified retry policy
● Routable to specific hosts or processes
● Dispatched through queues
● Per worker rate and parallelism limit
● Per queue rate limit
26. Cadence Workflows
● Virtual Objects in Java or Go
● Transactional
● Orchestrate Activities
● React to external events
● Stateful including local variables and stack
● Queryable
● Potentially Long Lived
● Durable Timers
33. Case Study: Driver Rewards
● Driver signs up if qualifies
● Eligibility is checked every 30 days
● Participation is lost if doesn’t meet the
rewards requirements when checked
● Service listens on trip completion events to
calculate average rating
34. void OnMessage(Trip t){
State s = loadFromDb(t.driverId);
s.addTrip(t);
saveToDb(s);
}
PartnerService
Queue
Database
void onTimer(String driverId){
State s = loadFromDb(driverId);
if (s.eligible) {
activate(driverId);
} else {
deactivate(driverId);
}
s.reset();
saveToDB(s);
scheduleTimer();
}
35. void OnMessage(Trip t){
State s = loadFromDb(t.driverId);
s.addTrip(t);
saveToDb(s);
}
PartnerService
Queue
Database
void onTimer(String driverId){
State s = loadFromDb(driverId);
if (s.eligible) {
activate(driverId);
} else {
deactivate(driverId);
}
s.reset();
saveToDB(s);
scheduleTimer();
}
43. Use Case: Uber Flow
● UI based workflows
● Graph execution engine.
● Each edge has conditions
attached
● Some state nodes are
associated with actions
52. ● Potentially billions of entities
● ~ 50 workflows per entity
● ~ 10K external events per second
○ not counting duplicates
● Each event should be checked against all workflows for an entity
○ 50 * 10K = 500K condition evaluation per second
● On average one workflow per event generate an action
○ ~ 10K actions
Flow Scalability Requirements
53. void OnMessage(Message t){
List<String> workflows = getWorkflowsFor(t);
for (String workflow: workflows) {
State s = loadFromDb(workflow, t.entityId);
if (s.currentNode.evaluateConditions(t)) {
s.currentNode.executeAction();
saveToDb(s);
}
}
}
Services
Queue
Database
Services
Services
Services
Flow Original Implementation
54. void OnMessage(Message t){
List<String> workflows = getWorkflowsFor(t);
for (String workflow: workflows) {
State s = loadFromDb(workflow, t.entityId);
if (s.currentNode.evaluateConditions(t)) {
s.currentNode.executeAction();
saveToDb(s);
}
}
}
Services
Queue
Database
Services
Services
Services
Flow Original Implementation
55. List<State> workflows;
void OnMessage(Message t){
for (State workflow: workflows) {
if (s.currentNode.evaluateConditions(t)) {
s.currentNode.executeAction();
}
}
}
Services
Queue Services
Services
Services
Flow on Cadence
56. Flow on Cadence Advantages
● Reliable retries of condition evaluations
● Reliable retries of actions
● Database load is proportional to number of events per second.
○ Not (number of events) x (number of workflows per entity) which is 500K
● Cross datacenter failover
● Unified workflow engine
58. Flow Load Test
● 218000 per second condition evaluations
● 4400 actions per second
59. Cadence as a Platform
Cadence Service
BPMN
AWS States
Language
Airflow
DAG
Uber
Flow
Custom
DSL2
Java SDK Go SDK
JavApp1
JavaApp2
BPMNApp1
BPMNApp2
SLApp1
SLApp2
SLApp3
SLApp4
GoApp1
GoApp2
App1
App1
AirflowApp1
AirflowApp2
AirflowApp3
App1App1App1
DSL1App1
DSL1App2
DSL3App1
App1
App1
FlowApp1
FlowApp2
Custom
DSL1
Custom
DSL3
App1App1App1
DSL2App1
DSL2App2
60. More Uber Cadence Use Cases
● Freight load workflow
● Driver loyalty program
● Customer support workflows
● CI/CD/Deployment infrastructure
● End of month statement generation for each u4b customer
● Recalculate every hexagon on the city map every 1 minute for every city
● Tip processing in microservices architecture
● Managing Flink and Spark Jobs in Mesos or Yarn
● Customer loyalty Program
● Marketing email campaign management
● New datacenter provisioning
● Numerous other periodic jobs
63. Cadence Summary
● Higher level way of building distributed applications
○ Focus on business logic not plumbing
● Large scale
○ Billions of workflow instances
○ Tens of thousands of events per second
● High availability
○ Oblivious to node failures
○ Cross Datacenter Replication
● Unify all workflow solutions
○ Can support any existing workflow definition language
○ Perfect for DSL
● Open Source
○ http://cadenceworkflow.io
○ Apache 2.0 License
Editor's Notes
Other Issues to Consider
Timeouts
What if debit service lost the transaction?
What if settlement has a time limit?
Compensations
What if credit is impossible?
Changes to already running transaction
Tip amount updated
Cancellation
What if long running operation requires polling for the result?
Upgrading the sequence of steps
Operations
Many moving parts like DB, queue, etc.
Datacenter failures
Debugging
Change to business exception
Other Issues to Consider
Timeouts
What if debit service lost the transaction?
What if settlement has a time limit?
Compensations
What if credit is impossible?
Changes to already running transaction
Tip amount updated
Cancellation
What if long running operation requires polling for the result?
Upgrading the sequence of steps
Operations
Many moving parts like DB, queue, etc.
Datacenter failures
Debugging
Other Issues to Consider
Timeouts
What if debit service lost the transaction?
What if settlement has a time limit?
Compensations
What if credit is impossible?
Changes to already running transaction
Tip amount updated
Cancellation
What if long running operation requires polling for the result?
Upgrading the sequence of steps
Operations
Many moving parts like DB, queue, etc.
Datacenter failures
Debugging
Copy from https://engdocs.uberinternal.com/autobots/overview.html#product-details
Other Issues to Consider
Timeouts
What if debit service lost the transaction?
What if settlement has a time limit?
Compensations
What if credit is impossible?
Changes to already running transaction
Tip amount updated
Cancellation
What if long running operation requires polling for the result?
Upgrading the sequence of steps
Operations
Many moving parts like DB, queue, etc.
Datacenter failures
Debugging
Other Issues to Consider
Timeouts
What if debit service lost the transaction?
What if settlement has a time limit?
Compensations
What if credit is impossible?
Changes to already running transaction
Tip amount updated
Cancellation
What if long running operation requires polling for the result?
Upgrading the sequence of steps
Operations
Many moving parts like DB, queue, etc.
Datacenter failures
Debugging
Other Issues to Consider
Timeouts
What if debit service lost the transaction?
What if settlement has a time limit?
Compensations
What if credit is impossible?
Changes to already running transaction
Tip amount updated
Cancellation
What if long running operation requires polling for the result?
Upgrading the sequence of steps
Operations
Many moving parts like DB, queue, etc.
Datacenter failures
Debugging
Copy from https://engdocs.uberinternal.com/autobots/overview.html#product-details
Copy from https://engdocs.uberinternal.com/autobots/overview.html#product-details