Agenda
• Motivation for StateManagementics
• Expectations
• Scenario-1: VM Boot Abort & Resumption
• Scenario-2: Transaction Example
• Here Comes the TaskFlow
• Foundation Concepts
• Patterns
• Engines
• Persistence
• TaskFlow Demo
• Reference
• Questions
2
Motivation for StateManagementics
• Workflows without state management in place makes these hard
to follow, alter and recover
• Unreliable workflow and resource state
It is hard to maintain correctness and consistency of distributed
systems
• RPC boundaries are a constant balance of improving scalability
but decreasing consistency
• Race conditions occur more often than desired
• Manager Driver API boundary
• Application and state recovery typically patched on after the fact
instead of built-in from the ground up (e.g. periodic tasks)
• Capability to service stop an application cleanly without manual
(or periodic) clean up is crucial for features like live upgrades
3
Expectations
• Customers expect stability and consistency
API and service reliability
• Resource and/or state corruption (or people to fix manually these
problems) costs $$$
• Easy understanding of workflows and states allows the
development and alteration of existing workflows
• Upgrades (not even live), just upgrades
Just say no to destroy the cloud to upgrade
4
Scenario-1: VM Boot Abort & Resumption
We can continue the run instance process again from the point
where flow failed.
7
Scenario-2: Transaction Example
Flow resemblance with SQL transaction.
START TRANSACTION
task1: call API-1 to do op-1 || ROLLBACK
task2: task1 finish, call API-2 to do op-2 || ROLLBACK
...perform other tasks...
COMMIT
8
Here Comes the TaskFlow
• A library that makes task execution easy, consistent, and reliable
• Provides a framework for executing flow of operations such that
those flows can be paused, resumed, revoked at any time
• A paradigm and lightweight framework
• Community driven & well documented
• Library and code available though pypi & github
https://pypi.python.org/pypi/taskflow
https://github.com/openstack/taskflow
9
Foundation Concepts
• Code structure (your applications frame)
• Controlled execution
Who & what manages the overall execution
• Persistence (how you know what was executed)
• Work recovery
How you recover from failure/partial progress
10
• Tasks
Executes and reverts one action
Receives inputs and declares outputs
• Flows
Composes tasks (or subflows) into useful structures
Imposes some definition of order also called patterns onto
the running of your tasks or subflows
Foundation Concepts
12
• Jobs
The initial (and any derivative) set of tasks & flows required to
fulfil an action
Can be transferred to a worker for completion
Can be re-associated on worker failure (or timeout) for
resumption or undo/reversion
• Job Board
A system where jobs can be atomically posted, reposted,
claimed, marked as completed
Backed by varying implementations
Message queue, zookeeper, database…
Foundation Concepts
13
Patterns
Attributes/
Patterns
Linear Unordered Graph
Summary Runs a list of tasks/flows,
one after the other in a serial
manner
Runs a set of tasks/flows, in
any order
Runs a graph composed of
tasks/flows in dependency
driven ordering
Constraints Predecessor tasks outputs
must satisfy successive tasks
inputs
Disallows inter-task
dependencies
Dependency driven, no
cycles. A tasks dependents
are guaranteed to be
satisfied before the task will
run
Use-Cases This pattern is useful for
structuring tasks/flows that
are fairly simple, where a
task/flow follows a previous
one
This pattern is useful for
tasks/flow that are fairly
simple and are typically
embarrassingly parallel
This pattern allows for a very
high level of potential
concurrency and is useful for
tasks which can be arranged
in a directed acyclic graph
(without cycles).
14
Attributes/
Patterns
Linear Unordered Graph
Benefits Simple Simple. Inherently
concurrent
Allows for complex task
ordering. Can be
automatically made
concurrent by running
disjoint tasks/flows in
parallel
Drawbacks Serial, no potential for
concurrency
No dependencies allowed.
Ease of tracking and
debugging harder due to
lack of reliable ordering
Complex. Can not support
cycles. Tracking and
debugging harder due to
graph dependency traversal
Patterns
15
• Runs flow (and associated tasks) in a well defined, reliable,
consistent and resumable manner
Follows well defined state transitions
• Allows for deployers/developers of a service that uses taskflow to
select an engine that suites their setup best
• Backed by varying implementations
Single-threaded
Multi-threaded via native or green threads
Distributed
• Support for other engines:
Message based(RPC), oslo.messaging
Celery
Engines
16
• Saves task state/progress/results and flow state
• Allows for reconstruction and resumption of flows and associated
tasks
• Allows the user to view the play-by-play action history of flows
and associated tasks
Facilitates debugging of taskflow usage and integration
• Backed by varying implementations
File system, memory, database…
Persistence
17