Moneytree - Data Aggregation with SWF

Ross Sharrott
Founder / CTO
rsharrott@moneytree.jp
@moneytreejp

Who Am I?

Ross Sharrott
Founder & CTO of Moneytree
American
10 Years in Japan (Feb 24!)
Previously Senior IT Manager

Love distributed architectures in the
cloud

What is Moneytree?

Internet banking is fragmented; not simple

Email is Simple

Gmail

Yahoo!

Work, etc.

For mail we use just ONE app!

Radically simplify your relationship with money

Data Aggregator

Our Goals:
Download accounts for 1M people every
day
Deliver new data in < 1 minute
2-3 developers
Sleep at night

First Idea

I know…I’ll use a queue!

Original Queue Based Process

Download
Data

Process
Statements

Store Data

1 Account / Many Statements
But we had a problem…

To determine a CC balance, we need
information from multiple statements
We needed a post statement process
Download
Data

Process
Statements

Post Process
Statements

Store Data +
Additional
Information

What We Needed

Download
Data

Process
Statements

• Statement 1
• Statement 2
• Many More

Post
Process

Queue Falls Down

I know…I’ll use a queue!
Queues are linear
Where are we in the process?
Logged in yet? Processing data?

What do you do when a job fails?
How do you relate jobs to one workflow?

Enter SWF

AWS Managed Service
Coordinates Workflows / Maintains
history
Provides multiple queues called Task
Lists

Handle decision points with Deciders
Perform tasks with Activity Workers

SWF World – A Restaurant

Decider – does nothing, makes decisions
Workflow Starter – takes orders

Activity Worker – makes food
Activity Worker – distributes food
SWF – maintains history, distributes
tasks

Activity Worker

Very similar to any queue worker
Handles a specific task

Polls a Task List to get new info
Reports activity success or failure
Puts results in a DB or on S3, etc.

Workflow Decider

Uses workflow history to make decisions
Schedules tasks

Handles rescheduling failures & timeouts
Reacts to external events (Signals)
Reacts to completion events

Moneytree’s Workflow

Statement
Download
Data

Post
Process
Statement

Moneytree’s SWF Architecture

1 Day of Work

Yesterday:
70,000 Workflows

Average Completion Time: 1 Minute
575,000 Decision Tasks
146,000 Statements Processed
70,000 Aggregation Tasks
70,000 Post Process Tasks

Data Aggregator

Our Goals:


1M people every day



Deliver new data in < 1 minute



2-3 developers



Sleep at night

How To Sleep At Night

Make Workers Scalable
Avoid SWF API Throttling
Expect Failures
Measure Everything


Separate concerns into individual
workers

Scale each worker process individually
Automate scaling your workers
Make workers idempotent
You can always try again

Avoid API Throttling
Don’t call GetWorkflowHistory
Stress test your implementation
Limits are by Region, not domain!
Get your limits raised
We hit limits on day 1

Use exponential retry
Have a circuit breaker

Expect Failures
Cloud = Failures
Dyno / EC2 instance restarts
Network & Service outages

Don’t wait for failed processes
Use aggressive timeouts
Use heartbeats for long processes

Monitor Everything
Use Performance Monitoring
10x increase in performance = 10x workers
New Relic & Cloudwatch

Centralize Logging
Cloud resources disappear w/their logs
Papertrail / Logentries

Log Everything & Setup Alerts
If you don’t log it, you can’t fix it

Sleep At Night

Avoid SWF API Throttling
Expect Failures
Measure Everything

Thank You!
Moneytree is hiring!
iOS Developers
API Developers / AWS Dev Ops
Technology Ninjas
Ross Sharrott Founder / CTO
rsharrott@moneytree.jp
@moneytreejp

Moneytree - Data Aggregation with SWF

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Moneytree - Data Aggregation with SWF

Similar to Moneytree - Data Aggregation with SWF (20)

Recently uploaded

Recently uploaded (20)

Moneytree - Data Aggregation with SWF

Editor's Notes