AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DAT201)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Carl Youngblood, Lead Engineer, UnderArmour
Prahlad Rao, Solutions Architect, AWS
November 29, 2016
Cross-Region Replication with
Amazon DynamoDB Streams

What to expect from the session
DynamoDB introduction
1. SQL vs NoSQL refresher
2. Amazon DynamoDB recap
3. DynamoDB replication patterns
Implementing cross-region replication at Under Armour
1. What does single sign-on mean?
2. Background and problem context
3. Decision process that lead to our current solution
4. Our experience so far
5. Next steps
6. Starting over

Amazon DynamoDB
Fast and consistent
Scales to any workloadDocument or key-valueFully managed NoSQL
Event driven programmingAccess control

Table
Table
Items
Attributes
Hash
Key
Range
KeyMandatory
Key-value access pattern
Determines data distribution Optional
Model 1:N relationships
Enables rich query capabilities
All items for a hash key
==, <, >, >=, <=
“begins with”
“between”
sorted results
counts
top/bottom N values
paged responses
Table can be partitioned for scale

Partitions are three-way replicated
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Replica 1
Replica 2
Replica 3
Partition 1 Partition 2 Partition N

Replication use cases
• Globally distributed applications
• Lower-latency data access
• Traffic distribution
• Disaster recovery
• In-region and cross-region

Stream of updates to a table
Asynchronous
Exactly once
Strictly ordered
• Per item
Highly durable
• Scale with table
24-hour lifetime
Sub-second latency
DynamoDB Streams

In-region replication
• Automatic replication across AZs within
region (natively provided)
• Writes replicated continuously across 3
AZs, persisted to disk (SSD)
• Reads—strong or eventually consistent
• For data redundancy and protection
• DynamoDB Streams and AWS Lambda
• Streams of updates to a table
• DynamoDB triggers invoke a Lambda
function to run your code

Open Source Cross-
Region Replication Library
Cross-region Replication
• Solution uses Amazon
DynamoDB Cross-Region
Replication Library
• Leverages DynamoDB streams to
keep tables in sync across
multiple regions in near real-time
• Leverage cross-region replication
library in your applications
• Available in GitHub repository at:
• https://github.com/awslabs/dyna
modb-cross-region-library

Stream
Table
Partition 1
Partition 2
Partition 3
Partition 4
Partition 5
Table
Shard 1
Shard 2
Shard 3
Shard 4
KCL
Worker
KCL
Worker
KCL
Worker
KCL
Worker
Amazon Kinesis Client
Library application
DynamoDB
client application
Updates
DynamoDB Streams and Amazon Kinesis Client Library
Cross-region replication

DynamoDB Streams and AWS Lambda

Cross-region replication at
Under Armour

To make all athletes better through passion, design, and the relentless pursuit of
innovation.
Under Armour connected fitness

42%
33%
11%
8%
2%1% 3%
Engineering Team Locations
Austin San Francisco Copenhagen Denver Baltimore Guangzhou Off-Site

What does single sign-on mean?

Background and problem context

• 1 manager/developer/tech lead
• 1 developer
• 1 site reliability engineer (me!)
• Fast startup
• Fast iteration
• Low overhead
• Reliable188 million users.
Sign on once. That’s it.

STOP
Personally identifiable information (PII)…as used in US privacy law…is
information that can be used…to identify, contact, or locate a single person, or
to identify an individual in context.
https://en.wikipedia.org/wiki/Personally_identifiable_information

*not to scale
• Store data where it belongs
• Don’t store data where it doesn’t belong
• Get data where and when it’s needed
1. Replicate PII-free pointers across regions
2. Follow pointers to locate user data
userId homeRegion
42 US
US users
German users
Other EU users

Decision process
Google:
“dynamodb cross
region
replication.”
Click first result.
http://docs.aws.amazon.com/a
mazondynamodb/latest/develop
erguide/Streams.CrossRegionR
epl.html
Profit. …sort of.

Decision process—AWS CloudFormation
*This solution has now been deprecated.
• CloudFormation
• Amazon EC2 Container Service
• Tuning containers based on throughput
• Possible to wedge the whole thing if you go full chaos monkey
• No custom replication logic
Struggles

Decision process
Google:
“dynamodb cross
region
replication.”
Click first result.
epl.html
Check out the
Amazon Kinesis
Client Library
Plus the DynamoDB Streams adapter
Profit. …well, sort of.

Decision process—Amazon Kinesis Client Library
• Requires running a process somewhere
• Troubleshooting, startup, rebalancing, and failovers
• State tracking DynamoDB table in your account
• Scaling processes for throughput
• Less is more
Struggles

Decision process
Google:
“dynamodb cross
region
replication.”
Click first result.
epl.html
Profit. …yep!
DynamoDB Streams
+ Lambda
Check out the
Amazon Kinesis
Client Library
Plus the DynamoDB Streams adapter

Decision process—Lambda
• 24 hours to respond to problems
• Parallelizable with 1,024 threads
• Almost zero operational overhead
• Automatically scales with throughput
Strengths

Decision process—Lambda
• Log4j
• Logs to Amazon CloudWatch
• Lack of run-time configuration
Struggles

Experience—reads
• Public DynamoDB endpoints + TLS
• Read anonymous data locally
• Read PII from user’s home region
eu-west-1us-east-1
us-east-1
OpenID server
eu-west-1
OpenID server

Experience—writes
• Write anonymous data to us-east-1
• Replicate anonymous data
• Write PII to user’s home region
• Public DynamoDB endpoints + TLS
us-east-1
OpenID server
us-east-1
eu-west-1
us-east-1

Experience—replication
class Main extends StrictLogging {
def handler(event: DynamodbEvent, context: Context): Unit = {
val conf = Main.loadConfFromContext(context)
logger.info("Replicating to regions: %s".format(Main.readConfRegions(conf)))
val clients = Main.buildClientsFromConf(conf)
val (records, skipped) = event.getRecords.asScala.toList.partition(Main.filterReplicatedUpdate)
logger.info("Skipping %s records: %s".format(
skipped.length, for (r <- skipped) yield (r.getEventSourceARN, r.getDynamodb.getKeys)))
logger.info("Replicating %s records: %s".format(
records.length, for (r <- records) yield (r.getEventSourceARN, r.getDynamodb.getKeys)))
records.par.map(Main.replicate(_, clients))
}
}

Experience—latency
Slow
Fast
Outside us-east-1, outside home region
Outside us-east-1, inside home region
Inside us-east-1, outside home region
Inside us-east-1, inside home region
from us-east-1
~50ms to eu-west-1
~150ms to ap-northeast-1

Experience—reliability
• ~1 year in production
• CloudWatch alarms on throttles, errors
• ~0 pager alerts

circuit: open
us-east-1
Multimaster—reliability
us-east-1
OpenID server
eu-west-1
us-east-1
circuit: open
eu-west-1
circuit: closed
ap-northeast-1
fallback
fallback

Multimaster—latency
Slow
Fast
Outside us-east-1, outside home region
Outside us-east-1, inside home region
Inside us-east-1, outside home region
Inside us-east-1, inside home region
SQUISH
Better non-PII data locality
from us-east-1
~50ms to eu-west-1
~150ms to ap-northeast-1

Multimaster—write ordering
Extra rields:
1. Timestamp
2. Write ID
3. Replication flag
userId 42
email, etc e@mail.com
timestamp 1476106431728
writeId
5c0fb0d3-c1fe-4526-
a2cf-0678880952f9
replicateMe true

Lambda
DynamoDB
Application
Replicate
if(replicateMe) DoneWrite to
DynamoDB
Poll DynamoDB
Stream event
source
DynamoDB
Stream shard
updated if(writeConditionFailed)
Write to Amazon S3
Done

// Write condition expression
(:timestamp > timestamp)
OR (:timestamp = timestamp
AND :writeId > writeId)

ts=1
r=t
ts=1
r=f
ts=1
r=f
us-east-1
eu-west-1
ap-northeast-1

ts=1
r=t
ts=2
r=t
ts=3
r=t
ts=3
r=f
ts=3
r=f us-east-1
eu-west-1
ap-northeast-1

ts=1,wid=a
r=t
ts=1,wid=b
r=t
ts=1,wid=a
r=t
ts=1,wid=b
r=t
ts=1,wid=b
r=t us-east-1
eu-west-1
ap-northeast-1

Concurrent writes will happen!
The question is not how to work around or avoid them.
The question is how to recognize and resolve them.

Document schema
Concurrent writes require storage for multiple versions
of your data.
Either formally as a CRDT data structure or ad hoc for
eventual conflict resolution by a person or process.

Dotted version vectors
Thank you:
basho http://basho.com
Russel Brown https://github.com/russelldb
Nuno Preguiça
Carlos Baquero
Paulo Almeida
Victor Fonte
Ricardo Gonçalves
Efficient Causality Tracking in
Distributed Storage Systems
With Dotted Version Vectors.

Thank you!
Carl Youngblood
cyoungblood@underarmour.com

Remember to complete
your evaluations!

AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DAT201)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DAT201)

Similar to AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DAT201) (20)

More from Amazon Web Services

More from Amazon Web Services (20)

Recently uploaded

Recently uploaded (20)

AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DAT201)