This document summarizes an AWS webinar on optimizing the data tier for serverless web applications. The webinar covered anatomy of serverless apps, data tier options on AWS including DynamoDB, RDS, and ElastiCache. It discussed NoSQL vs SQL considerations and best practices for using AWS Lambda with each data store. Additional best practices covered caching, retries, and event ordering for serverless architectures.
2. What to Expect from the Session
• Anatomy of Serverless Apps
• Web applications
• Mobile backends
• Hierarchy of choice for data tier options on AWS
• Data tier for Serverless architectures
• SQL vs. NoSQL considerations
• AWS Lambda with Amazon DynamoDB
• AWS Lambda with Amazon RDS database
• AWS Lambda with Amazon ElastiCache
• Additional Best Practices
• Caching, and retries
7. Data tier options on AWS
Amazon
DynamoDB
Document
and Key-
Value Store
Amazon
RDS
SQL
Database
Engines
Amazon
ElastiCache
In-Memory
Key-Value
Store
Amazon
Redshift
Data
Warehouse
8. NoSQL vs. SQL for a new app: how to choose?
• Strong schema, complex relationships,
transactions and joins
• Single/Cluster system scaling
• Focus on ACID consistency and
availability
• SQL tables will have faster query
performance when running complex
queries
• Structured data sources, large
ecosystem of SQL toolsets
• Partial Schema, easy reads and
writes, simple data model
• Focus on performance and
availability at scale
• Varied data sources, dynamic
• High data volume, denormalized
• Horizontal scaling
NoSQL SQL
9. Amazon DynamoDB use cases
Ad Tech IoT Gaming
Mobile
& Web
Ad serving,
retargeting, ID
lookup, user
profile
management,
session-
tracking, RTB
Tracking state,
metadata and
readings from
millions of
devices, real-
time
notifications
Recording
game details,
leaderboards,
session
information,
usage history,
and logs
Storing user
profiles,
session details,
personalization
settings, entity
specific
metadata
10. AWS Lambda with DynamoDB
• Configuration
• No VPC configuration required
• IAM roles for access and authentication
• Leverage FGAC (Fine Grained Access Control) for
granular access to DynamoDB tables
11. AWS Lambda with DynamoDB
• Performance
• Simple API model
• Invoke concurrent connections at scale
• Query consistency with volume growth
• Simply dial-up read/write capacity units for scaling
• Use DynamoDB for storing persistent data,
complement with ElastiCache for better read
performance
12. RDS use cases
Applicable wherever you need relational databases
eCommerce Gaming
Websites IT Solutions
Apps
Reporting
13. Amazon Aurora: fast, available, and MySQL-compatible
SQL
Transactions
AZ 1 AZ 2 AZ 3
Caching
Amazon
S3
ü 5x faster than MySQL on same
hardware
ü Sysbench: 100K writes/sec and
500K reads/sec
ü Designed for 99.99% availability
ü 6-way replicated storage across
3 AZs
ü Scale to 64 TB and 15 read
replicas
14. AWS Lambda with RDS
• VPC Configuration
• Lambda functions by default have access to internet
• Grant Lambda functions access to resources (RDS, EC2, ElastiCache) in
your own VPC by adding:
§ VPC subnet IDs and security group IDs to Lambda configuration
§ Lambda function execution role (AWSLambdaVPCAccessExecutionRole)
§ Security group inbound rules on VPC resources should allow appropriate
ports for the subnet
• Allows access to peered VPCs, VPN endpoints, and private S3 endpoints
• Lambda access to VPC is optional, unless you need to access VPC
resources
15. AWS Lambda with RDS
• VPC Configuration
• Functions configured for VPC access lose internet access
• Even with “Auto-assign Public IP” enabled, Internet gateway and security
group allows all outbound traffic
• If functions need access to both Internet and VPC, attach to private subnet
with Internet access through a NAT instance or Amazon VPC NAT gateway
• Ensure subnets have enough IPs for ENIs
• Avoid DNS resolution of public hostnames for your VPC when accessing
through Lambda function
16. AWS Lambda with RDS
• Performance
• RDS instance type important for high Lambda concurrency
• Concurrency control using ”Kinesis sandwich” (Lambda -> Kinesis -> Lambda -
> Storage tier). Allows throttle on backend at a different rate than frontend
(may increase latency)
• Instantiate database connections outside scope of handler for connection re-
use, other options use language frameworks (nodejs knex, sequelize) or open
source libraries like Hibernate
• Faster query performance for complex queries
• Fine tune max_connections based on DB instance type
17. AWS Lambda with Aurora on Amazon RDS and KMS
Database Authentication
AWS Lambda
RDS
Database
AWS KMS VPC NAT Gateway
Master Keys for
encrypt/decrypt
1
2
3
4
3
1. Encrypt db password file
with KMS
2. Package encrypted db
password file along with
lambda deployment
package and upload to
Lambda
3. When function is invoked,
Lambda will connect with
KMS through NAT gateway
to decrypt password file
4. Lambda connects with
database using extracted
credentials to read/write
records
18. ElastiCache use cases
Caching layer for performance or cost optimization of an
underlying database
Storage of ephemeral key-value data
High-performance application patterns such as leaderboards
(for gaming users), session management, event counters, in-
memory lists
19. AWS Lambda with ElastiCache
• Configuration
• Lambda configuration to access ElastiCache resources inside VPC
• Use IAM roles for access and authentication
• Leverage additional libraries (pymemcache, node discovery) within
your function
20. AWS Lambda with ElastiCache
• Performance
• Invoke concurrent connections at scale
• Use Redis pipeline to maximize number of operations per second
• Handle high throughput by scaling instance types
• ElastiCache offers faster performance with lowest latency
• Write-through vs. lazy load based on applications
• Memcache for read heavy workloads
• Instead of updating the cache and persistent database, invalidate cache and
let the readers update it
• Redis for write heavy workloads
• Move data structures outside of the web apps to the data stores
21. AWS Lambda with API Gateway and Amazon ElastiCache
Amazon API
Gateway
Amazon
ElastiCache
AWS Lambda
1
2 3
4
1. Users authenticate via social
identity providers or using
Cognito
2. Amazon API gateway
receives incoming request
with query string parameters
3. Lambda function gets
invoked, does a look up on
the Redis cache
4. Lambda returns data
based on the supplied
criteria
Amazon
Cognito
23. Closing out – additional best practices
• Local Caching
• Instantiate AWS clients and database connections outside event handler for
connection re-use
• Initialization code is executed once per function, before handler is called first
time
• Connection re-use on frequent invocations will reduce latency
• Files stored in /tmp space (512 MB) will exist on connection re-use
• Schedule a function to keep it warm
24. Closing out – additional best practices
• Retries and Event Ordering
• Lambda function called synchronously
• Using the AWS SDK? Set retry logic there
• Direct RESTful call to Lambda? Client control retries entirely
• Ordering is up to the caller
• Amazon S3 or SNS triggers Lambda function, or asynchronous calls
• 3 tries, total, then event is discarded
• Loosely ordered
• Let the function fail, Lambda drops the event and puts it on an SQS/SNS for retries –
Dead Letter Queue
• Lambda polls Amazon Kinesis or Amazon DynamoDB update stream
• Attempts to process batch of records until data expires from source stream, ordering
preserved
New
Feature