Riga dev day: Lambda architecture at AWS

Is Lambda Architecture really a
new normal for cloud native apps?
λ
+

:~ whoami:
Antons Kranga
Full stack developer ~ 15years
Cloud Architect
DevOps evangelist
Innovation Center of Accenture Cloud Platform
Speaker
Marathon runner

What is Streaming?
We often want to deploy data models based on new data that
continuously arrive from the multiple sources
0
1
0
1
0
10
1
0
1
0
1
0
10
1

Challenges
Users expect data will appear immediately after it arrived
Fault tolerant
Distributed data consistency
Scalability (how not to lose data when scale down)

What is “λ”
0
1
0 10
10
1 00
0
1
110
1
Speed Layer Batch Layer
new data
master
data
realtime
view
Serving Layer
view View View…
map-red
query query
realtime
view

What is “λ” architecture
Batch Layer: Master Data sets and Pre-compute aggregations
• Slow Data Ingestion – minutes to days intervals
• Append-only data sets eventually supersedes data
captured in speed layer
Speed Layer: High throughput, near-real-time data ingestion
• Fast Data Ingestion – seconds interval
• Concurrent information processing
• Retrieval of most recent information
Serving Layer: Provide query capability over the Batch Layer
• Low-latency ad-hoc query
• May also provide assess to speed layer views

Transparent integration with
other “Cloud Native” services

AWS Blueprint for Lambda Architectures
https://d0.awsstatic.com/whitepapers/lambda-architecure-on-for-batch-aws.pdf
Published at July 2015
Amazon
Kinesis
AmazonKinesis–
enabledapp
S3 buckets
Amazon EMR
speed layer
batch layer
emr on serving
and merging layer

Kinesis
aws region
az1 az2 az3
Lambda
S3 storage
Redshift
consumers
EC2 Instance
EMR
producers

Kinesis
producers
aws region
az1 az2 az3
Lambda
S3 storage
Redshift
consumers
EC2 Instance
EMR
AmazonKinesis kinesis = ...
...
PutRecordRequest putRecord = new PutRecordRequest();
putRecord.setStreamName(streamName);
putRecord.setData(ByteBuffer.wrap(bytes));
putRecord.setSequenceNumberForOrdering(null);
...
kinesis.putRecord(putRecord);
Producer

Kinesis
aws region
az1 az2 az3
Lambda
S3 storage
Redshift
consumers
EC2 Instance
EMR
AmazonKinesis kinesis = ...
...
PutRecordRequest putRecord = new PutRecordRequest();
putRecord.setStreamName(streamName);
putRecord.setData(ByteBuffer.wrap(bytes));
putRecord.setSequenceNumberForOrdering(null);
...
kinesis.putRecord(putRecord);
Producer
AmazonKinesisClient kinesisClient = ...
GetShardIteratorRequest req = ...
req.setStreamName("my-kinesis");
req.setShardIteratorType("TRIM_HORIZON");
...
GetRecordsResult result = kinesisClient.getRecords(req);
records = result.getRecords();
for (Record record : records) {
... = record.getData();
}
Consumerproducers

Kinesis streams
What: Enables to build near-real-time data processing
applications
Use cases:
• Real time analytics
• Log files processing
• Reporting
Durability: data streams replicated across 3AZ

Kinesis streams
Cost Model:
Shard Hour:
• 5 read transaction per second
• 2 MB data read per second
• 100 write transactions per second
• 1 MB data write per second
aprox 12.5USD/Mo
Extended data retention
• Up to 7 days

Kinesis streams
Not good when:
• Small scale throughput less than 200KB/sec
• Long term data storage (more than 24H)

Lambda
What: Lambda allows to write function without having actual
server
Use cases:
• Real time Stream processing
• Tiny ETL
• In few cases can replace EC2
• Process IaaS Events
Runtimes: Java8, NodeJS, Python
Backed by: provides /tmp for ephemeral storage.
Durability: No maintenance windows, 3 retries before failure

Lambda
Cost Model:
Requests per function:
• GB/seconds
• Step 100 millisec
• 0.20 USD Mill-Requests; $0.00001667 per GB

Lambda
Not good when:
• Timeout 300 sec (cannot be changed)
• Forces developer to think stateless
• Highly dynamic web-sites.
• Competes with t2.nano ($4.75/month)

S3 storage
SNS
consumers
Kinesis
Lambda
…Lambda
S3 storage
SNS
consumers
Kinesis
…
myApp.
ZIP
Java8
Python
NodeJS

EMR
What: Managed service of Apache Hadoop
Use cases:
• MapRed data processing
• Large data ETL jobs
• Data movement
• Log processing and analytics
Backed by: 1 or cluster of EC2 instances
Durability: on storage level provides by S3
See more:
https://media.amazonwebservices.com/AWS_Amazon_EMR_Best_Practices.pdf

EMR
Cost Model:
• Charges apply per EC2 sizes model
• S3 storage charges applies (0.03 GB/Mo)

EMR
Not good when:
• Small to Medium data sets
• ACID (atomicity, consistency, isolation, durability)
• Competes with RDS: Dynamo DB, Aurora DB

S3
What: Highly fully managed persistent storage
• Static content web sites
• File storage (primarily for reading)
• Archives storage
Backed by: covered by AWS S3 SLA
Durability: storage: 99.999999999%; availability: 99.99%

S3
Cost Model: GB/Mo
• Standard Storage: $0.03 GB/Mo
• Infrequent Access Storage: $0.0125 GB/Mo
• Glacier Storage: $0.007 GB/Mo

S3
Not good when:
• S3 write can be slow
• Glacier can restore up to 5% of storage per months

Redshift
What: Petabytes scale Data Warehouse as managed service
• Data warehouse (OLAP)
• BI and ETL
• Store large historical data
Backed by: AWS provides automatic data backup
Durability: on storage level provides by S3
Scaling: Start with 160GB node and then you can scale

Redshift
Cost Model:
• Charges apply per EC2 sizes model
• S3 storage charges applies (0.03 GB/Mo)

Redshift
Not good when:
• OLTP (On-line transaction processing)
• Unstructured data
• Blob storage

Kinesis
shard
shard
shard
producer
batch layer
speed layerec2
S3 Bucket Map Red
Process Stream
serving layer
View
DynamoDB
Primer Lambda
(every hour)

Kinesis
shard
shard
shard
producer
batch layer
speed layerec2
S3 Bucket Map Red
Process Stream
serving layer
View
DynamoDB
Lambda
(every hour)
computation per hour
Lambda
(every hour)
h0 h1 h2 h3
batch layerSpeed layer
t

Kinesis
shard
shard
shard
producer
batch layer
speed layerfec2
S3 Bucket Map Red
Process Stream
serving layer
View
DynamoDB
Primer Lambda
(every hour)
Lambda
(every hour)
Presentation Layer
JSappLambda

Riga dev day: Lambda architecture at AWS

More Related Content

What's hot

Viewers also liked

Similar to Riga dev day: Lambda architecture at AWS

Recently uploaded

Riga dev day: Lambda architecture at AWS