AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC402)

1,229 views

Published on

As serverless architectures become more popular, AWS customers need a framework of patterns to help them deploy their workloads without managing servers or operating systems. This session introduces and describes four re-usable serverless patterns for web apps, stream processing, batch processing, and automation. For each, we provide a TCO analysis and comparison with its server-based counterpart. We also discuss the considerations and nuances associated with each pattern and have customers share similar experiences. The target audience is architects, system operators, and anyone looking for a better understanding of how serverless architectures can help them save money and improve their agility.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,229
On SlideShare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
170
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC402)

  1. 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Drew Dennis, AWS Solution Architect Maitreya Ranganath, AWS Solution Architect Ajoy Kumar, BMC R&D Architect November 30, 2016 Serverless Architectural Patterns and Best Practices ARC402
  2. 2. Agenda Serverless characteristics and practices 3-tier web application Batch processing Stream processing Operations automation BMC serverless use case Wrap-up/Q&A
  3. 3. Spectrum of AWS offerings AWS Lambda Amazon Kinesis Amazon S3 Amazon API Gateway Amazon SQS Amazon DynamoDB AWS IoT Amazon EMR Amazon ElastiCache Amazon RDS Amazon Redshift Amazon Elasticsearch Service Managed Serverless Amazon EC2 “On EC2” Amazon Cognito Amazon CloudWatch
  4. 4. Serverless patterns built with functions Functions are the unit of deployment and scale Scales per request—users cannot over or under-provision Never pay for idle Skip the boring parts; skip the hard parts
  5. 5. Lambda considerations and best practices AWS Lambda is stateless—architect accordingly • Assume no affinity with underlying compute infrastructure • Local filesystem access and child process may not extend beyond the lifetime of the Lambda request
  6. 6. Lambda considerations and best practices Can your Lambda functions survive the cold? • Instantiate AWS clients and database clients outside the scope of the handler to take advantage of connection re-use. • Schedule with CloudWatch Events for warmth • ENIs for VPC support are attached during cold start import sys import logging import rds_config import pymysql rds_host = "rds-instance" db_name = rds_config.db_name try: conn = pymysql.connect( except: logger.error("ERROR: def handler(event, context): with conn.cursor() as cur: Executes with each invocation Executes during cold start
  7. 7. Lambda considerations and best practices How about a file system? • Don’t forget about /tmp (512 MB scratch space) exports.ffmpeg = function(event,context) { new ffmpeg('./thumb.MP4', function (err, video) { if (!err) { video.fnExtractFrameToJPG('/tmp’) function (error, files) { … } … if (!error) console.log(files); context.done(); ...
  8. 8. Lambda considerations and best practices Custom CloudWatch metrics • 40 KB per POST • Default Acct Limit of 150 TPS • Consider aggregating with Kinesis def put_cstate ( iid, state ): response = cwclient.put_metric_data( Namespace='AWSx/DirectConnect', MetricData=[ { 'MetricName':'ConnectionState', 'Dimensions': [ { 'Name': 'ConnectionId', 'Value': iid }, ], 'Value': state, 'Unit': 'None’ …
  9. 9. Pattern 1: 3-Tier Web Application
  10. 10. Web application Data stored in Amazon DynamoDB Dynamic content in AWS Lambda Amazon API Gateway Browser Amazon CloudFront Amazon S3
  11. 11. Amazon API Gateway AWS Lambda Amazon DynamoDB Amazon S3 Amazon CloudFront • Bucket Policies • ACLs • OAI • Geo-Restriction • Signed Cookies • Signed URLs • DDOS IAM AuthZ IAM Serverless web app security • Throttling • Caching • Usage Plans Browser
  12. 12. Amazon API Gateway AWS Lambda Amazon DynamoDB Amazon S3 Amazon CloudFront • Bucket Policies • ACLs • OAI • Geo-Restriction • Signed Cookies • Signed URLs • DDOS IAMAuthZ IAM Serverless web app security • Throttling • Caching • Usage Plans Browser Amazon CloudFront • HTTPS • Disable Host Header Forwarding AWS WAF
  13. 13. Amazon API Gateway AWS Lambda Amazon DynamoDB Amazon S3 Amazon CloudFront • Access Logs in S3 Bucket• Access Logs in S3 Bucket • CloudWatch Metrics- https://aws.amazon.com/cl oudfront/reporting/ Serverless web app monitoring AWS WAF • WebACL Testing • Total Requests • Allowed/Blocked Requests by ACL logslogs • Invocations • Invocation Errors • Duration • Throttled Invocations • Latency • Throughput • Throttled Reqs • Returned Bytes • Documentation • Latency • Count • Cache Hit/Miss • 4XX/5XX Errors Streams AWS CloudTrail Browser Custom CloudWatch Metrics & Alarms
  14. 14. Serverless web app lifecycle management AWS SAM (Serverless Application Model) - blog AWS Lambda Amazon API Gateway AWS CloudFormation Amazon S3 Amazon DynamoDB Package & Deploy Code/Packages/ Swagger Serverless Template Serverless Template w/ CodeUri package deploy CI/CD Tools
  15. 15. Amazon API Gateway best practices Use mock integrations Signed URL from API Gateway for large or binary file uploads to S3 Use request/response mapping templates for legacy apps and HTTP response codes Asynchronous calls for Lambda > 30s
  16. 16. Root/ /{proxy+} ANY Your Node.js Express app Greedy variable, ANY method, proxy integration Simple yet very powerful: • Automatically scale to meet demand • Only pay for the requests you receive
  17. 17. Pattern 2: Batch Processing
  18. 18. Characteristics Large data sets Periodic or scheduled tasks Extract Transform Load (ETL) jobs Usually non-interactive and long running Many problems fit MapReduce programming model
  19. 19. Serverless batch processing AWS Lambda: Splitter Amazon S3 Object Amazon DynamoDB: Mapper Results AWS Lambda: Mappers …. …. AWS Lambda: Reducer Amazon S3 Results
  20. 20. Considerations and best practices Cascade mapper functions Lambda languages vs. SQL Speed is directly proportional to the concurrent Lambda function limit Use DynamoDB/ElastiCache/S3 for intermediate state of mapper functions Lambda MapReduce Reference Architecture
  21. 21. Cost of serverless batch processing 200 GB normalized Google Ngram data-set Serverless: • 1000 concurrent Lambda invocations • Processing time: 9 minutes • Cost: $7.06
  22. 22. Pattern 3: Stream Processing
  23. 23. Stream processing characteristics • High ingest rate • Near real-time processing (low latency from ingest to process) • Spiky traffic (lots of devices with intermittent network connections) • Message durability • Message ordering
  24. 24. Serverless stream processing architecture Sensors Amazon Kinesis: Stream Lambda: Stream Processor S3: Final Aggregated Output Lambda: Periodic Dump to S3 CloudWatch Events: Trigger every 5 minutes S3: Intermediate Aggregated Data Lambda: Scheduled Dispatcher KPL: Producer
  25. 25. Fan-out pattern • Number of Amazon Kinesis Streams shards corresponds to concurrent Lambda invocations • Trade higher throughput & lower latency vs. strict message ordering Sensors Amazon Kinesis: Stream Lambda: Dispatcher KPL: Producer Lambda: Processors Increase throughput, reduce processing latency
  26. 26. More about fan-out pattern • Keep up with peak shard capacity • 1000 records / second, OR • 1 MB / second • Consider parallel synchronous Lambda invocations • Rcoil for JS (https://github.com/sapessi/rcoil) can help • Dead letter queue to retry failed Lambda invocations
  27. 27. Best practices • Tune batch size when Lambda is triggered by Amazon Kinesis Streams – reduce number of Lambda invocations • Tune memory setting for your Lambda function – shorten execution time • Use KPL to batch messages and saturate Amazon Kinesis Stream capacity
  28. 28. Monitoring Amazon Kinesis Stream metric GetRecords.IteratorAgeMilliseconds maximum
  29. 29. Amazon Kinesis Analytics Sensors Amazon Kinesis: Stream Amazon Kinesis Analytics: Window Aggregation Amazon Kinesis Streams Producer S3: Aggregated Output CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM" SELECT STREAM "device_id", FLOOR("SOURCE_SQL_STREAM_001".ROWTIME TO MINUTE) as "round_ts", SUM("measurement") as "sample_sum", COUNT(*) AS "sample_count" FROM "SOURCE_SQL_STREAM_001" GROUP BY "device_id", FLOOR("SOURCE_SQL_STREAM_001".ROWTIME TO MINUTE); Aggregation Time Window
  30. 30. Cost comparison - assumptions • Variable message rate over 6 hours • Costs extrapolated over 30 days 20,000 10,000 20,000 50,000 20,000 10,000 1 2 3 4 5 6 MESSAGES/SEC HOURS
  31. 31. Serverless • Amazon Kinesis Stream with 5 shards Cost comparison Server-based on EC2 • Kafka cluster (3 x m3.large) • Zookeeper cluster (3 x m3.large) • Consumer (1 x c4.xlarge) Service Monthly Cost Amazon Kinesis Streams $ 58.04 AWS Lambda $259.85 Amazon S3 (Intermediate Files) $ 84.40 Amazon CloudWatch $ 4.72 Total $407.01 Service Monthly Cost EC2 Kafka Cluster $292.08 EC2 Zookeeper Cluster $292.08 EC2 Consumer $152.99 Total On-Demand $737.15 1-year All Upfront RI $452.42
  32. 32. Compare related services Amazon Kinesis Streams Amazon SQS Amazon SNS Message Durability Up to retention period Up to retention period Retry delivery (depends on destination type) Maximum Retention Period 7 days 14 days Up to retry delivery limit Message Ordering Strict within shard Standard - Best effort FIFO – Strict within Message Group None Delivery semantics Multiple consumers per shard Multiple readers per queue (but one message is only handled by one reader at a time) Multiple subscribers per topic Scaling By throughput using Shards Automatic Automatic Iterate over messages Shard iterators No No Delivery Destination Types Kinesis Consumers SQS Readers HTTP/S, Mobile Push, SMS, Email, SQS, Lambda
  33. 33. Lambda architecture Data Sources Serving Layer Speed Layer AWS Lambda: Splitter Amazon S3 Object Amazon DynamoDB: Mapper Results Amazon S3 AWS Lambda: Mappers …. …. AWS Lambda: Reducer Amazon S3 Results Batch Layer Sensors Amazon Kinesis: Stream Lambda: Stream Processor S3: Final Aggregated Output Lambda: Periodic Dump to S3 CloudWatch Events: Trigger every 5 minutes S3: Intermediate Aggregated Data Lambda: Scheduled Dispatcher KPL: Producer
  34. 34. Pattern 4: Automation
  35. 35. Automation characteristics • Respond to alarms or events • Periodic jobs • Auditing and Notification • Extend AWS functionality …All while being Highly Available and Scalable
  36. 36. Automation: dynamic DNS for EC2 instances AWS Lambda: Update Route53 Amazon CloudWatch Events: Rule Triggered Amazon EC2 Instance State Changes Amazon DynamoDB: EC2 Instance Properties Amazon Route53: Private Hosted Zone Tag: CNAME = ‘xyz.example.com’ xyz.example.com A 10.2.0.134
  37. 37. Automation: image thumbnail creation from S3 AWS Lambda: Resize Images Users upload photos S3: Source Bucket S3: Destination Bucket Triggered on PUTs
  38. 38. CapitalOne Cloud Custodian AWS Lambda: Policy & Compliance Rules Amazon CloudWatch Events: Rules Triggered AWS CloudTrail: Events Amazon SNS: Alert Notifications Amazon CloudWatch Logs: Logs Read more here: http://www.capitalone.io/cloud-custodian/docs/index.html
  39. 39. Best practices • Document how to disable event triggers for your automation when troubleshooting • Gracefully handle API throttling by retrying with an exponential back- off algorithm (AWS SDKs do this for you) • Publish custom metrics from your Lambda function that are meaningful for operations (e.g. number of EBS volumes snapshotted)
  40. 40. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ajoy Kumar Architect, BMC Software Security and DevOps Automation Use Cases
  41. 41. What to Expect from the Session Our journey to cloud Security and DevOps automation use cases Serverless architecture deep dive Learnings
  42. 42. Our journey Early 2016, BMC Software wanted to build a new cloud service that incorporated compliance and security checks into DevOps pipelines • Support rapid innovation and iterations • Had to scale in terms of tenants, data, throughput, and availability • Sophisticated business logic • Economically scalable
  43. 43. Our Security and DevOps Automation Use Cases • CI/CD integration of Compliance APIs into Pipeline releases • Interrogation of Mode 2 artifacts for governance and control: Docker, CFN, Cloud • Ease of use and portability of data • Any data / any policy
  44. 44. We thought of building this from ground up on AWS App Indexer Service 4. Kafka + Storm 1. Nginx 5. Cassandra 6. Elasticsearch Data Blob Data Blob Data Blob Data Blob Data Blobcollectors 2. Zookeeper 3. Vault Ops Security Scale Monitor Dev App But faced these challenges: • Complexity of 6 clusters • Lack of Infra dev & ops skills • Time to build Infra • TCO
  45. 45. But then…we got told to do it in a month Amazon API Gateway Amazon Elasticsearch Service Ingest App Stream Resource Write Service Amazon Kinesis Indexer Service Amazon API Gateway Data Blob Data Blob Data Blob Data Blob Data Blobcollectors Stream Write Service App Manage Clusters Amazon DynamoDB s3 Simple but powerful • Time to value • Scalable • No Infra Ops • Lower cost • Unit 1 economics API App Services
  46. 46. Architectural patterns we used for Lambda API Gateway REST API methods Real-time stream processing Real-time DB triggers Scaling Auto, Fan-out Deployment AWS CloudFormation Monitoring Logs App Lambda pattern API App Services
  47. 47. But wait, there is more to serverless than Lambda Amazon API Gateway Amazon Kinesis Streams Amazon DynamoDB Amazon Elasticsearch Service Amazon CloudWatch Amazon Route 53 Amazon SNS
  48. 48. You still need to do Ops, there is no such thing as “NoOps” No infra ops, but you do need to care about… Is my app up and running? Are there high API errors? Why is my latency high? Are my DB queries efficient?
  49. 49. Our learnings We love serverless architecture • Move fast, innovate with new “apps” • Focus on application logic, easy to write and deploy • No “InfraOps,” only “AppOps” • Cost savings • Scale without worry BMC Software will be going GA for security and DevOps automation SaaS service in December 2016 To see more about BMC, visit us in booth #2344
  50. 50. Related Sessions ARC402 Repeat – Fri 10:30AM ALX302 Build a Serverless Back End for Alexa – Thu 5PM DEV308 Chalice: Serverless Python Framework – Fri 10:30AM DCS205 Workshop: Building Serverless Bots – Thu 11AM, 2PM DEV301 Amazon CloudWatch Logs and Lambda – Thu 11AM BDM303 JustGiving Serverless Data Pipelines – Tue 2:30PM CMP211 Getting Started with Serverless – Fri 9:30AM DEV205 Monitoring with Lambda – Datadog – Tue 2:30PM
  51. 51. Thank you!
  52. 52. Remember to complete your evaluations!

×