© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Adrian Hornsby, Cloud Architecture Evangelist
Serverless Architectural Patterns
@adhorn
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Session objectives
1. Understand Serverless Key Concepts.
2. Understand Event Processing Architecture.
3. Understand Operation Automation Architecture.
4. Understand Web Application Architecture.
5. Understand Data Processing Architecture.
1. Kinesis-based apps.
2. IoT-based apps.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Serverless Key Concepts
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
No servers to provision
or manage
Scales with usage
Never pay for idle Availability and fault
tolerance built in
Serverless means…
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spectrum of AWS offerings
AWS
Lambda
Amazon
Kinesis
Amazon
S3
Amazon API
Gateway
Amazon
SQS
Amazon
DynamoDB
AWS IoT
Amazon
EMR
Amazon
ElastiCache
Amazon
RDS
Amazon
Redshift
Amazon ES
Managed Serverless
Amazon EC2
Microsoft SQL
Server
“On EC2”
Amazon
Cognito
Amazon
CloudWatch
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Regional services
AZ1 AZ2 AZ3
Service XYZ
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Anatomy of a Lambda function
Handler() function
Function to be executed
upon invocation
Event object
Data sent during
Lambda Function
Invocation
Context object
Methods available to
interact with runtime
information (request ID,
log group, etc.)
def handler(event, context):
return {
"message": ”Hello World!",
"event": event
}
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lambda execution model
Synchronous (push) Asynchronous (event) Stream-based
Amazon
API Gateway
AWS Lambda
function
Amazon
DynamoDB
Amazon
SNS
/api/hello
AWS Lambda
function
Amazon
S3
reqs
Amazon
Kinesis
changes
AWS Lambda
service
function
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lambda Best Practices
• Minimize package size to necessities
• Separate the Lambda handler from core logic
• Use Environment Variables to modify operational
behavior
• Self-contain dependencies in your function package
• Leverage “Max Memory Used” to right-size your
functions
• Delete large unused functions (75GB limit)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS X-Ray Integration with Serverless
• Lambda instruments incoming requests for
all supported languages
• Lambda runs the X-Ray daemon on all
languages with an SDK
var AWSXRay = require(‘aws-xray-sdk-core‘);
AWSXRay.middleware.setSamplingRules(‘sampling-rules.json’);
var AWS = AWSXRay.captureAWS(require(‘aws-sdk’));
S3Client = AWS.S3();
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
X-Ray Trace Example
Chalice
awslabs/aws-serverless-express
awslabs/aws-serverless-java-container
Serverless Frameworks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Event Processing Architecture
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Event driven
A B CEvent A on B triggers C
Invocation
Lambda functions
Action
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Event-driven platform
S3 event
notifications
DynamoDB
Streams
Kinesis
events
Cognito
events
SNS
events
Custom
events
CloudTrail
events
LambdaDynamoDB
Kinesis S3
Any custom
Invoked in response to events
- Changes in data
- Changes in state
Redshift
SNS
Access any service,
including your own
Such as…
Lambda functions
CloudWatch
events
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Event-driven actions
Lambda:
Resize Images
Users upload photos
S3:
Source Bucket
S3:
Destination Bucket
Triggered on
PUTs
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Source data
Source data
Source data
Source data
S3 Data Staging
Layer
/data/source-raw
S3 Data Staging
Layer
/data/source-validated
Lambda: Input
Validation and
Conversion layer
Lambda: Input
Tracking layer
State Management
Store
Event-driven controls
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Step Functions:
Orchestrate a Serverless processing
workflow using AWS Lambda
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Operation Automation Architecture
https://github.com/awslabs/lambda-refarch-imagerecognition
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Automation characteristics
• Periodic jobs
• Event triggered workflows
• Enforce security policies
• Audit and notification
• Respond to alarms
• Extend AWS functionality
… All while being Highly Available, Scalable and Auditable
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Auto tagging resources as they start
AWS Lambda:
Update Tag
Amazon CloudWatch Events:
Rule Triggered
Amazon EC2 Instance
State Changes
Amazon DynamoDB:
EC2 Instance Properties
Tag: N/A
Amazon EC2 Instance
State Changes
Tag:
Owner=userName
PrincipalID=aws:userid
• AMI
• Instances
• Snapshot
• Volume
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CapitalOne Cloud Custodian
AWS Lambda:
Policy & Compliance Rules
Amazon CloudWatch Events:
Rules Triggered
AWS CloudTrail:
Events
Amazon SNS:
Alert Notifications
Amazon CloudWatch Logs:
Logs
Read more here: http://www.capitalone.io/cloud-custodian/docs/index.html
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scheduled backup operation
AWS Lambda:
Backup Rules
Amazon CloudWatch Events:
Scheduled Trigger
Amazon Redshift Cluster XYZ Snapshot
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scheduled AI jobs
Amazon
S3
AWS
Lambda
CloudWatch:
Time-based events
Amazon
Polly
https://github.com/adhorn/pollycast
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Ops Automator
CloudWatch:
Time-based events
Lambda:
Event handler
Lambda:
Task executors
SNS:
Error & warning notifications
Resources in multiple AWS
Regions and Accounts
EC2 Instances
Tags
OpsAutomatorTaskList CreateSnapshot
DynamoDB:
Task configuration & tracking
CloudWatch:
Logs
Redshift Clusters
https://aws.amazon.com/answers/infrastructure-management/ops-automator/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Autodesk - Tailor
Serverless AWS Account Provisioning and Management Service:
• Automates AWS Account creation,
• Configures IAM, CloudTrail, AWS Config, Direct Connect, and VPC
• Enforces corporate standards
• Audit for compliance
Provisions new Accounts in 10 minutes vs 10 hours in earlier manual process
Open source and extensible: https://github.com/alanwill/aws-tailor
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Web Application Architecture
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Web application
Data stored in
Amazon
DynamoDB
Dynamic content
in AWS Lambda
Amazon API
Gateway
Browser
Amazon
CloudFront
Amazon
S3
Amazon
Cognito
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Bustle Achieves 84% Cost Savings with
AWS Lambda
Bustle is a news, entertainment, lifestyle, and fashion
website targeted towards women.
With AWS Lambda, we
eliminate the need to worry
about operations
Tyler Love
CTO, Bustle
”
“ • Bustle had trouble scaling and
maintaining high availability for its
website without heavy management
• Moved to serverless architecture using
AWS Lambda and Amazon API
Gateway
• Experienced approximately 84% in cost
savings
• Engineers are now focused on
innovation
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon API
Gateway AWS
Lambda
Amazon
DynamoDB
Amazon
S3
Amazon
CloudFront
• Bucket Policies
• ACLs
• OAI
• Geo-Restriction
• Signed Cookies
• Signed URLs
• DDOS Protection
IAM
AuthZ
IAM
• Throttling
• Caching
• Usage Plans
• ACM
Browser
Amazon Cognito
Serverless web app security
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Custom Authorizer
Lambda function
Client
Lambda
function
AmazonAPI
Gateway
Amazon
DynamoDB
AWS Identity &
Access Management
SAML
Two types:
• TOKEN - authorization token
passed in a header
• REQUEST – all headers, query
strings, paths, stage variables or
context variables.
Custom Authorizers
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
us-west-2
us-east-1
Client
Amazon
Route 53
Regional
API
Endpoint
Regional
API
Endpoint
Custom
Domain
Name
Custom
Domain
Name
API Gateway
API Gateway
Lambda
Lambda
Multi-Region with API Gateway
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Processing Architecture
Kinesis-based apps
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Processing real-time, streaming data
• Durable
• Continuous
• Fast
• Correct
• Reactive
• Reliable
What are the key requirements?
Ingest Transform Analyze React Persist
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3
Amazon S3- Foundations for Data Processing
Ingest Transform Analyze React Persist
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Standard
Active data Archive dataInfrequently accessed data
Standard - Infrequent Access Amazon Glacier
Choice of storage classes on S3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
S3 Standard
• Big data analysis
• Content distribution
• Static website
hosting
Standard - IA
• Backup & archive
• Disaster recovery
• File sync & share
• Long-retained data
Amazon Glacier
• Long term archives
• Digital preservation
• Magnetic tape
replacement
Active data Archive dataInfrequently accessed data
Choice of storage classes on S3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis makes it easy to work with real-
time streaming data
Amazon Kinesis
Streams
• For Technical Developers
• Collect and stream data
for ordered, replay-able,
real-time processing
Amazon Kinesis
Firehose
• For all developers, data
scientists
• Easily load massive
volumes of streaming data
into Amazon S3, Redshift,
ElasticSearch
Amazon Kinesis
Analytics
• For all developers, data
scientists
• Easily analyze data
streams using standard
SQL queries
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Producers Consumers
Shard 1
Shard 2
Shard n
Shard 3
…
…
Write: 1MB Read: 2MB
** A shard is a group of data records in a stream
Amazon Kinesis
Amazon Kinesis under the hood
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Producers Amazon S3
Amazon ES
Amazon Redshift
Amazon Kinesis Firehose
Real-time analytics
Amazon
Kinesis
Stream
Amazon
Kinesis
Analytics
Amazon
Cognito
Amazon
Kinesis
Stream
Amazon
DynamoDB
Amazon
Lambda
Amazon S3
JavaScript
SDK
Real-time Analytics Demo
http://quad.adhorn.me
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Input
Source JSON Data
• Once per second, using JavaScript SDK:
• Unique Cognito ID (anonymous user)
• OS
• Quadrant
• Data sent to Kinesis Stream
Amazon
Kinesis
Stream
Amazon
Cognito
Amazon
S3
JavaScript
SDK
{
"recordTime": 1486505943.204,
"cognitoId": "us-east-1:3626e211-d2a3-447b-8231-e1f4e0486f44",
"os": "Android",
"quadrant": "A”,
…
}
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How is raw data mapped to a schema?
Amazon Kinesis stream Amazon KinesisAnalytics
cognitoID os quadrant
<guid1> Android A
<guid2> iOS B
Source Data for Kinesis Analytics
{
"recordTime": 1486505943.204,
"cognitoId": "us-east-1:<guid>",
"os": "Android",
"quadrant": "A"
}
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How is streaming data accessed with SQL?
STREAM
• Analogous to a TABLE
• Represents continuous data flow
CREATE OR REPLACE STREAM DISTINCT_USER_STREAM(
COGNITO_ID VARCHAR(64),
DEVICE VARCHAR(32),
OS VARCHAR(32),
QUADRANT char(1),
DT TIMESTAMP);
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How is streaming data accessed with SQL?
PUMP
• Continuous INSERT query
• Inserts data from one in-application stream to another
CREATE OR REPLACE PUMP "DISTINCT_USER_PUMP" AS
INSERT INTO "DISTINCT_USER_STREAM"
SELECT STREAM DISTINCT
"cognitoId",
...
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How do we get distinct user records?
Use PUMP to insert distinct records into in-app STREAM
CREATE OR REPLACE PUMP "DISTINCT_USER_PUMP" AS
INSERT INTO "DISTINCT_USER_STREAM"
SELECT STREAM DISTINCT
"cognitoId",
"device",
"os",
"quadrant",
FLOOR(s.ROWTIME TO SECOND)
FROM "SOURCE_SQL_STREAM_001" s;
DISTINCT_USERS_STREAM
•COGNITO_ID
•OS
•QUADRANT
•DT
SOURCE_STREAM
•cognitoID
•os
•quadrant
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How do we aggregate per second?
• Tumbling window, group by time period
CREATE OR REPLACE PUMP "OUTPUT_PUMP" AS
INSERT INTO "DESTINATION_SQL_STREAM"
SELECT STREAM
COUNT(dus.COGNITO_ID) AS UNIQUE_USER_COUNT,
COUNT((CASE WHEN dus.OS = 'Android' THEN COGNITO_ID ELSE null END)) AS ANDROID_COUNT,
COUNT((CASE WHEN dus.OS = 'iOS' THEN COGNITO_ID ELSE null END)) AS IOS_COUNT,
COUNT((CASE WHEN dus.OS = 'Windows Phone' THEN COGNITO_ID ELSE null END)) AS WINDOWS_PHONE_COUNT,
COUNT((CASE WHEN dus.OS = 'other' THEN COGNITO_ID ELSE null END)) AS OTHER_OS_COUNT,
COUNT((CASE WHEN dus.QUADRANT = 'A' THEN COGNITO_ID ELSE null END)) AS QUADRANT_A_COUNT,
COUNT((CASE WHEN dus.QUADRANT = 'B' THEN COGNITO_ID ELSE null END)) AS QUADRANT_B_COUNT,
COUNT((CASE WHEN dus.QUADRANT = 'C' THEN COGNITO_ID ELSE null END)) AS QUADRANT_C_COUNT,
COUNT((CASE WHEN dus.QUADRANT = 'D' THEN COGNITO_ID ELSE null END)) AS QUADRANT_D_COUNT,
ROWTIME
FROM "DISTINCT_USER_STREAM" dus
GROUP BY
FLOOR(dus.ROWTIME TO SECOND);
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Comparing Types of Windows
• Output created at the end of the window
• The output of the window will be single event based on the aggregate
function used
Tumbling window
Aggregate per time interval
Sliding window
Windows constantly re-evaluated
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Output to Kinesis Stream
MENTION_COUNT_STREAM
•UNIQUE_USER_COUNT
•ANDROID_COUNT
•… Amazon
Kinesis Stream
{
"unique_user_count": 96,
"android_count": 50,
"ios_count": 46,
"android_count": 50,
"quadrant_a_count": 80,
"quadrant_b_count ": 10,
"quadrant_c_count ": 3,
"quadrant_d_count ": 3
}
1 record, every second
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Persist aggregated data in DynamoDB
Kinesis Stream Lambda DynamoDB
event.Records.forEach((record) => {
const payload = new Buffer(record.kinesis.data, 'base64').toString('ascii');
var docClient = new AWS.DynamoDB.DocumentClient();
var table = "user-quadrant-data";
var data = JSON.parse(payload);
var params = {
TableName: table,
Item:{
"dataType": "quadrantRollup",
"windowtime": (new Date(data.WINDOW_TIME)).getTime(),
"userCount": data.UNIQUE_USER_COUNT,
"quadrantA": data.QUADRANT_A_COUNT,
"quadrantB": data.QUADRANT_B_COUNT, ...
}
};
docClient.put(params, function(err, data) { ...
Lambda event source mapping
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Processing a Kinesis Streams with AWS Lambda
Shard 1 Shard 2 Shard 3 Shard 4 Shard n
Kinesis Stream
. . .
. . .
• Single instance of Lambda function per shard
• Polls shard once per second
• Lambda function instances created and removed automatically as stream is scaled
Gets Records
1x per sec
10k records
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kinesis Firehose Amazon S3 Amazon Athena AWS QuickSight
Users browse content
Logs Analytics
https://github.com/adhorn/logtoes
Amazon Athena
Amazon Athena
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Processing Architecture
IoT-based apps
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Sensor data collection
IoT
rules
IoT
actions
MQTT
Amazon S3:
Raw records
Amazon Kinesis Firehose:
Delivery stream
Amazon S3:
Batched records
Amazon Kinesis Streams:
Real-time stream
AWS IoT:
Data collection
IoT Sensors
Real-time analytics
applications
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS
Lambda
Notifications:
Amazon SNS
DynamoDB
AWS IoTSensors Control System
Anomaly Detection Using AWS Lambda
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS IoT IoT
shadow
Amazon
Cognito
MQTT over WebSockets
AWS
LambdaAlexa
Amazon
S3
Interactive Serverless applications
http://bit.ly/adhornlightbulb
Interactive Serverless applications
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Further Reading
Optimizing Enterprise Economics with Serverless Architectures
https://d0.awsstatic.com/whitepapers/optimizing-enterprise-economics-serverless-architectures.pdf
Serverless Architectures with AWS Lambda
https://d1.awsstatic.com/whitepapers/serverless-architectures-with-aws-lambda.pdf
Serverless Applications Lens - AWS Well-Architected Framework
https://d1.awsstatic.com/whitepapers/architecture/AWS-Serverless-Applications-Lens.pdf
Streaming Data Solutions on AWS with Amazon Kinesis
https://d1.awsstatic.com/whitepapers/whitepaper-streaming-data-solutions-on-aws-with-amazon-kinesis.pdf
AWS Serverless Multi-Tier Architectures
https://d1.awsstatic.com/whitepapers/AWS_Serverless_Multi-Tier_Archiectures.pdf
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
More info:
https://aws.amazon.com/serverless/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thanks you!

Serverless Architectural Patterns

  • 1.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Adrian Hornsby, Cloud Architecture Evangelist Serverless Architectural Patterns @adhorn
  • 2.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Session objectives 1. Understand Serverless Key Concepts. 2. Understand Event Processing Architecture. 3. Understand Operation Automation Architecture. 4. Understand Web Application Architecture. 5. Understand Data Processing Architecture. 1. Kinesis-based apps. 2. IoT-based apps.
  • 3.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Serverless Key Concepts
  • 4.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. No servers to provision or manage Scales with usage Never pay for idle Availability and fault tolerance built in Serverless means…
  • 5.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Spectrum of AWS offerings AWS Lambda Amazon Kinesis Amazon S3 Amazon API Gateway Amazon SQS Amazon DynamoDB AWS IoT Amazon EMR Amazon ElastiCache Amazon RDS Amazon Redshift Amazon ES Managed Serverless Amazon EC2 Microsoft SQL Server “On EC2” Amazon Cognito Amazon CloudWatch
  • 6.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Regional services AZ1 AZ2 AZ3 Service XYZ
  • 7.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Anatomy of a Lambda function Handler() function Function to be executed upon invocation Event object Data sent during Lambda Function Invocation Context object Methods available to interact with runtime information (request ID, log group, etc.) def handler(event, context): return { "message": ”Hello World!", "event": event }
  • 8.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Lambda execution model Synchronous (push) Asynchronous (event) Stream-based Amazon API Gateway AWS Lambda function Amazon DynamoDB Amazon SNS /api/hello AWS Lambda function Amazon S3 reqs Amazon Kinesis changes AWS Lambda service function
  • 9.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Lambda Best Practices • Minimize package size to necessities • Separate the Lambda handler from core logic • Use Environment Variables to modify operational behavior • Self-contain dependencies in your function package • Leverage “Max Memory Used” to right-size your functions • Delete large unused functions (75GB limit)
  • 10.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. AWS X-Ray Integration with Serverless • Lambda instruments incoming requests for all supported languages • Lambda runs the X-Ray daemon on all languages with an SDK var AWSXRay = require(‘aws-xray-sdk-core‘); AWSXRay.middleware.setSamplingRules(‘sampling-rules.json’); var AWS = AWSXRay.captureAWS(require(‘aws-sdk’)); S3Client = AWS.S3();
  • 11.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. X-Ray Trace Example
  • 12.
  • 13.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Event Processing Architecture
  • 14.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Event driven A B CEvent A on B triggers C Invocation Lambda functions Action
  • 15.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Event-driven platform S3 event notifications DynamoDB Streams Kinesis events Cognito events SNS events Custom events CloudTrail events LambdaDynamoDB Kinesis S3 Any custom Invoked in response to events - Changes in data - Changes in state Redshift SNS Access any service, including your own Such as… Lambda functions CloudWatch events
  • 16.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Event-driven actions Lambda: Resize Images Users upload photos S3: Source Bucket S3: Destination Bucket Triggered on PUTs
  • 17.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Source data Source data Source data Source data S3 Data Staging Layer /data/source-raw S3 Data Staging Layer /data/source-validated Lambda: Input Validation and Conversion layer Lambda: Input Tracking layer State Management Store Event-driven controls
  • 18.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. AWS Step Functions: Orchestrate a Serverless processing workflow using AWS Lambda
  • 19.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Operation Automation Architecture
  • 20.
  • 21.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Automation characteristics • Periodic jobs • Event triggered workflows • Enforce security policies • Audit and notification • Respond to alarms • Extend AWS functionality … All while being Highly Available, Scalable and Auditable
  • 22.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Auto tagging resources as they start AWS Lambda: Update Tag Amazon CloudWatch Events: Rule Triggered Amazon EC2 Instance State Changes Amazon DynamoDB: EC2 Instance Properties Tag: N/A Amazon EC2 Instance State Changes Tag: Owner=userName PrincipalID=aws:userid • AMI • Instances • Snapshot • Volume
  • 23.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. CapitalOne Cloud Custodian AWS Lambda: Policy & Compliance Rules Amazon CloudWatch Events: Rules Triggered AWS CloudTrail: Events Amazon SNS: Alert Notifications Amazon CloudWatch Logs: Logs Read more here: http://www.capitalone.io/cloud-custodian/docs/index.html
  • 24.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Scheduled backup operation AWS Lambda: Backup Rules Amazon CloudWatch Events: Scheduled Trigger Amazon Redshift Cluster XYZ Snapshot
  • 25.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Scheduled AI jobs Amazon S3 AWS Lambda CloudWatch: Time-based events Amazon Polly https://github.com/adhorn/pollycast
  • 26.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. AWS Ops Automator CloudWatch: Time-based events Lambda: Event handler Lambda: Task executors SNS: Error & warning notifications Resources in multiple AWS Regions and Accounts EC2 Instances Tags OpsAutomatorTaskList CreateSnapshot DynamoDB: Task configuration & tracking CloudWatch: Logs Redshift Clusters https://aws.amazon.com/answers/infrastructure-management/ops-automator/
  • 27.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Autodesk - Tailor Serverless AWS Account Provisioning and Management Service: • Automates AWS Account creation, • Configures IAM, CloudTrail, AWS Config, Direct Connect, and VPC • Enforces corporate standards • Audit for compliance Provisions new Accounts in 10 minutes vs 10 hours in earlier manual process Open source and extensible: https://github.com/alanwill/aws-tailor
  • 28.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Web Application Architecture
  • 29.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Web application Data stored in Amazon DynamoDB Dynamic content in AWS Lambda Amazon API Gateway Browser Amazon CloudFront Amazon S3 Amazon Cognito
  • 30.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Bustle Achieves 84% Cost Savings with AWS Lambda Bustle is a news, entertainment, lifestyle, and fashion website targeted towards women. With AWS Lambda, we eliminate the need to worry about operations Tyler Love CTO, Bustle ” “ • Bustle had trouble scaling and maintaining high availability for its website without heavy management • Moved to serverless architecture using AWS Lambda and Amazon API Gateway • Experienced approximately 84% in cost savings • Engineers are now focused on innovation
  • 31.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Amazon API Gateway AWS Lambda Amazon DynamoDB Amazon S3 Amazon CloudFront • Bucket Policies • ACLs • OAI • Geo-Restriction • Signed Cookies • Signed URLs • DDOS Protection IAM AuthZ IAM • Throttling • Caching • Usage Plans • ACM Browser Amazon Cognito Serverless web app security
  • 32.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Custom Authorizer Lambda function Client Lambda function AmazonAPI Gateway Amazon DynamoDB AWS Identity & Access Management SAML Two types: • TOKEN - authorization token passed in a header • REQUEST – all headers, query strings, paths, stage variables or context variables. Custom Authorizers
  • 33.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. us-west-2 us-east-1 Client Amazon Route 53 Regional API Endpoint Regional API Endpoint Custom Domain Name Custom Domain Name API Gateway API Gateway Lambda Lambda Multi-Region with API Gateway
  • 34.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Processing Architecture Kinesis-based apps
  • 35.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Processing real-time, streaming data • Durable • Continuous • Fast • Correct • Reactive • Reliable What are the key requirements? Ingest Transform Analyze React Persist
  • 36.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Amazon S3 Amazon S3- Foundations for Data Processing Ingest Transform Analyze React Persist
  • 37.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Standard Active data Archive dataInfrequently accessed data Standard - Infrequent Access Amazon Glacier Choice of storage classes on S3
  • 38.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. S3 Standard • Big data analysis • Content distribution • Static website hosting Standard - IA • Backup & archive • Disaster recovery • File sync & share • Long-retained data Amazon Glacier • Long term archives • Digital preservation • Magnetic tape replacement Active data Archive dataInfrequently accessed data Choice of storage classes on S3
  • 39.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis makes it easy to work with real- time streaming data Amazon Kinesis Streams • For Technical Developers • Collect and stream data for ordered, replay-able, real-time processing Amazon Kinesis Firehose • For all developers, data scientists • Easily load massive volumes of streaming data into Amazon S3, Redshift, ElasticSearch Amazon Kinesis Analytics • For all developers, data scientists • Easily analyze data streams using standard SQL queries
  • 40.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Producers Consumers Shard 1 Shard 2 Shard n Shard 3 … … Write: 1MB Read: 2MB ** A shard is a group of data records in a stream Amazon Kinesis
  • 41.
  • 42.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Producers Amazon S3 Amazon ES Amazon Redshift Amazon Kinesis Firehose
  • 43.
  • 44.
  • 45.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Data Input Source JSON Data • Once per second, using JavaScript SDK: • Unique Cognito ID (anonymous user) • OS • Quadrant • Data sent to Kinesis Stream Amazon Kinesis Stream Amazon Cognito Amazon S3 JavaScript SDK { "recordTime": 1486505943.204, "cognitoId": "us-east-1:3626e211-d2a3-447b-8231-e1f4e0486f44", "os": "Android", "quadrant": "A”, … }
  • 46.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. How is raw data mapped to a schema? Amazon Kinesis stream Amazon KinesisAnalytics cognitoID os quadrant <guid1> Android A <guid2> iOS B Source Data for Kinesis Analytics { "recordTime": 1486505943.204, "cognitoId": "us-east-1:<guid>", "os": "Android", "quadrant": "A" }
  • 47.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. How is streaming data accessed with SQL? STREAM • Analogous to a TABLE • Represents continuous data flow CREATE OR REPLACE STREAM DISTINCT_USER_STREAM( COGNITO_ID VARCHAR(64), DEVICE VARCHAR(32), OS VARCHAR(32), QUADRANT char(1), DT TIMESTAMP);
  • 48.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. How is streaming data accessed with SQL? PUMP • Continuous INSERT query • Inserts data from one in-application stream to another CREATE OR REPLACE PUMP "DISTINCT_USER_PUMP" AS INSERT INTO "DISTINCT_USER_STREAM" SELECT STREAM DISTINCT "cognitoId", ...
  • 49.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. How do we get distinct user records? Use PUMP to insert distinct records into in-app STREAM CREATE OR REPLACE PUMP "DISTINCT_USER_PUMP" AS INSERT INTO "DISTINCT_USER_STREAM" SELECT STREAM DISTINCT "cognitoId", "device", "os", "quadrant", FLOOR(s.ROWTIME TO SECOND) FROM "SOURCE_SQL_STREAM_001" s; DISTINCT_USERS_STREAM •COGNITO_ID •OS •QUADRANT •DT SOURCE_STREAM •cognitoID •os •quadrant
  • 50.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. How do we aggregate per second? • Tumbling window, group by time period CREATE OR REPLACE PUMP "OUTPUT_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM" SELECT STREAM COUNT(dus.COGNITO_ID) AS UNIQUE_USER_COUNT, COUNT((CASE WHEN dus.OS = 'Android' THEN COGNITO_ID ELSE null END)) AS ANDROID_COUNT, COUNT((CASE WHEN dus.OS = 'iOS' THEN COGNITO_ID ELSE null END)) AS IOS_COUNT, COUNT((CASE WHEN dus.OS = 'Windows Phone' THEN COGNITO_ID ELSE null END)) AS WINDOWS_PHONE_COUNT, COUNT((CASE WHEN dus.OS = 'other' THEN COGNITO_ID ELSE null END)) AS OTHER_OS_COUNT, COUNT((CASE WHEN dus.QUADRANT = 'A' THEN COGNITO_ID ELSE null END)) AS QUADRANT_A_COUNT, COUNT((CASE WHEN dus.QUADRANT = 'B' THEN COGNITO_ID ELSE null END)) AS QUADRANT_B_COUNT, COUNT((CASE WHEN dus.QUADRANT = 'C' THEN COGNITO_ID ELSE null END)) AS QUADRANT_C_COUNT, COUNT((CASE WHEN dus.QUADRANT = 'D' THEN COGNITO_ID ELSE null END)) AS QUADRANT_D_COUNT, ROWTIME FROM "DISTINCT_USER_STREAM" dus GROUP BY FLOOR(dus.ROWTIME TO SECOND);
  • 51.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Comparing Types of Windows • Output created at the end of the window • The output of the window will be single event based on the aggregate function used Tumbling window Aggregate per time interval Sliding window Windows constantly re-evaluated
  • 52.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Output to Kinesis Stream MENTION_COUNT_STREAM •UNIQUE_USER_COUNT •ANDROID_COUNT •… Amazon Kinesis Stream { "unique_user_count": 96, "android_count": 50, "ios_count": 46, "android_count": 50, "quadrant_a_count": 80, "quadrant_b_count ": 10, "quadrant_c_count ": 3, "quadrant_d_count ": 3 } 1 record, every second
  • 53.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Persist aggregated data in DynamoDB Kinesis Stream Lambda DynamoDB event.Records.forEach((record) => { const payload = new Buffer(record.kinesis.data, 'base64').toString('ascii'); var docClient = new AWS.DynamoDB.DocumentClient(); var table = "user-quadrant-data"; var data = JSON.parse(payload); var params = { TableName: table, Item:{ "dataType": "quadrantRollup", "windowtime": (new Date(data.WINDOW_TIME)).getTime(), "userCount": data.UNIQUE_USER_COUNT, "quadrantA": data.QUADRANT_A_COUNT, "quadrantB": data.QUADRANT_B_COUNT, ... } }; docClient.put(params, function(err, data) { ... Lambda event source mapping
  • 54.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Processing a Kinesis Streams with AWS Lambda Shard 1 Shard 2 Shard 3 Shard 4 Shard n Kinesis Stream . . . . . . • Single instance of Lambda function per shard • Polls shard once per second • Lambda function instances created and removed automatically as stream is scaled Gets Records 1x per sec 10k records
  • 55.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Kinesis Firehose Amazon S3 Amazon Athena AWS QuickSight Users browse content Logs Analytics https://github.com/adhorn/logtoes
  • 56.
  • 57.
  • 58.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Processing Architecture IoT-based apps
  • 59.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Sensor data collection IoT rules IoT actions MQTT Amazon S3: Raw records Amazon Kinesis Firehose: Delivery stream Amazon S3: Batched records Amazon Kinesis Streams: Real-time stream AWS IoT: Data collection IoT Sensors Real-time analytics applications
  • 60.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. AWS Lambda Notifications: Amazon SNS DynamoDB AWS IoTSensors Control System Anomaly Detection Using AWS Lambda
  • 61.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. AWS IoT IoT shadow Amazon Cognito MQTT over WebSockets AWS LambdaAlexa Amazon S3 Interactive Serverless applications
  • 62.
  • 63.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Further Reading Optimizing Enterprise Economics with Serverless Architectures https://d0.awsstatic.com/whitepapers/optimizing-enterprise-economics-serverless-architectures.pdf Serverless Architectures with AWS Lambda https://d1.awsstatic.com/whitepapers/serverless-architectures-with-aws-lambda.pdf Serverless Applications Lens - AWS Well-Architected Framework https://d1.awsstatic.com/whitepapers/architecture/AWS-Serverless-Applications-Lens.pdf Streaming Data Solutions on AWS with Amazon Kinesis https://d1.awsstatic.com/whitepapers/whitepaper-streaming-data-solutions-on-aws-with-amazon-kinesis.pdf AWS Serverless Multi-Tier Architectures https://d1.awsstatic.com/whitepapers/AWS_Serverless_Multi-Tier_Archiectures.pdf
  • 64.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. More info: https://aws.amazon.com/serverless/
  • 65.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Thanks you!

Editor's Notes

  • #18 Also a classic in Data Lake designs. Amazon S3 is a simple key-based object store whose scalability and low cost make it ideal for storing large datasets. S3 to provide excellent performance for storing and retrieving objects based on a known key. Taking advantage of AWS Lambda event-driven triggers from S3. S3 storage tier – lifecycle policies Modern businesses typically collect data from internal and external sources at various frequencies throughout the day. These data sources could be franchise stores, subsidiaries, or new systems integrated as a result of merger and acquisitions. For example, a retail chain might collect point-of-sale (POS) data from all franchise stores three times a day to get insights into sales as well as to identify the right number of staff at a given time in any given store. As each franchise functions as an independent business, the format and structure of the data might not be consistent across the board. Depending on the geographical region, each franchise would provide data at a different frequency and the analysis of these datasets should wait until all the required data is provided (event-driven) from the individual franchises. In most cases, the individual data volumes received from each franchise are usually small but the velocity of the data being generated and the collective volume can be challenging to manage.
  • #36 Narrative: You need a different set of analytical tools to collect and analyze real-time streaming data than what you have traditionally used for data at rest. With traditional analytics, you gather the information, store it in a database, and analyze it hours, days, or weeks later. Analyzing real-time data requires a different approach. Instead of running database queries on stored data, streaming analytics platforms have to process the data continuously and before the data lands in a database. And streaming data comes in at an incredible rate that can vary up and down all the time. Streaming analytics platforms have to be able to process this data when it arrives, often at speeds of millions and even tens of millions of events per hour. Key requirements of stream processing Durable: Durable ingest so that processing can be repeatable; Continuous - Need to always be processing the latest data Fast: Frequency (micro batches, size of batches, true streaming), and speed (sub-second, minute, hour) Correct: at most once, at least once, and exactly once processing; event time, ingest time, processing time. Reactive: Ability to process and respond in near real-time; feedback mechanisms to send processed data to live applications Reliable: Highly available, fast failovers
  • #37 A foundation of highly durable data storage and streaming of any type of data A metadata index and workflow which helps us categorise and govern data stored in the data lake A search index and workflow which enables data discovery A robust set of security controls – governance through technology, not policy An API and user interface that expose these features to internal and external users
  • #38 All 3 storage classes are highly dur designed for 11 9s Standard, designed for the active data, hot workload. High performance Designed for 4 9s availability. General Purpose storage class, new data frequently access, start with standard. Starting at 2.3c/GB. As data age, access less. SIA, designed for colder or less frequently access data, Offers same high performance3 9s of availability., high throughput and low latency as S3 Standard. starting at 1.25c/GB, 45% lower in storage cost. There is a $0.01/gb retrieval cost. As the data age further, no one is actively interacting with the old data and you need for record keeping. Glacier designed for long term archival storage. 4/10th of a cent, 3 retrieval options ranging from minutes to hours retrieval, you choose depending on quickly you need the data. lifecycle For data that is less-frequently accessed, you can leverage Amazon S3 Standard-IA to save on cost while still benefiting from the great durability and performance as S3 Standard. In addition to transitioning your data to S-IA as its characteristics change, you can also leverage Amazon S3 Standard-IA for new data that fits the bill for Infrequently accessed data. For example you can leverage the S-IA storage class to stored detailed applications logs that you analyse in-frequently and save on storage cost.
  • #39 if your storage is for Big data analysis, content distribution, website, consider using S3 standard which is designed for the active, hot analytics workload. Netflix and finra for backup and archive, DR, you generally don’t access the data until the rare occasion when you need it, when you need it, you need it right away, SIA is designed for those use cases. consider directly putting into Standard-IA. Glacier long term archive. Sony moved over 1M hours of video from magnetic tape to Glacier for digital preservation As data age, you access the objects less
  • #40 Since Amazon Kinesis launch in 2013, the ecosystem evolved and we introduced Kinesis Firehose and Kinesis Analytics. Streams was launched in GA at re:Invent 2014, Firehose at re:Invent 2015, and Analytics was launched in August 2016 We have continuously iterated to make it easier for customers to use streaming data, as well as expand the functionality of real-time processing Together, these three products make up the Amazon Kinesis streaming data platform
  • #41 A shard is a group of data records in a stream. When you create a stream, you specify the number of shards for the stream. Each shard can support up to 5 transactions per second for reads, up to a maximum total data read rate of 2 MB per second and up to 1,000 records per second for writes, up to a maximum total data write rate of 1 MB per second (including partition keys). The total capacity of a stream is the sum of the capacities of its shards. You can increase or decrease the number of shards in a stream as needed. However, note that you are charged on a per-shard basis.
  • #52 Feedback: break into two, flow for introducing windowing needs revision
  • #53 Tumbling: Repeat Are non-overlapping An event can belong to only one tumbling window Sliding: Continuously moves forward in time Produces an output only during the occurrence of an event Every window will have at least one event Events can belong to more than one sliding window
  • #56 AWS Lambda is a compute service that runs your code in response to events and automatically manages the compute resources for you, making it easy to build applications that respond quickly to new information. AWS Lambda starts running your code within milliseconds of an event such as an image upload, in-app activity, website click, or output from a connected device. You can also use AWS Lambda to create new back-end services where compute resources are automatically triggered based on custom requests. With AWS Lambda you pay only for the requests served and the compute time required to run your code. Billing is metered in increments of 100 milliseconds, making it cost-effective and easy to scale automatically from a few requests per day to thousands per second.
  • #62 Sometimes, though, you may want to analyze the anomalies or at least be notified of their presence. such algo could be : calculating an average and standard deviation of the time-series data Data is sent from our sensor to AWS IoT, where it is routed to AWS Lambda through the AWS IoT Rules Engine. Lambda executes the logic for anomaly detection and because the algorithm requires knowledge of previous measurements, uses Amazon DynamoDB as a key-value store. The Lambda function republishes the message along with parameters extracted from the PEWMA algorithm. The results can be viewed in your browser through a WebSocket connection to AWS IoT on your local machine
  • #63 To build that, no genius was involved .. but like Edison, we stitched innovations together to create a Wow experience for our customers. (security, scalable, reliable..) And this is exactly what the cloud is about! Just few years ago, it would have taken months or even years to build a scalable, reliable and secure application like that from scratch.