2. Agenda
• Why AWS Lambda
• How it works
• Use cases
• Sample architecture
• Best practices
3. Simple but usable primitives Scales with usage
Never pay for idle
Availability and fault
tolerance built in
Serverless means..
4. AWS compute offerings
VM Task Function
Service EC2 ECS Lambda
H/W OS Runtime
Unit of scale
Level of
abstraction
5. AWS compute offerings
I want to
configure
machines,
storage,
networking,
and my OS
I want to run
servers,
configure
applications,
and control
scaling
Run my
code when
it’s needed
Service EC2 ECS Lambda
How do I
choose?
7. Servers
How will the application
handle server hardware failure?
How can I control
access from my servers?
When should I decide to
scale out my servers?
When should I decide to
scale up my servers?
What size servers are
right for my budget?
How much remaining
capacity do my servers have?
(AAHHHHHHHHH!!)
8. Serverless compute: AWS Lambda
COMPUTE
SERVICE
EVENT-
DRIVEN
Run arbitrary
code without
managing
servers
Code only runs
when it needs to
run
9. AWS Lambda: Run code in response to events
Lambda functions: Stateless, trigger-based code execution
Triggered by events:
• Direct sync and async API calls
• AWS service integrations
• Third-party triggers
• Many more …
Makes it easy to:
• Perform data-driven auditing, analysis, and notification
• Build back-end services that perform at scale
10. Cost-effective and
efficient
No infrastructure
to manage
Pay only for what you use
Bring your
own code
Productivity-focused compute platform to build powerful, dynamic, modular
applications in the cloud
Run code in standard
languages
Focus on business logic
Benefits of AWS Lambda
1 2 3
12. Using AWS Lambda
Bring your own code
• Node.js, Java, Python
• Bring your own libraries
(even native ones)
Simple resource model
• Select power rating from
128 MB to 1.5 GB
• CPU and network
allocated proportionately
Flexible use
• Synchronous or
asynchronous
• Integrated with other
AWS services
Flexible authorization
• Securely grant access to
resources and VPCs
• Fine-grained control for
invoking your functions
13. Using AWS Lambda
Authoring functions
• WYSIWYG editor or
upload packaged .zip
• Third-party plugins
(Eclipse, Visual Studio)
Monitoring and logging
• Metrics for requests,
errors, and throttles
• Built-in logs to Amazon
CloudWatch Logs
Programming model
• Use processes, threads,
/tmp, sockets normally
• AWS SDK built in
(Python and Node.js)
Stateless
• Persist data using
external storage
• No affinity or access to
underlying infrastructure
14. Application components for serverless apps
EVENT SOURCE FUNCTION SERVICES (ANYTHING)
Changes in
data state
Requests to
endpoints
Changes in
resource state
Node
Python
Java
… more coming soon
24. What to expect from the session
15-20 minutes of processing now in seconds
2x order of magnitude for cost savings
https://www.youtube.com/watch?v=TXmkj2a0fRE
Nordstrom Recommendations
29. Serverless → distributed by nature
Component graph
becomes call graph
Distributed systems
thinking is required from
the start
Event-based architecture
33. AWS Lambda best practices
Limit your function/code size
500 MB /tmp directory provided to each function
Don’t assume function will reuse underlying infrastructure
But take advantage of it when it does occur
You own the logs
Include details from service-provided context
Create custom metrics
Operations-centric vs. business-centric
34. Best practice: Use versions and aliases
Versions = immutable copies of code + properties
Aliases = mutable pointers to versions
Rollbacks
Staged
promotions
“Lock” behavior
for client
35. The function networking environment
Default - a default network environment within VPC is provided for you
Access to the Internet always permitted to your function
No access to VPC-deployed assets
Customer VPC - Your function executes within the context of your own VPC
Privately communicate with other resources within your VPC
Familiar configuration and behavior with:
Subnets
Elastic network interfaces (ENIs)
EC2 security groups
VPC route tables
NAT gateway
36. Additional best practices
Externalize authorization to IAM roles whenever possible
Least privilege and separate IAM roles
Externalize configuration
DynamoDB is great for this
Make sure your downstream setup “keeps up” with Lambda scaling
Limit concurrency when talking to relational databases
Be aware of service throttling
Engage AWS Support to increase your limits
Contact AWS Support before known large scaling events
37. Demo
Amazon Cognito
User Pools
Amazon API
Gateway
Custom Authorizer
Lambda Function
/blog Lambda
Function
/edit… Lambda
Function
Amazon
DynamoDB
Throttling
Cache
Logging
Monitoring
Auth
Mobile apps
38. Next steps
1. Go to console.aws.amazon.com/lambda and create your first
Lambda function. (The first 1M requests are on us!)
2. Stay up to date with AWS Lambda on the Compute blog and check
out aws.amazon.com/lambda for scenarios and customer stories.
3. Send us your questions, comments, and feedback on the AWS
Lambda Forums.
So this is where Lambda fits within the ecosystem.
FireEye – have built a completely serverless MapReduce
Nordstrom –
S3 use case: Seattle Times: https://aws.amazon.com/solutions/case-studies/the-seattle-times/
But first, lets make sure we are all on the same page as to what a serverless application is.
Serverless computing approach that Lambda brings about isn’t just about “not having to manage servers”. Serverless means having a simple but usable primitive – your code as a Lambda function - with nothing that looks like a container or server. The programming model and APIs are all oriented around dealing with functions. Serverless means you only pay for work done, not for provisioning capacity. You don’t have to worry about utilization, because you never pay for idle. You only pay for compute time, that is, the time your function takes to run, in units of 100 ms. This is something most customers get excited about thinking about what paying 21 microcents for 100 ms of compute can do for their costs . For example, Nordstrom tells us switching to Lambda reduced the cost of their analytics pipeline by two orders of magnitude. A publishing company from Singapore tells us they saves over 30,000 per month by switching from a proprietary image processing solution to one built on Lambda for processing millions of images a day. Which brings me to the third aspect, that is Serverless means scaling is built in - you can never overprovision or under provision. Since your code is run in response to events, Lambda will automatically spin up as many instances of your function as required to handle any incoming event rate. Let me repeat this, any event rate. We have customers running backends handling in excess of 100, 000 TPS at peak, and others like Adroll who are processing over 55 B ad impressions a day through Lambda. And last but not the least - Serverless means that functions come with high availability and, depending on the workload, fault tolerance come built in. The combination of offloading these responsibilities can have significant impact on the way you own and operate applications running in the cloud. For example, Vidroll tells us what used to take them 10 engineers now takes them two, while handling twice the scale.
Three levels of abstraction for consuming, key differences between these three:
Unit of scale – how do you scale up/out when you need to Task – define what the task
Level of abstraction – what does AWS take care of, and what do you need to take care of. Developers and infrastructure maintainers are two different. With Lambda, everything up to the runtime is abstracted, if there’s a problem with the JVM, we’ll path that too.
When do I use each?
There are industries and industries out there who’s entire value propositions are centered around just one of these questions. You as a developer, operations admin, application owner, security architect, CIO, need to have answers for all of these questions.
That’s not to say there are tools and strategies for answering every single one of these questions. There are, and I think AWS gives you the most and best tools in order to answer them.
But…
Why care?
Provisioning:
- How do I know that I’m using the right # of EC2 instances? Could I have run less
- How do I make sure itAvailabitity and fault tolerance
’s up and running?
Owning servers meant you own the patching and lifecycle of the instances. You own the OS patching, the language stack updates, and the resource configuration, the port details. You also need to put logic in place to decide how many of them are alive at any given time to match what scale your application needs. You are responsible for placing work onto these instances, and deciding how busy they are kept. And as with all product grade applications, you are responsible for architecting the application for high availability. Now, there are many who want these responsibilities as part of their application, and even those who differentiate their application through excelling at these responsibilities. For those customers, there are many tools and solutions out there that greatly simplify these tasks. But for everyone else, we wanted to provide a solution where you would be responsible for the application code, and leave the rest to AWS.
So we looked at these problems, and we asked how we can make it better?
6 And that, essentially, is the origin story for AWS Lambda. Lambda is a compute service that allows you to run code without any of the responsibilities we just talked about – just upload the code, and Lambda takes care of spinning up the required compute power to execute the code in scalable, highly available manner. Lambda is also an event driven service, in that your code only executes when needed in response to an event, such as an incoming request, or an update to data.
A Serverless application usually starts with an event. That event can be a write to a dynamoDB table, a PutObject to an S3 bucket, an HTTP call, or a host of other Lambda supported event sources.
That event then triggers your Lambda function, which can be written in Node.js, Python, Java, or C#. Now remember, this is your code, and you can program it to do whatever you’d like. You could do things like call other downstream services to continue processing, return a result, or write metadata to database.
Whenever you’re programming, you can do almost you do everything on the server. You can fork up processes, run multi-threaded code if you want. Make use of /tmp so you can scratch. You can create a socket and send outbound messages, you just can’t listen on the socket. You can also bring up an entire binary model.
Bring your own binary: We have customers who bring in the ffmpeg binary and run transcoding jobs
Statleess: Can’t SSH in your function
Your Serverless application consists of three components. First is the entity that triggers your function, or the “event source”. Events that trigger your function can be changes to data, say, records written to a DynamoDB table. They can be requests made to endpoint services like API Gateway for HTTP requests or AWS IoT for MQTT requests from devices. They can also be events indicating a change in resource state, say, an EC2 instance being spun up or a change in a CloudFormation stack.
Second, and definitely the central component, is the Lambda function, written in one of the supported languages – Node, Python, or Java. A Lambda function is essentially the code you want executed along with some metadata to let Lambda know the execution parameters such as resource allocations and timeouts. Functions are expected to be modular. Instead of having one function that does compression and thumbnailing and indexing called “file processor”, consider having three different functions that each do its one thing.
The third building block consists of the services that your function interacts with, both AWS and otherwise. Barring the simplest of cases, you would probably be calling another service, or reading and writing from a database. Remember this is your code – and you can control exactly what AWS services your function has access to using an IAM role for the function. If the resource you need to access happens to be in a private IP space, you can configure your function to access specific VPC subnets as well. If you need to store secrets to access external services, you can leverage the AWS key management service to store and retrieve the secrets within your Lambda function.
You also have granular controls on which functions can be invoked by which event source, using resource policies on the Lambda function.
11 Event sources come in a few flavors. First , data repositories. If it stores data that you want to track changes for, it’s a potential event source. Remember, you can always bring your own event source – Lambda exposes an invoke API that accepts arbitrary JSON payloads as events, so if you have a system that emits events, you can wire it up to Lambda.
Recap the common use cases for serverless
Web Applications: By combining AWS Lambda with other AWS services, developers can build powerful web applications that automatically scale up and down and run in a highly available configuration across multiple data centers – with zero administrative effort required for scalability, back-ups or multi-data center redundancy.
Mention Flask and Express
Backends: You can build serverless backends using AWS Lambda, Amazon API Gateway, and Amazon DynamoDB to handle web, mobile, Internet of Things (IoT) requests.
Data Processing: You can build a variety of real-time data processing systems using AWS Lambda, Amazon Kinesis, Amazon S3, and Amazon DynamoDB.
So what can you build with an event driven compute service? We see two broad patterns – using Lambda to process data as it comes in and write to other data stores downstream; or, using Lambda to build interactive backends, adding backend logic in front of databases or other services. Customers like Thomson Reuters use Lambda to process files loaded into Amazon S3 as soon as the data is available, from image transformation, to file format conversion, to developing indexes of uploaded content. Customers like Adroll and localytrics using Kinesis and Lambda to process large amounts of streaming data in realtime for their click stream analysis. You can also build NoSQL Database triggers for DynamoDB, such as validating every row written or adding calculated columns.
In the interactive backend class, Customers like EasyTen are building serverless mobile backends, where their Lambda function containing cross platform app logic is invoked synchronously using the AWS Mobile SDK, or build standalone REST microservices using Lambda with Amazon API Gateway. You can also use Lambda to create new voice driven “skills” for Alexa on Amazon Echo, allowing you create voice powered commands to do a variety of operations like order a pizza or post a slack update.
-- DEVICE MESSAGE COMING IN AS MQTT MESSAGE
-- ALEXA
-- SLACK CONTEST ++ ADD A SLIDE!
There are a whole collection of AWS services with Lambda integration like CloudFormation, or Simple Notification Service of Simple Worfklow service – the idea being, if you need to run arbitrary code and don’t want to worry about servers, Lambda is a great starting point.
<humor>We studied this in undergrad….this simple 3 block architecture turns into this.
13 One of the popular uses for Lambda is for backend data processing workflows, such as those for an e-commerce backend, or for an ingestion pipeline for media content. Here’s a sample architecture for a real time file processing application, similar to those built by Thomson Reuters, Seattle Times, Fireeye, Periscope and others. When the file gets upload to S3, it sends the event to Lambda, using Amazon SNS for fanning out the requests among multiple Lambda functions. One function handles format conversion and writes to another S3 bucket, the second indexes the data into DynamoDB, and the third records the file size to track total data processed. Now, due to how Lambda retries asynchronous invocations, as is the case with SNS and S3, your code should be designed to handle duplicates.
The functions in these architectures invariable do some heavy duty processing, like video transcoding, so its important to remember that memory is your performance dial. When running within Lambda, you control how much CPU and memory is available to your function by configuring its memory. Lambda gives you 23 “power levels” so to speak, with settings from 128MB to 1.5GB, with the highest setting getting you 12 times the CPU and memory as the lowest one. If your function is CPU bound, higher settings equals faster runtimes!
This is an example of event source to Lambda being one to many – S3 and SNS as the event source, which fanout to multiple Lambda functions to do different operations, which each then write back to the same set of stores, in this case, S3 and DynamoDB.
14 When combined with Kinesis, Lambda also fits nicely into real time processing workflows for machine data, operational logs, and similar data. Major League Baseball, Zillow, and Localytics have all published architectures highlighting this approach. The same architecture here shows parallel processing on an incoming stream of data using multiple lambda functions, each piping to a different destination. In this example, the incoming stream of operational data is ingested through Kinesis, and then processed in parallel by two Lambda functions. The first one does aggregation and metrics calculation, the second chunks the stream into flat files and backs it up into S3, potentially for further batch processing by EMR or Redshift. In this architecture, the two Lambda functions share the Kinesis’s Streams read throughput, but get individual copies and checkpoints on the data being read. Lambda automatically checkpoints each batch as it successfully processes, but this also means it will retry the entire batch if the invoke hits a code error. You can customize the retry policy by piping the failed records to another queue or Lambda function to process out of order from the others, and have the code return a success code so that the function moves on.
This is what “pattern 2” looks like - one event source fans out to multiple Lambda functions, each of which writes to its own downstream store.
Online equivalent of a Stylist…to help you find what you love. So how they do that? Whether you’re browsing on your mobile, or on Nordstrom.com. Their chief architect’s quote is: “Our operational requirement is, alert me when the Internet is down”
We’re now able to focus on what Nordstrom does, and let Amazon do the heavy lifting. Let’s them get out of Patch Tuesdays and Heartbleed. At June 2015 event, Nordstrom is “all-in”.
Many autonomous devices in collaboration
→ distributed
Interacting with the real world
→ event-based
Going back to iRobot, the smart home is part of the Internet of Things
In IoT, you have unreliable communication links between devices, which means each component needs to be more robust or autonomous. It’s a distributed system
Devices are reacting to, or changing, the real-world environment they are in. This is fundamentally an event-based system.
So you can see how this is a good fit for using serverless architecture for the cloud-side systems supporting our apps and robots
We build microservices on top of AWS services
We’ve got zero unmanaged EC2 instances in these services
With functions as the unit of deployment, your component graph becomes your call graph
Diagram shows one flow in one service in our system. Looks super complex!
But really, if you draw a call diagram for most monoliths, they look this complicated
In a more consolidated, less serverless system, distributed systems thinking happens at the boundaries, and may be treated less systematically.
With a serverless architecture, you’re distributed by nature. You have to think about it from the start, which means you’re going to build a more robust system
A good strategy is to go event-based, e.g. CQRS
We use a lot of queues. Some events can be dropped, but for some it’s critical to ensure their resulting actions are completed
We could use an Elastic Beanstalk application for this. But the added complexity of VPCs, security groups, etc. and the uniformity of using Lambda everywhere means lower operational complexity for us
Don’t let your function uploads be the new “monolith”.
Lots of new users of Lambda are also new Node.js programmers. Learn the basics of asynchronous code processing.
Your Lambda container MIGHT be reused. This could be GREAT for your performance. But your code can’t assume it. But your could SHOULD take advantage of it if your container is reused (loading configuration, keeping connections alive, in-memory caches, etc).
You’ve got scratch space available on disk. This could open up new use cases for you or reduce your cost. Not everything needs to live in memory all the time.
Integration with CloudWatch logs is tight and requires no effort on your part to start capturing logs at scale. CloudWatch logs is a powerful tool in it’s own right, and the price point is compelling. It also has third party integration capabilities for platforms like Splunk.
Create custom metrics using the AWS SDK and CloudWatch. Things like error scenarios, or metrics that your business would love to aggregate and report on, dependency response times, or intra-function response times. A simple statement to push out metrics to cloudwatch split on the dimensions important to your application or business are all that you need for very valuable dashboards to be generated.
19 A recommended approach here is to leverage Lambda’s versioning and aliasing capability. Versions are immutable copies of the function code and configuration as of when the version was published. You can create a version at the time when a function is created or updated, or at any time by using the Lambda PublishVersion API. Aliases are pointers to a version, which can be remapped. Think of the alias as the context in which the code is being invoked, such as a stack or a particular client type. Together, these capabilities allow you to setup both a test stack to validate code as it gets deployed, as well as explicitly promote qualified code into your “prod” stack. Since each function is also aware of the alias which was used to invoke it, you can also use this capability to reference configuration for down stream services. For example, imagine a function with a dev alias defined, which points to the latest version. You can now store the downstream configuration keyed by alias in a separate DynamoDB table, that specifies the downstream services the “dev” alias can access. Once your code passes muster, numbers look good, you can explicitly publish a version, version 1, and point your beta alias to it. You can point your beta event source at this alias, and specify the required beta configuration in your config store. When ready, you can point your prod alias at this version, and only when this happens will that code you uploaded affect your production stack. This approach enables multiple capabilities, such as rolling back a version by simply remapping at alias, locking a client to a particular version by directly pointing them to the version ARN, and of course, the ability to leverage the same deployment package across multiple packages by keying configuration off the alias.
No VPC is non-secure
Private subnets NAT Gateway IGW
Naming conventions – make your function names programmatically consumable (think of EC2 tags as example values you might include).
Use those naming conventions to drive automation (CI/CD, metrics/monitoring, reports, etc)
By externalizing your security posture to IAM, your code and application can be agnostic of it. Code flaws don’t impact security. You’ll get all of the audit and tracking capabilities that IAM provides through things like CloudTrail and Config.
Create separate IAM roles for everything. Least privilege can’t be least if multiple functions/APIs share the same role and you eventually need to make permission changes for one of your functions/APIs but not all. Make this easier by dynamically building your IAM roles by merging CloudFormation templates programmatically.
Externalize the configuration of your functions. Environment variables, log levels, etc. You can “deploy” these configuration changes on the fly. DynamoDB is great for this.
Be aware of scaling. These services will manage scaling events for you, but there are some initial levels where throttling could occur. If you plan on adopting Lambda and/or API Gateway with production scale, engage AWS Support or your Solutions Architect to make them aware.
Connection pooling as a Service
In front of the Lambda service, you put in API Gateway or Kinesis stream – you limit things in the front-end or the backend. Kinesis acts like a buffer.
Concurrent limit is a safety limit – we limit to 100.
Known large scale events – give Nike example
Only about 14% of our requests today are synchronous, asynchronous is 86%. About 6% is synchronous with Node/Python: cold-starts are 100-200ms. Remaining 8% is synchronous with Java, where we start a 0.1% chance of cold-start. It’s 0.06 chance of cold starts.
So let’s take a look at how we can use Amazon Cognito to secure our API.
We can make sure that only authenticated users can interact with our API (That’s the first A in AAA; Authentication).
We can choose who can interact with which API resources by writing logic into our custom authorizer function granting access depending on their JWT claims (There’s the 2nd A; Authorisation).
We can also log and meter all API access via API Gateway (and that’s the final A in AAA; Accounting).