An increasingly common form of computing is computation in response to recently occurring events. These might be newly arrived or changed data, such as an uploaded Amazon S3 image file or an update to an Amazon DynamoDB table, or they might be changes in the state of some system or service, such as termination of an EC2 instance. Support for this form of computing requires both a means of efficiently surfacing events as a sequence of change records, as well as frameworks for processing such change logs. This session provides an overview of how AWS intends to facilitate event-driven computing through support for both change logs as well as various means of processing them.
3. TraditionalWay of Doing Things
•My phone or my camera uploads an image file to my S3 bucket
–Maybe my phone is smart enough to index my photos by their GPS coordinates
•Now I buy an app that creates photo albums by city, region, and country
–Don’t want to run it on my phone!
–Could be a web service –but now I’m handing all my personal photos to a 3rdparty
–Could be a cloud app that I periodically invoke –but now I have to remember to invoke it and remember which photos to run it on
–Could set up a recurring work flow to run it –but now I’m paying to run it all the time
•Now I buy another app that does face recognition and tags all my friends
•Not I buy another app that does ____
•…
4. TraditionalWay of Doing Things
•Launch an EC2 instance as part of your application
•Your application is highly available
–Did you set up all the appropriate alarms on the new instance?
•Your application is secure
–Did you tell your intrustiondetection service about the new instance?
•Your company is cost conscious
–Did you tag the instance with the appropriate cost allocation tags?
5. Event-DrivenWay of Doing Things
•Your phone or camera uploads an image to your S3 bucket
•An event is generated telling all interested parties about the upload
•Your photo gets indexed by GPS location
•Your photo gets added to all relevant photo albums
•Your photo gets tagged with friend references
•Ideally: You didn’t have to do anything other than purchase all those cool apps
6. Event-DrivenWay of Doing Things
•An EC2 instance is launched
•An event is generated telling all interested parties about the creation of the new instance
•Appropriate monitors and alarms are created
•Intrusion detection learns of the new instance
•The appropriate cost allocation tags are added
•Ideally:
–You didn’t have to do anything other than launch the EC2 instance
–Application developers don’t need to know about all the “auxiliary” activities that have to happen
8. Anti-Patternin Discovering New Events
Periodically Scan Entire Dataset
List S3-buckets
List S3-buckets
Diff (ListingA–ListingB)
{millions of objects}
{millions of objects + 3 objects}
3 objects
9. Anti-Patternin Discovering New Events
Periodically polling all system state
Ec2-describe-instances
Ec2-describe-instances
Diff (ListingA–LintingB) 3 instances
{thousands of instances}
{thousands of instances+ 3 instances}
14. Event Driven Computing in AWS Tomorrow
Event logs for asynchronous service events
Event logs from other data storage services
Customer 1
Customer 2
Customer 3
15. Vision: UnifiedEvent Log Approach
Kinesis
S Q S
DynamoDB Streams
S3 Archive Objects
S3 Archive Objects
S N S
Challenge:
16. A UnifiedEvent Log Approach
Kinesis
SQS
Plus: easy conversion to
other standard forms:
S3archive format, SNS, …
(UnorderedEvents)
(OrderedEvents)
19. Layers of Commonality --Storage
FileSystem:
-Everyone can have theirownsequence of bytes
-Tools for managing and manipulating byte sequences
20. Layers of Commonality --Storage
Typed files:
-Application-specific state
-Tools for managing and manipulating structured information across manyfiles; e.g. search
22. Layers of Commonality–Event Logs
Event logs:
-Everyone can have their own sequence of records
-Tools for managing and manipulating sequences of records
23. Layers of Commonality–Event Logs
Typed event logs:
-DynamoDB update streams
-Tools for managing and manipulating structured information across many files; e.g. cross-region replication
29. Foundations
Lowest level of abstraction: un-interpreted sequence of records
A key characteristic:
vs.
(e.g. multi-item transactions or “delta” updates)
(e.g. S3 image
upload notifications)765432
unordered
ordered
32. K
I
N
E
S
I
S
OrderedLog Processing Using the Kinesis Client Library
Shard mgmttable
User
State
Kinesis-enabled application
ASG
Use of the KCL
Mostly writing
business logic
33. Kinesis vs. DynamoDB Update Streams
•The Kinesis API and the DynamoDB Update Streams API differ
–Different max record sizes
–DynamoDBcontrols all aspects of writing to streams
•this includes naming of streams, provisioning, sharding, and resharding
–ListStream and DescribeStream in DynamoDB include service-specific semantics (e.g. Describe returns the table schema)
•Kinesis Client Library (KCL) abstracts these differences away
–Best way to write applications for either Kinesis streams or DynamoDB update streams
–Applications that are agnostic to which type of stream is being processed can transparently target either type
34. Higher-Level Processing Frameworks
•SQS and Kinesis-enabled applications are low-level frameworks:
–You still need to create AMIs, launch EC2 instances, configure auto-scaling groups, etc.
–“All I want is X . Can’t someone just create that for me?”
•Lambdaeliminates the operations/management tasks
•Opportunity: High level capabilities –e.g. archive-to-S3,upload-to- Redshift,orpublish-to-SNS –can be provided as predefined functions that can be attached to anevent log
38. How Many Event Logs?
Good for customer understanding of a particular service:
-What just happened?
-List everything that happened recently
Not so easy to understand things across multiple services
Too expensive for “data plane” events;
wrong granularity:
-Log per S3 bucket
-Log per DynamoDB table
Event log per customer per service
Cust.2593
Cust.2593
Cust.2593
Cust.7302
Cust.7302
Cust.3826
Cust.8941
Cust.2590
Cust.4198
Cust.8368
Cust.2505
Cust.7731
39. How Many Event Logs?
Per customer event log of allcontrol
plane events
-Traffic volume small enough to simply merge all of it
-Makes it easy understand the biggerpicture
Cust.2593
Cust.7302
Cust.3826
Optionally generated per “entity”
event log for data storage services
-The right granularity
-Only incur traffic costs where necessary
Bucket Y
Bucket W
Table A
TableB
Table C
Bucket X
Bucket Z
40. Event Logs for Customers’ Services
Vision: customers’ services and applications leverage the AWS event log infrastructure
Cust.2593
Cust.7302
Cust.3826
Widget A
Widget B
Widget C
www.widget.com
Per-customer control plane events sent to per-customer unified control plane log
Create& manage optional per-entity data plane event logs (e.g. as Kinesis streams)
41. Summary: It’s All About Timeliness and Agility
•“Cycle time compression may be the most underestimated force in determingwinners & losers in tech” –Marc Andreesen
•Real-time events lets you create real-time applications
•Published events let 3rdparties independently innovate on top of each other
•Platform-wide event architecture lets independent parties start building composabletools and functions
•Low friction processing frameworks (e.g. Lambda) compress the development & operations cycle time
42. Summary: Enablers for Pervasive Event-Driven Computing
•Efficient way of surfacing events:event logs
•Standardsfor discovery, access, semi-structured data formats, and processing of event logs
•Low and high-level processing frameworks that enable various degrees of control vs. simplicity
44. Opportunityfor an Ecosystem
Enablers:
–Enumerate/discover event logs
–Standard, semi-structured data formats
–Standard processing frameworks –e.g. SQS, KCL, Lambda
45. Opportunityfor an Ecosystem
Marketplace for free /for-fee software & services
–KCL-enabled libraries, Lambda functions, etc.
–Services that consume/emit streams of records –e.g. SQS or Kinesis records