SlideShare a Scribd company logo
1 of 51
Download to read offline
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Eric Johnson
Senior Developer Advocate - Serverless
AWS
@edjgeek
Big “Serverless” Data
Powering Big Data with Serverless
Background Image by Эдуард Ризванов from Pixabay
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Who am I?
• Sr. Developer Advocate – Serverless, AWS
• Serverless / Tooling / Automation Geek
• Software Architect / Solutions Architect
• Husband to Brigitte
• Father to Noah, Jake, Owen
Sophie Anne, & Gracie Mae
• Music lover
• Pizza / Diet Dr. Pepper fanatic
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why are
we here?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Serverless in big data processing
Amazon Kinesis
Video Streams
Amazon Kinesis
Data Streams
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Analytics
Amazon Athena AWS Lambda Amazon Simple
Storage Service
Amazon DynamoDB
Understanding the role Serverless plays in Big Data
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
Ingestion
Real-time processing
Real-time analytics
Post processing
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is serverless?
No infrastructure provisioning,
no management
Automatic scaling
Pay for value Highly available and secure
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ingestion
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ingesting data at scale
Amazon Kinesis
Video Streams
Amazon Kinesis
Data Streams
Amazon Kinesis
Data Firehose
Video Ingestion Data Ingestion
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Video ingestion
• Fully managed infrastructure
that scales to load
• Offers SDK in C++ and Java
• Supports live and on-demand
playback of streams
• Durable storage using
Amazon S3
• Works with many forms of
time encoded data
• Supports multiple time code
based formats
Amazon Kinesis
Video Streams
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data ingestion – Kinesis Data Streams
• Uses shards to scale
• 1 MB or 1000 records /second/shard
ingress
• 2 MB/second/shard egress
• Works with Kinesis Data
Analytics
• Can support connected
consumers for enhanced
fanout
• Can store data up to 168
hours (7 days)
Amazon Kinesis
Data Streams
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data ingestion – Kinesis Firehose
• Auto-scales to meet load
• Different regions have different
capacity
• US East: 5,000 records/second, 2,000
transactions/second, and 5 MiB/second.
• Works with Kinesis Data
Analytics
• Can transform data before
delivery to target
• Stores data up to 24 hours
on failed delivery
Amazon Kinesis
Firehose
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data ingestion – Kinesis Firehose
Data Sources Targets
• Firehose PUT APIs
• Amazon Kinesis
Agent
• AWS IoT
• CloudWatch Logs
• CloudWatch Events
• Amazon S3
• Amazon Redshift
• Amazon Elasticsearch
Service
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kinesis Data Stream
Kinesis Data Streams vs. Kinesis Firehose
Kinesis Firehose
Amazon Kinesis
Data Stream
Data Producers
010001110010100
01000111001001101010100
010010100010100
01000100101110100
010010100010100
010010100010100
010010100010100
010010100010100
010010100010100
Data Producers
Amazon Kinesis
Data Firehose
01000111001001101010100
010010100010100
010010100010100
010010100010100
010010100010100
01000111001001101010100
01000111001001101010100
01000111001001101010100
01000111001001101010100
010001101010100
010001101100
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kinesis Firehose
Kinesis Firehose
Data Producers
Amazon Kinesis
Data Firehose
01000111001001101010100
010010100010100
010010100010100
010010100010100
010010100010100
01000111001001101010100
01000111001001101010100
01000111001001101010100
01000111001001101010100
010001101010100
010001101100
Use Kinesis Firehose when you need:
• Ability to transform data in the stream
• Auto scaling for unpredictable load
• Multiple targets for final data
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kinesis Data Stream
Kinesis Data Streams
Amazon Kinesis
Data Stream
Data Producers
010001110010100
01000111001001101010100
010010100010100
01000100101110100
010010100010100
010010100010100
010010100010100
010010100010100
010010100010100
Use Kinesis Data Streams when:
• You have semi-predictable traffic
• You need to perform real-time action on
data in the stream
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Real-time
processing
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kinesis Data Stream + Lambda
Amazon Kinesis
Data Stream
Data Producers
Lambda
function
Lambda
function
Lambda
function
Amazon
DynamoDB
Amazon Kinesis
Data Stream
AWS IoT
Core
Lambda services handles
intermittent polling
via GetRecords API
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kinesis Data Stream + Lambda
Amazon Kinesis
Data Stream
Data Producers
Lambda
function
Lambda
function
Lambda
function
Amazon
DynamoDB
Amazon Kinesis
Data Stream
AWS IoT
Core
Lambda services handles
intermittent polling
via GetRecords API
All applications share 2 MB/second/shard egress
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kinesis Data Stream + Enhanced Fanout + Lambda
Amazon Kinesis
Data Stream
Data Producers
Lambda
function
Lambda
function
Lambda
function
Amazon
DynamoDB
Amazon Kinesis
Data Stream
AWS IoT
Core
Functions triggered
by consumers
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis
Data Stream
Data Producers
Lambda
function
Lambda
function
Lambda
function
Amazon
DynamoDB
Amazon Kinesis
Data Stream
AWS IoT
Core
Functions triggered
by consumers
Each consumer provides an
individual 2 MB/second/shard
egress
Kinesis Data Stream + Enhanced Fanout + Lambda
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Video Processing
Amazon Kinesis
Video Streams
Amazon
Rekognition video
Amazon
SageMaker
S3 Bucket
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Video Processing
Amazon Kinesis
Video Streams
Amazon
Rekognition video
Amazon
SageMaker
Real time analysis and machine learning
S3 Bucket
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Video Processing
Amazon Kinesis
Video Streams
Amazon
Rekognition video
Amazon
SageMaker
Real time analysis and machine learning
S3 Bucket
HLS Compatible live or
on-demand playback
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Video Processing
Amazon Kinesis
Video Streams
Amazon
Rekognition video
Amazon
SageMaker
Real time analysis and machine learning
HLS Compatible live or
on-demand playback
S3 Bucket
Near real-time
processing
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Real-time
analytics
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis Data Analytics
• Built-in functions to filter,
aggregate, and transform
streaming data
• Processes streaming data with
sub-second latencies
• Build SQL queries that
perform joins, aggregations
over time windows and filters
• includes open source libraries
based on Apache Flink that
enable you to build an
application in hours instead of
months
Amazon Kinesis
Data Analytics
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Real-time analytics
Amazon Kinesis
Data Stream
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Analytics
Stream source can be
Kinesis Data Stream
or Firehose
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Inside Kinesis Data Analytics
Stream
data
-- Create Fail Stream --
CREATE OR REPLACE STREAM "FAIL_STREAM" (
sensorId INT,
currentTemperature INT,
status VARCHAR(10)
);
CREATE OR REPLACE PUMP "FAIL_STREAM_PUMP" AS INSERT INTO "FAIL_STREAM"
SELECT "sensorId", "currentTemperature", "status"
FROM "SOURCE_SQL_STREAM_001"
WHERE "status" SIMILAR TO '%FAIL%';
-- Create Warn Stream --
CREATE OR REPLACE STREAM "WARN_STREAM" (
sensorId INT,
currentTemperature INT,
status VARCHAR(10)
);
CREATE OR REPLACE PUMP "WARN_STREAM_PUMP" AS INSERT INTO "WARN_STREAM"
SELECT "sensorId", "currentTemperature", "status"
FROM "SOURCE_SQL_STREAM_001"
WHERE "status" SIMILAR TO '%WARN%';
FAIL_STREAM
WARN_STREAM
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Inside Kinesis Data Analytics
Stream
data
-- Create Fail Stream --
CREATE OR REPLACE STREAM "FAIL_STREAM" (
sensorId INT,
currentTemperature INT,
status VARCHAR(10)
);
CREATE OR REPLACE PUMP "FAIL_STREAM_PUMP" AS INSERT INTO "FAIL_STREAM"
SELECT "sensorId", "currentTemperature", "status"
FROM "SOURCE_SQL_STREAM_001"
WHERE "status" SIMILAR TO '%FAIL%';
-- Create Warn Stream --
CREATE OR REPLACE STREAM "WARN_STREAM" (
sensorId INT,
currentTemperature INT,
status VARCHAR(10)
);
CREATE OR REPLACE PUMP "WARN_STREAM_PUMP" AS INSERT INTO "WARN_STREAM"
SELECT "sensorId", "currentTemperature", "status"
FROM "SOURCE_SQL_STREAM_001"
WHERE "status" SIMILAR TO '%WARN%';
FAIL_STREAM
WARN_STREAM
Use SQL or Apache Flink to filter data
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Inside Kinesis Data Analytics
Stream
data
-- Create Fail Stream --
CREATE OR REPLACE STREAM "FAIL_STREAM" (
sensorId INT,
currentTemperature INT,
status VARCHAR(10)
);
CREATE OR REPLACE PUMP "FAIL_STREAM_PUMP" AS INSERT INTO "FAIL_STREAM"
SELECT "sensorId", "currentTemperature", "status"
FROM "SOURCE_SQL_STREAM_001"
WHERE "status" SIMILAR TO '%FAIL%';
-- Create Warn Stream --
CREATE OR REPLACE STREAM "WARN_STREAM" (
sensorId INT,
currentTemperature INT,
status VARCHAR(10)
);
CREATE OR REPLACE PUMP "WARN_STREAM_PUMP" AS INSERT INTO "WARN_STREAM"
SELECT "sensorId", "currentTemperature", "status"
FROM "SOURCE_SQL_STREAM_001"
WHERE "status" SIMILAR TO '%WARN%';
FAIL_STREAM
AWS Lambda
• Alert
• Diagnose
• Remediate
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Inside Kinesis Data Analytics
Stream
data
-- Create Fail Stream --
CREATE OR REPLACE STREAM "FAIL_STREAM" (
sensorId INT,
currentTemperature INT,
status VARCHAR(10)
);
CREATE OR REPLACE PUMP "FAIL_STREAM_PUMP" AS INSERT INTO "FAIL_STREAM"
SELECT "sensorId", "currentTemperature", "status"
FROM "SOURCE_SQL_STREAM_001"
WHERE "status" SIMILAR TO '%FAIL%';
-- Create Warn Stream --
CREATE OR REPLACE STREAM "WARN_STREAM" (
sensorId INT,
currentTemperature INT,
status VARCHAR(10)
);
CREATE OR REPLACE PUMP "WARN_STREAM_PUMP" AS INSERT INTO "WARN_STREAM"
SELECT "sensorId", "currentTemperature", "status"
FROM "SOURCE_SQL_STREAM_001"
WHERE "status" SIMILAR TO '%WARN%';
WARN_STREAM
Amazon Kinesis
Data Stream
• Dashboards
• Consumer
response
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Real-time analytics
Amazon Kinesis
Data Stream
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Analytics
Amazon Kinesis
Data Stream
AWS Lambda
FAIL_STREAM
WARN_STREAM
What about the raw data?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Real-time analytics
Amazon Kinesis
Data Stream
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Analytics
Amazon Kinesis
Data Stream
Amazon Kinesis
Data Firehose
AWS Lambda
FAIL_STREAM
WARN_STREAM
Raw Data Archive
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Post processing
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Serverless data storage
Amazon Simple
Storage Service
Amazon
DynamoDB
Amazon
Timestream
Amazon Quantum
Ledger Database
Amazon
CloudWatch
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Streams
Amazon Kinesis
Data Analytics
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Serverless data storage
Amazon Simple
Storage Service
Amazon
DynamoDB
Amazon
Timestream
Amazon Quantum
Ledger Database
Amazon
CloudWatch
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Streams
Amazon Kinesis
Data Analytics
How you need to process
your data determines
where to store it
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Serverless storage options
Amazon Simple
Storage Service
Amazon
DynamoDB
Amazon
Timestream
Amazon Quantum
Ledger Database
Amazon
CloudWatch
• Immutable and
transparent
• Cryptographically
Verifiable
• Object storage
• Unstructured data
• Structured data
• Alerting built in
• NoSQL
• Key value or
document data
• Time series
database
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Post processing – Serverless Tools
Amazon Athena
Query S3 data with standard
SQL expressions
Amazon S3 Select
Retrieve subsets of object data,
instead of the entire object.
AWS Glue
Extract, transform, and load
(ETL) service that works across
multiple services.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Glue
Bucket
Bucket
Bucket
Bucket
Bucket
DynamoDB
TableDynamoDB
Table
DynamoDB
Table
DynamoDB
Table
DynamoDB
Table
Other non-
serverless services
• MariaDB
• Microsoft SQL Server
• MySQL
• Oracle
• PostgreSQL
Critical data can be stored in many places
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Glue
Bucket
Bucket
Bucket
Bucket
Bucket
DynamoDB
TableDynamoDB
Table
DynamoDB
Table
DynamoDB
Table
DynamoDB
Table
Crawler Data Catalog
Other non-
serverless services
• MariaDB
• Microsoft SQL Server
• MySQL
• Oracle
• PostgreSQL
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Glue
Bucket
Bucket
Bucket
Bucket
Bucket
DynamoDB
TableDynamoDB
Table
DynamoDB
Table
DynamoDB
Table
DynamoDB
Table
Crawler Data Catalog
Other non-
serverless services
• MariaDB
• Microsoft SQL Server
• MySQL
• Oracle
• PostgreSQL
What it is doing
• Classifies data to determine the format, schema,
and associated properties of the raw data
• Groups data into tables or partitions – Data is
grouped based on crawler heuristics.
• Writes metadata to the Data Catalog
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Glue
Bucket
Bucket
Bucket
Bucket
Bucket
DynamoDB
TableDynamoDB
Table
DynamoDB
Table
DynamoDB
Table
DynamoDB
Table
Crawler Data Catalog
Other non-
serverless services
• MariaDB
• Microsoft SQL Server
• MySQL
• Oracle
• PostgreSQL
This catalog contains meta-data about the
data stores. How do I get the data itself in
a meaningful way?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Enter: AWS Athena
Bucket
Bucket
Bucket
Bucket
Bucket
DynamoDB
TableDynamoDB
Table
DynamoDB
Table
DynamoDB
Table
DynamoDB
Table
Crawler Data Catalog
Other non-
serverless services
• MariaDB
• Microsoft SQL Server
• MySQL
• Oracle
• PostgreSQL
Amazon Athena
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Athena
Bucket
Bucket
Bucket
Bucket
Bucket
DynamoDB
TableDynamoDB
Table
DynamoDB
Table
DynamoDB
Table
DynamoDB
Table
Crawler Data Catalog
Other non-
serverless services
• Amazon Aurora
• MariaDB
• Microsoft SQL Server
• MySQL
• Oracle
• PostgreSQL
Athena queries Glue Data Catalog
Glue returns data from data source Amazon Athena
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Athena
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Question
I have HUGE compressed CSV files
stored on Amazon S3.
How do I get small bits of data without
reading the entire file?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Enter: Amazon S3 Select
import boto3
s3 = boto3.client('s3’)
r = s3.select_object_content(
Bucket='jbarr-us-west-2’,
Key='sample-data/airportCodes.csv’,
ExpressionType='SQL’,
Expression="select * from s3object s where s."Country (Name)" like '%United States%’”,
InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}},
OutputSerialization = {'CSV': {}}, )
for event in r['Payload’]:
if 'Records' in event:
records = event['Records']['Payload'].decode('utf-8’)
print(records)
elif 'Stats' in event:
statsDetails = event['Stats']['Details’]
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Before S3 Select
Lambda
function
Bucket
0010100101101100100101010100101001
1100100100110110010110010101000100
1001111001000011001001001111110010
0000001101100101001100000101001011
0110010010101010010100111001001001
1011001011001010100010010011110010
0001100100100111111001000000011011
0010100110000010100101101100100101
0101001010011100100100110110010110
0101010001001001111001000011001001
0011111100100000001101100101001100
0001010010110110010010101010010100
1110010010011011001011001010100010
0100111100100001100100100111111001
0000000110110010100110000010100101
1011001001010101001010011100100100
1101100101100101010001001001111001
0000110010010011111100100000001101
1001010011000001010010110110010010
1010100101001110010010011011001011
0010101000100100111100100001100100
1001111110010000000110110010100110
Entire file returned
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
0010100101101100100101010100101001
1100100100110110010110010101000100
1001111001000011001001001111110010
0000001101100101001100000101001011
0110010010101010010100111001001001
1011001011001010100010010011110010
0001100100100111111001000000011011
0010100110000010100101101100100101
0101001010011100100100110110010110
0101010001001001111001000011001001
0011111100100000001101100101001100
0001010010110110010010101010010100
1110010010011011001011001010100010
0100111100100001100100100111111001
0000000110110010100110000010100101
1011001001010101001010011100100100
1101100101100101010001001001111001
0000110010010011111100100000001101
1001010011000001010010110110010010
1010100101001110010010011011001011
0010101000100100111100100001100100
1001111110010000000110110010100110
After S3 Select
Lambda
function
Bucket
Parsed value returned
Up to 400% faster
and 80% cheaper
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Questions?
https://pixabay.com/illustrations/questions-font-who-what-how-why-2245264/
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Eric Johnson
@edjgeek
Image Source: https://pixabay.com/illustrations/thank-you-polaroid-letters-2490552/

More Related Content

What's hot

Simplify Your Front End Apps with Serverless Backend in the Cloud.
Simplify Your Front End Apps with Serverless Backend in the Cloud.Simplify Your Front End Apps with Serverless Backend in the Cloud.
Simplify Your Front End Apps with Serverless Backend in the Cloud.Amazon Web Services
 
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS SummitOptimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS SummitAmazon Web Services
 
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...Boaz Ziniman
 
Building-Event-Driven-Serverless-Apps-with-AWS-Event-Forkines
Building-Event-Driven-Serverless-Apps-with-AWS-Event-ForkinesBuilding-Event-Driven-Serverless-Apps-with-AWS-Event-Forkines
Building-Event-Driven-Serverless-Apps-with-AWS-Event-ForkinesAmazon Web Services
 
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018Amazon Web Services Korea
 
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019Amazon Web Services
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...AWS Summits
 
AI and IoT innovation - an industry focus
AI and IoT innovation - an industry focusAI and IoT innovation - an industry focus
AI and IoT innovation - an industry focusAmazon Web Services
 
Starting your Cloud Transformation Journey - Tel Aviv Summit 2018
Starting your Cloud Transformation Journey - Tel Aviv Summit 2018Starting your Cloud Transformation Journey - Tel Aviv Summit 2018
Starting your Cloud Transformation Journey - Tel Aviv Summit 2018Boaz Ziniman
 
MCL334_Find Missing Persons by Scanning Social Media with Amazon Rekognition
MCL334_Find Missing Persons by Scanning Social Media with Amazon RekognitionMCL334_Find Missing Persons by Scanning Social Media with Amazon Rekognition
MCL334_Find Missing Persons by Scanning Social Media with Amazon RekognitionAmazon Web Services
 
在-MongoDB-Cloud-上構建無服務器化應用
在-MongoDB-Cloud-上構建無服務器化應用在-MongoDB-Cloud-上構建無服務器化應用
在-MongoDB-Cloud-上構建無服務器化應用Amazon Web Services
 
Twelve-factor serverless applications - MAD302 - Santa Clara AWS Summit
Twelve-factor serverless applications - MAD302 - Santa Clara AWS SummitTwelve-factor serverless applications - MAD302 - Santa Clara AWS Summit
Twelve-factor serverless applications - MAD302 - Santa Clara AWS SummitAmazon Web Services
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudAmazon Web Services
 
Tech Talk: Cloud Data Management with Veeam & AWS
Tech Talk: Cloud Data Management with Veeam & AWSTech Talk: Cloud Data Management with Veeam & AWS
Tech Talk: Cloud Data Management with Veeam & AWSAmazon Web Services
 
Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.Adrian Hornsby
 
TECHTalks - Philadelphia PA - Brien Blandford
  TECHTalks - Philadelphia PA - Brien Blandford  TECHTalks - Philadelphia PA - Brien Blandford
TECHTalks - Philadelphia PA - Brien BlandfordEagleDream Technologies
 
以容器技術為基礎的混合雲設計架構
以容器技術為基礎的混合雲設計架構以容器技術為基礎的混合雲設計架構
以容器技術為基礎的混合雲設計架構Amazon Web Services
 
Databases-on-AWS-Purpose-built-databases,-the-right-tool-for-the-right-job
Databases-on-AWS-Purpose-built-databases,-the-right-tool-for-the-right-jobDatabases-on-AWS-Purpose-built-databases,-the-right-tool-for-the-right-job
Databases-on-AWS-Purpose-built-databases,-the-right-tool-for-the-right-jobAmazon Web Services
 
Aws Tools for Alexa Skills
Aws Tools for Alexa SkillsAws Tools for Alexa Skills
Aws Tools for Alexa SkillsBoaz Ziniman
 

What's hot (20)

Simplify Your Front End Apps with Serverless Backend in the Cloud.
Simplify Your Front End Apps with Serverless Backend in the Cloud.Simplify Your Front End Apps with Serverless Backend in the Cloud.
Simplify Your Front End Apps with Serverless Backend in the Cloud.
 
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS SummitOptimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
 
Amazon Container Services
Amazon Container ServicesAmazon Container Services
Amazon Container Services
 
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
 
Building-Event-Driven-Serverless-Apps-with-AWS-Event-Forkines
Building-Event-Driven-Serverless-Apps-with-AWS-Event-ForkinesBuilding-Event-Driven-Serverless-Apps-with-AWS-Event-Forkines
Building-Event-Driven-Serverless-Apps-with-AWS-Event-Forkines
 
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
 
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
 
AI and IoT innovation - an industry focus
AI and IoT innovation - an industry focusAI and IoT innovation - an industry focus
AI and IoT innovation - an industry focus
 
Starting your Cloud Transformation Journey - Tel Aviv Summit 2018
Starting your Cloud Transformation Journey - Tel Aviv Summit 2018Starting your Cloud Transformation Journey - Tel Aviv Summit 2018
Starting your Cloud Transformation Journey - Tel Aviv Summit 2018
 
MCL334_Find Missing Persons by Scanning Social Media with Amazon Rekognition
MCL334_Find Missing Persons by Scanning Social Media with Amazon RekognitionMCL334_Find Missing Persons by Scanning Social Media with Amazon Rekognition
MCL334_Find Missing Persons by Scanning Social Media with Amazon Rekognition
 
在-MongoDB-Cloud-上構建無服務器化應用
在-MongoDB-Cloud-上構建無服務器化應用在-MongoDB-Cloud-上構建無服務器化應用
在-MongoDB-Cloud-上構建無服務器化應用
 
Twelve-factor serverless applications - MAD302 - Santa Clara AWS Summit
Twelve-factor serverless applications - MAD302 - Santa Clara AWS SummitTwelve-factor serverless applications - MAD302 - Santa Clara AWS Summit
Twelve-factor serverless applications - MAD302 - Santa Clara AWS Summit
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the Cloud
 
Tech Talk: Cloud Data Management with Veeam & AWS
Tech Talk: Cloud Data Management with Veeam & AWSTech Talk: Cloud Data Management with Veeam & AWS
Tech Talk: Cloud Data Management with Veeam & AWS
 
Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.
 
TECHTalks - Philadelphia PA - Brien Blandford
  TECHTalks - Philadelphia PA - Brien Blandford  TECHTalks - Philadelphia PA - Brien Blandford
TECHTalks - Philadelphia PA - Brien Blandford
 
以容器技術為基礎的混合雲設計架構
以容器技術為基礎的混合雲設計架構以容器技術為基礎的混合雲設計架構
以容器技術為基礎的混合雲設計架構
 
Databases-on-AWS-Purpose-built-databases,-the-right-tool-for-the-right-job
Databases-on-AWS-Purpose-built-databases,-the-right-tool-for-the-right-jobDatabases-on-AWS-Purpose-built-databases,-the-right-tool-for-the-right-job
Databases-on-AWS-Purpose-built-databases,-the-right-tool-for-the-right-job
 
Aws Tools for Alexa Skills
Aws Tools for Alexa SkillsAws Tools for Alexa Skills
Aws Tools for Alexa Skills
 

Similar to Serverless Big Data Processing

Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS SummitServerless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS SummitAmazon Web Services
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Amazon Web Services
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudAlluxio, Inc.
 
WildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing WorkshopWildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing WorkshopAmazon Web Services
 
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018Amazon Web Services
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Amazon Web Services
 
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)Adir Sharabi
 
BDA309 Build Your First Big Data Application on AWS
BDA309 Build Your First Big Data Application on AWSBDA309 Build Your First Big Data Application on AWS
BDA309 Build Your First Big Data Application on AWSAmazon Web Services
 
SRV316 Serverless Data Processing at Scale: An Amazon.com Case Study
 SRV316 Serverless Data Processing at Scale: An Amazon.com Case Study SRV316 Serverless Data Processing at Scale: An Amazon.com Case Study
SRV316 Serverless Data Processing at Scale: An Amazon.com Case StudyAmazon Web Services
 
Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...
Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...
Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...Amazon Web Services
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSAmazon Web Services
 
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWSKeynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWSFlink Forward
 
Building a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSBuilding a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSInjae Kwak
 
Serverless Architectural Patterns and Best Practices
Serverless Architectural Patterns and Best PracticesServerless Architectural Patterns and Best Practices
Serverless Architectural Patterns and Best PracticesAmazon Web Services
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time dataAmazon Web Services
 
Wild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless WorkshopWild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless WorkshopAWS Germany
 
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfCome scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfAmazon Web Services
 
Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018
Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018
Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018Amazon Web Services
 
Architetture per l'analisi di flussi di dati in tempo reale
Architetture per l'analisi di flussi di dati in tempo realeArchitetture per l'analisi di flussi di dati in tempo reale
Architetture per l'analisi di flussi di dati in tempo realeAmazon Web Services
 

Similar to Serverless Big Data Processing (20)

Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS SummitServerless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the Cloud
 
WildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing WorkshopWildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing Workshop
 
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
 
BDA309 Build Your First Big Data Application on AWS
BDA309 Build Your First Big Data Application on AWSBDA309 Build Your First Big Data Application on AWS
BDA309 Build Your First Big Data Application on AWS
 
SRV316 Serverless Data Processing at Scale: An Amazon.com Case Study
 SRV316 Serverless Data Processing at Scale: An Amazon.com Case Study SRV316 Serverless Data Processing at Scale: An Amazon.com Case Study
SRV316 Serverless Data Processing at Scale: An Amazon.com Case Study
 
Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...
Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...
Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWS
 
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWSKeynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
 
Building a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSBuilding a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWS
 
Serverless Architectural Patterns and Best Practices
Serverless Architectural Patterns and Best PracticesServerless Architectural Patterns and Best Practices
Serverless Architectural Patterns and Best Practices
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time data
 
Wild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless WorkshopWild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
 
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfCome scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
 
Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018
Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018
Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018
 
Architetture per l'analisi di flussi di dati in tempo reale
Architetture per l'analisi di flussi di dati in tempo realeArchitetture per l'analisi di flussi di dati in tempo reale
Architetture per l'analisi di flussi di dati in tempo reale
 

Recently uploaded

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Recently uploaded (20)

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

Serverless Big Data Processing

  • 1. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Eric Johnson Senior Developer Advocate - Serverless AWS @edjgeek Big “Serverless” Data Powering Big Data with Serverless Background Image by Эдуард Ризванов from Pixabay
  • 2. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Who am I? • Sr. Developer Advocate – Serverless, AWS • Serverless / Tooling / Automation Geek • Software Architect / Solutions Architect • Husband to Brigitte • Father to Noah, Jake, Owen Sophie Anne, & Gracie Mae • Music lover • Pizza / Diet Dr. Pepper fanatic
  • 3. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why are we here?
  • 4. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Serverless in big data processing Amazon Kinesis Video Streams Amazon Kinesis Data Streams Amazon Kinesis Data Firehose Amazon Kinesis Data Analytics Amazon Athena AWS Lambda Amazon Simple Storage Service Amazon DynamoDB Understanding the role Serverless plays in Big Data
  • 5. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Ingestion Real-time processing Real-time analytics Post processing
  • 6. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is serverless? No infrastructure provisioning, no management Automatic scaling Pay for value Highly available and secure
  • 7. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ingestion
  • 8. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ingesting data at scale Amazon Kinesis Video Streams Amazon Kinesis Data Streams Amazon Kinesis Data Firehose Video Ingestion Data Ingestion
  • 9. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Video ingestion • Fully managed infrastructure that scales to load • Offers SDK in C++ and Java • Supports live and on-demand playback of streams • Durable storage using Amazon S3 • Works with many forms of time encoded data • Supports multiple time code based formats Amazon Kinesis Video Streams
  • 10. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data ingestion – Kinesis Data Streams • Uses shards to scale • 1 MB or 1000 records /second/shard ingress • 2 MB/second/shard egress • Works with Kinesis Data Analytics • Can support connected consumers for enhanced fanout • Can store data up to 168 hours (7 days) Amazon Kinesis Data Streams
  • 11. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data ingestion – Kinesis Firehose • Auto-scales to meet load • Different regions have different capacity • US East: 5,000 records/second, 2,000 transactions/second, and 5 MiB/second. • Works with Kinesis Data Analytics • Can transform data before delivery to target • Stores data up to 24 hours on failed delivery Amazon Kinesis Firehose
  • 12. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data ingestion – Kinesis Firehose Data Sources Targets • Firehose PUT APIs • Amazon Kinesis Agent • AWS IoT • CloudWatch Logs • CloudWatch Events • Amazon S3 • Amazon Redshift • Amazon Elasticsearch Service
  • 13. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kinesis Data Stream Kinesis Data Streams vs. Kinesis Firehose Kinesis Firehose Amazon Kinesis Data Stream Data Producers 010001110010100 01000111001001101010100 010010100010100 01000100101110100 010010100010100 010010100010100 010010100010100 010010100010100 010010100010100 Data Producers Amazon Kinesis Data Firehose 01000111001001101010100 010010100010100 010010100010100 010010100010100 010010100010100 01000111001001101010100 01000111001001101010100 01000111001001101010100 01000111001001101010100 010001101010100 010001101100
  • 14. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kinesis Firehose Kinesis Firehose Data Producers Amazon Kinesis Data Firehose 01000111001001101010100 010010100010100 010010100010100 010010100010100 010010100010100 01000111001001101010100 01000111001001101010100 01000111001001101010100 01000111001001101010100 010001101010100 010001101100 Use Kinesis Firehose when you need: • Ability to transform data in the stream • Auto scaling for unpredictable load • Multiple targets for final data
  • 15. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kinesis Data Stream Kinesis Data Streams Amazon Kinesis Data Stream Data Producers 010001110010100 01000111001001101010100 010010100010100 01000100101110100 010010100010100 010010100010100 010010100010100 010010100010100 010010100010100 Use Kinesis Data Streams when: • You have semi-predictable traffic • You need to perform real-time action on data in the stream
  • 16. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Real-time processing
  • 17. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kinesis Data Stream + Lambda Amazon Kinesis Data Stream Data Producers Lambda function Lambda function Lambda function Amazon DynamoDB Amazon Kinesis Data Stream AWS IoT Core Lambda services handles intermittent polling via GetRecords API
  • 18. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kinesis Data Stream + Lambda Amazon Kinesis Data Stream Data Producers Lambda function Lambda function Lambda function Amazon DynamoDB Amazon Kinesis Data Stream AWS IoT Core Lambda services handles intermittent polling via GetRecords API All applications share 2 MB/second/shard egress
  • 19. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kinesis Data Stream + Enhanced Fanout + Lambda Amazon Kinesis Data Stream Data Producers Lambda function Lambda function Lambda function Amazon DynamoDB Amazon Kinesis Data Stream AWS IoT Core Functions triggered by consumers
  • 20. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Data Stream Data Producers Lambda function Lambda function Lambda function Amazon DynamoDB Amazon Kinesis Data Stream AWS IoT Core Functions triggered by consumers Each consumer provides an individual 2 MB/second/shard egress Kinesis Data Stream + Enhanced Fanout + Lambda
  • 21. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Video Processing Amazon Kinesis Video Streams Amazon Rekognition video Amazon SageMaker S3 Bucket
  • 22. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Video Processing Amazon Kinesis Video Streams Amazon Rekognition video Amazon SageMaker Real time analysis and machine learning S3 Bucket
  • 23. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Video Processing Amazon Kinesis Video Streams Amazon Rekognition video Amazon SageMaker Real time analysis and machine learning S3 Bucket HLS Compatible live or on-demand playback
  • 24. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Video Processing Amazon Kinesis Video Streams Amazon Rekognition video Amazon SageMaker Real time analysis and machine learning HLS Compatible live or on-demand playback S3 Bucket Near real-time processing
  • 25. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Real-time analytics
  • 26. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Data Analytics • Built-in functions to filter, aggregate, and transform streaming data • Processes streaming data with sub-second latencies • Build SQL queries that perform joins, aggregations over time windows and filters • includes open source libraries based on Apache Flink that enable you to build an application in hours instead of months Amazon Kinesis Data Analytics
  • 27. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Real-time analytics Amazon Kinesis Data Stream Amazon Kinesis Data Firehose Amazon Kinesis Data Analytics Stream source can be Kinesis Data Stream or Firehose
  • 28. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Inside Kinesis Data Analytics Stream data -- Create Fail Stream -- CREATE OR REPLACE STREAM "FAIL_STREAM" ( sensorId INT, currentTemperature INT, status VARCHAR(10) ); CREATE OR REPLACE PUMP "FAIL_STREAM_PUMP" AS INSERT INTO "FAIL_STREAM" SELECT "sensorId", "currentTemperature", "status" FROM "SOURCE_SQL_STREAM_001" WHERE "status" SIMILAR TO '%FAIL%'; -- Create Warn Stream -- CREATE OR REPLACE STREAM "WARN_STREAM" ( sensorId INT, currentTemperature INT, status VARCHAR(10) ); CREATE OR REPLACE PUMP "WARN_STREAM_PUMP" AS INSERT INTO "WARN_STREAM" SELECT "sensorId", "currentTemperature", "status" FROM "SOURCE_SQL_STREAM_001" WHERE "status" SIMILAR TO '%WARN%'; FAIL_STREAM WARN_STREAM
  • 29. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Inside Kinesis Data Analytics Stream data -- Create Fail Stream -- CREATE OR REPLACE STREAM "FAIL_STREAM" ( sensorId INT, currentTemperature INT, status VARCHAR(10) ); CREATE OR REPLACE PUMP "FAIL_STREAM_PUMP" AS INSERT INTO "FAIL_STREAM" SELECT "sensorId", "currentTemperature", "status" FROM "SOURCE_SQL_STREAM_001" WHERE "status" SIMILAR TO '%FAIL%'; -- Create Warn Stream -- CREATE OR REPLACE STREAM "WARN_STREAM" ( sensorId INT, currentTemperature INT, status VARCHAR(10) ); CREATE OR REPLACE PUMP "WARN_STREAM_PUMP" AS INSERT INTO "WARN_STREAM" SELECT "sensorId", "currentTemperature", "status" FROM "SOURCE_SQL_STREAM_001" WHERE "status" SIMILAR TO '%WARN%'; FAIL_STREAM WARN_STREAM Use SQL or Apache Flink to filter data
  • 30. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Inside Kinesis Data Analytics Stream data -- Create Fail Stream -- CREATE OR REPLACE STREAM "FAIL_STREAM" ( sensorId INT, currentTemperature INT, status VARCHAR(10) ); CREATE OR REPLACE PUMP "FAIL_STREAM_PUMP" AS INSERT INTO "FAIL_STREAM" SELECT "sensorId", "currentTemperature", "status" FROM "SOURCE_SQL_STREAM_001" WHERE "status" SIMILAR TO '%FAIL%'; -- Create Warn Stream -- CREATE OR REPLACE STREAM "WARN_STREAM" ( sensorId INT, currentTemperature INT, status VARCHAR(10) ); CREATE OR REPLACE PUMP "WARN_STREAM_PUMP" AS INSERT INTO "WARN_STREAM" SELECT "sensorId", "currentTemperature", "status" FROM "SOURCE_SQL_STREAM_001" WHERE "status" SIMILAR TO '%WARN%'; FAIL_STREAM AWS Lambda • Alert • Diagnose • Remediate
  • 31. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Inside Kinesis Data Analytics Stream data -- Create Fail Stream -- CREATE OR REPLACE STREAM "FAIL_STREAM" ( sensorId INT, currentTemperature INT, status VARCHAR(10) ); CREATE OR REPLACE PUMP "FAIL_STREAM_PUMP" AS INSERT INTO "FAIL_STREAM" SELECT "sensorId", "currentTemperature", "status" FROM "SOURCE_SQL_STREAM_001" WHERE "status" SIMILAR TO '%FAIL%'; -- Create Warn Stream -- CREATE OR REPLACE STREAM "WARN_STREAM" ( sensorId INT, currentTemperature INT, status VARCHAR(10) ); CREATE OR REPLACE PUMP "WARN_STREAM_PUMP" AS INSERT INTO "WARN_STREAM" SELECT "sensorId", "currentTemperature", "status" FROM "SOURCE_SQL_STREAM_001" WHERE "status" SIMILAR TO '%WARN%'; WARN_STREAM Amazon Kinesis Data Stream • Dashboards • Consumer response
  • 32. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Real-time analytics Amazon Kinesis Data Stream Amazon Kinesis Data Firehose Amazon Kinesis Data Analytics Amazon Kinesis Data Stream AWS Lambda FAIL_STREAM WARN_STREAM What about the raw data?
  • 33. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Real-time analytics Amazon Kinesis Data Stream Amazon Kinesis Data Firehose Amazon Kinesis Data Analytics Amazon Kinesis Data Stream Amazon Kinesis Data Firehose AWS Lambda FAIL_STREAM WARN_STREAM Raw Data Archive
  • 34. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Post processing
  • 35. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Serverless data storage Amazon Simple Storage Service Amazon DynamoDB Amazon Timestream Amazon Quantum Ledger Database Amazon CloudWatch Amazon Kinesis Data Firehose Amazon Kinesis Data Streams Amazon Kinesis Data Analytics
  • 36. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Serverless data storage Amazon Simple Storage Service Amazon DynamoDB Amazon Timestream Amazon Quantum Ledger Database Amazon CloudWatch Amazon Kinesis Data Firehose Amazon Kinesis Data Streams Amazon Kinesis Data Analytics How you need to process your data determines where to store it
  • 37. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Serverless storage options Amazon Simple Storage Service Amazon DynamoDB Amazon Timestream Amazon Quantum Ledger Database Amazon CloudWatch • Immutable and transparent • Cryptographically Verifiable • Object storage • Unstructured data • Structured data • Alerting built in • NoSQL • Key value or document data • Time series database
  • 38. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Post processing – Serverless Tools Amazon Athena Query S3 data with standard SQL expressions Amazon S3 Select Retrieve subsets of object data, instead of the entire object. AWS Glue Extract, transform, and load (ETL) service that works across multiple services.
  • 39. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Glue Bucket Bucket Bucket Bucket Bucket DynamoDB TableDynamoDB Table DynamoDB Table DynamoDB Table DynamoDB Table Other non- serverless services • MariaDB • Microsoft SQL Server • MySQL • Oracle • PostgreSQL Critical data can be stored in many places
  • 40. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Glue Bucket Bucket Bucket Bucket Bucket DynamoDB TableDynamoDB Table DynamoDB Table DynamoDB Table DynamoDB Table Crawler Data Catalog Other non- serverless services • MariaDB • Microsoft SQL Server • MySQL • Oracle • PostgreSQL
  • 41. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Glue Bucket Bucket Bucket Bucket Bucket DynamoDB TableDynamoDB Table DynamoDB Table DynamoDB Table DynamoDB Table Crawler Data Catalog Other non- serverless services • MariaDB • Microsoft SQL Server • MySQL • Oracle • PostgreSQL What it is doing • Classifies data to determine the format, schema, and associated properties of the raw data • Groups data into tables or partitions – Data is grouped based on crawler heuristics. • Writes metadata to the Data Catalog
  • 42. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Glue Bucket Bucket Bucket Bucket Bucket DynamoDB TableDynamoDB Table DynamoDB Table DynamoDB Table DynamoDB Table Crawler Data Catalog Other non- serverless services • MariaDB • Microsoft SQL Server • MySQL • Oracle • PostgreSQL This catalog contains meta-data about the data stores. How do I get the data itself in a meaningful way?
  • 43. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Enter: AWS Athena Bucket Bucket Bucket Bucket Bucket DynamoDB TableDynamoDB Table DynamoDB Table DynamoDB Table DynamoDB Table Crawler Data Catalog Other non- serverless services • MariaDB • Microsoft SQL Server • MySQL • Oracle • PostgreSQL Amazon Athena
  • 44. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Athena Bucket Bucket Bucket Bucket Bucket DynamoDB TableDynamoDB Table DynamoDB Table DynamoDB Table DynamoDB Table Crawler Data Catalog Other non- serverless services • Amazon Aurora • MariaDB • Microsoft SQL Server • MySQL • Oracle • PostgreSQL Athena queries Glue Data Catalog Glue returns data from data source Amazon Athena
  • 45. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Athena
  • 46. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Question I have HUGE compressed CSV files stored on Amazon S3. How do I get small bits of data without reading the entire file?
  • 47. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Enter: Amazon S3 Select import boto3 s3 = boto3.client('s3’) r = s3.select_object_content( Bucket='jbarr-us-west-2’, Key='sample-data/airportCodes.csv’, ExpressionType='SQL’, Expression="select * from s3object s where s."Country (Name)" like '%United States%’”, InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}}, OutputSerialization = {'CSV': {}}, ) for event in r['Payload’]: if 'Records' in event: records = event['Records']['Payload'].decode('utf-8’) print(records) elif 'Stats' in event: statsDetails = event['Stats']['Details’]
  • 48. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Before S3 Select Lambda function Bucketntire file returned
  • 49. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reservedfter S3 Select Lambda function Bucket Parsed value returned Up to 400% faster and 80% cheaper
  • 50. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Questions? https://pixabay.com/illustrations/questions-font-who-what-how-why-2245264/
  • 51. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Eric Johnson @edjgeek Image Source: https://pixabay.com/illustrations/thank-you-polaroid-letters-2490552/