SlideShare a Scribd company logo
1 of 35
Alex Casalboni
Technical Evangelist, AWS
@alex_casalboni
@ 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
From Data Collection to
Actionable Insights in 60 Seconds
About me
• Software Engineer & Web Developer
• Startupper for 4.5 years
• Serverless Lover & AI Enthusiast
• AWS Customer since 2013
Agenda
1. Data Challenges
2. Columnar Formats
3. Data Lakes vs. Data Warehouses
4. Serverless Analytics
5. Demo time
Data Challenges
Data variety and data volumes are increasing rapidly
Multiple Consumers and Applications
Ingest
Discover
Catalog
Understand
Curate
Find insights
Right tool for the job
Customer Needs Come First
Purpose-Built Analytics on AWS
Collect Store Analyze
Amazon Kinesis
Firehose
AWS Direct
Connect
Amazon
Snowball
Amazon Kinesis
Analytics
Amazon Kinesis
Streams
Amazon S3 Amazon Glacier
Amazon
CloudSearch
Amazon RDS,
Amazon Aurora
Amazon Dynamo
DB
Amazon
Elasticsearch
Amazon EMR
Amazon
Redshift
Amazon
QuickSight
AWS Database Migration Service AWS Glue
Amazon Athena
Amazon AI
Open-source standards (Apache)
Parquet, ORC, etc.
Optimize Performance
Optimize Costs
Analytical queries
Columnar data
Under the hood
Why it matters
Big Data Analytics
Real-time Analytics
Data exploration
Traditional Data Warehouse
OLTP ERP CRM LOB
Data Warehouse
Business Intelligence
Relational data
Terabytes to Petabytes scale
Schema defined prior to data load
Operational reporting
and ad-hoc analysis
Data Lakes extend traditional warehouses
Relational and non-relational data
Terabytes to Exabytes scale
Schema defined during analysis
(Schema on Read)
Diverse analytical engines to gain insights
Designed for low cost storage and analytics
OLTP ERP CRM LOB
Data Warehouse
Business
Intelligence
Data Lake
1001100001001010111
0010101011100101010
0001011111011010
0011110010110010110
0100011000010
Devices Web Sensors Social
Data Catalog
Machine
Learning
DW
Queries
Big data
processing
Interactive Real-time
Snowball
Snowmobile Kinesis
Data Firehose
Kinesis
Data Streams
Amazon S3 AWS Glue
Wide variety of ways to bring data in
Durability and availability at Exabyte scale
Security, compliance, and audit capabilities
Run any analytics on the same data without
movement
Scale storage and compute independently
Store at $0.023 / GB-month
Query for $0.05 / GB scanned
Redshift
EMR
Athena
Kinesis
Elasticsearch
Service
Data Lakes on AWS
Data Lake Components
Catalog' &'Search
Access%and%search%metadata
Access'&'User'Interface
Give%your%users%easy%and%secure%access
DynamoDB Elasticsearch API'Gateway Identity'&'Access'
Management
Cognito
QuickSight Amazon' AI EMR Redshift
Athena Kinesis RDS
Central'Storage
Secure,%cost5effective
Storage%in%Amazon%S3
S3
Snowball Database' Migration'
Service
Kinesis' Firehose Direct'Connect
Data'Ingestion
Get%your%data%into%S3
Quickly%and%securely
Protect'and'Secure
Use%entitlements% to%ensure%data% is%secure%and% users’% identities% are% verified
Processing' &'Analytics
Use%of%predictive%and%prescriptive%
analytics%to%gain%better%understanding
Security'Token'
Service
CloudWatch CloudTrail Key'Management'
Service
Catalog' &'Search
Access%and%search%metadata
Access'&'User'Interface
Give%your%users%easy%and%secure%access
DynamoDB Elasticsearch API'Gateway Identity'&'Access'
Management
Cognito
QuickSight A
Central'Storage
Secure,%cost5effective
Storage%in%Amazon%S3
Glue ETL
Serverless Analytics
Deliver cost-effective analytic solutions faster
S3
Data Lake
Glue
(Data Catalog
and ETL)
RedShift
Spectrum
QuickSight
Serverless
Zero infrastructure
Zero administration
Pay only for
what you use,
not for idle
resources
$
Availability and
fault tolerance
built in
Automatically
scales resources
with usage
Snowball
Snowmobile
Kinesis
Data Firehose
many
other
sources
Other BI Tools
Amazon
Athena
Amazon
EMR
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Dark data are the information
assets organizations collect,
process, and store during
regular business activities,
but generally fail to use for
other purposes (for example,
analytics, business relationships
and direct monetizing).
Gartner
CRM ERP Data warehouse Mainframe
data
Web Social Log
files
Machine
data
Semi-
structured
Unstructured
“
”
AWS Glue—Serverless Data Catalog & ETL
Data Catalog ETL
Discover data and
extract schema
Auto-generate
customizable code
in Python and Spark
Automatically discovers data and stores schema
Data is immediately searchable
and available for ETL
Generates customizable code
Schedules and runs your ETL jobs
Serverless Model
Crawlers: Automatic Schema Inference
semi-structured
per-file schema
semi-structured
unified schema
identify file type
and parse files
enumerate
S3 objects
file 1
file 2
file N
…
int
array
intchar
struct
char int
array
struct
char
bool int
int
arrayint
char
char int
custom classifiers
Grok based parser
built-in classifiers
JSON parser
CSV parser
Parquet parser
…
bool
IAM Role
Glue Crawler
Data Lakes
Data Warehouse
Databases
Amazon
RDS
Amazon
Redshift
Amazon S3
JDBC Connection
Object Connection
Built-In Classifiers
MySQL
MariaDB
PostreSQL
Aurora
Redshift
Avro
Parquet
ORC
XML
JSON & BSON
Logs
(Apache (Grok), Linux(Grok), MS(Grok), Ruby, Redis,
and many others)
Delimited
(comma, pipe, tab, semicolon)
Compressions
(ZIP, BZIP, GZIP, LZ4, Snappy)
What can Crawlers Classify?
Detecting Schema Similarity
name:
str
id: num
Schema A
root
addr
street: str city: str zip: num
name:
str
id: num
Schema B
root
addr: str
Schema similarity heuristic
§ 1 point for matching name
§ 1 point for matching data type
§ Match when similarity index > 0.7
intersection
min(A,B)
7
8
.875sim
Available partitions
Automatically Detect Partitions
Automatically update table version as data evolves
Automatic Schema Versioning
Other Ways of Creating Tables
Call Glue’s CreateTable API
Create table manually Run Hive DDL statement
Apache Hive
Metastore
AWS GLUE ETL AWS GLUE
DATA CATALOG
Import from Apache Hive Metastore
Amazon Redshift - Data Warehousing
Fast, powerful, simple, and fully managed data warehouse at 1/10 the cost
Massively parallel, scale from gigabytes to petabytes
Fast at any scale
Columnar storage
technology to improve I/O
efficiency and scale query
performance
$
Inexpensive
As low as $1,000 per
terabyte per year, 1/10th
the cost of traditional data
warehouse solutions;
Start at $0.25 per hour
Open file formats Secure
Audit everything; encrypt
data end-to-end;
extensive certification and
compliance
Analyze optimized data
formats on the latest SSD,
and all open data formats in
Amazon S3
Amazon Redshift Spectrum
E x t e n d t h e d a t a w a r e h o u s e t o e x a b y t e s o f d a t a i n a n S 3 d a t a l a k e
Exabyte Redshift SQL queries against S3
Join data across Redshift and S3
Scale compute and storage separately
Stable query performance and unlimited
concurrency
CSV, ORC, Grok, Avro, & Parquet data formats
Pay only for the amount of data scanned
S3 data lakeRedshift data
Redshift Spectrum
query engine
Redshift Spectrum
Q u e ry y o u r dat a lake
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
AWS Glue
Data Catalog
Redshift Spectrum
Scale-out serverless compute
SELECT COUNT(*)
FROM S3.EXT_TABLE
GROUP BY …
Data Lake on Amazon S3 with AWS Glue
On premises data
Web app data
Amazon RDS
Other databases
Streaming data
Your data
AMAZON
QUICKSIGHT
AWS GLUE ETL
Demo Time
Uncompressed
Compressed (-94%)
Parquet (-70%)
Partitioned (-70%)
Overall 99.5% improvement!
Uncompressed
Compressed (-94%)
Parquet (-100%)
Partitioned (-100%)
Overall 100% improvement!
Uncompressed
Compressed (-94%)
Parquet (-72%)
Partitioned (-100%)
Overall 100% improvement!
Amazon
QuickSight
About me
Kinesis Data Generator (KDG)
github.com/awslabs/amazon-kinesis-data-generator
Serverless Data Pipeline powered by AWS SAM
github.com/alexcasalboni/serverless-data-pipeline-sam
AWS Big Data Blog
aws.amazon.com/blogs/big-data
Additional Resources
AMAZON CONFIDENTIAL
Did We Scan Your Badge?
Remember to opt-in to AWS
communications and you will receive a
post-event email with a link to:
• AWS Developer Workshop Slides
• $200 in AWS Credits
Alex Casalboni
Technical Evangelist, AWS
Thank you!
@alex_casalboni
@ 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved

More Related Content

What's hot

Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...Amazon Web Services
 
AWS Machine Learning Language Services
AWS Machine Learning Language ServicesAWS Machine Learning Language Services
AWS Machine Learning Language ServicesAmazon Web Services
 
AWS DevOps Essentials: An Introductory Workshop on CI/CD Best Practices (DEV3...
AWS DevOps Essentials: An Introductory Workshop on CI/CD Best Practices (DEV3...AWS DevOps Essentials: An Introductory Workshop on CI/CD Best Practices (DEV3...
AWS DevOps Essentials: An Introductory Workshop on CI/CD Best Practices (DEV3...Amazon Web Services
 
[NEW LAUNCH!] Introducing AWS App Mesh – service mesh on AWS (CON367) - AWS r...
[NEW LAUNCH!] Introducing AWS App Mesh – service mesh on AWS (CON367) - AWS r...[NEW LAUNCH!] Introducing AWS App Mesh – service mesh on AWS (CON367) - AWS r...
[NEW LAUNCH!] Introducing AWS App Mesh – service mesh on AWS (CON367) - AWS r...Amazon Web Services
 
Serverless Architectural Patterns and Best Practices (ARC305-R2) - AWS re:Inv...
Serverless Architectural Patterns and Best Practices (ARC305-R2) - AWS re:Inv...Serverless Architectural Patterns and Best Practices (ARC305-R2) - AWS re:Inv...
Serverless Architectural Patterns and Best Practices (ARC305-R2) - AWS re:Inv...Amazon Web Services
 
Tailor Your Alexa Skill Responses to Deliver Truly Personal Experiences (ALX3...
Tailor Your Alexa Skill Responses to Deliver Truly Personal Experiences (ALX3...Tailor Your Alexa Skill Responses to Deliver Truly Personal Experiences (ALX3...
Tailor Your Alexa Skill Responses to Deliver Truly Personal Experiences (ALX3...Amazon Web Services
 
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018Amazon Web Services
 
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018Amazon Web Services
 
Intro To AWS for Mobile Developers: Collision 2018
Intro To AWS for Mobile Developers: Collision 2018Intro To AWS for Mobile Developers: Collision 2018
Intro To AWS for Mobile Developers: Collision 2018Amazon Web Services
 
How Avatars & AR Are Driving Innovation: Lessons from Electronic Caregiver (A...
How Avatars & AR Are Driving Innovation: Lessons from Electronic Caregiver (A...How Avatars & AR Are Driving Innovation: Lessons from Electronic Caregiver (A...
How Avatars & AR Are Driving Innovation: Lessons from Electronic Caregiver (A...Amazon Web Services
 
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...Amazon Web Services
 
Architecting Digital Media Archive Migrations with AWS (STG357) - AWS re:Inve...
Architecting Digital Media Archive Migrations with AWS (STG357) - AWS re:Inve...Architecting Digital Media Archive Migrations with AWS (STG357) - AWS re:Inve...
Architecting Digital Media Archive Migrations with AWS (STG357) - AWS re:Inve...Amazon Web Services
 
SRV326 Build a Voice-based Chatbot for Your Amazon Connect Contact Center
 SRV326 Build a Voice-based Chatbot for Your Amazon Connect Contact Center SRV326 Build a Voice-based Chatbot for Your Amazon Connect Contact Center
SRV326 Build a Voice-based Chatbot for Your Amazon Connect Contact CenterAmazon Web Services
 
Customizing Data Lakes to Work for Your Enterprise with Sysco (STG340) - AWS ...
Customizing Data Lakes to Work for Your Enterprise with Sysco (STG340) - AWS ...Customizing Data Lakes to Work for Your Enterprise with Sysco (STG340) - AWS ...
Customizing Data Lakes to Work for Your Enterprise with Sysco (STG340) - AWS ...Amazon Web Services
 
Design and Implement a Serverless Media-Processing Workflow
Design and Implement a Serverless Media-Processing Workflow Design and Implement a Serverless Media-Processing Workflow
Design and Implement a Serverless Media-Processing Workflow Amazon Web Services
 
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...Amazon Web Services
 
Best Practices for Building Multi-Region, Active-Active Serverless Applicatio...
Best Practices for Building Multi-Region, Active-Active Serverless Applicatio...Best Practices for Building Multi-Region, Active-Active Serverless Applicatio...
Best Practices for Building Multi-Region, Active-Active Serverless Applicatio...Amazon Web Services
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudAmazon Web Services
 
Manage Queries, and Audit Usage & Control Costs at Scale on Amazon Athena (AN...
Manage Queries, and Audit Usage & Control Costs at Scale on Amazon Athena (AN...Manage Queries, and Audit Usage & Control Costs at Scale on Amazon Athena (AN...
Manage Queries, and Audit Usage & Control Costs at Scale on Amazon Athena (AN...Amazon Web Services
 
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018Amazon Web Services
 

What's hot (20)

Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
 
AWS Machine Learning Language Services
AWS Machine Learning Language ServicesAWS Machine Learning Language Services
AWS Machine Learning Language Services
 
AWS DevOps Essentials: An Introductory Workshop on CI/CD Best Practices (DEV3...
AWS DevOps Essentials: An Introductory Workshop on CI/CD Best Practices (DEV3...AWS DevOps Essentials: An Introductory Workshop on CI/CD Best Practices (DEV3...
AWS DevOps Essentials: An Introductory Workshop on CI/CD Best Practices (DEV3...
 
[NEW LAUNCH!] Introducing AWS App Mesh – service mesh on AWS (CON367) - AWS r...
[NEW LAUNCH!] Introducing AWS App Mesh – service mesh on AWS (CON367) - AWS r...[NEW LAUNCH!] Introducing AWS App Mesh – service mesh on AWS (CON367) - AWS r...
[NEW LAUNCH!] Introducing AWS App Mesh – service mesh on AWS (CON367) - AWS r...
 
Serverless Architectural Patterns and Best Practices (ARC305-R2) - AWS re:Inv...
Serverless Architectural Patterns and Best Practices (ARC305-R2) - AWS re:Inv...Serverless Architectural Patterns and Best Practices (ARC305-R2) - AWS re:Inv...
Serverless Architectural Patterns and Best Practices (ARC305-R2) - AWS re:Inv...
 
Tailor Your Alexa Skill Responses to Deliver Truly Personal Experiences (ALX3...
Tailor Your Alexa Skill Responses to Deliver Truly Personal Experiences (ALX3...Tailor Your Alexa Skill Responses to Deliver Truly Personal Experiences (ALX3...
Tailor Your Alexa Skill Responses to Deliver Truly Personal Experiences (ALX3...
 
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
 
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
 
Intro To AWS for Mobile Developers: Collision 2018
Intro To AWS for Mobile Developers: Collision 2018Intro To AWS for Mobile Developers: Collision 2018
Intro To AWS for Mobile Developers: Collision 2018
 
How Avatars & AR Are Driving Innovation: Lessons from Electronic Caregiver (A...
How Avatars & AR Are Driving Innovation: Lessons from Electronic Caregiver (A...How Avatars & AR Are Driving Innovation: Lessons from Electronic Caregiver (A...
How Avatars & AR Are Driving Innovation: Lessons from Electronic Caregiver (A...
 
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
 
Architecting Digital Media Archive Migrations with AWS (STG357) - AWS re:Inve...
Architecting Digital Media Archive Migrations with AWS (STG357) - AWS re:Inve...Architecting Digital Media Archive Migrations with AWS (STG357) - AWS re:Inve...
Architecting Digital Media Archive Migrations with AWS (STG357) - AWS re:Inve...
 
SRV326 Build a Voice-based Chatbot for Your Amazon Connect Contact Center
 SRV326 Build a Voice-based Chatbot for Your Amazon Connect Contact Center SRV326 Build a Voice-based Chatbot for Your Amazon Connect Contact Center
SRV326 Build a Voice-based Chatbot for Your Amazon Connect Contact Center
 
Customizing Data Lakes to Work for Your Enterprise with Sysco (STG340) - AWS ...
Customizing Data Lakes to Work for Your Enterprise with Sysco (STG340) - AWS ...Customizing Data Lakes to Work for Your Enterprise with Sysco (STG340) - AWS ...
Customizing Data Lakes to Work for Your Enterprise with Sysco (STG340) - AWS ...
 
Design and Implement a Serverless Media-Processing Workflow
Design and Implement a Serverless Media-Processing Workflow Design and Implement a Serverless Media-Processing Workflow
Design and Implement a Serverless Media-Processing Workflow
 
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...
 
Best Practices for Building Multi-Region, Active-Active Serverless Applicatio...
Best Practices for Building Multi-Region, Active-Active Serverless Applicatio...Best Practices for Building Multi-Region, Active-Active Serverless Applicatio...
Best Practices for Building Multi-Region, Active-Active Serverless Applicatio...
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the Cloud
 
Manage Queries, and Audit Usage & Control Costs at Scale on Amazon Athena (AN...
Manage Queries, and Audit Usage & Control Costs at Scale on Amazon Athena (AN...Manage Queries, and Audit Usage & Control Costs at Scale on Amazon Athena (AN...
Manage Queries, and Audit Usage & Control Costs at Scale on Amazon Athena (AN...
 
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
 

Similar to From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Workshop - Web Summit 2018

AWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scaleAWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scaleAmazon Web Services
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSAmazon Web Services
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech TalksAnalyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech TalksAmazon Web Services
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Amazon Web Services
 
Implementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfImplementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfAmazon Web Services
 
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Amazon Web Services
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS Amazon Web Services
 
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Amazon Web Services
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 
Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018Amazon Web Services
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Amazon Web Services LATAM
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWSAmazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 

Similar to From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Workshop - Web Summit 2018 (20)

AWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scaleAWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scale
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWS
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
 
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech TalksAnalyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2
 
Implementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfImplementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdf
 
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS
 
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Workshop - Web Summit 2018

  • 1. Alex Casalboni Technical Evangelist, AWS @alex_casalboni @ 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved From Data Collection to Actionable Insights in 60 Seconds
  • 2. About me • Software Engineer & Web Developer • Startupper for 4.5 years • Serverless Lover & AI Enthusiast • AWS Customer since 2013
  • 3. Agenda 1. Data Challenges 2. Columnar Formats 3. Data Lakes vs. Data Warehouses 4. Serverless Analytics 5. Demo time
  • 4. Data Challenges Data variety and data volumes are increasing rapidly Multiple Consumers and Applications Ingest Discover Catalog Understand Curate Find insights
  • 5.
  • 6. Right tool for the job Customer Needs Come First
  • 7. Purpose-Built Analytics on AWS Collect Store Analyze Amazon Kinesis Firehose AWS Direct Connect Amazon Snowball Amazon Kinesis Analytics Amazon Kinesis Streams Amazon S3 Amazon Glacier Amazon CloudSearch Amazon RDS, Amazon Aurora Amazon Dynamo DB Amazon Elasticsearch Amazon EMR Amazon Redshift Amazon QuickSight AWS Database Migration Service AWS Glue Amazon Athena Amazon AI
  • 8. Open-source standards (Apache) Parquet, ORC, etc. Optimize Performance Optimize Costs Analytical queries Columnar data
  • 10. Why it matters Big Data Analytics Real-time Analytics Data exploration
  • 11. Traditional Data Warehouse OLTP ERP CRM LOB Data Warehouse Business Intelligence Relational data Terabytes to Petabytes scale Schema defined prior to data load Operational reporting and ad-hoc analysis
  • 12. Data Lakes extend traditional warehouses Relational and non-relational data Terabytes to Exabytes scale Schema defined during analysis (Schema on Read) Diverse analytical engines to gain insights Designed for low cost storage and analytics OLTP ERP CRM LOB Data Warehouse Business Intelligence Data Lake 1001100001001010111 0010101011100101010 0001011111011010 0011110010110010110 0100011000010 Devices Web Sensors Social Data Catalog Machine Learning DW Queries Big data processing Interactive Real-time
  • 13. Snowball Snowmobile Kinesis Data Firehose Kinesis Data Streams Amazon S3 AWS Glue Wide variety of ways to bring data in Durability and availability at Exabyte scale Security, compliance, and audit capabilities Run any analytics on the same data without movement Scale storage and compute independently Store at $0.023 / GB-month Query for $0.05 / GB scanned Redshift EMR Athena Kinesis Elasticsearch Service Data Lakes on AWS
  • 14. Data Lake Components Catalog' &'Search Access%and%search%metadata Access'&'User'Interface Give%your%users%easy%and%secure%access DynamoDB Elasticsearch API'Gateway Identity'&'Access' Management Cognito QuickSight Amazon' AI EMR Redshift Athena Kinesis RDS Central'Storage Secure,%cost5effective Storage%in%Amazon%S3 S3 Snowball Database' Migration' Service Kinesis' Firehose Direct'Connect Data'Ingestion Get%your%data%into%S3 Quickly%and%securely Protect'and'Secure Use%entitlements% to%ensure%data% is%secure%and% users’% identities% are% verified Processing' &'Analytics Use%of%predictive%and%prescriptive% analytics%to%gain%better%understanding Security'Token' Service CloudWatch CloudTrail Key'Management' Service Catalog' &'Search Access%and%search%metadata Access'&'User'Interface Give%your%users%easy%and%secure%access DynamoDB Elasticsearch API'Gateway Identity'&'Access' Management Cognito QuickSight A Central'Storage Secure,%cost5effective Storage%in%Amazon%S3 Glue ETL
  • 15. Serverless Analytics Deliver cost-effective analytic solutions faster S3 Data Lake Glue (Data Catalog and ETL) RedShift Spectrum QuickSight Serverless Zero infrastructure Zero administration Pay only for what you use, not for idle resources $ Availability and fault tolerance built in Automatically scales resources with usage Snowball Snowmobile Kinesis Data Firehose many other sources Other BI Tools Amazon Athena Amazon EMR
  • 16. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Dark data are the information assets organizations collect, process, and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Gartner CRM ERP Data warehouse Mainframe data Web Social Log files Machine data Semi- structured Unstructured “ ”
  • 17. AWS Glue—Serverless Data Catalog & ETL Data Catalog ETL Discover data and extract schema Auto-generate customizable code in Python and Spark Automatically discovers data and stores schema Data is immediately searchable and available for ETL Generates customizable code Schedules and runs your ETL jobs Serverless Model
  • 18. Crawlers: Automatic Schema Inference semi-structured per-file schema semi-structured unified schema identify file type and parse files enumerate S3 objects file 1 file 2 file N … int array intchar struct char int array struct char bool int int arrayint char char int custom classifiers Grok based parser built-in classifiers JSON parser CSV parser Parquet parser … bool
  • 19. IAM Role Glue Crawler Data Lakes Data Warehouse Databases Amazon RDS Amazon Redshift Amazon S3 JDBC Connection Object Connection Built-In Classifiers MySQL MariaDB PostreSQL Aurora Redshift Avro Parquet ORC XML JSON & BSON Logs (Apache (Grok), Linux(Grok), MS(Grok), Ruby, Redis, and many others) Delimited (comma, pipe, tab, semicolon) Compressions (ZIP, BZIP, GZIP, LZ4, Snappy) What can Crawlers Classify?
  • 20. Detecting Schema Similarity name: str id: num Schema A root addr street: str city: str zip: num name: str id: num Schema B root addr: str Schema similarity heuristic § 1 point for matching name § 1 point for matching data type § Match when similarity index > 0.7 intersection min(A,B) 7 8 .875sim
  • 22. Automatically update table version as data evolves Automatic Schema Versioning
  • 23. Other Ways of Creating Tables Call Glue’s CreateTable API Create table manually Run Hive DDL statement Apache Hive Metastore AWS GLUE ETL AWS GLUE DATA CATALOG Import from Apache Hive Metastore
  • 24. Amazon Redshift - Data Warehousing Fast, powerful, simple, and fully managed data warehouse at 1/10 the cost Massively parallel, scale from gigabytes to petabytes Fast at any scale Columnar storage technology to improve I/O efficiency and scale query performance $ Inexpensive As low as $1,000 per terabyte per year, 1/10th the cost of traditional data warehouse solutions; Start at $0.25 per hour Open file formats Secure Audit everything; encrypt data end-to-end; extensive certification and compliance Analyze optimized data formats on the latest SSD, and all open data formats in Amazon S3
  • 25. Amazon Redshift Spectrum E x t e n d t h e d a t a w a r e h o u s e t o e x a b y t e s o f d a t a i n a n S 3 d a t a l a k e Exabyte Redshift SQL queries against S3 Join data across Redshift and S3 Scale compute and storage separately Stable query performance and unlimited concurrency CSV, ORC, Grok, Avro, & Parquet data formats Pay only for the amount of data scanned S3 data lakeRedshift data Redshift Spectrum query engine
  • 26. Redshift Spectrum Q u e ry y o u r dat a lake Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage AWS Glue Data Catalog Redshift Spectrum Scale-out serverless compute SELECT COUNT(*) FROM S3.EXT_TABLE GROUP BY …
  • 27. Data Lake on Amazon S3 with AWS Glue On premises data Web app data Amazon RDS Other databases Streaming data Your data AMAZON QUICKSIGHT AWS GLUE ETL
  • 33. About me Kinesis Data Generator (KDG) github.com/awslabs/amazon-kinesis-data-generator Serverless Data Pipeline powered by AWS SAM github.com/alexcasalboni/serverless-data-pipeline-sam AWS Big Data Blog aws.amazon.com/blogs/big-data Additional Resources
  • 34. AMAZON CONFIDENTIAL Did We Scan Your Badge? Remember to opt-in to AWS communications and you will receive a post-event email with a link to: • AWS Developer Workshop Slides • $200 in AWS Credits
  • 35. Alex Casalboni Technical Evangelist, AWS Thank you! @alex_casalboni @ 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved