SlideShare a Scribd company logo
1 of 42
Download to read offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Building data lakes for analytics
on AWS
Gautam Srinivasan
AWS solutions architect
A D B 2 0 1
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Agenda
Trends in analytics
The AWS analytics portfolio
Services & customer results
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
More than 125 million players
Data provides a constant feedback loop for
game designers
Up-to-the-minute analysis of gamer
satisfaction to drive gamer engagement
Resulting in the most popular game played
in the world
Fortnite
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Customers want more value from their data
Growing
exponentially
From new
sources
Increasingly
diverse
Used by
many people
Analyzed by
many applications
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Companies want more value from their data
Complications
Siloed approaches don’t work anymore
It’s too expensive and limiting
to store data on-premises
Implication
A new approach is needed to
extract insights and value
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Cloud data lakes are the future
Data Lake
Customers want:
To move to a single store; i.e., a data lake in the cloud
To store data securely in standard formats
To grow to any scale, with low costs
To analyze their data in a variety of ways
To democratize data access and analysis
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Why choose AWS for data lakes and analytics?
Most
comprehensive
Most
secure
Easiest
to build
Most
cost effective
Most
customers
& partners
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Most comprehensive
Broadestand deepestportfolio, purpose-built forbuilders
Migration & Streaming Services
Infrastructure Data Catalog
& ETL
Security &
Management
Dashboards Predictive Analytics
Data
Warehousing
Big Data
Processing
Interactive
Query
Operational
Analytics
Real time
Analytics
Serverless
Data Processing
Visualization & Machine Learning
Data Movement
Analytics
Data Lake Infrastructure & Management
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data Movement
Analytics
Most comprehensive
Broadestand deepestportfolio, purpose-built forbuilders
+ 10 more
Redshift
EMR (Spark &
Hadoop)
Athena
Elasticsearch
Service
Kinesis Data
Analytics
AWS Glue
(Spark &
Python)
S3/Glacier AWS GlueLake
Formation
Visualization & Machine Learning
QuickSight SageMaker Comprehend Lex Polly Rekognition Translate Transcribe
Deep Learning
AMIs
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Managed Streaming for Kafka
Data Lake Infrastructure & Management
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Most secure
Servicesforsecurity and governance
Compliance
AWS Artifact
Amazon Inspector
AWS CloudHSM
Amazon Cognito
AWS CloudTrail
Security
Amazon GuardDuty
AWS Shield
AWS WAF
Amazon Macie
Amazon Virtual Private Cloud
(Amazon VPC)
Encryption
AWS Certificate Manager
AWS Key Management Service
Encryption at rest
Encryption in transit
Bring your own keys, CloudHSM
support
Identity
AWS IAM
AWS SSO
Amazon Cloud Directory
AWS Directory Service
AWS Organizations
Customers need to have multiple levels of security, identity and access management, encryption,
and compliance to secure their data lake
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Most secure – Certifications
CSA
Cloud Security
Alliance Controls
ISO 9001
Global Quality
Standard
ISO 27001
Security Management
Controls
ISO 27017
Cloud Specific
Controls
ISO 27018
Personal Data
Protection
PCI DSS Level 1
Payment Card
Standards
SOC 1
Audit Controls
Report
SOC 2
Security, Availability, &
Confidentiality Report
SOC 3
General Controls
Report
Global United States
CJIS
Criminal Justice
Information Services
DoD SRG
DoD Data
Processing
FedRAMP
Government Data
Standards
FERPA
Educational
Privacy Act
FIPS
Government Security
Standards
FISMA
Federal Information
Security Management
GxP
Quality Guidelines
and Regulations
ISO FFIEC
Financial Institutions
Regulation
HIPPA
Protected Health
Information
ITAR
International Arms
Regulations
MPAA
Protected Media
Content
NIST
National Institute of
Standards and Technology
SEC Rule 17a-4(f)
Financial Data
Standards
VPAT/Section 508
Accountability
Standards
Asia Pacific
FISC [Japan]
Financial Industry
Information Systems
IRAP [Australia]
Australian Security
Standards
K-ISMS [Korea]
Korean Information
Security
MTCS Tier 3 [Singapore]
Multi-Tier Cloud
Security Standard
My Number Act [Japan]
Personal Information
Protection
Europe
C5 [Germany]
Operational Security
Attestation
Cyber Essentials
Plus [UK]
Cyber Threat
Protection
G-Cloud [UK]
UK Government
Standards
IT-Grundschutz
[Germany]
Baseline Protection
Methodology
X P
G
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Most cost effective
Decouplecompute and storage,choiceofPAYG analytics services
Storage
Amazon S3 tiers &
intelligent tiering
From $0.023 per
GB/mo to as low as
$0.004 per GB/mo
Compute
Spot & reserved
instances
Save up to 90% off
on-demand prices
EMR
Autoscaling
57% less than
on-premises
per IDC report
Redshift
less than a tenth
of the cost of
traditional solutions.
Athena &
QuickSight
Serverless, pay
only for what is used
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
More data lakes and analytics than anywhere else
Morethan 10,000 data lakes on AWS
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Most partners to complement AWS offerings
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data movement solutions
Migration & Streaming Services
Data Movement
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Most ways to move data to the data lake
Data movement from
on-premises datacenters
Dedicated network connection
Secure appliances
Ruggedized shipping containers
Database migration
Gateway that lets applications write to the cloud
Data movement from real-time sources
Connect devices to AWS
Real-time data streams
Real-time video streams
Data movement from
real-time sources
Data movement from your
on-premises datacenters
Amazon S3
Amazon S3 Glacier
AWS Glue
Synchronizing data
across environments
Professional services and partners
to help migration
Data
movement
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data lake infrastructure
& management solutions
Infrastructure Data Catalog
& ETL
Security &
Management
Data lake infrastructure & management
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon S3
Lake Formation & Glue
Snowball Kinesis
Data Streams
Snowmobile Kinesis
Data Firehose
Amazon
Redshift
Amazon
EMR
Athena
Kinesis
Amazon
Elasticsearch
Service
Robust data lake infrastructure
Amazon
SageMaker
Amazon
Comprehend
Amazon
Rekognition
Durable and available; exabyte scale
Secure, compliant, auditable
Object-level controls for fine-grain access
Fast performance by retrieving subsets of data
Decoupling of compute and storage
On-demand resources, tiering, cost choices
Data lake infrastructure
& management
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
“Zestimates are more up-to-date
and accurate, because they’re
built with the absolute latest
data. That’s a huge benefit for
our users, who depend on this
information to influence their
buying or selling decisions.”
— Jasjeet Thind, vice president of data science and
engineering, Zillow Group
Data lake infrastructure
& management
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Set up a catalog, ETL, and data prep
with AWSGlue
Serverless provisioning, configuration, and
scaling to run your ETL jobs on Apache Spark
Pay only for the resources used for jobs
Crawl your data sources, identify data
formats and suggest schemas and
transformations
Automates the effort in building, maintaining
and running ETL jobs
Data lake infrastructure
& management
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
“Beeswax uses Amazon S3 and AWS
Glue Data Catalog to build a highly
reliable data lake that is fully
managed by AWS. Our platform
leverages the AWS Glue Data Catalog
integration with Amazon EMR in
Hive and SparkSQL applications to
deliver reporting and optimization
features to our customers.”
— Ram Kumar Rengaswamy, CTO, Beeswax
Data lake infrastructure
& management
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Challenges to making a secure data lake
Typical steps in building a data lake
Move data2 Cleanse, prep, and
catalog data
3
Configure and enforce security
and compliance policies
4
Make data available
for analytics5
Set up storage1
Data lake infrastructure
& management
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Build a secure data lake in days
with AWSLake Formation
Move, store, catalog, and
clean your data faster
Move, store, catalog,
and clean your data faster
with Machine Learning
Enforce security policies
across multiple services
Enforce security policies across
multiple services
Gain and manage
new insights
Empower analyst and data scientist to
gain and manage new insights
Data lake infrastructure
& management
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data lake infrastructure
& management
“With an enterprise-ready option
like Lake Formation, we will be
able to spend more time deriving
value from our data rather than
doing the heavy lifting involved in
manually setting up and managing
our data lake.”
— Joshua Couch, VP engineering at Fender Digital
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Analytics solutions
Data
Warehousing
Big Data
Processing
Interactive
Query
Operational
Analytics
Real time
Analytics
Serverless
Data Processing
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Big data processing with Apache Spark & Hadoop
with Amazon EMR
Easy to use notebooks
Low cost versus on-premises
Elastic autoscaling
Reliable 99.9% SLA
Secure with encryption and keys
Flexible, open-source choice
Analytics
Enterprise grade Easy Lowest cost
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Analytics
FINRA’s legacy system did not scale
to handle 75 billion events per day. It
needed to run complex surveillance
queries on more than 20 PB of data
FINRA moved its big data appliance
to an Amazon S3 data lake and uses
Amazon EMR for ingestion and
processing
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
The Forrester Wave
Cloud Hadoop/Spark Platforms
Q12019
“The 11 Providers That Matter Most
and How They Stack Up”
by Noel Yuhanna and Mike Gualtieri
February 13, 2019
The Forrester Wave is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave are
trademarks of Forrester Research, Inc. The Forrester Wave is a graphical representation of
Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores,
weightings, and comments. Forrester does not endorse any vendor, product, or service
depicted in the Forrester Wave. Information is based on best available resources. Opinions
reflect judgment at the time and are subject to change.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data warehouse for business reporting
with Amazon RedShift
Fast-up to 10x faster than
traditional data warehouses
Easy to set up, deploy, and
manage
Cost effective
Scale on demand for large data
volume and high query
concurrency
Query data in open formats
directly from the data lake
Analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Analytics
“Twenty percent of our queries
now complete in less than one
second. Best of all, we didn’t have
to change anything to get this
speed-up with Redshift, which
supports our mission-critical
workloads.”
— Greg Rokita, executive director of technology, Edmunds
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Real-time analytics for timely insights
with Amazon Kinesis
Make streaming data available to
multiple real-time analytics
applications
Run streaming applications without
managing any infrastructure
Durable to reduce the probability
of data loss
Scalable to process data from hundreds
of thousands of sources with low
latencies
Analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Analytics
“Amazon Kinesis makes it simple to
scale our solution end to end,
including the capture, processing,
and delivery of actionable insights.
This empowers our customers to
better understand their user base.”
—Indu Narayan, director of data, Yieldmo
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Operational analytics for logs and search
with Amazon Elasticsearch Service
Fully managed; deploy
production-ready cluster
in minutes
Direct access to Elasticsearch
open-source APIs, Logstash
and Kibana
Amazon VPC support; at-rest
and in-transit encryption
Easily scale up and down
Analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Interactive analysis
with Amazon Athena
Interactive query service to analyze data in
Amazon S3 using standard SQL
No infrastructure to set up or manage, and no
data to load
Ability to run SQL queries on data archived in
Amazon Glacier (coming soon)
Analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Analytics
“One of the big attractions of Amazon Athena
is that it’s serverless and purely
consumption based.”
“We only pay when we’re actually querying
the data, and we don’t have to keep a
cluster running all the time. Using Amazon
Athena, we’re able to query seven years’
worth of data—adding up to hundreds of
terabytes—get results at least 50% faster,
and save nearly $15,000 per month.”
— Matt Chesler, director of DevOps at Movable Ink
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Serverless analytics
Deliveron-demand analytics onthe data lake
Amazon S3
Data lake
AWS Glue
(ETL &
Data Catalog)
Athena
QuickSight
Serverless; zero infrastructure,
zero administration
Never pay for
idle resources
$
Availability and fault
tolerance
built in
Automatically scales
resources with usage
AWS IoT
AI/ML
Devices Web Sensors Social
Analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Visualization & machine learning
solutions
Dashboards Predictive Analytics
Visualization & Machine Learning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Visual insights for everyone
with Amazon QuickSight
Pay only for what you use
Scale to tens of thousands of users
Embedded analytics
Build end-to-end BI solutions
Visualization &
machine learning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Visual insights for everyone
With AWSML & AI services
Frameworks and interfaces for
machine learning practitioners
Platform services that make it easy
for any developer to get started and
get deep with ML
Application services that enable
developers to plug in pre-built
AI functionality into their apps
Visualization &
machine learning
Amazon S3
Raw Data Initial training data
is annotated by
human labelers
Active learning model
is trained from human
labeled data
Ambiguous data is sent to human
labelers for annotation
Human labeled data is then sent
back to retrain and improve the
machine learning model
Training data the
model understands is
labeled automatically
An accurate training data
set is ready for use in
Amazon SageMaker
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Visualization &
machine learning
Using Amazon Translate,
Lionbridge is able to scale
machine translation in order to
localize content faster and in
more languages.
Using Amazon Translate,
Lionbridge was able to reduce
translation costs by 20%.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Next steps…
Dive deeper into
specific AWS services
Set up a proof-of-concept
Talk about how professional
services can help
Sign up for an
AWS account
Instantly get access
to the AWS Free Tier
Learn with
10-minute tutorials
Explore and learn with
simple tutorials
Start building
with AWS
Begin building with step-by-step
guide to help you launch
your AWS project.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Gautam Srinivasan
AWS solutions architect

More Related Content

What's hot

Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...AWS Summits
 
Ask me anything about building data lakes on AWS - ADB209 - New York AWS Summit
Ask me anything about building data lakes on AWS - ADB209 - New York AWS SummitAsk me anything about building data lakes on AWS - ADB209 - New York AWS Summit
Ask me anything about building data lakes on AWS - ADB209 - New York AWS SummitAmazon Web Services
 
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...Amazon Web Services
 
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Amazon Web Services
 
Grid computing in the cloud for Financial Services industry - CMP205-I - New ...
Grid computing in the cloud for Financial Services industry - CMP205-I - New ...Grid computing in the cloud for Financial Services industry - CMP205-I - New ...
Grid computing in the cloud for Financial Services industry - CMP205-I - New ...Amazon Web Services
 
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdf
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdfPerforming real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdf
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdfAmazon Web Services
 
Database su AWS scegliere lo strumento giusto per il giusto obiettivo
Database su AWS scegliere lo strumento giusto per il giusto obiettivoDatabase su AWS scegliere lo strumento giusto per il giusto obiettivo
Database su AWS scegliere lo strumento giusto per il giusto obiettivoAmazon Web Services
 
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Amazon Web Services
 
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdfAdd intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdfAmazon Web Services
 
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...Amazon Web Services
 
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...Boaz Ziniman
 
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0Amazon Web Services
 
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Amazon Web Services
 
Setting up custom machine learning environments on AWS - AIM309 - New York AW...
Setting up custom machine learning environments on AWS - AIM309 - New York AW...Setting up custom machine learning environments on AWS - AIM309 - New York AW...
Setting up custom machine learning environments on AWS - AIM309 - New York AW...Amazon Web Services
 
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Amazon Web Services
 
Deriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML ArchitecturesDeriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML ArchitecturesAmazon Web Services
 
進化中的遊戲產業-以微服務架構-全球布局與現代化資料庫策略來打造高成長遊戲
進化中的遊戲產業-以微服務架構-全球布局與現代化資料庫策略來打造高成長遊戲進化中的遊戲產業-以微服務架構-全球布局與現代化資料庫策略來打造高成長遊戲
進化中的遊戲產業-以微服務架構-全球布局與現代化資料庫策略來打造高成長遊戲Amazon Web Services
 
Accelerating-ML-Adoption-with-Our-New-AI-Services
Accelerating-ML-Adoption-with-Our-New-AI-ServicesAccelerating-ML-Adoption-with-Our-New-AI-Services
Accelerating-ML-Adoption-with-Our-New-AI-ServicesAmazon Web Services
 
Threat Detection using artificial intelligence
Threat Detection using artificial intelligenceThreat Detection using artificial intelligence
Threat Detection using artificial intelligenceAmazon Web Services
 

What's hot (20)

Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
 
Ask me anything about building data lakes on AWS - ADB209 - New York AWS Summit
Ask me anything about building data lakes on AWS - ADB209 - New York AWS SummitAsk me anything about building data lakes on AWS - ADB209 - New York AWS Summit
Ask me anything about building data lakes on AWS - ADB209 - New York AWS Summit
 
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...
 
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
 
Grid computing in the cloud for Financial Services industry - CMP205-I - New ...
Grid computing in the cloud for Financial Services industry - CMP205-I - New ...Grid computing in the cloud for Financial Services industry - CMP205-I - New ...
Grid computing in the cloud for Financial Services industry - CMP205-I - New ...
 
HK-AWS-Quick-Start-Workshop
HK-AWS-Quick-Start-WorkshopHK-AWS-Quick-Start-Workshop
HK-AWS-Quick-Start-Workshop
 
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdf
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdfPerforming real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdf
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdf
 
Database su AWS scegliere lo strumento giusto per il giusto obiettivo
Database su AWS scegliere lo strumento giusto per il giusto obiettivoDatabase su AWS scegliere lo strumento giusto per il giusto obiettivo
Database su AWS scegliere lo strumento giusto per il giusto obiettivo
 
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
 
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdfAdd intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
 
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
 
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
 
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0
 
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
 
Setting up custom machine learning environments on AWS - AIM309 - New York AW...
Setting up custom machine learning environments on AWS - AIM309 - New York AW...Setting up custom machine learning environments on AWS - AIM309 - New York AW...
Setting up custom machine learning environments on AWS - AIM309 - New York AW...
 
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
 
Deriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML ArchitecturesDeriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML Architectures
 
進化中的遊戲產業-以微服務架構-全球布局與現代化資料庫策略來打造高成長遊戲
進化中的遊戲產業-以微服務架構-全球布局與現代化資料庫策略來打造高成長遊戲進化中的遊戲產業-以微服務架構-全球布局與現代化資料庫策略來打造高成長遊戲
進化中的遊戲產業-以微服務架構-全球布局與現代化資料庫策略來打造高成長遊戲
 
Accelerating-ML-Adoption-with-Our-New-AI-Services
Accelerating-ML-Adoption-with-Our-New-AI-ServicesAccelerating-ML-Adoption-with-Our-New-AI-Services
Accelerating-ML-Adoption-with-Our-New-AI-Services
 
Threat Detection using artificial intelligence
Threat Detection using artificial intelligenceThreat Detection using artificial intelligence
Threat Detection using artificial intelligence
 

Similar to Building Data Lakes for Analytics on AWS

AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...AWS Summits
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019Amazon Web Services
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Summits
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSAmazon Web Services
 
Immersion Day - Como a AWS apoia a estratégia analítica de sua empresa
Immersion Day - Como a AWS apoia a estratégia analítica de sua empresaImmersion Day - Como a AWS apoia a estratégia analítica de sua empresa
Immersion Day - Como a AWS apoia a estratégia analítica de sua empresaAmazon Web Services LATAM
 
Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSAmazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building a modern data platform in AWS
Building a modern data platform in AWSBuilding a modern data platform in AWS
Building a modern data platform in AWSAmazon Web Services
 
How to go from zero to data lakes in days - ADB202 - New York AWS Summit
How to go from zero to data lakes in days - ADB202 - New York AWS SummitHow to go from zero to data lakes in days - ADB202 - New York AWS Summit
How to go from zero to data lakes in days - ADB202 - New York AWS SummitAmazon Web Services
 
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAutomate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAmazon Web Services
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...AWS Summits
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Amazon Web Services
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSSteven Hsieh
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudAlluxio, Inc.
 
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Amazon Web Services
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time dataAmazon Web Services
 
AWS and Symantec: Cyber Defense at Scale (SEC311-S) - AWS re:Invent 2018
AWS and Symantec: Cyber Defense at Scale (SEC311-S) - AWS re:Invent 2018AWS and Symantec: Cyber Defense at Scale (SEC311-S) - AWS re:Invent 2018
AWS and Symantec: Cyber Defense at Scale (SEC311-S) - AWS re:Invent 2018Amazon Web Services
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaAmazon Web Services
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaAmazon Web Services
 

Similar to Building Data Lakes for Analytics on AWS (20)

AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWS
 
Immersion Day - Como a AWS apoia a estratégia analítica de sua empresa
Immersion Day - Como a AWS apoia a estratégia analítica de sua empresaImmersion Day - Como a AWS apoia a estratégia analítica de sua empresa
Immersion Day - Como a AWS apoia a estratégia analítica de sua empresa
 
Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWS
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building a modern data platform in AWS
Building a modern data platform in AWSBuilding a modern data platform in AWS
Building a modern data platform in AWS
 
How to go from zero to data lakes in days - ADB202 - New York AWS Summit
How to go from zero to data lakes in days - ADB202 - New York AWS SummitHow to go from zero to data lakes in days - ADB202 - New York AWS Summit
How to go from zero to data lakes in days - ADB202 - New York AWS Summit
 
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAutomate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the Cloud
 
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time data
 
AWS and Symantec: Cyber Defense at Scale (SEC311-S) - AWS re:Invent 2018
AWS and Symantec: Cyber Defense at Scale (SEC311-S) - AWS re:Invent 2018AWS and Symantec: Cyber Defense at Scale (SEC311-S) - AWS re:Invent 2018
AWS and Symantec: Cyber Defense at Scale (SEC311-S) - AWS re:Invent 2018
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Building Data Lakes for Analytics on AWS

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Building data lakes for analytics on AWS Gautam Srinivasan AWS solutions architect A D B 2 0 1
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Agenda Trends in analytics The AWS analytics portfolio Services & customer results
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T More than 125 million players Data provides a constant feedback loop for game designers Up-to-the-minute analysis of gamer satisfaction to drive gamer engagement Resulting in the most popular game played in the world Fortnite
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Customers want more value from their data Growing exponentially From new sources Increasingly diverse Used by many people Analyzed by many applications
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Companies want more value from their data Complications Siloed approaches don’t work anymore It’s too expensive and limiting to store data on-premises Implication A new approach is needed to extract insights and value
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Cloud data lakes are the future Data Lake Customers want: To move to a single store; i.e., a data lake in the cloud To store data securely in standard formats To grow to any scale, with low costs To analyze their data in a variety of ways To democratize data access and analysis
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Why choose AWS for data lakes and analytics? Most comprehensive Most secure Easiest to build Most cost effective Most customers & partners
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Most comprehensive Broadestand deepestportfolio, purpose-built forbuilders Migration & Streaming Services Infrastructure Data Catalog & ETL Security & Management Dashboards Predictive Analytics Data Warehousing Big Data Processing Interactive Query Operational Analytics Real time Analytics Serverless Data Processing Visualization & Machine Learning Data Movement Analytics Data Lake Infrastructure & Management
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data Movement Analytics Most comprehensive Broadestand deepestportfolio, purpose-built forbuilders + 10 more Redshift EMR (Spark & Hadoop) Athena Elasticsearch Service Kinesis Data Analytics AWS Glue (Spark & Python) S3/Glacier AWS GlueLake Formation Visualization & Machine Learning QuickSight SageMaker Comprehend Lex Polly Rekognition Translate Transcribe Deep Learning AMIs Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Managed Streaming for Kafka Data Lake Infrastructure & Management
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Most secure Servicesforsecurity and governance Compliance AWS Artifact Amazon Inspector AWS CloudHSM Amazon Cognito AWS CloudTrail Security Amazon GuardDuty AWS Shield AWS WAF Amazon Macie Amazon Virtual Private Cloud (Amazon VPC) Encryption AWS Certificate Manager AWS Key Management Service Encryption at rest Encryption in transit Bring your own keys, CloudHSM support Identity AWS IAM AWS SSO Amazon Cloud Directory AWS Directory Service AWS Organizations Customers need to have multiple levels of security, identity and access management, encryption, and compliance to secure their data lake
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Most secure – Certifications CSA Cloud Security Alliance Controls ISO 9001 Global Quality Standard ISO 27001 Security Management Controls ISO 27017 Cloud Specific Controls ISO 27018 Personal Data Protection PCI DSS Level 1 Payment Card Standards SOC 1 Audit Controls Report SOC 2 Security, Availability, & Confidentiality Report SOC 3 General Controls Report Global United States CJIS Criminal Justice Information Services DoD SRG DoD Data Processing FedRAMP Government Data Standards FERPA Educational Privacy Act FIPS Government Security Standards FISMA Federal Information Security Management GxP Quality Guidelines and Regulations ISO FFIEC Financial Institutions Regulation HIPPA Protected Health Information ITAR International Arms Regulations MPAA Protected Media Content NIST National Institute of Standards and Technology SEC Rule 17a-4(f) Financial Data Standards VPAT/Section 508 Accountability Standards Asia Pacific FISC [Japan] Financial Industry Information Systems IRAP [Australia] Australian Security Standards K-ISMS [Korea] Korean Information Security MTCS Tier 3 [Singapore] Multi-Tier Cloud Security Standard My Number Act [Japan] Personal Information Protection Europe C5 [Germany] Operational Security Attestation Cyber Essentials Plus [UK] Cyber Threat Protection G-Cloud [UK] UK Government Standards IT-Grundschutz [Germany] Baseline Protection Methodology X P G
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Most cost effective Decouplecompute and storage,choiceofPAYG analytics services Storage Amazon S3 tiers & intelligent tiering From $0.023 per GB/mo to as low as $0.004 per GB/mo Compute Spot & reserved instances Save up to 90% off on-demand prices EMR Autoscaling 57% less than on-premises per IDC report Redshift less than a tenth of the cost of traditional solutions. Athena & QuickSight Serverless, pay only for what is used
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T More data lakes and analytics than anywhere else Morethan 10,000 data lakes on AWS
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Most partners to complement AWS offerings
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data movement solutions Migration & Streaming Services Data Movement
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Most ways to move data to the data lake Data movement from on-premises datacenters Dedicated network connection Secure appliances Ruggedized shipping containers Database migration Gateway that lets applications write to the cloud Data movement from real-time sources Connect devices to AWS Real-time data streams Real-time video streams Data movement from real-time sources Data movement from your on-premises datacenters Amazon S3 Amazon S3 Glacier AWS Glue Synchronizing data across environments Professional services and partners to help migration Data movement
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data lake infrastructure & management solutions Infrastructure Data Catalog & ETL Security & Management Data lake infrastructure & management
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon S3 Lake Formation & Glue Snowball Kinesis Data Streams Snowmobile Kinesis Data Firehose Amazon Redshift Amazon EMR Athena Kinesis Amazon Elasticsearch Service Robust data lake infrastructure Amazon SageMaker Amazon Comprehend Amazon Rekognition Durable and available; exabyte scale Secure, compliant, auditable Object-level controls for fine-grain access Fast performance by retrieving subsets of data Decoupling of compute and storage On-demand resources, tiering, cost choices Data lake infrastructure & management
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T “Zestimates are more up-to-date and accurate, because they’re built with the absolute latest data. That’s a huge benefit for our users, who depend on this information to influence their buying or selling decisions.” — Jasjeet Thind, vice president of data science and engineering, Zillow Group Data lake infrastructure & management
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Set up a catalog, ETL, and data prep with AWSGlue Serverless provisioning, configuration, and scaling to run your ETL jobs on Apache Spark Pay only for the resources used for jobs Crawl your data sources, identify data formats and suggest schemas and transformations Automates the effort in building, maintaining and running ETL jobs Data lake infrastructure & management
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T “Beeswax uses Amazon S3 and AWS Glue Data Catalog to build a highly reliable data lake that is fully managed by AWS. Our platform leverages the AWS Glue Data Catalog integration with Amazon EMR in Hive and SparkSQL applications to deliver reporting and optimization features to our customers.” — Ram Kumar Rengaswamy, CTO, Beeswax Data lake infrastructure & management
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Challenges to making a secure data lake Typical steps in building a data lake Move data2 Cleanse, prep, and catalog data 3 Configure and enforce security and compliance policies 4 Make data available for analytics5 Set up storage1 Data lake infrastructure & management
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Build a secure data lake in days with AWSLake Formation Move, store, catalog, and clean your data faster Move, store, catalog, and clean your data faster with Machine Learning Enforce security policies across multiple services Enforce security policies across multiple services Gain and manage new insights Empower analyst and data scientist to gain and manage new insights Data lake infrastructure & management
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data lake infrastructure & management “With an enterprise-ready option like Lake Formation, we will be able to spend more time deriving value from our data rather than doing the heavy lifting involved in manually setting up and managing our data lake.” — Joshua Couch, VP engineering at Fender Digital
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Analytics solutions Data Warehousing Big Data Processing Interactive Query Operational Analytics Real time Analytics Serverless Data Processing
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Big data processing with Apache Spark & Hadoop with Amazon EMR Easy to use notebooks Low cost versus on-premises Elastic autoscaling Reliable 99.9% SLA Secure with encryption and keys Flexible, open-source choice Analytics Enterprise grade Easy Lowest cost
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Analytics FINRA’s legacy system did not scale to handle 75 billion events per day. It needed to run complex surveillance queries on more than 20 PB of data FINRA moved its big data appliance to an Amazon S3 data lake and uses Amazon EMR for ingestion and processing
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T The Forrester Wave Cloud Hadoop/Spark Platforms Q12019 “The 11 Providers That Matter Most and How They Stack Up” by Noel Yuhanna and Mike Gualtieri February 13, 2019 The Forrester Wave is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave are trademarks of Forrester Research, Inc. The Forrester Wave is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data warehouse for business reporting with Amazon RedShift Fast-up to 10x faster than traditional data warehouses Easy to set up, deploy, and manage Cost effective Scale on demand for large data volume and high query concurrency Query data in open formats directly from the data lake Analytics
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Analytics “Twenty percent of our queries now complete in less than one second. Best of all, we didn’t have to change anything to get this speed-up with Redshift, which supports our mission-critical workloads.” — Greg Rokita, executive director of technology, Edmunds
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Real-time analytics for timely insights with Amazon Kinesis Make streaming data available to multiple real-time analytics applications Run streaming applications without managing any infrastructure Durable to reduce the probability of data loss Scalable to process data from hundreds of thousands of sources with low latencies Analytics
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Analytics “Amazon Kinesis makes it simple to scale our solution end to end, including the capture, processing, and delivery of actionable insights. This empowers our customers to better understand their user base.” —Indu Narayan, director of data, Yieldmo
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Operational analytics for logs and search with Amazon Elasticsearch Service Fully managed; deploy production-ready cluster in minutes Direct access to Elasticsearch open-source APIs, Logstash and Kibana Amazon VPC support; at-rest and in-transit encryption Easily scale up and down Analytics
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Interactive analysis with Amazon Athena Interactive query service to analyze data in Amazon S3 using standard SQL No infrastructure to set up or manage, and no data to load Ability to run SQL queries on data archived in Amazon Glacier (coming soon) Analytics
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Analytics “One of the big attractions of Amazon Athena is that it’s serverless and purely consumption based.” “We only pay when we’re actually querying the data, and we don’t have to keep a cluster running all the time. Using Amazon Athena, we’re able to query seven years’ worth of data—adding up to hundreds of terabytes—get results at least 50% faster, and save nearly $15,000 per month.” — Matt Chesler, director of DevOps at Movable Ink
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Serverless analytics Deliveron-demand analytics onthe data lake Amazon S3 Data lake AWS Glue (ETL & Data Catalog) Athena QuickSight Serverless; zero infrastructure, zero administration Never pay for idle resources $ Availability and fault tolerance built in Automatically scales resources with usage AWS IoT AI/ML Devices Web Sensors Social Analytics
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Visualization & machine learning solutions Dashboards Predictive Analytics Visualization & Machine Learning
  • 38. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Visual insights for everyone with Amazon QuickSight Pay only for what you use Scale to tens of thousands of users Embedded analytics Build end-to-end BI solutions Visualization & machine learning
  • 39. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Visual insights for everyone With AWSML & AI services Frameworks and interfaces for machine learning practitioners Platform services that make it easy for any developer to get started and get deep with ML Application services that enable developers to plug in pre-built AI functionality into their apps Visualization & machine learning Amazon S3 Raw Data Initial training data is annotated by human labelers Active learning model is trained from human labeled data Ambiguous data is sent to human labelers for annotation Human labeled data is then sent back to retrain and improve the machine learning model Training data the model understands is labeled automatically An accurate training data set is ready for use in Amazon SageMaker
  • 40. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Visualization & machine learning Using Amazon Translate, Lionbridge is able to scale machine translation in order to localize content faster and in more languages. Using Amazon Translate, Lionbridge was able to reduce translation costs by 20%.
  • 41. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Next steps… Dive deeper into specific AWS services Set up a proof-of-concept Talk about how professional services can help Sign up for an AWS account Instantly get access to the AWS Free Tier Learn with 10-minute tutorials Explore and learn with simple tutorials Start building with AWS Begin building with step-by-step guide to help you launch your AWS project.
  • 42. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Gautam Srinivasan AWS solutions architect