SlideShare a Scribd company logo
1 of 39
Download to read offline
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Technology Trends: Data Lakes
and Analytics
Anurag Gupta
VP, Analytics & Amazon RDS
AWS
A N T 2 0 5
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data is a strategic asset
for every organization
The world’s most valuable
resource is
*Copyright: The Economist, 2017, David Parkins
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The move
toward
data-centric
companies
Five largest companies
by market cap*
2001
2006
2011
2016
2018
$1.091T
$406B
$446B
$406B
$582B
$976B
$365B
$383B
$556B
$383B
$877B
$272B
$327B
$277B
$452B
$839B
$261B
$293B
$237B
$364B
$523B
$260B
$273B
$228B
$228B
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is a
data-centric
company?
What do we sell?
How do we make money?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thinking about data as an asset, not a cost
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Stop
throwing
data away
Make it
available to
more users
Arm users
with more
data processing
technologies
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data
every 5 years
There is more data
than people think
15
years
live for
Data platforms need to
1,000x
scale
>10x
grows
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hadoop Elasticsearch
There are more
ways to analyze data
than ever before
Years ago
11 8 5 4
Presto Spark
Didn’t exist
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Democratization
of data
Governance
& control
There are more
people working
with data than
ever before
How do I provide democratized
access to data to enable
informed decisions while at the
same time enforce data
governance and prevent
mismanagement of the data?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Store exabytes of data
Stage from landing dock to transformed to curated–
Make available in each
Load, transform, and catalog once
Make data available to many tools
Open formats and interfaces support innovation
Snowball
Snowmobile Kinesis
Data Firehose
Kinesis
Data Streams
Amazon S3
Amazon
Redshift
Amazon
EMR
Athena
Amazon
Kinesis Amazon
Elasticsearch
Service
Data lakes help you cost-effectively scale
Kinesis
Video Streams
AI Services
Amazon
QuickSight
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Lake Formation (sign up for the preview)
Build, secure, and manage a data lake in days
Build a data lake in days,
not months
Build and deploy a fully
managed data lake with a few
clicks
Enforce security policies
across multiple services
Centrally define security,
governance, and auditing policies in
one place and enforce those policies
for all users and all applications
Combine different
analytics approaches
Empower analyst and data scientist
productivity, giving them self-
service discovery and safe access to
all data from a single catalog
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How it works
Data Lakes and analytics on AWS
S3
IAM KMS
OLTP
ERP
CRM
LOB
Devices
Web
Sensors
Social Kinesis
Build Data Lakes quickly
• Identify, crawl, and catalog sources
• Ingest and clean data
• Transform into optimal formats
Simplify security management
• Enforce encryption
• Define access policies
• Implement audit login
Enable self-service and combined analytics
• Analysts discover all data available for analysis
from a single data catalog
• Use multiple analytics tools over the same data
Athena
Amazon
Redshift
AI Services
Amazon
EMR
Amazon
QuickSight
Data
Catalog
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS databases and analytics
Broad and deep portfolio, built for builders
AWS Marketplace
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Athena
Interactive analytics
Kinesis Analytics
Real-time
Amazon Elasticsearch service
Operational Analytics
RDS
MySQL, PostgreSQL, MariaDB,
Oracle, SQL Server
Aurora
MySQL, PostgreSQL
Amazon
QuickSight
Amazon
SageMaker
DynamoDB
Key value, Document
ElastiCache
Redis, Memcached
Neptune
Graph
Timestream
Time Series
QLDB
Ledger Database
S3/Amazon Glacier
AWS Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Data Pipeline | Direct Connect
Data Movement
AnalyticsDatabases
Business Intelligence & Machine Learning
Data Lake
Managed
Blockchain
Blockchain
Templates
Blockchain
Amazon
Comprehend
Amazon
Rekognition
Amazon
Lex
Amazon
Transcribe
AWS DeepLens 250+ solutions
730+ Database
solutions
600+ Analytics
solutions
25+ Blockchain
solutions
20+ Data lake
solutions
30+ solutions
RDS on VMWare
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Recent announcements
AWS Lake Formation
Amazon Redshift Concurrency Scaling
Amazon Redshift Elastic Resize
Amazon QuickSight—embedded
dashboards
Amazon QuickSight—ML-powered Insights
Amazon Managed Blockchain
Amazon DynamoDB Transactions
Amazon DynamoDB—read/write
capacity on-demand
Amazon Timestream (managed time
series database)
Amazon QLDB (managed ledger
database)
Amazon Aurora Global Database
Amazon RDS on VMware
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
More places to learn about analytics services
Amazon Athena
Amazon
Redshift
Amazon
Elasticsearch
Services
AWS Lake
Formation
ANT401-R2: Deep Dive and Best Practices for
Amazon Redshift | Fri 11:30
ANT 401-R1: Deep Dive and Best Practices for
Amazon Redshift | Thu 4:00
ANT202-R1: Modern Cloud Data Warehousing ft.
Intuit | Thu 2:30
ANT350-R1: What's New with Amazon Redshift ft.
McDonald's | Thu 3:15
Sessions that already occurred: ANT350-R
ANT323-R1: Build Your Own Log Analytics
Solutions on AWS | Thur 11:30
Sessions that already occurred: ANT334-R,
ANT334-R1, ANT323-R, ANT203
Introduction to AWS Lake Formation - Build a
secure data lake in days | Wed 7:00pm
Sessions that already occurred: ANT205
ANT340-R1: A Deep Dive into What's New with
Amazon EMR | Fri 3:00
ANT312: Migrate Your Hadoop/Spark Workload to
Amazon EMR and Architect It for Security and
Governance on AWS | Wed 7:00
Sessions that already occurred: ANT204, ANT312,
ANT340-R
Amazon Kinesis
Amazon
QuickSight
AWS Glue
ANT322-R1: High Performance Data Streaming
with Amazon Kinesis: Best Practices | Thu 1:00
ANT 310: Architecting for Real-Time Insights
with Amazon Kinesis | Thu 3:15
Sessions that already occurred: ANT208,
ANT322-R
Introducing ML-powered insights with Amazon
QuickSight | Wed 1:00pm | Aria East, Level 1,
Joshua 9
ANT311: NFL and Forwood Safety Deploy
Business Analytics at Scale with Amazon
QuickSight | Fri 11:30
Amazon EMR
Sessions that already occurred: ANT309, ANT308
Sessions from prior days: ANT324
CHALLENGE
Need to create constant feedback loop
for designers
Gain up-to-the-minute understanding
of gamer satisfaction to guarantee
gamers are engaged, thus resulting in
the most popular game played in the
world
Fortnite | 125+ million players
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Epic Games uses Data Lakes and analytics
Entire analytics platform running on AWS
S3 leveraged as a Data Lake
All telemetry data is collected with Kinesis
Real-time analytics done through Spark on EMR,
DynamoDB to create scoreboards and real-time queries
Use Amazon EMR for large batch data processing
Game designers use data to inform their decisions
Game
clients
Game
servers
Launcher
Game
services
N E A R R E A L T I M E P I P E L I N E
N E A R R E A L T I M E P I P E L I N E
Grafana
Scoreboards API
Limited Raw Data
(real time ad-hoc SQL)
User ETL
(metric definition)
Spark on EMR DynamoDB
NEAR REALTIME PIPELINES
BATCH PIPELINES
ETL using
EMR
Tableau/BI
Ad-hoc SQLS3
(Data Lake)
Kinesis
APIs
Databases
S3
Other
sources
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Equinox Fitness Cubs is a company with integrated luxury and
lifestyle offerings centered on movement, nutrition and
regeneration. Equinox built connected experiences using
applications that connect to Apple Health and built data
collection in their exercise equipment.
More than 200 locations within every major city across the
U.S., London, and Canada
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Many lines of business across
98 clubs & 200+ studios in
total
Plus central supporting
functions
Digital
Products
CRM Marketing Creative
Development
/ Building
Finance Member’s
Services
Maintenance
Personal
training
Pilates Spa Group
Fitness
Membership/
Sales
Retail Food
Services
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Digital products
End user applications
Connections to Apple Health
Connected
equipment
Pursuit (gamified cycling experience)
Cardio
Digital assessment
Location tracking
Connected tech
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data lake architecture
Data & analytics apps
Equinox apps
Third-party apps
Informatica
Maximilian
Amazon
EMR
PT
App
Pursuit
Engage
Exact
Target
Adobe Social
MOSO
Fitness
Agg.
Amazon
Redshift
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The assembled pipeline
Adobe
Analytics
Amazon
EMR
AthenaS3
Glue Data
Catalog
Redshift
Spectrum
S3
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Re-platformed and productionalized
2 apps in 4 months
Finished re-platform in under a year
Dependability–very few operational issues
Faster time-to-benefit via automated regression
Huge cost savings over Teradata
Results
Reduced time-to-benefit and increased
end-user productivity
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
We need to
rethink what we
mean by data
and analytics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
This is data
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
This is data
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
This is data
Skip the trip.
one-hour delivery
Exclusively for Amazon Prime Members
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data can be used to
connect more deeply
with your customer base
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reporting,
analysis,
modeling, and
planning are not
going away
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Rethinking data:
Example #1
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Rethinking data:
Example #2
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How do we build new
types of applications that
can leverage this data?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Social mediaRide hailing Media streaming Dating
As application requirements change,
data processing engines need to evolve as well
On Prime Day, DynamoDB requests
from Alexa, the Amazon.com sites,
and the Amazon fulfillment centers
totaled 3.34 trillion, peaking at 12.9
million per second
Databases need to be able to provide reliable performance
with highly variable demands and deliver consistent, single-
digit millisecond response time at any scale.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tracking change over time
Fast, scalable, fully managed time series database
Application logs
IoT sensor readings
Vehicle telematics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
We see ledger
use cases as
an emerging
need
Transparent,
immutable verifiable
system of record
Easily store and
track transactions
Trace the entire production
and distribution journey
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Supports familiar
open source
frameworks
Fully managed with
enterprise-grade security
Quantum Ledger DB
provides scale and
off-chain analytics
Advantages of an AWS Managed Blockchain
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data has power
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Anurag Gupta
awgupta@amazon.com
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

More from Amazon Web Services

Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSAmazon Web Services
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAmazon Web Services
 
Crea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightCrea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightAmazon Web Services
 
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotCostruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotAmazon Web Services
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Amazon Web Services
 
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?Amazon Web Services
 
Protect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksProtect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksAmazon Web Services
 
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Amazon Web Services
 

More from Amazon Web Services (20)

Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWS
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei server
 
Crea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightCrea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSight
 
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotCostruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows
 
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
 
Protect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksProtect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced Attacks
 
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
 

Technology Trends: Data Lakes and Analytics (ANT205) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Technology Trends: Data Lakes and Analytics Anurag Gupta VP, Analytics & Amazon RDS AWS A N T 2 0 5
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data is a strategic asset for every organization The world’s most valuable resource is *Copyright: The Economist, 2017, David Parkins
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The move toward data-centric companies Five largest companies by market cap* 2001 2006 2011 2016 2018 $1.091T $406B $446B $406B $582B $976B $365B $383B $556B $383B $877B $272B $327B $277B $452B $839B $261B $293B $237B $364B $523B $260B $273B $228B $228B
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What is a data-centric company? What do we sell? How do we make money?
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thinking about data as an asset, not a cost © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Stop throwing data away Make it available to more users Arm users with more data processing technologies
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data every 5 years There is more data than people think 15 years live for Data platforms need to 1,000x scale >10x grows
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hadoop Elasticsearch There are more ways to analyze data than ever before Years ago 11 8 5 4 Presto Spark Didn’t exist
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Democratization of data Governance & control There are more people working with data than ever before How do I provide democratized access to data to enable informed decisions while at the same time enforce data governance and prevent mismanagement of the data?
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Store exabytes of data Stage from landing dock to transformed to curated– Make available in each Load, transform, and catalog once Make data available to many tools Open formats and interfaces support innovation Snowball Snowmobile Kinesis Data Firehose Kinesis Data Streams Amazon S3 Amazon Redshift Amazon EMR Athena Amazon Kinesis Amazon Elasticsearch Service Data lakes help you cost-effectively scale Kinesis Video Streams AI Services Amazon QuickSight
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Lake Formation (sign up for the preview) Build, secure, and manage a data lake in days Build a data lake in days, not months Build and deploy a fully managed data lake with a few clicks Enforce security policies across multiple services Centrally define security, governance, and auditing policies in one place and enforce those policies for all users and all applications Combine different analytics approaches Empower analyst and data scientist productivity, giving them self- service discovery and safe access to all data from a single catalog
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How it works Data Lakes and analytics on AWS S3 IAM KMS OLTP ERP CRM LOB Devices Web Sensors Social Kinesis Build Data Lakes quickly • Identify, crawl, and catalog sources • Ingest and clean data • Transform into optimal formats Simplify security management • Enforce encryption • Define access policies • Implement audit login Enable self-service and combined analytics • Analysts discover all data available for analysis from a single data catalog • Use multiple analytics tools over the same data Athena Amazon Redshift AI Services Amazon EMR Amazon QuickSight Data Catalog
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS databases and analytics Broad and deep portfolio, built for builders AWS Marketplace Amazon Redshift Data warehousing Amazon EMR Hadoop + Spark Athena Interactive analytics Kinesis Analytics Real-time Amazon Elasticsearch service Operational Analytics RDS MySQL, PostgreSQL, MariaDB, Oracle, SQL Server Aurora MySQL, PostgreSQL Amazon QuickSight Amazon SageMaker DynamoDB Key value, Document ElastiCache Redis, Memcached Neptune Graph Timestream Time Series QLDB Ledger Database S3/Amazon Glacier AWS Glue ETL & Data Catalog Lake Formation Data Lakes Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Data Pipeline | Direct Connect Data Movement AnalyticsDatabases Business Intelligence & Machine Learning Data Lake Managed Blockchain Blockchain Templates Blockchain Amazon Comprehend Amazon Rekognition Amazon Lex Amazon Transcribe AWS DeepLens 250+ solutions 730+ Database solutions 600+ Analytics solutions 25+ Blockchain solutions 20+ Data lake solutions 30+ solutions RDS on VMWare
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Recent announcements AWS Lake Formation Amazon Redshift Concurrency Scaling Amazon Redshift Elastic Resize Amazon QuickSight—embedded dashboards Amazon QuickSight—ML-powered Insights Amazon Managed Blockchain Amazon DynamoDB Transactions Amazon DynamoDB—read/write capacity on-demand Amazon Timestream (managed time series database) Amazon QLDB (managed ledger database) Amazon Aurora Global Database Amazon RDS on VMware
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. More places to learn about analytics services Amazon Athena Amazon Redshift Amazon Elasticsearch Services AWS Lake Formation ANT401-R2: Deep Dive and Best Practices for Amazon Redshift | Fri 11:30 ANT 401-R1: Deep Dive and Best Practices for Amazon Redshift | Thu 4:00 ANT202-R1: Modern Cloud Data Warehousing ft. Intuit | Thu 2:30 ANT350-R1: What's New with Amazon Redshift ft. McDonald's | Thu 3:15 Sessions that already occurred: ANT350-R ANT323-R1: Build Your Own Log Analytics Solutions on AWS | Thur 11:30 Sessions that already occurred: ANT334-R, ANT334-R1, ANT323-R, ANT203 Introduction to AWS Lake Formation - Build a secure data lake in days | Wed 7:00pm Sessions that already occurred: ANT205 ANT340-R1: A Deep Dive into What's New with Amazon EMR | Fri 3:00 ANT312: Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Security and Governance on AWS | Wed 7:00 Sessions that already occurred: ANT204, ANT312, ANT340-R Amazon Kinesis Amazon QuickSight AWS Glue ANT322-R1: High Performance Data Streaming with Amazon Kinesis: Best Practices | Thu 1:00 ANT 310: Architecting for Real-Time Insights with Amazon Kinesis | Thu 3:15 Sessions that already occurred: ANT208, ANT322-R Introducing ML-powered insights with Amazon QuickSight | Wed 1:00pm | Aria East, Level 1, Joshua 9 ANT311: NFL and Forwood Safety Deploy Business Analytics at Scale with Amazon QuickSight | Fri 11:30 Amazon EMR Sessions that already occurred: ANT309, ANT308 Sessions from prior days: ANT324
  • 16. CHALLENGE Need to create constant feedback loop for designers Gain up-to-the-minute understanding of gamer satisfaction to guarantee gamers are engaged, thus resulting in the most popular game played in the world Fortnite | 125+ million players
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Epic Games uses Data Lakes and analytics Entire analytics platform running on AWS S3 leveraged as a Data Lake All telemetry data is collected with Kinesis Real-time analytics done through Spark on EMR, DynamoDB to create scoreboards and real-time queries Use Amazon EMR for large batch data processing Game designers use data to inform their decisions Game clients Game servers Launcher Game services N E A R R E A L T I M E P I P E L I N E N E A R R E A L T I M E P I P E L I N E Grafana Scoreboards API Limited Raw Data (real time ad-hoc SQL) User ETL (metric definition) Spark on EMR DynamoDB NEAR REALTIME PIPELINES BATCH PIPELINES ETL using EMR Tableau/BI Ad-hoc SQLS3 (Data Lake) Kinesis APIs Databases S3 Other sources
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Equinox Fitness Cubs is a company with integrated luxury and lifestyle offerings centered on movement, nutrition and regeneration. Equinox built connected experiences using applications that connect to Apple Health and built data collection in their exercise equipment. More than 200 locations within every major city across the U.S., London, and Canada
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Many lines of business across 98 clubs & 200+ studios in total Plus central supporting functions Digital Products CRM Marketing Creative Development / Building Finance Member’s Services Maintenance Personal training Pilates Spa Group Fitness Membership/ Sales Retail Food Services
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Digital products End user applications Connections to Apple Health Connected equipment Pursuit (gamified cycling experience) Cardio Digital assessment Location tracking Connected tech
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data lake architecture Data & analytics apps Equinox apps Third-party apps Informatica Maximilian Amazon EMR PT App Pursuit Engage Exact Target Adobe Social MOSO Fitness Agg. Amazon Redshift
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The assembled pipeline Adobe Analytics Amazon EMR AthenaS3 Glue Data Catalog Redshift Spectrum S3
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Re-platformed and productionalized 2 apps in 4 months Finished re-platform in under a year Dependability–very few operational issues Faster time-to-benefit via automated regression Huge cost savings over Teradata Results Reduced time-to-benefit and increased end-user productivity
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. We need to rethink what we mean by data and analytics
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. This is data
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. This is data
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. This is data Skip the trip. one-hour delivery Exclusively for Amazon Prime Members
  • 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data can be used to connect more deeply with your customer base
  • 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reporting, analysis, modeling, and planning are not going away
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Rethinking data: Example #1
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Rethinking data: Example #2
  • 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How do we build new types of applications that can leverage this data? © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Social mediaRide hailing Media streaming Dating As application requirements change, data processing engines need to evolve as well On Prime Day, DynamoDB requests from Alexa, the Amazon.com sites, and the Amazon fulfillment centers totaled 3.34 trillion, peaking at 12.9 million per second Databases need to be able to provide reliable performance with highly variable demands and deliver consistent, single- digit millisecond response time at any scale.
  • 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Tracking change over time Fast, scalable, fully managed time series database Application logs IoT sensor readings Vehicle telematics © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. We see ledger use cases as an emerging need Transparent, immutable verifiable system of record Easily store and track transactions Trace the entire production and distribution journey © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Supports familiar open source frameworks Fully managed with enterprise-grade security Quantum Ledger DB provides scale and off-chain analytics Advantages of an AWS Managed Blockchain
  • 37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data has power © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 38. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Anurag Gupta awgupta@amazon.com
  • 39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.