SlideShare a Scribd company logo
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
This Is My Architecture:
Most I nnovative Storag e Solu tions
L I G H T N I N G R O U N D
N o v e m b e r 2 8 , 2 0 1 7
AWS re:INVENT
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What’s in store…
7- to 10 minute highly-technical presentations
• Alert Logic
• Viber
• iRobot
• Insitu
• Celgene
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scalable Ingestion, Storage & Analytics
Paul Fisher
Technical Fellow
A l e r t L o g i c
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
About Alert Logic
Alert Logic provides fully managed
security monitoring and protection for
cloud-deployed applications and
workloads
• Intrusion detection
• Web attack detection and blocking
• Vulnerability scanning
• Asset analysis and configuration
assessment
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The problem we faced
Doing this produced a large
data-processing problem
• 4,100+ customers
• 1.2M messages/second
• 2 petabytes per month
• 3 months to 7 years retention
• Adding 110 percent data volume/year
0
5
10
15
20
25
30
35
40
Jan-12
Aug-12
Mar-13
Oct-13
May-14
Dec-14
Jul-15
Feb-16
Sep-16
Apr-17
Nov-17
Jun-18
Jan-19
Aug-19
Mar-20
Oct-20
May-21
Petabytes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Key architecture challenges
• Achieving performance, durability, and availability, all at
once, is hard
• Providing multi-region availability with Recovery Time
Objective (RTO) of 15 minutes
• Designing for scale up 150x to 10PB per day for 100k
customers
• Doing this all for less than $0.02 per GB, with a team of 20
engineers
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Alert Logic system architecture
Foundational Components
Partners EnvironmentsCustomer Environments
AWS
Cloud Collection
appliance
host
agent
host
agent
Azure
appliance
host
agent
Traditional
appliance
host
agent
host
agent
Security Subsystems
AWS
Partner Account
log
ids
Ingestion,
Storage &
Access
Assets &
Config
Correlation &
Analytics
Incident
Analysis &
Workflow
Vulnerability
Assessment
Support
Customers
Reporting
Analysts
Partners
External APIs
Internal UI/UX & APIs
Cloud Collection
Partner DC
Ticketing
Monitoring
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Alert Logic system architecture
Ingestion, Storage & Access
appliance
host
agent
host
agent
Ingestion
Data Access
Search
Collection
Control Flow
Data Flow
AWS
Partner Account
log
ids
Foundational Components
Partners EnvironmentsCustomer Environments
AWS
Cloud Collection
appliance
host
agent
host
agent
Azure
appliance
host
agent
Traditional
appliance
host
agent
host
agent
Security Subsystems
AWS
Partner Account
log
ids
Ingestion,
Storage &
Access
Assets &
Config
Correlation &
Analytics
Incident
Analysis &
Workflow
Vulnerability
Assessment
Support
Customers
Reporting
Analysts
Partners
External APIs
Internal UI/UX & APIs
Cloud Collection
Partner DC
Ticketing
Monitoring
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data access architecture
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Inventory for bundling optimization
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hear more from Paul…
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Evolving the Data Store Supporting IoT
Historical State Changes
Jim Teal
Senior Cloud Solution Architect – Product IT
i R o b o t C o r p o r a t i o n
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The problem…
• We needed a facility to support ad-hoc queries, through an API to
report historical states for AWS IoT Thing Shadow attributes
• Our initial design pattern was based on Amazon Elastic
MapReduce, Spark Streaming, Amazon DynamoDB, Amazon API
Gateway, and AWS Lambda
• DynamoDB became costly for this use case, and we needed to find
an alternative solution to meet cost and performance expectations
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Key architecture considerations
Balance
performance & cost
Minimize
DevOps impact
Focused on AWS
Managed Services
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
iRobot analytics architecture
Accepted
Rule
Processed
Data
Robot
DataAPI
Amazon
Athena
1 7 9
5
6
AWS ElastiCache
Accepted
Firehose
Accepted
Lambda
Kinesis
Firehose
8
Last Hour of
Data
Data
Pipeline
Landing
Data
3 4
Raw
Data
Aspen 2.x
Platform
Aspen 2.x Analytics
RS Data
Pipeline
Redshift
HDH
Tables
Accepted
Bucket
2
13 14
10 11 12
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Learn more about IoT & analytics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Linux Migration
Lance Smith
Associate Director
Celgene
C e l g e n e
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
2014
2015
2016
2017
2018+
AWS WAF
Amazon SQS
AWS Direct Connect
AWS CloudFront
AWS
IoT
AWS Storage Gateway
AWS Lambda
Amazon S3
Amazon EC2
Amazon RDS
Amazon Aurora
Amazon EFS
Amazon EMR
Amazon WorkSpaces
Linux migration
• Migrate COTS-HPC applications to AWS
• On-premise clusters no longer sized for data intensive projects
• Desire to move to elastic computational environment
• Data protections in place
First Foray First Migrations Cloud First
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Key architecture challenges
• Application not
made for cloud
• No app/code
changes
• Need to refactor
for best practices
GENOMICS HPC
COMPUTATIONAL
CHEMISTRY
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
HPC architecture
VPC subnet
Auto
Scaling
Amazon
EFS
VPC subnet
Auto
Scaling
Auto
Scaling
Amazon
EC2
Clients
Amazon
EC2
Clients
Amazon
EC2
Clients
HPC Cluster
GPFS workstation
Celgene Sites
AWS File
Gateway
Amazon
S3
AWS
CloudWatch
Amazon
SQS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hear more from Celgene…
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Viber—Data Lake Challenges
Amir Ish-Shalom
Chief Architect
R a k u t e n V i b e r
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Messaging (including group)
• Secure end-to-end encryption
• Rich media and chat extensions
• Full multiple-device support
• HD video and phone calls
• Viber Out and Viber In
• Public chats and accounts
• Chatbots
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Big data @ Viber
• Close to 1 billion users worldwide
• Globally used in 230 countries
• 10-15 billion events daily (2TB)
• 300,000 events per second (peak hours)
• 5PB of data stored on Amazon S3
• NoSQL DB (Couchbase) performing
2 million TPS on 20TB of data
with 35 billion keys
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Viber—big data architecture
RT data pipeline
Kinesis
Viber backend
servers
RT data processor
Apache Storm
Data storage
S3
Raw data backup
Kinesis Firehose
ETL jobs
Spark, Presto, Pig,
Lambda functions
Query engines
Presto, Athena,
Spark SQL, Pig
Reporting tools
Tableau, Redash,
Zeppelin, others
Databases &
data warehouses
Redshift, Aurora,
MySQL
Events
NoSQL profile database
Couchbase
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Challenge
• Over 300 different event types with large throughput variance
• Storm data processor created many small files, especially for lower
throughput events
• Events were stored in Hive-partitioned folders, which are not
optimal for Amazon S3
• Running a query over these events using Presto could generate up
to 15K tps on a single S3 bucket, resulting in 5xx errors and
throttling the whole bucket for other processes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Current solution
• Concatenate small files into large files, optimally 100MB+
• Convert files into columnar file format such as Parquet or ORC
S3DistCp Spark
Concatenate
files
Convert to
Parquet
S3 S3 S3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Future solution
• Concatenate and convert files in a single process (Glue?)
• Use better partitioned Hive directory format (H/D/M/Y instead of
Y/M/D/H)
• Use even larger files for high throughput events
Spark/Glue
Concatenate files &
convert to Parquet
S3 S3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Learn more about data lakes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
INEXA Cloud—Answers at the Edge
Rahul C. Thakkar, Director, Commercial Cloud
Steven Hoffert, Lead, Special Projects
I N S I T U I n c . , a B o e i n g C o m p a n y
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Yes Problemo!
• We are in the middle of nowhere – no Internet, think Australian Outback, an open-pit mine, a collection of
gas well heads, linear infrastructure like pipelines
• We want to build an accurate (under 10 cm/pixel) 3D (or what we also call 2.5D) model - Why? Because our
clients want our spatio-temporal algorithms to find what, if anything is broken
• We get 30+ TB of data for that location, we may have many many such locations operational globally
Problem
• We want to verify that the data we collected on that day was decent on the
ground at the “edge”
• We want that data to reach our cloud for further and continuous future
processing
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Key architecture challenges
1
2
• Middle of nowhere
• Risk for people
• Less area covered
• Lot of data (TB)
• Time to result in days
• $$$$$
• Use unmanned
• Minimize risk to people
• 10x or more area
• Data to cloud
• Time to result in hours
• $$
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Key architecture challenges
A boat load
of images
TB++
3
Days
Finally, we can
start analysis
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The solution
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Q&A
Don’t forget about…
• Speaker Meet & Greet in the Park (behind the Expo, by
the LINQ) —1 to 2 p.m. Thursday
• STG312 & STG313
• Demo in the booth
• Stop anyone with this logo to talk about Amazon S3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!

More Related Content

What's hot

FSV306_Getting to Yes—Minimal Viable Cloud with Maximum Security
FSV306_Getting to Yes—Minimal Viable Cloud with Maximum SecurityFSV306_Getting to Yes—Minimal Viable Cloud with Maximum Security
FSV306_Getting to Yes—Minimal Viable Cloud with Maximum Security
Amazon Web Services
 
STG311_Deep Dive on Amazon S3 & Amazon Glacier Storage Management
STG311_Deep Dive on Amazon S3 & Amazon Glacier Storage ManagementSTG311_Deep Dive on Amazon S3 & Amazon Glacier Storage Management
STG311_Deep Dive on Amazon S3 & Amazon Glacier Storage Management
Amazon Web Services
 
GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...
GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...
GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...
Amazon Web Services
 
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
Amazon Web Services
 
STG203_Get Rid of Tape and Modernize Backup with AWS
STG203_Get Rid of Tape and Modernize Backup with AWSSTG203_Get Rid of Tape and Modernize Backup with AWS
STG203_Get Rid of Tape and Modernize Backup with AWS
Amazon Web Services
 
MCL207_Amazon Lex Integration with IVR
MCL207_Amazon Lex Integration with IVRMCL207_Amazon Lex Integration with IVR
MCL207_Amazon Lex Integration with IVR
Amazon Web Services
 
ENT212-An Overview of Best Practices for Large-Scale Migrations
ENT212-An Overview of Best Practices for Large-Scale MigrationsENT212-An Overview of Best Practices for Large-Scale Migrations
ENT212-An Overview of Best Practices for Large-Scale Migrations
Amazon Web Services
 
ATC303-Cache Me If You Can Minimizing Latency While Optimizing Cost Through A...
ATC303-Cache Me If You Can Minimizing Latency While Optimizing Cost Through A...ATC303-Cache Me If You Can Minimizing Latency While Optimizing Cost Through A...
ATC303-Cache Me If You Can Minimizing Latency While Optimizing Cost Through A...
Amazon Web Services
 
ARC330_How the BBC Built a Massive Media Pipeline Using Microservices
ARC330_How the BBC Built a Massive Media Pipeline Using MicroservicesARC330_How the BBC Built a Massive Media Pipeline Using Microservices
ARC330_How the BBC Built a Massive Media Pipeline Using Microservices
Amazon Web Services
 
Application Performance Management on AWS - ARC317 - re:Invent 2017
Application Performance Management on AWS - ARC317 - re:Invent 2017Application Performance Management on AWS - ARC317 - re:Invent 2017
Application Performance Management on AWS - ARC317 - re:Invent 2017
Amazon Web Services
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital Markets
Amazon Web Services
 
IOT207_Panasonic—Building the Road of the Future on AWS
IOT207_Panasonic—Building the Road of the Future on AWSIOT207_Panasonic—Building the Road of the Future on AWS
IOT207_Panasonic—Building the Road of the Future on AWS
Amazon Web Services
 
Improving Microservice and Serverless Observability with Monitoring Data - SR...
Improving Microservice and Serverless Observability with Monitoring Data - SR...Improving Microservice and Serverless Observability with Monitoring Data - SR...
Improving Microservice and Serverless Observability with Monitoring Data - SR...
Amazon Web Services
 
What's New for AWS Purpose Built, Non-relational Databases - DAT204 - re:Inve...
What's New for AWS Purpose Built, Non-relational Databases - DAT204 - re:Inve...What's New for AWS Purpose Built, Non-relational Databases - DAT204 - re:Inve...
What's New for AWS Purpose Built, Non-relational Databases - DAT204 - re:Inve...
Amazon Web Services
 
ARC207_Monitoring Performance of Enterprise Applications on AWS
ARC207_Monitoring Performance of Enterprise Applications on AWSARC207_Monitoring Performance of Enterprise Applications on AWS
ARC207_Monitoring Performance of Enterprise Applications on AWS
Amazon Web Services
 
Data Security in the Cloud Demystified – Policies, Protection, and Tools for ...
Data Security in the Cloud Demystified – Policies, Protection, and Tools for ...Data Security in the Cloud Demystified – Policies, Protection, and Tools for ...
Data Security in the Cloud Demystified – Policies, Protection, and Tools for ...
Amazon Web Services
 
ABD310 big data aws and security no notes
ABD310 big data aws and security no notesABD310 big data aws and security no notes
ABD310 big data aws and security no notes
Amazon Web Services
 
MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...
MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...
MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...
Amazon Web Services
 
NEW LAUNCH! AWS PrivateLink Deep Dive - NET310 - re:Invent 2017
NEW LAUNCH! AWS PrivateLink Deep Dive - NET310 - re:Invent 2017NEW LAUNCH! AWS PrivateLink Deep Dive - NET310 - re:Invent 2017
NEW LAUNCH! AWS PrivateLink Deep Dive - NET310 - re:Invent 2017
Amazon Web Services
 
GAM301-Migrating the League of Legends Platform into AWS Cloud.pdf
GAM301-Migrating the League of Legends Platform into AWS Cloud.pdfGAM301-Migrating the League of Legends Platform into AWS Cloud.pdf
GAM301-Migrating the League of Legends Platform into AWS Cloud.pdf
Amazon Web Services
 

What's hot (20)

FSV306_Getting to Yes—Minimal Viable Cloud with Maximum Security
FSV306_Getting to Yes—Minimal Viable Cloud with Maximum SecurityFSV306_Getting to Yes—Minimal Viable Cloud with Maximum Security
FSV306_Getting to Yes—Minimal Viable Cloud with Maximum Security
 
STG311_Deep Dive on Amazon S3 & Amazon Glacier Storage Management
STG311_Deep Dive on Amazon S3 & Amazon Glacier Storage ManagementSTG311_Deep Dive on Amazon S3 & Amazon Glacier Storage Management
STG311_Deep Dive on Amazon S3 & Amazon Glacier Storage Management
 
GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...
GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...
GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...
 
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
 
STG203_Get Rid of Tape and Modernize Backup with AWS
STG203_Get Rid of Tape and Modernize Backup with AWSSTG203_Get Rid of Tape and Modernize Backup with AWS
STG203_Get Rid of Tape and Modernize Backup with AWS
 
MCL207_Amazon Lex Integration with IVR
MCL207_Amazon Lex Integration with IVRMCL207_Amazon Lex Integration with IVR
MCL207_Amazon Lex Integration with IVR
 
ENT212-An Overview of Best Practices for Large-Scale Migrations
ENT212-An Overview of Best Practices for Large-Scale MigrationsENT212-An Overview of Best Practices for Large-Scale Migrations
ENT212-An Overview of Best Practices for Large-Scale Migrations
 
ATC303-Cache Me If You Can Minimizing Latency While Optimizing Cost Through A...
ATC303-Cache Me If You Can Minimizing Latency While Optimizing Cost Through A...ATC303-Cache Me If You Can Minimizing Latency While Optimizing Cost Through A...
ATC303-Cache Me If You Can Minimizing Latency While Optimizing Cost Through A...
 
ARC330_How the BBC Built a Massive Media Pipeline Using Microservices
ARC330_How the BBC Built a Massive Media Pipeline Using MicroservicesARC330_How the BBC Built a Massive Media Pipeline Using Microservices
ARC330_How the BBC Built a Massive Media Pipeline Using Microservices
 
Application Performance Management on AWS - ARC317 - re:Invent 2017
Application Performance Management on AWS - ARC317 - re:Invent 2017Application Performance Management on AWS - ARC317 - re:Invent 2017
Application Performance Management on AWS - ARC317 - re:Invent 2017
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital Markets
 
IOT207_Panasonic—Building the Road of the Future on AWS
IOT207_Panasonic—Building the Road of the Future on AWSIOT207_Panasonic—Building the Road of the Future on AWS
IOT207_Panasonic—Building the Road of the Future on AWS
 
Improving Microservice and Serverless Observability with Monitoring Data - SR...
Improving Microservice and Serverless Observability with Monitoring Data - SR...Improving Microservice and Serverless Observability with Monitoring Data - SR...
Improving Microservice and Serverless Observability with Monitoring Data - SR...
 
What's New for AWS Purpose Built, Non-relational Databases - DAT204 - re:Inve...
What's New for AWS Purpose Built, Non-relational Databases - DAT204 - re:Inve...What's New for AWS Purpose Built, Non-relational Databases - DAT204 - re:Inve...
What's New for AWS Purpose Built, Non-relational Databases - DAT204 - re:Inve...
 
ARC207_Monitoring Performance of Enterprise Applications on AWS
ARC207_Monitoring Performance of Enterprise Applications on AWSARC207_Monitoring Performance of Enterprise Applications on AWS
ARC207_Monitoring Performance of Enterprise Applications on AWS
 
Data Security in the Cloud Demystified – Policies, Protection, and Tools for ...
Data Security in the Cloud Demystified – Policies, Protection, and Tools for ...Data Security in the Cloud Demystified – Policies, Protection, and Tools for ...
Data Security in the Cloud Demystified – Policies, Protection, and Tools for ...
 
ABD310 big data aws and security no notes
ABD310 big data aws and security no notesABD310 big data aws and security no notes
ABD310 big data aws and security no notes
 
MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...
MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...
MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...
 
NEW LAUNCH! AWS PrivateLink Deep Dive - NET310 - re:Invent 2017
NEW LAUNCH! AWS PrivateLink Deep Dive - NET310 - re:Invent 2017NEW LAUNCH! AWS PrivateLink Deep Dive - NET310 - re:Invent 2017
NEW LAUNCH! AWS PrivateLink Deep Dive - NET310 - re:Invent 2017
 
GAM301-Migrating the League of Legends Platform into AWS Cloud.pdf
GAM301-Migrating the League of Legends Platform into AWS Cloud.pdfGAM301-Migrating the League of Legends Platform into AWS Cloud.pdf
GAM301-Migrating the League of Legends Platform into AWS Cloud.pdf
 

Similar to STG401_This Is My Architecture

ABD201-Big Data Architectural Patterns and Best Practices on AWS
ABD201-Big Data Architectural Patterns and Best Practices on AWSABD201-Big Data Architectural Patterns and Best Practices on AWS
ABD201-Big Data Architectural Patterns and Best Practices on AWS
Amazon Web Services
 
21st Century Analytics with Zopa
21st Century Analytics with Zopa21st Century Analytics with Zopa
21st Century Analytics with Zopa
Amazon Web Services
 
ARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million UsersARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million Users
Amazon Web Services
 
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
Amazon Web Services
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data Oceans
Amazon Web Services
 
Innovations fueled by IoT and the Cloud
Innovations fueled by IoT and the CloudInnovations fueled by IoT and the Cloud
Innovations fueled by IoT and the Cloud
Adrian Hornsby
 
AWS Storage State of the Union
AWS Storage State of the UnionAWS Storage State of the Union
AWS Storage State of the Union
Amazon Web Services
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
Amazon Web Services
 
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Amazon Web Services
 
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
Amazon Web Services
 
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of ManufacturingGPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
Amazon Web Services
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Amazon Web Services
 
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
Amazon Web Services
 
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
Amazon Web Services
 
ABD206-Building Visualizations and Dashboards with Amazon QuickSight
ABD206-Building Visualizations and Dashboards with Amazon QuickSightABD206-Building Visualizations and Dashboards with Amazon QuickSight
ABD206-Building Visualizations and Dashboards with Amazon QuickSight
Amazon Web Services
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
Amazon Web Services
 
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
Amazon Web Services
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Amazon Web Services
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Amazon Web Services
 
Deep Dive on Big Data
Deep Dive on Big Data Deep Dive on Big Data
Deep Dive on Big Data
Amazon Web Services
 

Similar to STG401_This Is My Architecture (20)

ABD201-Big Data Architectural Patterns and Best Practices on AWS
ABD201-Big Data Architectural Patterns and Best Practices on AWSABD201-Big Data Architectural Patterns and Best Practices on AWS
ABD201-Big Data Architectural Patterns and Best Practices on AWS
 
21st Century Analytics with Zopa
21st Century Analytics with Zopa21st Century Analytics with Zopa
21st Century Analytics with Zopa
 
ARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million UsersARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million Users
 
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data Oceans
 
Innovations fueled by IoT and the Cloud
Innovations fueled by IoT and the CloudInnovations fueled by IoT and the Cloud
Innovations fueled by IoT and the Cloud
 
AWS Storage State of the Union
AWS Storage State of the UnionAWS Storage State of the Union
AWS Storage State of the Union
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
 
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
 
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
 
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of ManufacturingGPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
 
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
 
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
 
ABD206-Building Visualizations and Dashboards with Amazon QuickSight
ABD206-Building Visualizations and Dashboards with Amazon QuickSightABD206-Building Visualizations and Dashboards with Amazon QuickSight
ABD206-Building Visualizations and Dashboards with Amazon QuickSight
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
 
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
 
Deep Dive on Big Data
Deep Dive on Big Data Deep Dive on Big Data
Deep Dive on Big Data
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
Amazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
Amazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
Amazon Web Services
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Amazon Web Services
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
Amazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
Amazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Amazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
Amazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Amazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

STG401_This Is My Architecture

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. This Is My Architecture: Most I nnovative Storag e Solu tions L I G H T N I N G R O U N D N o v e m b e r 2 8 , 2 0 1 7 AWS re:INVENT
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What’s in store… 7- to 10 minute highly-technical presentations • Alert Logic • Viber • iRobot • Insitu • Celgene
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scalable Ingestion, Storage & Analytics Paul Fisher Technical Fellow A l e r t L o g i c
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. About Alert Logic Alert Logic provides fully managed security monitoring and protection for cloud-deployed applications and workloads • Intrusion detection • Web attack detection and blocking • Vulnerability scanning • Asset analysis and configuration assessment
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The problem we faced Doing this produced a large data-processing problem • 4,100+ customers • 1.2M messages/second • 2 petabytes per month • 3 months to 7 years retention • Adding 110 percent data volume/year 0 5 10 15 20 25 30 35 40 Jan-12 Aug-12 Mar-13 Oct-13 May-14 Dec-14 Jul-15 Feb-16 Sep-16 Apr-17 Nov-17 Jun-18 Jan-19 Aug-19 Mar-20 Oct-20 May-21 Petabytes
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Key architecture challenges • Achieving performance, durability, and availability, all at once, is hard • Providing multi-region availability with Recovery Time Objective (RTO) of 15 minutes • Designing for scale up 150x to 10PB per day for 100k customers • Doing this all for less than $0.02 per GB, with a team of 20 engineers
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Alert Logic system architecture Foundational Components Partners EnvironmentsCustomer Environments AWS Cloud Collection appliance host agent host agent Azure appliance host agent Traditional appliance host agent host agent Security Subsystems AWS Partner Account log ids Ingestion, Storage & Access Assets & Config Correlation & Analytics Incident Analysis & Workflow Vulnerability Assessment Support Customers Reporting Analysts Partners External APIs Internal UI/UX & APIs Cloud Collection Partner DC Ticketing Monitoring
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Alert Logic system architecture Ingestion, Storage & Access appliance host agent host agent Ingestion Data Access Search Collection Control Flow Data Flow AWS Partner Account log ids Foundational Components Partners EnvironmentsCustomer Environments AWS Cloud Collection appliance host agent host agent Azure appliance host agent Traditional appliance host agent host agent Security Subsystems AWS Partner Account log ids Ingestion, Storage & Access Assets & Config Correlation & Analytics Incident Analysis & Workflow Vulnerability Assessment Support Customers Reporting Analysts Partners External APIs Internal UI/UX & APIs Cloud Collection Partner DC Ticketing Monitoring
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data access architecture
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Inventory for bundling optimization
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hear more from Paul…
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Evolving the Data Store Supporting IoT Historical State Changes Jim Teal Senior Cloud Solution Architect – Product IT i R o b o t C o r p o r a t i o n
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The problem… • We needed a facility to support ad-hoc queries, through an API to report historical states for AWS IoT Thing Shadow attributes • Our initial design pattern was based on Amazon Elastic MapReduce, Spark Streaming, Amazon DynamoDB, Amazon API Gateway, and AWS Lambda • DynamoDB became costly for this use case, and we needed to find an alternative solution to meet cost and performance expectations
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Key architecture considerations Balance performance & cost Minimize DevOps impact Focused on AWS Managed Services
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. iRobot analytics architecture Accepted Rule Processed Data Robot DataAPI Amazon Athena 1 7 9 5 6 AWS ElastiCache Accepted Firehose Accepted Lambda Kinesis Firehose 8 Last Hour of Data Data Pipeline Landing Data 3 4 Raw Data Aspen 2.x Platform Aspen 2.x Analytics RS Data Pipeline Redshift HDH Tables Accepted Bucket 2 13 14 10 11 12
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Learn more about IoT & analytics
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Linux Migration Lance Smith Associate Director Celgene C e l g e n e
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2014 2015 2016 2017 2018+ AWS WAF Amazon SQS AWS Direct Connect AWS CloudFront AWS IoT AWS Storage Gateway AWS Lambda Amazon S3 Amazon EC2 Amazon RDS Amazon Aurora Amazon EFS Amazon EMR Amazon WorkSpaces Linux migration • Migrate COTS-HPC applications to AWS • On-premise clusters no longer sized for data intensive projects • Desire to move to elastic computational environment • Data protections in place First Foray First Migrations Cloud First
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Key architecture challenges • Application not made for cloud • No app/code changes • Need to refactor for best practices GENOMICS HPC COMPUTATIONAL CHEMISTRY
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. HPC architecture VPC subnet Auto Scaling Amazon EFS VPC subnet Auto Scaling Auto Scaling Amazon EC2 Clients Amazon EC2 Clients Amazon EC2 Clients HPC Cluster GPFS workstation Celgene Sites AWS File Gateway Amazon S3 AWS CloudWatch Amazon SQS
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hear more from Celgene…
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Viber—Data Lake Challenges Amir Ish-Shalom Chief Architect R a k u t e n V i b e r
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Messaging (including group) • Secure end-to-end encryption • Rich media and chat extensions • Full multiple-device support • HD video and phone calls • Viber Out and Viber In • Public chats and accounts • Chatbots
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Big data @ Viber • Close to 1 billion users worldwide • Globally used in 230 countries • 10-15 billion events daily (2TB) • 300,000 events per second (peak hours) • 5PB of data stored on Amazon S3 • NoSQL DB (Couchbase) performing 2 million TPS on 20TB of data with 35 billion keys
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Viber—big data architecture RT data pipeline Kinesis Viber backend servers RT data processor Apache Storm Data storage S3 Raw data backup Kinesis Firehose ETL jobs Spark, Presto, Pig, Lambda functions Query engines Presto, Athena, Spark SQL, Pig Reporting tools Tableau, Redash, Zeppelin, others Databases & data warehouses Redshift, Aurora, MySQL Events NoSQL profile database Couchbase
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Challenge • Over 300 different event types with large throughput variance • Storm data processor created many small files, especially for lower throughput events • Events were stored in Hive-partitioned folders, which are not optimal for Amazon S3 • Running a query over these events using Presto could generate up to 15K tps on a single S3 bucket, resulting in 5xx errors and throttling the whole bucket for other processes
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Current solution • Concatenate small files into large files, optimally 100MB+ • Convert files into columnar file format such as Parquet or ORC S3DistCp Spark Concatenate files Convert to Parquet S3 S3 S3
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Future solution • Concatenate and convert files in a single process (Glue?) • Use better partitioned Hive directory format (H/D/M/Y instead of Y/M/D/H) • Use even larger files for high throughput events Spark/Glue Concatenate files & convert to Parquet S3 S3
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Learn more about data lakes
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. INEXA Cloud—Answers at the Edge Rahul C. Thakkar, Director, Commercial Cloud Steven Hoffert, Lead, Special Projects I N S I T U I n c . , a B o e i n g C o m p a n y
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Yes Problemo! • We are in the middle of nowhere – no Internet, think Australian Outback, an open-pit mine, a collection of gas well heads, linear infrastructure like pipelines • We want to build an accurate (under 10 cm/pixel) 3D (or what we also call 2.5D) model - Why? Because our clients want our spatio-temporal algorithms to find what, if anything is broken • We get 30+ TB of data for that location, we may have many many such locations operational globally Problem • We want to verify that the data we collected on that day was decent on the ground at the “edge” • We want that data to reach our cloud for further and continuous future processing
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Key architecture challenges 1 2 • Middle of nowhere • Risk for people • Less area covered • Lot of data (TB) • Time to result in days • $$$$$ • Use unmanned • Minimize risk to people • 10x or more area • Data to cloud • Time to result in hours • $$
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Key architecture challenges A boat load of images TB++ 3 Days Finally, we can start analysis
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The solution
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Q&A Don’t forget about… • Speaker Meet & Greet in the Park (behind the Expo, by the LINQ) —1 to 2 p.m. Thursday • STG312 & STG313 • Demo in the booth • Stop anyone with this logo to talk about Amazon S3
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you!