SlideShare a Scribd company logo
1 of 79
Download to read offline
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Puneet Suri, Thermo Fisher Scientific
Shakila Pothini, Thermo Fisher Scientific
October 2015
Decoding the Genetic Blueprint of Life
on a Cloud-Connected Ecosystem
ARC311
About Me
Puneet Suri
Senior Director,Software Engineering
Life Sciences Group,Thermo Fisher Scientific
follow at: @psuri  connect at: puneet.suri@thermofisher.com
Envisioned and developed the life sciences cloud platformfor Thermo Fisher
Scientific
4.5 hours of TV
11 hours of online reading
5 hours of radio
2 hours of computer
5 hours of handheld
1.13 hours of phone
Earth to Moon
6000 times
Earth to Sun
30 times
DNA from all cells
37.2
Trillion
Cells in our bodyDNA in single cell
6 feet long
Human genome
200
Gigabytes
Population of 7 billion
1.4
Zettabytes
Basepairs in DNA
3.2 billion
2 Bytes/basepair
ATGC
Infectious Disease
Top Threats
Environmental Degradation Bioterrorism
S
P
A
N
I
S
H
F
L
U
1918
A
I
D
S
1961
1958 1968
S
W
I
N
E
F
L
U
1976
1999
S
A
R
S
2002
2003
M
U
M
P
S
2006
2009
E
.
C
O
L
I
2011
2013
E
B
O
L
A
2014
Timelinesofinfectiousthreats
H
O
N
G
K
O
N
G
F
L
U
A
V
I
A
N
F
L
U
W
E
S
T
N
I
L
E
V
I
R
U
S
M
O
N
K
E
Y
P
O
X
H
1
N
1
M
E
R
S
Ourvision We enable our customer to make the world…
Having an Impact…
A person was set free
after 35 years in prison
because of a DNA test
Freeing the innocent
H1N1 pandemic
Ebola
CDC : Swine flu viruses in U.S and Mexico match
7500 Instrument
What to Expect from the Session
• Refresh / overview of our architecture
• Evolution of the architecture
• Connected ecosystem
• Future possibilities
A Day with the Scientist
Get Insights

a project
* * * *
Insights
• What is causing cancer?
• Is the disease a genetic hereditary?
• What drugs will work?
• Is therapy working?
Data Model: Project
GBs
GBs
MBs
Parse
Instrument Run
(1000s)
Patient Samples
(1000s)
Genes
(1000s)
Analysis Results
(millions)
project
MBs
Raw Signals
(millions)
A Few Years Back
desktop apps no backup, archive, or securitylimited analysis limited collaboration
ourofferings
multiple tools
painpoints
slow analysis unsupported workflowsExcel nightmare
Real-time analysis
highly interactive visualizations
2-3 second response time
easy to share data &
collaborate
store & manage large
scientific data sets
Customerneeds…
A Better Way Is to Provide…
 STORAGE
 COMPUTE
 SCALABILITY
 MEMORY
Ourvision
Our Journey
Enabling Complex Customer Workflows
Dimensions of Complexity
millions
of
records
1000s of
users,
projects
Real-time
analysis
of large
data sets
2-3
second
response
time
project
storage compute performance scalability
Custom Storage Engine
To meet the challenges of storage, compute,
performance & scalability, a custom storage and query
engine had to be designed and implemented
Design Goals for Our Custom Storage Engine
• High availability (HA) out of the box – no config
needed to turn on HA
• Completely decentralized, so no SPOF
• Managed scalability such that no admin input
required to scale storage engine as data volume
and concurrent requests go up
• Manageable total cost of ownership (TCO)
Custom Storage Engine Design
Redundant S3 storage
for big data sets
Amazon S3
DynamoDB for storing
indexes and small objects
DynamoDB
In-memory caching
for faster access using
index-based retrieval
ElastiCache
Our Iterative Journey & Challenges
0 Start with reference architecture
1 Identify scalable storage solution
2 identify scalable storage solution for large data items
3 identify solutions for real time response & queries
4 Identify solutions for real time analysis of data
NoSQL (DynamoDB)
• Managed scalability
• Near zero administration overhead
• Query performance not impacted by table size;
can add billions of rows
• Key value store allows for flexibility, so new
domains can be supported
• Read/write performance in order of tens of ms
Architecture with DynamoDB
A
B
User
Client
Internet
DNS Routing :
Amazon Route 53
AUTO SCALING
WEB SERVERS
AUTO SCALING
APP SERVERS
Load
Balancers
Load balancers
WEB SERVERS
CDN:
Amazon CloudFront
APP SERVERS
Auto Scaling
Amazon
DynamoDB
Raw Signals
(millions)
Iteration 1
MBs
Parse
Storage Query
Performance ✔ ✔
Cost ✔ ✔
MBs
GBs
Get Insights

Instrument Run
(1000s)
Patient Samples
(1000s)
Genes
(1000s)
Analysis Results
(millions)
project
GBs
What Were the Gaps?
Our item attribute (e.g., Instrument Run) size range > 400 KB
(item attribute size limitation of 400 KB)
Hot hash key
• Adding thousands of related records (e.g., Raw Signals) with common hash
key (e.g., Instrument Run) can be slow (10s seconds)
For a large project, high read/write capacity (1000s) was needed
(increased cost due to high read/write capacity needs)
Batch size limitation of 25
• A large project can have ~1 million records (e.g., Raw
Signals) that needs to be read & written at the same time
What We Needed…
…was a solution that
• can store a huge number of related objects
• is cost effective for reading/writing large data sets
• has no limitations on batch size or item size
• has ability to query into the large number of records
Our Iterative Journey & Challenges
0 Start with reference architecture
1 Identify scalable storage solution: DynamoDB
2 Identify scalable storage solution for large data items
3 identify solutions for real time response & queries
4 Identify solutions for real time analysis of data
A
B
User
Client
Internet
DNS Routing :
Amazon Route 53
AUTO SCALING
WEB SERVERS
AUTO SCALING
APP SERVERS
Load
Balancers
Load balancers
APP SERVERS
CDN:
Amazon CloudFront
APP SERVERS
Auto Scaling
DynamoDB
Architecture : DynamoDB & Amazon S3
Amazon S3
MBs
MBs
GBs
Iteration 2
GBs
Instrument Run
(1000s)
Patient Samples
(1000s)
Genes
(1000s)
Analysis Results
(millions)
Storage Query
Performance ✔ ✔
Cost ✔ ✔
Get Insights

Raw Signals
(millions)
Real-time Queries for Complex Visualizations
Our Iterative Journey & Challenges
0 Start with reference architecture
1 Identify scalable storage solution: DynamoDB
2
Identify scalable storage solution for large data items : DynamoDB +
Amazon S3
3 Identify solutions for fast, real-time response & queries
4 Identify solutions for real time analysis of data
Distributed In-Memory Storage: ElastiCache
• Queries have to perform in the order of milliseconds, so reading
& writing from S3 was not feasible for interactive use cases.
• ElastiCache was used as IN-MEMORY storage on top of
DynamoDB & Amazon S3.
• All related serialized objects in Amazon S3 part of a query
access pattern are maintained in ElastiCache as individual
records.
• Non-clustered indexes were created in DynamoDB based on the
query pattern so that data could be efficiently retrieved from
ElastiCache.
A
B
User
Client
Internet
DNS Routing :
Amazon Route 53
AUTO SCALING
WEB SERVERS
AUTO SCALING
APP SERVERS
Load
Balancers
Load balancers
WEB SERVERS
CDN:
Amazon CloudFront
APP SERVERS
Architecture DynamoDB, Amazon S3 & ElastiCache
Auto Scaling
DynamoDB
Amazon S3
ElastiCache
Iteration 3
MBs
MBs
GBs
GBs
Patient Samples
(1000s)
Genes
(1000s)
Analysis Results
(millions)
Storage Query
Performance ✔ ✔
Cost ✔ ✔
Non-clustered
indexes
Get Insights

Instrument Run
(1000s)
Raw Signals
(millions)
Need for Real-time Data Analysis
• Analyze huge projects containing thousands of patient samples in
minutes instead of days
• A scalable solution is required to support analysis requests from
thousands of users
• Existing desktop algorithms used for this analysis are not optimized
for extracting parallelism in data
8
20
40
80
120
200
320
0
50
100
150
200
250
300
350
90000 180000 270000 360000 450000 675000 900000
Analysis Solutions on Desktop
Desktop
Crash
minutes
# of records
Our Iterative Journey & Challenges
0 Start with reference architecture
1 Identify scalable storage solution: DynamoDB
2
Identify scalable storage solution for large data items: DynamoDB +
Amazon S3
3
Identify solutions for fast, real-time response & queries: DynamoDB +
Amazon S3 + ElastiCache
4 Identify solutions for real-time analysis of data
Amazon EMR
• Amazon EMR was used to perform real-time analysis of huge data sets
(analysis results) in minutes instead of days.
• All small jobs analyzed in memory while big ones are sent to Amazon EMR.
• Existing algorithms overhauled to derive massive parallelism using Hadoop
map-reduce framework.
• As large data sets were already in Amazon S3, used Amazon S3 for input
and output instead of HDFS. Only intermediate map-reduce data in HDFS.
• Amazon EMR cluster is created on demand and shut down when done.
A
B
User
Client
Internet
DNS Routing :
Amazon Route 53
AUTO SCALING
WEB SERVERS
AUTO SCALING
APP SERVERS
Load
Balancers
Load balancers
WEB SERVERS
CDN:
Amazon CloudFront
APP SERVERS
Auto Scaling
DynamoDB
Amazon S3
ElastiCache
Architecture : Amazon EMR for Real-time Analysis
EMR
Iteration 4
MBs
MBs
GBs
GBs
Parse
Patient Samples
(1000s)
Genes
(1000s)
Raw Signals
(millions)
Analysis Results
(millions)
Storage Query Analysis
Performance ✔ ✔ ✔
Cost ✔ ✔ ✔
Non-
clustered
indexes
is it a huge
analysis job?
yesnoanalysis
servers
Amazon
EMR
Get Insights

Instrument Run
(1000s)
Performance for a Project
2 4 7 11 13 20
30
0
50
100
150
200
250
300
350
90000 180000 270000 360000 450000 675000 900000
cloud
desktop
>10x
Desktop
Crash
minutes
# of records
Journey…
0 Start with reference architecture
1 Identify scalable storage solution: DynamoDB
2
Identify scalable storage solution for large data items: :DynamoDB +
Amazon S3
3
Identify solutions for fast, real-time response & queries: DynamoDB +
Amazon S3 + ElastiCache
4 Identify solutions for real-time analysis of data: Amazon EMR
✓
✓
✓
✓
✓
Thermo Fisher Cloud
apps.thermofisher.com
Evolution of Our Architecture
More Reliable & Scalable
Road Blocks
• Unbalanced routing of analysis requests (asynchronous calls) by Elastic Load
Balancing (ELB)
 Supported algorithms by ELB
o Round robin routing
o Least outstanding requests routing
 Routing algorithm options are not memory or utilization based
• ElastiCache cluster
 Unreliable data eviction
 Connection pool overhead
 Need for robust fail-over mechanism
APPLICATION 2 SERVERS
APPLICATION 2
ANALYSIS SERVERSAPPLICATION 1 SERVERS
APPLICATION 1
User
APACHE
(staticcontent)
Client Internet
DNS Routing :
Amazon Route 53
CDN:
AWS CloudFront
DeploymentOverview
Application 1 /analysis/*Request to Application 1 Request to Application 2
Reliable Analysis Services
Reliable Analysis Services: Initial Design
Analysis Requests
ELB
Analysis Services: Evolved Design with Queue
Analysis Requests
ELB
If analysis server crashes all queued jobs are lost.
Control over jobs is difficult
• managing them in dead queue
• Redirecting the jobs to different servers can’t
be done
ISSUES
Analysis Services: Evolved Design ― Amazon SQS
ELB
query query
• Analysis servers query the depth of SQS
• Accept jobs based on available capacity and execute them
• Scaling up & down based on jobs in SQS
• Can start with fewer analysis servers due to scaling
query
Amazon SQS Design
ELB
query query
V
query
Scaling down termination policies
• Oldest instance, newest instance, oldest launch
configuration, closest to next instance hour.
• Default: oldest launch configuration or close to
billing hour
Working with Amazon to enable
• Scaling down by memory usage in the servers using lifecycle hooks
• This will prevent termination of instances while analysis is happening
Termination can happen while
analysis is in progress
Reliable Cache Cluster
Gene Expression App
Cache Cluster Design
Cachecluster/Pool
Genotyping App App N…..
DynamoDB
AmazonS3
Availability Zone 1 Availability Zone 2
Reliable Cache Eviction: Initial Design
Auto eviction: OFF
Time to live (TTL): 60 minutes
• Some expired objects were not evicted
• Objects were building up
• Led to out-of-memory exceptions
Reliable Cache: With CRON Jobs
1. Collect all keys older than 7 days from slab
getKeys(){
local OLDER_THAN_DATE=${1}
##echo "getKeys(${OLDER_THAN_DATE})" >&2
mc_get_slab_list "$@" | while read -r i
do
mc_get_keys_for_slab -d ${OLDER_THAN_DATE} -s ${i}
done
}
echo "Begin deleting all old keys from all slabs"
while ((mc_get_keys | cut -d " " -f1 | xargs mc_gets | grep -v END | wc -l) > 0)
do
mc_get_keys | cut -d " " -f1 | xargs mc_gets | grep -v END | wc -l
done
echo "Done with deleting all old keys from all slabs"
2. Delete the objects that are expired
Thanks to AWS Support Team
Connection Pooling: Initial Design
clients clients clients
..…
• Multiple connections are open
• The cluster has to manage the connection overhead & memory
Cachecluster/Pool
mcrouter Connection Pooling…
• Multiple clients can connect to a mcrouter and share the outgoing connections.
• Reduction in the number of open connections to cached instances.
clients clients .… clients
mcrouter proxy
Reliable Cache Cluster
Auto-Failover for Critical Data
Classification of Data in Cache with mcrouter
Prefix routing of data in cache: route keys according to common key prefixes
mcrouter proxy
“CRITICAL”: sharded pool
Contains very critical data that is not recoverable
• Interim instrument run data while uploading
• Interim analysis results data while saving
• Project lock information
“non-critical”: sharded pool
Contains non-critical data that is recoverable from DB
• Thumbnail images of results
• Project permissions
• Default settings for analysis
Keys with “NON CRITICAL” pre-fixKeys with “CRITICAL” pre-fix
Pool “NON-CRITICAL” : shardedPool “CRITICAL” : sharded
Failover for CRITICAL Data Pool with mcrouter
mcrouter proxy
PRIMARY CACHE
REPLICATED
RE-ROUTE to “failover
cache” upon data miss
for that request
Availability Zone 1 Availability Zone 2
FAILOVER CACHE
Availability Zone 1 Availability Zone 2
RobustReliableScalable
ARCHITECTURE
BEYOND ANALYSIS: CLOUD-CONNECTED ECOSYSTEM
FOR GENETIC DISCOVERY
IoT in Life Sciences
Connected things : Instruments
Connected things : Products (RFID,NFC etc.)
Insightful analytics for faster discovery
Operational efficiency
Enhanced customer experience
Ciscohttp://blogs.cisco.com/diversity/the-internet-of-things-
infographic
http://blogs.cisco.com/diversity/the-internet-of-things-infographic
QuantStudio 3 & 5:
Cloud-Connected Platform
Connected laptop with
QuantStudio® 3 & 5
Data Collection
Software
LAN
Wi-Fi
Connect to the genetic analysis
cloud software using any deviceDownload instrument
run files
Download protocols
Upload run files
QuantStudio3&5MonitoringApp
Proxy
(Gateway)
Instrument
Connect
Instrument
Details
https
Amazon S3Amazon RDS
Data
Manager
Deep Dive: IOT Architecture
END POINTS
for all instruments
https
https
Instruments within an
organization
Instrument type data
e.g., online manuals
Instrument data
Protocol, Sample, Gene,
Results
PULL DATA (protocols)
PUSH DATA (run)
REGISTER
INSTRUMENTS
INSTRUMENT DETAILS
Press Coverage
Chris Linthwaite,
President, Genetic Sciences
Thermo Fisher Scientific.
businesswire .com
American Association of Cancer Research
QuantStudio 3 & 5 Platform
one of top 7 new technologies
for cancer research @AACR
CE instruments connect to CDC
genomeweb.com
NEXT STEPS
GENOMIC ANALYSIS
Enhanced workflows & results through predictive analysis & machine learning
Example: Improve diversity of training data set to identify mutations, active genes,
etc.
Trending on cumulative data for a user
Example: Use for better quality control, operator error, etc.
Internet of Things with connected instruments
Example: Predictive analysis of calibration issues, machine health, etc.
Predictive Analysis/Deep Learning …
Examples:
• Connect and collaborate with scientists to share experiment info
• Share or crowd-source troubleshooting among users
• Make your research & publications visible
• Donate your genomic data to public genetic databases
Empower Scientist
Social Connection
Open Ecosystem
SaaS Application Platform
• Open our ecosystem for third-party application developers
• Support analysis of data by different industry platforms and
standardized file format
IMPACTING HUMAN LIVES
Infectious disease caused by protozoa parasites of the Leishmania
genus (two strains: L. donovani, L. infantum)
Symptoms: skin ulcers and swelling of the liver and spleen
This disease is the second largest parasitic killer in the world
spread by sand fly (after malaria)
Annual occurrence rate (WHO): 300,000 new cases/year in Sudan,
Bangladesh, India, and Nepal
Fatality: 90% if untreated
Kala-azar, a.k.a. Visceral Leishmaniasis or Black Fever
30,000 deaths occur annually
Customer Success Story: Connected Ecosystem
Dr. Eisei Noiri
Associate professor
University of Tokyo Hospital
Collaborators
Bangladesh
Developed assays to identify different species of Leishmania for diagnosis
Collaborates in Bangladesh to run patient samples
Defined experiment run protocols & shared
protocols for running instrument
Collaborators can share
analysis results with Dr. Noiri
Quantstudio 5
Real-time PCR
Connected Ecosystem: Driving Adoption
outbreak
info
ALLinstrumentaroundtheworld
upload
Thank you!
Remember to complete
your evaluations!

More Related Content

What's hot

Building Big Data Applications with Serverless Architectures - June 2017 AWS...
Building Big Data Applications with Serverless Architectures -  June 2017 AWS...Building Big Data Applications with Serverless Architectures -  June 2017 AWS...
Building Big Data Applications with Serverless Architectures - June 2017 AWS...Amazon Web Services
 
Amazon Relational Database Service Deep Dive
Amazon Relational Database Service Deep DiveAmazon Relational Database Service Deep Dive
Amazon Relational Database Service Deep DiveAmazon Web Services
 
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)Amazon Web Services
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsAmazon Web Services
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...Amazon Web Services
 
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)Amazon Web Services
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...Amazon Web Services
 
AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...
AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...
AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...Amazon Web Services
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivAmazon Web Services
 
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)Amazon Web Services
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...Amazon Web Services
 
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...Amazon Web Services
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierAmazon Web Services
 
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...Amazon Web Services
 

What's hot (20)

Building Big Data Applications with Serverless Architectures - June 2017 AWS...
Building Big Data Applications with Serverless Architectures -  June 2017 AWS...Building Big Data Applications with Serverless Architectures -  June 2017 AWS...
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
 
Amazon Relational Database Service Deep Dive
Amazon Relational Database Service Deep DiveAmazon Relational Database Service Deep Dive
Amazon Relational Database Service Deep Dive
 
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
 
Accelerating DynamoDB with DAX
Accelerating DynamoDB with DAX Accelerating DynamoDB with DAX
Accelerating DynamoDB with DAX
 
Create cloud service on AWS
Create cloud service on AWSCreate cloud service on AWS
Create cloud service on AWS
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
 
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
 
Amazon Kinesis
Amazon KinesisAmazon Kinesis
Amazon Kinesis
 
AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...
AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...
AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
 
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
 
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
 
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
 
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...
 

Viewers also liked

(HLS304) Building a Secure and Scalable Healthcare Platform | AWS re:Invent 2014
(HLS304) Building a Secure and Scalable Healthcare Platform | AWS re:Invent 2014(HLS304) Building a Secure and Scalable Healthcare Platform | AWS re:Invent 2014
(HLS304) Building a Secure and Scalable Healthcare Platform | AWS re:Invent 2014Amazon Web Services
 
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...Amazon Web Services
 
London is the Capital of the Fintech Industry: Top 7 Reasons
London is the Capital of the Fintech Industry: Top 7 ReasonsLondon is the Capital of the Fintech Industry: Top 7 Reasons
London is the Capital of the Fintech Industry: Top 7 ReasonsThe Pathway Group
 
Funding for New Product Development by TCI Pathway Ltd
Funding for New Product Development by TCI Pathway LtdFunding for New Product Development by TCI Pathway Ltd
Funding for New Product Development by TCI Pathway LtdThe Pathway Group
 
Հին Հայաստանի բանակը
Հին Հայաստանի բանակըՀին Հայաստանի բանակը
Հին Հայաստանի բանակըgexarvest
 
Countable and uncountable nouns
Countable and uncountable nounsCountable and uncountable nouns
Countable and uncountable nounsAziz Al Kalim
 
Editing Session #1
Editing Session #1Editing Session #1
Editing Session #1AJV2000
 
Glosarium Card Teks Debat, Riska -yanis x mm3 vocsten Malang
Glosarium Card Teks Debat, Riska -yanis x mm3 vocsten MalangGlosarium Card Teks Debat, Riska -yanis x mm3 vocsten Malang
Glosarium Card Teks Debat, Riska -yanis x mm3 vocsten MalangNuril anwar
 
王道ダイエットで痩せる話 #デブナイト
王道ダイエットで痩せる話 #デブナイト王道ダイエットで痩せる話 #デブナイト
王道ダイエットで痩せる話 #デブナイトTakashi Abe
 
Fizika
FizikaFizika
Fizikaganyan
 
Presentación Proyecto: Mapfre
Presentación Proyecto: MapfrePresentación Proyecto: Mapfre
Presentación Proyecto: MapfreJorge Herrero
 
Cupa pizarra
Cupa pizarraCupa pizarra
Cupa pizarraLuisaceo
 
බුද්ධිමත් බවේ ලක්ෂණ
බුද්ධිමත් බවේ  ලක්ෂණබුද්ධිමත් බවේ  ලක්ෂණ
බුද්ධිමත් බවේ ලක්ෂණAurora Computer Studies
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 

Viewers also liked (17)

(HLS304) Building a Secure and Scalable Healthcare Platform | AWS re:Invent 2014
(HLS304) Building a Secure and Scalable Healthcare Platform | AWS re:Invent 2014(HLS304) Building a Secure and Scalable Healthcare Platform | AWS re:Invent 2014
(HLS304) Building a Secure and Scalable Healthcare Platform | AWS re:Invent 2014
 
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...
 
London is the Capital of the Fintech Industry: Top 7 Reasons
London is the Capital of the Fintech Industry: Top 7 ReasonsLondon is the Capital of the Fintech Industry: Top 7 Reasons
London is the Capital of the Fintech Industry: Top 7 Reasons
 
юра Sokal
юра Sokalюра Sokal
юра Sokal
 
Funding for New Product Development by TCI Pathway Ltd
Funding for New Product Development by TCI Pathway LtdFunding for New Product Development by TCI Pathway Ltd
Funding for New Product Development by TCI Pathway Ltd
 
Հին Հայաստանի բանակը
Հին Հայաստանի բանակըՀին Հայաստանի բանակը
Հին Հայաստանի բանակը
 
Countable and uncountable nouns
Countable and uncountable nounsCountable and uncountable nouns
Countable and uncountable nouns
 
Study abroad consultants
Study abroad consultantsStudy abroad consultants
Study abroad consultants
 
Editing Session #1
Editing Session #1Editing Session #1
Editing Session #1
 
Glosarium Card Teks Debat, Riska -yanis x mm3 vocsten Malang
Glosarium Card Teks Debat, Riska -yanis x mm3 vocsten MalangGlosarium Card Teks Debat, Riska -yanis x mm3 vocsten Malang
Glosarium Card Teks Debat, Riska -yanis x mm3 vocsten Malang
 
王道ダイエットで痩せる話 #デブナイト
王道ダイエットで痩せる話 #デブナイト王道ダイエットで痩せる話 #デブナイト
王道ダイエットで痩せる話 #デブナイト
 
Fizika
FizikaFizika
Fizika
 
Presentación Proyecto: Mapfre
Presentación Proyecto: MapfrePresentación Proyecto: Mapfre
Presentación Proyecto: Mapfre
 
Cupa pizarra
Cupa pizarraCupa pizarra
Cupa pizarra
 
බුද්ධිමත් බවේ ලක්ෂණ
බුද්ධිමත් බවේ  ලක්ෂණබුද්ධිමත් බවේ  ලක්ෂණ
බුද්ධිමත් බවේ ලක්ෂණ
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
SlideShare 101
SlideShare 101SlideShare 101
SlideShare 101
 

Similar to (ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem

AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...Amazon Web Services
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudAmazon Web Services
 
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreBig Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreAmazon Web Services
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
 
AWS Summit Seoul 2015 - AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
AWS Summit Seoul 2015 -  AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석AWS Summit Seoul 2015 -  AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
AWS Summit Seoul 2015 - AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석Amazon Web Services Korea
 
Real-time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-time Data Exploration and Analytics with Amazon Elasticsearch ServiceReal-time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-time Data Exploration and Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud. Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud. Amazon Web Services
 
Real-time Analytics with Open-Source
Real-time Analytics with Open-SourceReal-time Analytics with Open-Source
Real-time Analytics with Open-SourceAmazon Web Services
 
(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big Data(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big DataAmazon Web Services
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
re:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any Scalere:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any ScaleAdrian Hornsby
 
Log Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & KibanaLog Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & KibanaAmazon Web Services
 
Deep Dive: Amazon Elastic MapReduce
Deep Dive: Amazon Elastic MapReduceDeep Dive: Amazon Elastic MapReduce
Deep Dive: Amazon Elastic MapReduceAmazon Web Services
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSAmazon Web Services
 
Deep Dive: Amazon Elastic MapReduce
Deep Dive: Amazon Elastic MapReduceDeep Dive: Amazon Elastic MapReduce
Deep Dive: Amazon Elastic MapReduceAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Aaum Analytics event - Big data in the cloud
Aaum Analytics event - Big data in the cloudAaum Analytics event - Big data in the cloud
Aaum Analytics event - Big data in the cloudGanesh Raja
 

Similar to (ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem (20)

AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreBig Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
 
AWS Summit Seoul 2015 - AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
AWS Summit Seoul 2015 -  AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석AWS Summit Seoul 2015 -  AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
AWS Summit Seoul 2015 - AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
 
Real-time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-time Data Exploration and Analytics with Amazon Elasticsearch ServiceReal-time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-time Data Exploration and Analytics with Amazon Elasticsearch Service
 
Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud. Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud.
 
Deep Dive in Big Data
Deep Dive in Big DataDeep Dive in Big Data
Deep Dive in Big Data
 
Real-time Analytics with Open-Source
Real-time Analytics with Open-SourceReal-time Analytics with Open-Source
Real-time Analytics with Open-Source
 
(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big Data(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big Data
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
re:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any Scalere:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any Scale
 
Log Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & KibanaLog Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & Kibana
 
Deep Dive: Amazon Elastic MapReduce
Deep Dive: Amazon Elastic MapReduceDeep Dive: Amazon Elastic MapReduce
Deep Dive: Amazon Elastic MapReduce
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWS
 
Deep Dive: Amazon Elastic MapReduce
Deep Dive: Amazon Elastic MapReduceDeep Dive: Amazon Elastic MapReduce
Deep Dive: Amazon Elastic MapReduce
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Aaum Analytics event - Big data in the cloud
Aaum Analytics event - Big data in the cloudAaum Analytics event - Big data in the cloud
Aaum Analytics event - Big data in the cloud
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Puneet Suri, Thermo Fisher Scientific Shakila Pothini, Thermo Fisher Scientific October 2015 Decoding the Genetic Blueprint of Life on a Cloud-Connected Ecosystem ARC311
  • 2. About Me Puneet Suri Senior Director,Software Engineering Life Sciences Group,Thermo Fisher Scientific follow at: @psuri  connect at: puneet.suri@thermofisher.com Envisioned and developed the life sciences cloud platformfor Thermo Fisher Scientific
  • 3. 4.5 hours of TV 11 hours of online reading 5 hours of radio 2 hours of computer 5 hours of handheld 1.13 hours of phone
  • 4. Earth to Moon 6000 times Earth to Sun 30 times DNA from all cells 37.2 Trillion Cells in our bodyDNA in single cell 6 feet long Human genome 200 Gigabytes Population of 7 billion 1.4 Zettabytes Basepairs in DNA 3.2 billion 2 Bytes/basepair ATGC
  • 7. Ourvision We enable our customer to make the world…
  • 8. Having an Impact… A person was set free after 35 years in prison because of a DNA test Freeing the innocent H1N1 pandemic Ebola CDC : Swine flu viruses in U.S and Mexico match 7500 Instrument
  • 9. What to Expect from the Session • Refresh / overview of our architecture • Evolution of the architecture • Connected ecosystem • Future possibilities
  • 10. A Day with the Scientist Get Insights  a project * * * *
  • 11. Insights • What is causing cancer? • Is the disease a genetic hereditary? • What drugs will work? • Is therapy working?
  • 12. Data Model: Project GBs GBs MBs Parse Instrument Run (1000s) Patient Samples (1000s) Genes (1000s) Analysis Results (millions) project MBs Raw Signals (millions)
  • 13. A Few Years Back
  • 14. desktop apps no backup, archive, or securitylimited analysis limited collaboration ourofferings multiple tools painpoints slow analysis unsupported workflowsExcel nightmare
  • 15. Real-time analysis highly interactive visualizations 2-3 second response time easy to share data & collaborate store & manage large scientific data sets Customerneeds…
  • 16. A Better Way Is to Provide…  STORAGE  COMPUTE  SCALABILITY  MEMORY
  • 18. Our Journey Enabling Complex Customer Workflows
  • 19. Dimensions of Complexity millions of records 1000s of users, projects Real-time analysis of large data sets 2-3 second response time project storage compute performance scalability
  • 20. Custom Storage Engine To meet the challenges of storage, compute, performance & scalability, a custom storage and query engine had to be designed and implemented
  • 21. Design Goals for Our Custom Storage Engine • High availability (HA) out of the box – no config needed to turn on HA • Completely decentralized, so no SPOF • Managed scalability such that no admin input required to scale storage engine as data volume and concurrent requests go up • Manageable total cost of ownership (TCO)
  • 22. Custom Storage Engine Design Redundant S3 storage for big data sets Amazon S3 DynamoDB for storing indexes and small objects DynamoDB In-memory caching for faster access using index-based retrieval ElastiCache
  • 23. Our Iterative Journey & Challenges 0 Start with reference architecture 1 Identify scalable storage solution 2 identify scalable storage solution for large data items 3 identify solutions for real time response & queries 4 Identify solutions for real time analysis of data
  • 24. NoSQL (DynamoDB) • Managed scalability • Near zero administration overhead • Query performance not impacted by table size; can add billions of rows • Key value store allows for flexibility, so new domains can be supported • Read/write performance in order of tens of ms
  • 25. Architecture with DynamoDB A B User Client Internet DNS Routing : Amazon Route 53 AUTO SCALING WEB SERVERS AUTO SCALING APP SERVERS Load Balancers Load balancers WEB SERVERS CDN: Amazon CloudFront APP SERVERS Auto Scaling Amazon DynamoDB
  • 26. Raw Signals (millions) Iteration 1 MBs Parse Storage Query Performance ✔ ✔ Cost ✔ ✔ MBs GBs Get Insights  Instrument Run (1000s) Patient Samples (1000s) Genes (1000s) Analysis Results (millions) project GBs
  • 27. What Were the Gaps? Our item attribute (e.g., Instrument Run) size range > 400 KB (item attribute size limitation of 400 KB) Hot hash key • Adding thousands of related records (e.g., Raw Signals) with common hash key (e.g., Instrument Run) can be slow (10s seconds) For a large project, high read/write capacity (1000s) was needed (increased cost due to high read/write capacity needs) Batch size limitation of 25 • A large project can have ~1 million records (e.g., Raw Signals) that needs to be read & written at the same time
  • 28. What We Needed… …was a solution that • can store a huge number of related objects • is cost effective for reading/writing large data sets • has no limitations on batch size or item size • has ability to query into the large number of records
  • 29. Our Iterative Journey & Challenges 0 Start with reference architecture 1 Identify scalable storage solution: DynamoDB 2 Identify scalable storage solution for large data items 3 identify solutions for real time response & queries 4 Identify solutions for real time analysis of data
  • 30. A B User Client Internet DNS Routing : Amazon Route 53 AUTO SCALING WEB SERVERS AUTO SCALING APP SERVERS Load Balancers Load balancers APP SERVERS CDN: Amazon CloudFront APP SERVERS Auto Scaling DynamoDB Architecture : DynamoDB & Amazon S3 Amazon S3
  • 31. MBs MBs GBs Iteration 2 GBs Instrument Run (1000s) Patient Samples (1000s) Genes (1000s) Analysis Results (millions) Storage Query Performance ✔ ✔ Cost ✔ ✔ Get Insights  Raw Signals (millions)
  • 32. Real-time Queries for Complex Visualizations
  • 33. Our Iterative Journey & Challenges 0 Start with reference architecture 1 Identify scalable storage solution: DynamoDB 2 Identify scalable storage solution for large data items : DynamoDB + Amazon S3 3 Identify solutions for fast, real-time response & queries 4 Identify solutions for real time analysis of data
  • 34. Distributed In-Memory Storage: ElastiCache • Queries have to perform in the order of milliseconds, so reading & writing from S3 was not feasible for interactive use cases. • ElastiCache was used as IN-MEMORY storage on top of DynamoDB & Amazon S3. • All related serialized objects in Amazon S3 part of a query access pattern are maintained in ElastiCache as individual records. • Non-clustered indexes were created in DynamoDB based on the query pattern so that data could be efficiently retrieved from ElastiCache.
  • 35. A B User Client Internet DNS Routing : Amazon Route 53 AUTO SCALING WEB SERVERS AUTO SCALING APP SERVERS Load Balancers Load balancers WEB SERVERS CDN: Amazon CloudFront APP SERVERS Architecture DynamoDB, Amazon S3 & ElastiCache Auto Scaling DynamoDB Amazon S3 ElastiCache
  • 36. Iteration 3 MBs MBs GBs GBs Patient Samples (1000s) Genes (1000s) Analysis Results (millions) Storage Query Performance ✔ ✔ Cost ✔ ✔ Non-clustered indexes Get Insights  Instrument Run (1000s) Raw Signals (millions)
  • 37. Need for Real-time Data Analysis • Analyze huge projects containing thousands of patient samples in minutes instead of days • A scalable solution is required to support analysis requests from thousands of users • Existing desktop algorithms used for this analysis are not optimized for extracting parallelism in data
  • 38. 8 20 40 80 120 200 320 0 50 100 150 200 250 300 350 90000 180000 270000 360000 450000 675000 900000 Analysis Solutions on Desktop Desktop Crash minutes # of records
  • 39. Our Iterative Journey & Challenges 0 Start with reference architecture 1 Identify scalable storage solution: DynamoDB 2 Identify scalable storage solution for large data items: DynamoDB + Amazon S3 3 Identify solutions for fast, real-time response & queries: DynamoDB + Amazon S3 + ElastiCache 4 Identify solutions for real-time analysis of data
  • 40. Amazon EMR • Amazon EMR was used to perform real-time analysis of huge data sets (analysis results) in minutes instead of days. • All small jobs analyzed in memory while big ones are sent to Amazon EMR. • Existing algorithms overhauled to derive massive parallelism using Hadoop map-reduce framework. • As large data sets were already in Amazon S3, used Amazon S3 for input and output instead of HDFS. Only intermediate map-reduce data in HDFS. • Amazon EMR cluster is created on demand and shut down when done.
  • 41. A B User Client Internet DNS Routing : Amazon Route 53 AUTO SCALING WEB SERVERS AUTO SCALING APP SERVERS Load Balancers Load balancers WEB SERVERS CDN: Amazon CloudFront APP SERVERS Auto Scaling DynamoDB Amazon S3 ElastiCache Architecture : Amazon EMR for Real-time Analysis EMR
  • 42. Iteration 4 MBs MBs GBs GBs Parse Patient Samples (1000s) Genes (1000s) Raw Signals (millions) Analysis Results (millions) Storage Query Analysis Performance ✔ ✔ ✔ Cost ✔ ✔ ✔ Non- clustered indexes is it a huge analysis job? yesnoanalysis servers Amazon EMR Get Insights  Instrument Run (1000s)
  • 43. Performance for a Project 2 4 7 11 13 20 30 0 50 100 150 200 250 300 350 90000 180000 270000 360000 450000 675000 900000 cloud desktop >10x Desktop Crash minutes # of records
  • 44. Journey… 0 Start with reference architecture 1 Identify scalable storage solution: DynamoDB 2 Identify scalable storage solution for large data items: :DynamoDB + Amazon S3 3 Identify solutions for fast, real-time response & queries: DynamoDB + Amazon S3 + ElastiCache 4 Identify solutions for real-time analysis of data: Amazon EMR ✓ ✓ ✓ ✓ ✓
  • 46. Evolution of Our Architecture More Reliable & Scalable
  • 47. Road Blocks • Unbalanced routing of analysis requests (asynchronous calls) by Elastic Load Balancing (ELB)  Supported algorithms by ELB o Round robin routing o Least outstanding requests routing  Routing algorithm options are not memory or utilization based • ElastiCache cluster  Unreliable data eviction  Connection pool overhead  Need for robust fail-over mechanism
  • 48. APPLICATION 2 SERVERS APPLICATION 2 ANALYSIS SERVERSAPPLICATION 1 SERVERS APPLICATION 1 User APACHE (staticcontent) Client Internet DNS Routing : Amazon Route 53 CDN: AWS CloudFront DeploymentOverview Application 1 /analysis/*Request to Application 1 Request to Application 2
  • 50. Reliable Analysis Services: Initial Design Analysis Requests ELB
  • 51. Analysis Services: Evolved Design with Queue Analysis Requests ELB If analysis server crashes all queued jobs are lost. Control over jobs is difficult • managing them in dead queue • Redirecting the jobs to different servers can’t be done ISSUES
  • 52. Analysis Services: Evolved Design ― Amazon SQS ELB query query • Analysis servers query the depth of SQS • Accept jobs based on available capacity and execute them • Scaling up & down based on jobs in SQS • Can start with fewer analysis servers due to scaling query
  • 53. Amazon SQS Design ELB query query V query Scaling down termination policies • Oldest instance, newest instance, oldest launch configuration, closest to next instance hour. • Default: oldest launch configuration or close to billing hour Working with Amazon to enable • Scaling down by memory usage in the servers using lifecycle hooks • This will prevent termination of instances while analysis is happening Termination can happen while analysis is in progress
  • 55. Gene Expression App Cache Cluster Design Cachecluster/Pool Genotyping App App N….. DynamoDB AmazonS3 Availability Zone 1 Availability Zone 2
  • 56. Reliable Cache Eviction: Initial Design Auto eviction: OFF Time to live (TTL): 60 minutes • Some expired objects were not evicted • Objects were building up • Led to out-of-memory exceptions
  • 57. Reliable Cache: With CRON Jobs 1. Collect all keys older than 7 days from slab getKeys(){ local OLDER_THAN_DATE=${1} ##echo "getKeys(${OLDER_THAN_DATE})" >&2 mc_get_slab_list "$@" | while read -r i do mc_get_keys_for_slab -d ${OLDER_THAN_DATE} -s ${i} done } echo "Begin deleting all old keys from all slabs" while ((mc_get_keys | cut -d " " -f1 | xargs mc_gets | grep -v END | wc -l) > 0) do mc_get_keys | cut -d " " -f1 | xargs mc_gets | grep -v END | wc -l done echo "Done with deleting all old keys from all slabs" 2. Delete the objects that are expired Thanks to AWS Support Team
  • 58. Connection Pooling: Initial Design clients clients clients ..… • Multiple connections are open • The cluster has to manage the connection overhead & memory Cachecluster/Pool
  • 59. mcrouter Connection Pooling… • Multiple clients can connect to a mcrouter and share the outgoing connections. • Reduction in the number of open connections to cached instances. clients clients .… clients mcrouter proxy
  • 61. Classification of Data in Cache with mcrouter Prefix routing of data in cache: route keys according to common key prefixes mcrouter proxy “CRITICAL”: sharded pool Contains very critical data that is not recoverable • Interim instrument run data while uploading • Interim analysis results data while saving • Project lock information “non-critical”: sharded pool Contains non-critical data that is recoverable from DB • Thumbnail images of results • Project permissions • Default settings for analysis Keys with “NON CRITICAL” pre-fixKeys with “CRITICAL” pre-fix Pool “NON-CRITICAL” : shardedPool “CRITICAL” : sharded
  • 62. Failover for CRITICAL Data Pool with mcrouter mcrouter proxy PRIMARY CACHE REPLICATED RE-ROUTE to “failover cache” upon data miss for that request Availability Zone 1 Availability Zone 2 FAILOVER CACHE Availability Zone 1 Availability Zone 2
  • 64. BEYOND ANALYSIS: CLOUD-CONNECTED ECOSYSTEM FOR GENETIC DISCOVERY
  • 65. IoT in Life Sciences Connected things : Instruments Connected things : Products (RFID,NFC etc.) Insightful analytics for faster discovery Operational efficiency Enhanced customer experience Ciscohttp://blogs.cisco.com/diversity/the-internet-of-things- infographic http://blogs.cisco.com/diversity/the-internet-of-things-infographic
  • 66. QuantStudio 3 & 5: Cloud-Connected Platform Connected laptop with QuantStudio® 3 & 5 Data Collection Software LAN Wi-Fi Connect to the genetic analysis cloud software using any deviceDownload instrument run files Download protocols Upload run files
  • 68. Proxy (Gateway) Instrument Connect Instrument Details https Amazon S3Amazon RDS Data Manager Deep Dive: IOT Architecture END POINTS for all instruments https https Instruments within an organization Instrument type data e.g., online manuals Instrument data Protocol, Sample, Gene, Results PULL DATA (protocols) PUSH DATA (run) REGISTER INSTRUMENTS INSTRUMENT DETAILS
  • 69. Press Coverage Chris Linthwaite, President, Genetic Sciences Thermo Fisher Scientific. businesswire .com American Association of Cancer Research QuantStudio 3 & 5 Platform one of top 7 new technologies for cancer research @AACR CE instruments connect to CDC genomeweb.com
  • 71. Enhanced workflows & results through predictive analysis & machine learning Example: Improve diversity of training data set to identify mutations, active genes, etc. Trending on cumulative data for a user Example: Use for better quality control, operator error, etc. Internet of Things with connected instruments Example: Predictive analysis of calibration issues, machine health, etc. Predictive Analysis/Deep Learning …
  • 72. Examples: • Connect and collaborate with scientists to share experiment info • Share or crowd-source troubleshooting among users • Make your research & publications visible • Donate your genomic data to public genetic databases Empower Scientist Social Connection
  • 73. Open Ecosystem SaaS Application Platform • Open our ecosystem for third-party application developers • Support analysis of data by different industry platforms and standardized file format
  • 75. Infectious disease caused by protozoa parasites of the Leishmania genus (two strains: L. donovani, L. infantum) Symptoms: skin ulcers and swelling of the liver and spleen This disease is the second largest parasitic killer in the world spread by sand fly (after malaria) Annual occurrence rate (WHO): 300,000 new cases/year in Sudan, Bangladesh, India, and Nepal Fatality: 90% if untreated Kala-azar, a.k.a. Visceral Leishmaniasis or Black Fever 30,000 deaths occur annually
  • 76. Customer Success Story: Connected Ecosystem Dr. Eisei Noiri Associate professor University of Tokyo Hospital Collaborators Bangladesh Developed assays to identify different species of Leishmania for diagnosis Collaborates in Bangladesh to run patient samples Defined experiment run protocols & shared protocols for running instrument Collaborators can share analysis results with Dr. Noiri Quantstudio 5 Real-time PCR
  • 77. Connected Ecosystem: Driving Adoption outbreak info ALLinstrumentaroundtheworld upload