SlideShare a Scribd company logo
1 of 28
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Petabytes of Data & NoServers:
CortevaScales DNAAnalysis to Meet
Increasing Business Demand
Ryan Smith
Software Development Leader –
Bioinformatics
Corteva Agrisciences
E N T 2 1 8 - S
Scott Warren
Cloud Architect
Sogeti USA
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
WhereWeStarted
• Every 6 hours Corteva produces as much Genetic data as
existed in the entire public sphere in 2008
• On premise compute and storage demands were
becoming unsustainable
• 35 node Hadoop cluster with 2PB of storage
• Significant increase in future demand
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DataUses
• Assembly level variation
screening
• Transformation assay
• Quality control
• Whole genome assembly
9
OurApplications
SNPFinder
• Whole genome alignment of short reads
• Looking for single nucleotide differences
• Surrounding area contains certain properties
Vector Quality Control (VQC)
• Synthesize a DNA fragment to create a
transgenic event
• Synthesis needs to be quality controlled
• Regulatory requirements
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ProjectTheseus
11
Guiding Principles
User Patterns
• Time Sensitive Workloads
• Small User Base
Technical
• Serverless
• Immutable Infrastructure
• Automate Everything
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Short ReadSequencing
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Difference inDesign
• Both applications use the same input
data
• Type of processing, outputs and
technical requirements are very
different
14
User Interaction
SNPFinder:
• Data is transformed into a query-able state
• Analysis is done ad-hoc through a user interface
or API layer
VQC
• All processing is completed when data
enters the application
• Users are viewing these results to inform
decision making
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon API Gateway
Amazon EMR
Pipeline
Amazon EMR
Query
AWS Lambda
SNPFinder-API
AWS LambdaAmazon SQS
Amazon RDS
Amazon S3
Snpfinder-data
Amazon EFS
Write job parameters
Job ID
Job ID
Get Job ID Start Cluster
Job Tracking
Write Intermediate Files
Job Tracking
Get Job Parameters
Write Output Parquet Files
Amazon SNS
Job Tracking
Complete|Failed
Amazon CloudWatch
Event every 2 minutes
Lexicon
Get ID
PSR
Get Sequencing Metadata
Amazon S3
Ngsa-data
Get Sequencing Data
Amazon S3
Snpfinder-intermediate
Write intermediate data
GAIA
Reference Files
GAIA Importer
Get Reference Files
PubSub
Power user
SNPFinder
Architecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
User Interface
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
VQC Architecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sideby side – difference betweenthetwo
Working on producing this picture:
Will show beginning and end of processes very similar. Middle/compute –
very different.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
EMRvs.Batch
• Both applications use the same input data
• SNPFinder has a small number of larger jobs
• EMR gave the ability to parallelize these jobs
• VQC runs a large number of much smaller workloads
• AWS Batch allowed for better scheduling, containerization and a more
natural programming paradigm
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWSGlue
• Introduced later in the project
• Using for data cleanup
• Move data without having to fully
reprocess
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Results
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Results
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Results
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ryan Smith
ryand.smith@pioneer.com
Corteva Agrisciences
Scott Warren
scott.warren@us.sogeti.co
m
Sogeti USA
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

What's hot

Anomaly Detection at Scale!
Anomaly Detection at Scale!Anomaly Detection at Scale!
Anomaly Detection at Scale!Databricks
 
Amper overview slide share
Amper overview   slide shareAmper overview   slide share
Amper overview slide shareChad Richeson
 
Data Driven Decisions at Scale
Data Driven Decisions at ScaleData Driven Decisions at Scale
Data Driven Decisions at ScaleDatabricks
 
Spark at Airbnb
Spark at AirbnbSpark at Airbnb
Spark at AirbnbHao Wang
 
Code Once Use Often with Declarative Data Pipelines
Code Once Use Often with Declarative Data PipelinesCode Once Use Often with Declarative Data Pipelines
Code Once Use Often with Declarative Data PipelinesDatabricks
 
Bridging the Completeness of Big Data on Databricks
Bridging the Completeness of Big Data on DatabricksBridging the Completeness of Big Data on Databricks
Bridging the Completeness of Big Data on DatabricksDatabricks
 
Effective AIOps with Open Source Software in a Week
Effective AIOps with Open Source Software in a WeekEffective AIOps with Open Source Software in a Week
Effective AIOps with Open Source Software in a WeekDatabricks
 
Building Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at CernerBuilding Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at CernerDatabricks
 
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...Databricks
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpJoseph Arriola
 
Weavework Flagger Demo- AWS Container Day 2019 Barcelona
Weavework Flagger Demo- AWS Container Day 2019 BarcelonaWeavework Flagger Demo- AWS Container Day 2019 Barcelona
Weavework Flagger Demo- AWS Container Day 2019 BarcelonaAmazon Web Services
 
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Leveraging Apache Spark to Develop AI-Enabled Products and Services at BoschLeveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Leveraging Apache Spark to Develop AI-Enabled Products and Services at BoschDatabricks
 
Data Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopData Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopMichelle Ufford
 
Building Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks DeltaBuilding Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks DeltaDatabricks
 
Disrupting Big Data with Apache Spark in the Cloud
Disrupting Big Data with Apache Spark in the CloudDisrupting Big Data with Apache Spark in the Cloud
Disrupting Big Data with Apache Spark in the CloudJen Aman
 
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...Databricks
 
Funnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and DruidFunnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and DruidDatabricks
 
Netflix Big Data Paris 2017
Netflix Big Data Paris 2017Netflix Big Data Paris 2017
Netflix Big Data Paris 2017Jason Flittner
 
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle Diagnostics
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle DiagnosticsScylla Summit 2018: Scylla and KairosDB in Smart Vehicle Diagnostics
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle DiagnosticsScyllaDB
 

What's hot (20)

Anomaly Detection at Scale!
Anomaly Detection at Scale!Anomaly Detection at Scale!
Anomaly Detection at Scale!
 
Amper overview slide share
Amper overview   slide shareAmper overview   slide share
Amper overview slide share
 
Data Driven Decisions at Scale
Data Driven Decisions at ScaleData Driven Decisions at Scale
Data Driven Decisions at Scale
 
Spark at Airbnb
Spark at AirbnbSpark at Airbnb
Spark at Airbnb
 
Code Once Use Often with Declarative Data Pipelines
Code Once Use Often with Declarative Data PipelinesCode Once Use Often with Declarative Data Pipelines
Code Once Use Often with Declarative Data Pipelines
 
Bridging the Completeness of Big Data on Databricks
Bridging the Completeness of Big Data on DatabricksBridging the Completeness of Big Data on Databricks
Bridging the Completeness of Big Data on Databricks
 
Effective AIOps with Open Source Software in a Week
Effective AIOps with Open Source Software in a WeekEffective AIOps with Open Source Software in a Week
Effective AIOps with Open Source Software in a Week
 
Building Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at CernerBuilding Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at Cerner
 
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcp
 
Weavework Flagger Demo- AWS Container Day 2019 Barcelona
Weavework Flagger Demo- AWS Container Day 2019 BarcelonaWeavework Flagger Demo- AWS Container Day 2019 Barcelona
Weavework Flagger Demo- AWS Container Day 2019 Barcelona
 
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Leveraging Apache Spark to Develop AI-Enabled Products and Services at BoschLeveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
 
Data Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopData Warehousing Patterns for Hadoop
Data Warehousing Patterns for Hadoop
 
Building Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks DeltaBuilding Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks Delta
 
Disrupting Big Data with Apache Spark in the Cloud
Disrupting Big Data with Apache Spark in the CloudDisrupting Big Data with Apache Spark in the Cloud
Disrupting Big Data with Apache Spark in the Cloud
 
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
 
Data-Driven @ Netflix
Data-Driven @ NetflixData-Driven @ Netflix
Data-Driven @ Netflix
 
Funnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and DruidFunnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and Druid
 
Netflix Big Data Paris 2017
Netflix Big Data Paris 2017Netflix Big Data Paris 2017
Netflix Big Data Paris 2017
 
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle Diagnostics
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle DiagnosticsScylla Summit 2018: Scylla and KairosDB in Smart Vehicle Diagnostics
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle Diagnostics
 

Similar to Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increasing Business Demand

Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...
Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...
Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...Amazon Web Services
 
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftBuilding a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftAmazon Web Services
 
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Amazon Web Services
 
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018Amazon Web Services
 
Considerations for Building Your First Streaming Application (ANT359) - AWS r...
Considerations for Building Your First Streaming Application (ANT359) - AWS r...Considerations for Building Your First Streaming Application (ANT359) - AWS r...
Considerations for Building Your First Streaming Application (ANT359) - AWS r...Amazon Web Services
 
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...Amazon Web Services
 
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...Amazon Web Services
 
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018Amazon Web Services
 
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Amazon Web Services
 
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018Amazon Web Services
 
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_SingaporeAmazon Web Services
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Amazon Web Services
 
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Amazon Web Services
 
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Web Services
 
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfCome scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfAmazon Web Services
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Amazon Web Services
 
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...Amazon Web Services
 
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...Amazon Web Services
 
Building Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptxBuilding Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptxNelson Kimathi
 

Similar to Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increasing Business Demand (20)

Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...
Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...
Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...
 
Migrating database to cloud
Migrating database to cloudMigrating database to cloud
Migrating database to cloud
 
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftBuilding a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
 
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
 
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
 
Considerations for Building Your First Streaming Application (ANT359) - AWS r...
Considerations for Building Your First Streaming Application (ANT359) - AWS r...Considerations for Building Your First Streaming Application (ANT359) - AWS r...
Considerations for Building Your First Streaming Application (ANT359) - AWS r...
 
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...
 
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
 
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018
 
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
 
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018
 
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
 
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
 
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
 
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfCome scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
 
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
 
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
 
Building Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptxBuilding Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptx
 

More from Capgemini

Top Healthcare Trends 2022
Top Healthcare Trends 2022Top Healthcare Trends 2022
Top Healthcare Trends 2022Capgemini
 
Top P&C Insurance Trends 2022
Top P&C Insurance Trends 2022Top P&C Insurance Trends 2022
Top P&C Insurance Trends 2022Capgemini
 
Commercial Banking Trends book 2022
Commercial Banking Trends book 2022Commercial Banking Trends book 2022
Commercial Banking Trends book 2022Capgemini
 
Top Trends in Payments 2022
Top Trends in Payments 2022Top Trends in Payments 2022
Top Trends in Payments 2022Capgemini
 
Top Trends in Wealth Management 2022
Top Trends in Wealth Management 2022Top Trends in Wealth Management 2022
Top Trends in Wealth Management 2022Capgemini
 
Retail Banking Trends book 2022
Retail Banking Trends book 2022Retail Banking Trends book 2022
Retail Banking Trends book 2022Capgemini
 
Top Life Insurance Trends 2022
Top Life Insurance Trends 2022Top Life Insurance Trends 2022
Top Life Insurance Trends 2022Capgemini
 
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーですキャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーですCapgemini
 
Property & Casualty Insurance Top Trends 2021
Property & Casualty Insurance Top Trends 2021Property & Casualty Insurance Top Trends 2021
Property & Casualty Insurance Top Trends 2021Capgemini
 
Life Insurance Top Trends 2021
Life Insurance Top Trends 2021Life Insurance Top Trends 2021
Life Insurance Top Trends 2021Capgemini
 
Top Trends in Commercial Banking: 2021
Top Trends in Commercial Banking: 2021Top Trends in Commercial Banking: 2021
Top Trends in Commercial Banking: 2021Capgemini
 
Top Trends in Wealth Management: 2021
Top Trends in Wealth Management: 2021Top Trends in Wealth Management: 2021
Top Trends in Wealth Management: 2021Capgemini
 
Top Trends in Payments: 2021
Top Trends in Payments: 2021Top Trends in Payments: 2021
Top Trends in Payments: 2021Capgemini
 
Health Insurance Top Trends 2021
Health Insurance Top Trends 2021Health Insurance Top Trends 2021
Health Insurance Top Trends 2021Capgemini
 
Top Trends in Retail Banking: 2021
Top Trends in Retail Banking: 2021Top Trends in Retail Banking: 2021
Top Trends in Retail Banking: 2021Capgemini
 
Capgemini’s Connected Autonomous Planning
Capgemini’s Connected Autonomous PlanningCapgemini’s Connected Autonomous Planning
Capgemini’s Connected Autonomous PlanningCapgemini
 
Top Trends in Retail Banking: 2020
Top Trends in Retail Banking: 2020Top Trends in Retail Banking: 2020
Top Trends in Retail Banking: 2020Capgemini
 
Top Trends in Life Insurance: 2020
Top Trends in Life Insurance: 2020Top Trends in Life Insurance: 2020
Top Trends in Life Insurance: 2020Capgemini
 
Top Trends in Health Insurance: 2020
Top Trends in Health Insurance: 2020Top Trends in Health Insurance: 2020
Top Trends in Health Insurance: 2020Capgemini
 
Top Trends in Payments: 2020
Top Trends in Payments: 2020Top Trends in Payments: 2020
Top Trends in Payments: 2020Capgemini
 

More from Capgemini (20)

Top Healthcare Trends 2022
Top Healthcare Trends 2022Top Healthcare Trends 2022
Top Healthcare Trends 2022
 
Top P&C Insurance Trends 2022
Top P&C Insurance Trends 2022Top P&C Insurance Trends 2022
Top P&C Insurance Trends 2022
 
Commercial Banking Trends book 2022
Commercial Banking Trends book 2022Commercial Banking Trends book 2022
Commercial Banking Trends book 2022
 
Top Trends in Payments 2022
Top Trends in Payments 2022Top Trends in Payments 2022
Top Trends in Payments 2022
 
Top Trends in Wealth Management 2022
Top Trends in Wealth Management 2022Top Trends in Wealth Management 2022
Top Trends in Wealth Management 2022
 
Retail Banking Trends book 2022
Retail Banking Trends book 2022Retail Banking Trends book 2022
Retail Banking Trends book 2022
 
Top Life Insurance Trends 2022
Top Life Insurance Trends 2022Top Life Insurance Trends 2022
Top Life Insurance Trends 2022
 
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーですキャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
 
Property & Casualty Insurance Top Trends 2021
Property & Casualty Insurance Top Trends 2021Property & Casualty Insurance Top Trends 2021
Property & Casualty Insurance Top Trends 2021
 
Life Insurance Top Trends 2021
Life Insurance Top Trends 2021Life Insurance Top Trends 2021
Life Insurance Top Trends 2021
 
Top Trends in Commercial Banking: 2021
Top Trends in Commercial Banking: 2021Top Trends in Commercial Banking: 2021
Top Trends in Commercial Banking: 2021
 
Top Trends in Wealth Management: 2021
Top Trends in Wealth Management: 2021Top Trends in Wealth Management: 2021
Top Trends in Wealth Management: 2021
 
Top Trends in Payments: 2021
Top Trends in Payments: 2021Top Trends in Payments: 2021
Top Trends in Payments: 2021
 
Health Insurance Top Trends 2021
Health Insurance Top Trends 2021Health Insurance Top Trends 2021
Health Insurance Top Trends 2021
 
Top Trends in Retail Banking: 2021
Top Trends in Retail Banking: 2021Top Trends in Retail Banking: 2021
Top Trends in Retail Banking: 2021
 
Capgemini’s Connected Autonomous Planning
Capgemini’s Connected Autonomous PlanningCapgemini’s Connected Autonomous Planning
Capgemini’s Connected Autonomous Planning
 
Top Trends in Retail Banking: 2020
Top Trends in Retail Banking: 2020Top Trends in Retail Banking: 2020
Top Trends in Retail Banking: 2020
 
Top Trends in Life Insurance: 2020
Top Trends in Life Insurance: 2020Top Trends in Life Insurance: 2020
Top Trends in Life Insurance: 2020
 
Top Trends in Health Insurance: 2020
Top Trends in Health Insurance: 2020Top Trends in Health Insurance: 2020
Top Trends in Health Insurance: 2020
 
Top Trends in Payments: 2020
Top Trends in Payments: 2020Top Trends in Payments: 2020
Top Trends in Payments: 2020
 

Recently uploaded

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 

Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increasing Business Demand

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Petabytes of Data & NoServers: CortevaScales DNAAnalysis to Meet Increasing Business Demand Ryan Smith Software Development Leader – Bioinformatics Corteva Agrisciences E N T 2 1 8 - S Scott Warren Cloud Architect Sogeti USA
  • 3.
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 6.
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. WhereWeStarted • Every 6 hours Corteva produces as much Genetic data as existed in the entire public sphere in 2008 • On premise compute and storage demands were becoming unsustainable • 35 node Hadoop cluster with 2PB of storage • Significant increase in future demand
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. DataUses • Assembly level variation screening • Transformation assay • Quality control • Whole genome assembly
  • 9. 9 OurApplications SNPFinder • Whole genome alignment of short reads • Looking for single nucleotide differences • Surrounding area contains certain properties Vector Quality Control (VQC) • Synthesize a DNA fragment to create a transgenic event • Synthesis needs to be quality controlled • Regulatory requirements
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. ProjectTheseus
  • 11. 11 Guiding Principles User Patterns • Time Sensitive Workloads • Small User Base Technical • Serverless • Immutable Infrastructure • Automate Everything
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Short ReadSequencing
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Difference inDesign • Both applications use the same input data • Type of processing, outputs and technical requirements are very different
  • 14. 14 User Interaction SNPFinder: • Data is transformed into a query-able state • Analysis is done ad-hoc through a user interface or API layer VQC • All processing is completed when data enters the application • Users are viewing these results to inform decision making
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon API Gateway Amazon EMR Pipeline Amazon EMR Query AWS Lambda SNPFinder-API AWS LambdaAmazon SQS Amazon RDS Amazon S3 Snpfinder-data Amazon EFS Write job parameters Job ID Job ID Get Job ID Start Cluster Job Tracking Write Intermediate Files Job Tracking Get Job Parameters Write Output Parquet Files Amazon SNS Job Tracking Complete|Failed Amazon CloudWatch Event every 2 minutes Lexicon Get ID PSR Get Sequencing Metadata Amazon S3 Ngsa-data Get Sequencing Data Amazon S3 Snpfinder-intermediate Write intermediate data GAIA Reference Files GAIA Importer Get Reference Files PubSub Power user SNPFinder Architecture
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. User Interface
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. VQC Architecture
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Sideby side – difference betweenthetwo Working on producing this picture: Will show beginning and end of processes very similar. Middle/compute – very different.
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. EMRvs.Batch • Both applications use the same input data • SNPFinder has a small number of larger jobs • EMR gave the ability to parallelize these jobs • VQC runs a large number of much smaller workloads • AWS Batch allowed for better scheduling, containerization and a more natural programming paradigm
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWSGlue • Introduced later in the project • Using for data cleanup • Move data without having to fully reprocess
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Results
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Results
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Results
  • 27. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ryan Smith ryand.smith@pioneer.com Corteva Agrisciences Scott Warren scott.warren@us.sogeti.co m Sogeti USA
  • 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Editor's Notes

  1. Corteva Introduction Video
  2. Brian Hammond
  3. Ryan Talk in more detail on who/what Corteva is. Cover anything the video didn’t.
  4. Ryan With advances in gene sequencing technology, on premise compute could not keep pace with how fast genetic data is produced. Large hardware/software refresh looming – needed a change in direction
  5. Ryan With advances in gene sequencing technology, on premise compute could not keep pace with how fast genetic data is produced. Large hardware/software refresh looming – needed a change in direction
  6. Scott Ship of Theseus Suppose that the famous ship sailed by the hero Theseus in a great battle has been kept in a harbor as a museum piece. As the years go by some of the wooden parts begin to rot and are replaced by new ones. After a century or so, all of the parts have been replaced. Is the "restored" ship still the same object as the original?
  7. Ryan
  8. Ryan – lab examples
  9. Scott
  10. Scott
  11. Scott
  12. Scott
  13. Scott
  14. Scott
  15. Ryan
  16. Ryan
  17. Ryan
  18. Ryan
  19. Ryan
  20. Ryan