SlideShare a Scribd company logo
1 of 50
Download to read offline
GENALICE MAP
THE ANSWER TO THE NGS DATA ANALYSIS
CHALLENGES IN AGRICULTURAL BIOTECHNOLOGY
1
Today’s journey
▲  Brief company introduction
▲  Specific challenges in plant genomics
▲  Product characteristics
▲  The quality of speed
▲  Introduction Population Calling Module
▲  Deployment options
2
COMPANY INTRODUCTION
3
GENALICE organization
4
Belfast, NI
Harderwijk,	NL
5
The	mission
World food challenge
Further under pressure if complex DNA diseases become chronic
6
2012
7 billion people
20 Gigacalories
2050
9 billion people
40 Gigacalories
GENALICE product pipeline
GENALICE LINK
GENALICE MAP
7
STORAGE
NEXT-GENERATION
SEQUENCING
SECONDARY ANALISIS DOWNSTREAM ANALYSIS
Hi Seq 2000
Standard data processing and analysis approach
to increase speed and shorten analysis time
8
Key ingredients
▲  Smart new algorithms – optimally making use of the modern hardware architecture
▲  Hidden resource – using the full potential of the hardware
▲  Bare essence – footprint and data stream reduction
9
of our high speed data analysis capabilities
GENALICE MAP: a NGS data analysis suite
Faster – smaller – better – at lower cost
10
Key components Alignment Variant calling Additional functionality:
▲  Population Calling
▲  Structural Variants
Analysis
▲  RNA-Seq mapping &
quantification
Description
Mapping
Individual reads
on right position
Statistical
correction of the
identified variants
Secondary Analysis
SPECIFIC CHALLENGES IN PLANT GENOMICS
11
QUALITY
Plant Genomics
Specific challenges
12
A
very
special
breed
COMPLEXITY
Polyploidy
Repetitive
areas
Species
diversity
REFERENCE
None available
Poor quality
Coverage issues
150 GB
SIZE
Storage
costs
Resource
requirements
Time
consuming
PRODUCT CHARACTERISTICS
13
Product characteristics
▲  Quality
▲  Speed
▲  Footprint reduction
▲  Ease-of-use
▲  Observational
▲  Full-control
14
Extensive quality validations
▲  Gold Standard (Array data)
▲  Silver Standard (GCAT – Genome in a Bottle data)
▲  True standard (Customer data)
15
Erasmus University Medical Centre
Gold standard
16
GIAB – Head to head against widely used workflows
Silver standard
17
97.269%
Sensitivity
0% 25% 50% 75% 100%
0% 25% 50% 75% 100%
Precision Rate
Sensitivity
Precision Rate
97.2561%
99.021%
95.431%
98.343%
92.595%
95.852%
44.851%
85.409%
94.990%
95.435%
62.769%
BWA-MEM - GATK HC v3 ISAAC - ISAAC v01_13_06_20 GENALICE MAP v2.3
NovoAlign - GATK HC BWA-MEM - GATK HC v3 GENALICE MAP v2.3
GIAB150xExomedataGIAB30xExomedata
GIAB – Head to head against widely used workflows
Silver standard
18
Customer validation projects in progress
True Standard
19
KeyGene Rijk Zwaan
Many others
in Agbio
and Human
Genomics
Product characteristics
▲  Quality
▲  Speed
▲  Footprint reduction
▲  Ease-of-use
▲  Observational
▲  Full-control
20
0
40
80
120 ............................. ........................................................................................................................................................ ..
............................ ......................................................................................................................................................... ...
............................ ......................................................................................................................................................... ...
............................ ..................................................................... ..................................................................... ...
............................ .......................................................... ......................................................... ...
............................ .......................................................... ......................................................... ...
........................................................................................................................................................................................................................... ...
....................................................................................................................................................................................................................................................................
........................................................................................................................................................................................................................... ..
160
BWA-MEM/GATK BWA-MEM/Platypus BWA-MEM/VarScan
125x
51x
162x
Speedgain(foldchanges)
30
1101001,00010,000
BWA-MEM/GATK GENALICE MAP
TotalRuntime(minutes)
16:06:49 00:15:42
		
16:06:49 00:08:28
		
Over100x faster - from FASTQ to VCF in 8½ minutes*
using a simple general purpose server with a dual Intel Xeon E5 processor
21
*NGS data preprocessing for one full tomato genome(40x coverage)
*Whole cultivated tomato genome (40x)
Over 100x faster - from FASTQ to VCF in 8 minutes*
using a simple general purpose server with a dual Intel Xeon E5 processor
22
*Whole cultivated tomato genome (40x)
Product characteristics
▲  Quality
▲  Speed
▲  Footprint reduction
▲  Ease-of-use
▲  Observational
▲  Full-control
23
CRAM
Product characteristics
▲  Quality
▲  Speed
▲  Footprint reduction
▲  Ease-of-use
▲  Observational
▲  Full-control
24
Whole cultivated tomato genome (40x)
Additional features GAR file
25
▲  Mate preservation (paired-end reads)
▲  Real-time re-alignment option
▲  Fast conversion to: SAM, BAM or FASTQ
▲  Direct access 3rd party software through API/plugins
All-in-one file
with real-time metrics and meta data
26
▲  Mate preservation (paired-end reads)
▲  Real-time re-alignment option
▲  Fast conversion to: SAM, BAM or FASTQ
▲  Direct access 3rd party software through API/plugins
Product characteristics
▲  Quality
▲  Speed
▲  Footprint reduction
▲  Ease-of-use
▲  Observational
▲  Full-control
27
Product characteristics
▲  Quality
▲  Speed
▲  Footprint reduction
▲  Ease-of-use
▲  Observational
▲  Full-control
28
Product characteristics
▲  Quality
▲  Speed
▲  Footprint reduction
▲  Ease-of-use
▲  Observational
▲  Full-control
29
THE QUALITY OF SPEED
30
Lycopersicon Neolycopersicon Arcanum Eriopersicon
Mappability depends on the reference genome used
in four different Tomato groups
S. lycopersicum cv Heinz v2.40
Best
31
Sequential mapping opportunity
In order to increase the total number of mappable reads
Effect of 114-fold speed increase on tomato genome mapping:
32
BWA/GATK
0 1 2 3 4 5 6 7 >13 hr
From >13 hour to only 1/2 hour for four different sequential references
Large scale projects
80 tomatoes project at KeyGene
SPEED BENEFITS
▲  Major time savings
▲  Major hardware cost savings
▲  Less hassle in planning
33
GENALICE MAP used for tomato genome mapping at KeyGene
BWA / GATK – 750 cores
0 1 2 3 4 5 6 7 weeks
From 1.5 months running on 750 cores back to less than one day on 12 cores
ADDITIONAL BENEFITS
▲  Simplified workflow
▲  Opportunity for iterations
▲  Significant storage cost savings
The Quality of Speed
Blog post – genalice.com
34
Cost-effectiveness
Number of nodes needed to process each sequencing run within 8 hours
35
Cost-effectiveness
Annual electricity cost reduction (in US dollars)
36
MAP vs. BWA-MEM/GATK
‘The Green Speed’
Annual CO2 reduction (in tons)
37
MAP vs. BWA-MEM/GATK
‘The Green Speed’
Annual number of ‘saved trees’
38
MAP vs. BWA-MEM/GATK
POPULATION CALLING MODULE
39
40
41
42
43
44
45
46
Extraordinary fast and linear scalable variant detection
A more than two orders of magnitude speed increase
47
340x
DEPLOYMENT OPTIONS
48
VAULT onsite
▲  1 or 3-Year license based on speed & functionality
▲  Annual maintenance fee of 20%
(Self) service
▲  Offline (service only) and online
▲  Payment per sample analyzed or project
VAULT on cloud (AWS)
▲  Flexible license based on speed & functionality
▲  Annual maintenance fee of 20%
Application via 3rd party platforms
▲  Payment per sample analyzed
49
GENALICE MAP deployment options
Now and in the future
Now:
Future:
S
Low cost
High speed
High quality

More Related Content

Similar to The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology - Jos Lunenberg (GENALICE)

Offshore lytics rolloos_midih_presentation_oc2
Offshore lytics rolloos_midih_presentation_oc2Offshore lytics rolloos_midih_presentation_oc2
Offshore lytics rolloos_midih_presentation_oc2MIDIH_EU
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUsiguazio
 
American connector company supply chain
American connector company supply chainAmerican connector company supply chain
American connector company supply chainUdit Jain
 
Energy Efficiency through Hygienic Design
Energy Efficiency through Hygienic DesignEnergy Efficiency through Hygienic Design
Energy Efficiency through Hygienic Designsselchow
 
Manuel cadenas - SIEMENS
Manuel cadenas - SIEMENSManuel cadenas - SIEMENS
Manuel cadenas - SIEMENSDatAgri1
 
BILS 2015 Umetrics Stefan Raennar
BILS 2015 Umetrics Stefan RaennarBILS 2015 Umetrics Stefan Raennar
BILS 2015 Umetrics Stefan RaennarGBX Events
 
Simplifying it with hpe simplivity golden deck customer presentation rk
Simplifying it with hpe simplivity golden deck customer presentation rkSimplifying it with hpe simplivity golden deck customer presentation rk
Simplifying it with hpe simplivity golden deck customer presentation rkRick Karbowski
 
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...Clobbi
 
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...Clobbi
 
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...Olexiy Lyzun
 
Productionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best PracticesProductionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best PracticesMapR Technologies
 
2015-11-24-pepite-data-analytics
2015-11-24-pepite-data-analytics2015-11-24-pepite-data-analytics
2015-11-24-pepite-data-analyticsSirris
 
American Connector Company
American Connector CompanyAmerican Connector Company
American Connector CompanySubhradeep Mitra
 
The Data lake hidden in your backups - Big Data Expo 2019
The Data lake hidden in your backups - Big Data Expo 2019The Data lake hidden in your backups - Big Data Expo 2019
The Data lake hidden in your backups - Big Data Expo 2019webwinkelvakdag
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsConcept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsSeeling Cheung
 
GE 이노베이션 포럼 2017 LIVE 발표자료 - 빌 루 GE 최고디지털책임자 겸 GE Digital 사장
GE 이노베이션 포럼 2017 LIVE 발표자료 - 빌 루 GE 최고디지털책임자 겸 GE Digital 사장GE 이노베이션 포럼 2017 LIVE 발표자료 - 빌 루 GE 최고디지털책임자 겸 GE Digital 사장
GE 이노베이션 포럼 2017 LIVE 발표자료 - 빌 루 GE 최고디지털책임자 겸 GE Digital 사장GE코리아
 
Pivoting event streaming, from PROJECTS to a PLATFORM
Pivoting event streaming, from PROJECTS to a PLATFORMPivoting event streaming, from PROJECTS to a PLATFORM
Pivoting event streaming, from PROJECTS to a PLATFORMconfluent
 

Similar to The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology - Jos Lunenberg (GENALICE) (20)

Offshore lytics rolloos_midih_presentation_oc2
Offshore lytics rolloos_midih_presentation_oc2Offshore lytics rolloos_midih_presentation_oc2
Offshore lytics rolloos_midih_presentation_oc2
 
Michael Hummel - Stop Storing Data! - Parstream
Michael Hummel - Stop Storing Data! - ParstreamMichael Hummel - Stop Storing Data! - Parstream
Michael Hummel - Stop Storing Data! - Parstream
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
 
Analytics in the Manufacturing industry
Analytics in the Manufacturing industryAnalytics in the Manufacturing industry
Analytics in the Manufacturing industry
 
American connector company supply chain
American connector company supply chainAmerican connector company supply chain
American connector company supply chain
 
Energy Efficiency through Hygienic Design
Energy Efficiency through Hygienic DesignEnergy Efficiency through Hygienic Design
Energy Efficiency through Hygienic Design
 
Manuel cadenas - SIEMENS
Manuel cadenas - SIEMENSManuel cadenas - SIEMENS
Manuel cadenas - SIEMENS
 
BILS 2015 Umetrics Stefan Raennar
BILS 2015 Umetrics Stefan RaennarBILS 2015 Umetrics Stefan Raennar
BILS 2015 Umetrics Stefan Raennar
 
Simplifying it with hpe simplivity golden deck customer presentation rk
Simplifying it with hpe simplivity golden deck customer presentation rkSimplifying it with hpe simplivity golden deck customer presentation rk
Simplifying it with hpe simplivity golden deck customer presentation rk
 
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
 
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
 
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
Clobbi CEO Dmitry Shapovalov Keynote @CRU 2019 Brussels "Practical case-studi...
 
Productionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best PracticesProductionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best Practices
 
2015-11-24-pepite-data-analytics
2015-11-24-pepite-data-analytics2015-11-24-pepite-data-analytics
2015-11-24-pepite-data-analytics
 
American Connector Company
American Connector CompanyAmerican Connector Company
American Connector Company
 
Bryan allcock trl9 ready
Bryan allcock   trl9 readyBryan allcock   trl9 ready
Bryan allcock trl9 ready
 
The Data lake hidden in your backups - Big Data Expo 2019
The Data lake hidden in your backups - Big Data Expo 2019The Data lake hidden in your backups - Big Data Expo 2019
The Data lake hidden in your backups - Big Data Expo 2019
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsConcept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
 
GE 이노베이션 포럼 2017 LIVE 발표자료 - 빌 루 GE 최고디지털책임자 겸 GE Digital 사장
GE 이노베이션 포럼 2017 LIVE 발표자료 - 빌 루 GE 최고디지털책임자 겸 GE Digital 사장GE 이노베이션 포럼 2017 LIVE 발표자료 - 빌 루 GE 최고디지털책임자 겸 GE Digital 사장
GE 이노베이션 포럼 2017 LIVE 발표자료 - 빌 루 GE 최고디지털책임자 겸 GE Digital 사장
 
Pivoting event streaming, from PROJECTS to a PLATFORM
Pivoting event streaming, from PROJECTS to a PLATFORMPivoting event streaming, from PROJECTS to a PLATFORM
Pivoting event streaming, from PROJECTS to a PLATFORM
 

Recently uploaded

Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 

Recently uploaded (20)

Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 

The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology - Jos Lunenberg (GENALICE)

  • 1. GENALICE MAP THE ANSWER TO THE NGS DATA ANALYSIS CHALLENGES IN AGRICULTURAL BIOTECHNOLOGY 1
  • 2. Today’s journey ▲  Brief company introduction ▲  Specific challenges in plant genomics ▲  Product characteristics ▲  The quality of speed ▲  Introduction Population Calling Module ▲  Deployment options 2
  • 6. World food challenge Further under pressure if complex DNA diseases become chronic 6 2012 7 billion people 20 Gigacalories 2050 9 billion people 40 Gigacalories
  • 7. GENALICE product pipeline GENALICE LINK GENALICE MAP 7 STORAGE NEXT-GENERATION SEQUENCING SECONDARY ANALISIS DOWNSTREAM ANALYSIS Hi Seq 2000
  • 8. Standard data processing and analysis approach to increase speed and shorten analysis time 8
  • 9. Key ingredients ▲  Smart new algorithms – optimally making use of the modern hardware architecture ▲  Hidden resource – using the full potential of the hardware ▲  Bare essence – footprint and data stream reduction 9 of our high speed data analysis capabilities
  • 10. GENALICE MAP: a NGS data analysis suite Faster – smaller – better – at lower cost 10 Key components Alignment Variant calling Additional functionality: ▲  Population Calling ▲  Structural Variants Analysis ▲  RNA-Seq mapping & quantification Description Mapping Individual reads on right position Statistical correction of the identified variants Secondary Analysis
  • 11. SPECIFIC CHALLENGES IN PLANT GENOMICS 11
  • 12. QUALITY Plant Genomics Specific challenges 12 A very special breed COMPLEXITY Polyploidy Repetitive areas Species diversity REFERENCE None available Poor quality Coverage issues 150 GB SIZE Storage costs Resource requirements Time consuming
  • 14. Product characteristics ▲  Quality ▲  Speed ▲  Footprint reduction ▲  Ease-of-use ▲  Observational ▲  Full-control 14
  • 15. Extensive quality validations ▲  Gold Standard (Array data) ▲  Silver Standard (GCAT – Genome in a Bottle data) ▲  True standard (Customer data) 15
  • 16. Erasmus University Medical Centre Gold standard 16
  • 17. GIAB – Head to head against widely used workflows Silver standard 17 97.269% Sensitivity 0% 25% 50% 75% 100% 0% 25% 50% 75% 100% Precision Rate Sensitivity Precision Rate 97.2561% 99.021% 95.431% 98.343% 92.595% 95.852% 44.851% 85.409% 94.990% 95.435% 62.769% BWA-MEM - GATK HC v3 ISAAC - ISAAC v01_13_06_20 GENALICE MAP v2.3 NovoAlign - GATK HC BWA-MEM - GATK HC v3 GENALICE MAP v2.3 GIAB150xExomedataGIAB30xExomedata
  • 18. GIAB – Head to head against widely used workflows Silver standard 18
  • 19. Customer validation projects in progress True Standard 19 KeyGene Rijk Zwaan Many others in Agbio and Human Genomics
  • 20. Product characteristics ▲  Quality ▲  Speed ▲  Footprint reduction ▲  Ease-of-use ▲  Observational ▲  Full-control 20 0 40 80 120 ............................. ........................................................................................................................................................ .. ............................ ......................................................................................................................................................... ... ............................ ......................................................................................................................................................... ... ............................ ..................................................................... ..................................................................... ... ............................ .......................................................... ......................................................... ... ............................ .......................................................... ......................................................... ... ........................................................................................................................................................................................................................... ... .................................................................................................................................................................................................................................................................... ........................................................................................................................................................................................................................... .. 160 BWA-MEM/GATK BWA-MEM/Platypus BWA-MEM/VarScan 125x 51x 162x Speedgain(foldchanges) 30
  • 21. 1101001,00010,000 BWA-MEM/GATK GENALICE MAP TotalRuntime(minutes) 16:06:49 00:15:42 16:06:49 00:08:28 Over100x faster - from FASTQ to VCF in 8½ minutes* using a simple general purpose server with a dual Intel Xeon E5 processor 21 *NGS data preprocessing for one full tomato genome(40x coverage) *Whole cultivated tomato genome (40x)
  • 22. Over 100x faster - from FASTQ to VCF in 8 minutes* using a simple general purpose server with a dual Intel Xeon E5 processor 22 *Whole cultivated tomato genome (40x)
  • 23. Product characteristics ▲  Quality ▲  Speed ▲  Footprint reduction ▲  Ease-of-use ▲  Observational ▲  Full-control 23 CRAM
  • 24. Product characteristics ▲  Quality ▲  Speed ▲  Footprint reduction ▲  Ease-of-use ▲  Observational ▲  Full-control 24 Whole cultivated tomato genome (40x)
  • 25. Additional features GAR file 25 ▲  Mate preservation (paired-end reads) ▲  Real-time re-alignment option ▲  Fast conversion to: SAM, BAM or FASTQ ▲  Direct access 3rd party software through API/plugins
  • 26. All-in-one file with real-time metrics and meta data 26 ▲  Mate preservation (paired-end reads) ▲  Real-time re-alignment option ▲  Fast conversion to: SAM, BAM or FASTQ ▲  Direct access 3rd party software through API/plugins
  • 27. Product characteristics ▲  Quality ▲  Speed ▲  Footprint reduction ▲  Ease-of-use ▲  Observational ▲  Full-control 27
  • 28. Product characteristics ▲  Quality ▲  Speed ▲  Footprint reduction ▲  Ease-of-use ▲  Observational ▲  Full-control 28
  • 29. Product characteristics ▲  Quality ▲  Speed ▲  Footprint reduction ▲  Ease-of-use ▲  Observational ▲  Full-control 29
  • 30. THE QUALITY OF SPEED 30
  • 31. Lycopersicon Neolycopersicon Arcanum Eriopersicon Mappability depends on the reference genome used in four different Tomato groups S. lycopersicum cv Heinz v2.40 Best 31
  • 32. Sequential mapping opportunity In order to increase the total number of mappable reads Effect of 114-fold speed increase on tomato genome mapping: 32 BWA/GATK 0 1 2 3 4 5 6 7 >13 hr From >13 hour to only 1/2 hour for four different sequential references
  • 33. Large scale projects 80 tomatoes project at KeyGene SPEED BENEFITS ▲  Major time savings ▲  Major hardware cost savings ▲  Less hassle in planning 33 GENALICE MAP used for tomato genome mapping at KeyGene BWA / GATK – 750 cores 0 1 2 3 4 5 6 7 weeks From 1.5 months running on 750 cores back to less than one day on 12 cores ADDITIONAL BENEFITS ▲  Simplified workflow ▲  Opportunity for iterations ▲  Significant storage cost savings
  • 34. The Quality of Speed Blog post – genalice.com 34
  • 35. Cost-effectiveness Number of nodes needed to process each sequencing run within 8 hours 35
  • 36. Cost-effectiveness Annual electricity cost reduction (in US dollars) 36 MAP vs. BWA-MEM/GATK
  • 37. ‘The Green Speed’ Annual CO2 reduction (in tons) 37 MAP vs. BWA-MEM/GATK
  • 38. ‘The Green Speed’ Annual number of ‘saved trees’ 38 MAP vs. BWA-MEM/GATK
  • 40. 40
  • 41. 41
  • 42. 42
  • 43. 43
  • 44. 44
  • 45. 45
  • 46. 46
  • 47. Extraordinary fast and linear scalable variant detection A more than two orders of magnitude speed increase 47 340x
  • 49. VAULT onsite ▲  1 or 3-Year license based on speed & functionality ▲  Annual maintenance fee of 20% (Self) service ▲  Offline (service only) and online ▲  Payment per sample analyzed or project VAULT on cloud (AWS) ▲  Flexible license based on speed & functionality ▲  Annual maintenance fee of 20% Application via 3rd party platforms ▲  Payment per sample analyzed 49 GENALICE MAP deployment options Now and in the future Now: Future: S