What are the key challenges in plant genomics? What efficient bioinformatics software technologies can be used to tackle them? The presentation introduces GENALICE MAP product characteristics, its quality of speed, an introduction to the revolutionary Population Calling Module and deployment options. For more information, please visit http://www.genalice.com/product/genalice-map/
6. World food challenge
Further under pressure if complex DNA diseases become chronic
6
2012
7 billion people
20 Gigacalories
2050
9 billion people
40 Gigacalories
7. GENALICE product pipeline
GENALICE LINK
GENALICE MAP
7
STORAGE
NEXT-GENERATION
SEQUENCING
SECONDARY ANALISIS DOWNSTREAM ANALYSIS
Hi Seq 2000
9. Key ingredients
▲ Smart new algorithms – optimally making use of the modern hardware architecture
▲ Hidden resource – using the full potential of the hardware
▲ Bare essence – footprint and data stream reduction
9
of our high speed data analysis capabilities
10. GENALICE MAP: a NGS data analysis suite
Faster – smaller – better – at lower cost
10
Key components Alignment Variant calling Additional functionality:
▲ Population Calling
▲ Structural Variants
Analysis
▲ RNA-Seq mapping &
quantification
Description
Mapping
Individual reads
on right position
Statistical
correction of the
identified variants
Secondary Analysis
21. 1101001,00010,000
BWA-MEM/GATK GENALICE MAP
TotalRuntime(minutes)
16:06:49 00:15:42
16:06:49 00:08:28
Over100x faster - from FASTQ to VCF in 8½ minutes*
using a simple general purpose server with a dual Intel Xeon E5 processor
21
*NGS data preprocessing for one full tomato genome(40x coverage)
*Whole cultivated tomato genome (40x)
22. Over 100x faster - from FASTQ to VCF in 8 minutes*
using a simple general purpose server with a dual Intel Xeon E5 processor
22
*Whole cultivated tomato genome (40x)
25. Additional features GAR file
25
▲ Mate preservation (paired-end reads)
▲ Real-time re-alignment option
▲ Fast conversion to: SAM, BAM or FASTQ
▲ Direct access 3rd party software through API/plugins
26. All-in-one file
with real-time metrics and meta data
26
▲ Mate preservation (paired-end reads)
▲ Real-time re-alignment option
▲ Fast conversion to: SAM, BAM or FASTQ
▲ Direct access 3rd party software through API/plugins
31. Lycopersicon Neolycopersicon Arcanum Eriopersicon
Mappability depends on the reference genome used
in four different Tomato groups
S. lycopersicum cv Heinz v2.40
Best
31
32. Sequential mapping opportunity
In order to increase the total number of mappable reads
Effect of 114-fold speed increase on tomato genome mapping:
32
BWA/GATK
0 1 2 3 4 5 6 7 >13 hr
From >13 hour to only 1/2 hour for four different sequential references
33. Large scale projects
80 tomatoes project at KeyGene
SPEED BENEFITS
▲ Major time savings
▲ Major hardware cost savings
▲ Less hassle in planning
33
GENALICE MAP used for tomato genome mapping at KeyGene
BWA / GATK – 750 cores
0 1 2 3 4 5 6 7 weeks
From 1.5 months running on 750 cores back to less than one day on 12 cores
ADDITIONAL BENEFITS
▲ Simplified workflow
▲ Opportunity for iterations
▲ Significant storage cost savings
49. VAULT onsite
▲ 1 or 3-Year license based on speed & functionality
▲ Annual maintenance fee of 20%
(Self) service
▲ Offline (service only) and online
▲ Payment per sample analyzed or project
VAULT on cloud (AWS)
▲ Flexible license based on speed & functionality
▲ Annual maintenance fee of 20%
Application via 3rd party platforms
▲ Payment per sample analyzed
49
GENALICE MAP deployment options
Now and in the future
Now:
Future:
S