Good quality control procedures are essential for sequencing facilities. The SciLifeLab National Genomics Infrastructure is an accredited facility that processes thousands of samples every month, driving us to develop high-throughput QC procedures. We use a LIMS, a bespoke web system and most recently MultiQC - a tool that I have written to summarise analysis log files and produce reports that visualise key sample metrics.
In this talk I describe how our different systems integrate and how we use MultiQC results for both project level reporting and long term monitoring.
Developing Reliable QC at the Swedish National Genomics Infrastructure
1. High Throughput QC
Quality Control at the Swedish National Genomics Infrastructure
Phil Ewels
@ewels
@tallphil
Bioinfo-Core Workshop
2017-07-24
ISMB 2017, Prague
3. ISO accredited facility
Library preparation
Sequencing
Bioinfo analysis
“This means that our services are subject to highly stringent quality control
procedures, so that you can be sure that your data is of excellent quality.”
NGI stockholm
4. 2 x MiSeq 5 x HiSeq 2500 5 x HiSeq X10 NovaSeq
RNA-Seq
WG Re-Seq
Targeted Re-Seq
Metagenomics
Others
0 2000 4000 6000 8000 10000 12000 14000
1,265
2,580
3,214
8,934
12,017
Number of
Samples in 2016
1141 Gbp/day
1X Human Genome
every 4 minutes
NGI stockholm
5. QC with lots of samples
Always Occasionally
Automation
Visualisation
(spot outliers)
Manual checks Validations
Looking for
trends
Good quality control at scale is essential
17. Long term trends
Data available from
2253 samples, 738 reports
Choose a plot type1
Select data2
Data HistogramReport plot Compare DataData Trend
18. Long term trends
Data available from
2253 samples, 738 reports
Select samples2
Choose a plot type1 Data Trend
+ Add Filter
Metadata field: [ select value ]
Minimum: Maximum
[ 837 samples ]
Date: 2017.01.01 - 2017.03.01
Application Type: WGS
19. Long term trends
Data available from
2253 samples, 738 reports
Select samples2
Choose a plot type1 Data Trend
[ 837 samples ]
Plot data3 Picard: % Dups Qualimap: > 30X [ add data ]
20. • Each lab technique has a protocol
• Changes require validation
• Internal audits
• Horizontal (one step)
• Vertical (one sample)
NGI stockholm
Validations
In-depth reproducibility checks
stockholm
uppsala
NGI stockholm
21. Analysis Validation
• Versioned pipeline releases
• New versions require a validation test
• Runs logged and reported
• Continuous integration tests