SlideShare a Scribd company logo
NGI stockholm
NGI Sweden
Next Generation Sequencing at the
National Genomics Infrastructure
Phil Ewels
phil.ewels@scilifelab.se
Introduction to Bioinformatics Using NGS Data
Linköping, 2018-05-23
NGI stockholm
Overview
National Genomics Infrastructure
Sequencing Technologies
Sequencing Applications
Bioinformatics at the NGI
The National Genomics
Infrastructure
NGI stockholm
National Genomics Infrastructure
Proteomics
Metabolomics
Single-Cell Biology
Cellular & Molecular Imaging
Molecular Structure
Chemical Biology
Genome Engineering
Diagnostic Development
Drug Discovery & Development
NationalBioinformaticsInfrastructure
DataOffice
SciLifeLab NGI
Technology Platforms
Research Programs
National Genomics Infrastructure
NGI stockholm
SciLifeLab NGI
Stockholm Uppsala
Genomics Production SNP&Seq
Uppsala Genome CenterGenomics Applications DevelopmentGenomics Applications Development
National Genomics Infrastructure
NGI stockholm
SciLifeLab NGI
Our mission is to offer a

state-of-the-art infrastructure

for massively parallel DNA sequencing
and SNP genotyping, available to
researchers all over Sweden
NGI stockholm
SciLifeLab NGI
National resource
State-of-the-art
infrastructure
Guidelines and
support
We provide 

guidelines and support

for sample collection, study
design, protocol selection and
bioinformatics analysis
NGI stockholm
NGI Organisation
NGI Stockholm NGI Uppsala
NGI stockholm
NGI Organisation
Funding
Staff salaries
Premises and service
contracts
Capital equipment
Host universities
SciLifeLab
VR
KAW
User fees
Reagent costs
NGI Stockholm NGI Uppsala
NGI stockholm
Project timeline
Sample QC
Library
preparation,
Sequencing,
Genotyping
Data processing
and primary
analysis
Scientific support
and project
consultation
Data delivery
NGI stockholm
Project timeline
Sample QC
Library
preparation,
Sequencing,
Genotyping
Data processing
and primary
analysis
Scientific support
and project
consultation
Data delivery
NGI stockholm
Just

Sequencing
Methods offered at NGI
FFPE

Sequencing
10X

Genomics
Nanoporesequencing
ATAC-seq
UserQC 

(cheap preps)
Hi-C(ox)Bisulphite

sequencing
RAD-seq
RNA
de novoDNA
Data analysis
pipelines included
NGI stockholm
NGI Stockholm
NGI Stockholm Projects in 2017
RNA-Seq
WG Re-Seq
De-Novo
Metagenomics
Targeted Re-Seq
ChIP-Seq
Epigenetics
RAD Seq
0 45 90 135 180
1
13
15
31
58
61
159
177
• RNA-seq is the most common project type
NGI stockholm
NGI Stockholm
• RNA-seq is the most common project type

• In total, NGI Sweden processed 1068 NGS projects with
almost 50 000 samples in 2017
NGI Stockholm Samples in 2017
RNA-Seq
WG Re-Seq
De-Novo
Metagenomics
Targeted Re-Seq
ChIP-Seq
Epigenetics
RAD Seq
0 4000 8000 12000 16000
192
180
397
5,909
4,496
211
4,551
15,022
NGI stockholm
NGI Stockholm
• Median turn around times from QC passed to data
delivered for 2017

• Sequencing only: 11.5 days
• RNA: 6.5 weeks
• WGS: 8 weeks
https://ngisweden.scilifelab.se/
file/stockholm_dashboard
Sequencing
Technologies
NGI stockholm
Sequencing Types
Illumina
PacBio
Oxford Nanopore
Ion Torrent
NGI stockholm
Illumina Sequencing
• Largest provider of sequencing technology

• NGS machines use "Sequencing-by-synthesis"

• Developed at the University of Cambridge in 1990s
• Spun into a company called Solexa in 1998
• Solexa acquired by illumina in 2007
• Responsible for vast majority of DNA sequencing
experiments worldwide
NGI stockholm
Illumina Sequencing
https://youtu.be/fCd6B5HRaZ8
NGI stockholm
Illumina iSeq 100
NGI stockholm
Illumina MiniSeq 100
NGI stockholm
Illumina MiSeq
NGI stockholm
Illumina NextSeq
NGI stockholm
Illumina HiSeq 2500
NGI stockholm
Illumina HiSeq 3000
NGI stockholm
Illumina HiSeq 4000
NGI stockholm
Illumina HiSeq X
NGI stockholm
Illumina NovaSeq 6000
NGI stockholm
Illumina at NGI
iSeq 100
Coming soon to NGI Uppsala
Small cheap runs
MiSeq Small runs, long reads (2x300bp)
HiSeq 2500 Primary machine for most of NGI's history
HiSeq X
Cheap, high throughput
Only allowed to run WGS with > 15X coverage
NovaSeq 6000
Newest machine, both Stockholm & Uppsala
Will eventually replace HiSeq 2500
NGI stockholm
How to choose
• Number of reads required

• How many samples, how deeply sequenced?
• Type of reads required

• Single End / Paired End, length?
• Urgency and cost

• Sharing flow cells with other users
• Best price for the project
NGI stockholm
Patterned flow cells
• New type of flow cell

• HiSeq 4000, HiSeq X, NovaSeq
• Single sequence per well

• Higher density, more data
• What's index-hopping?

• ExAmp can mix up index pairs in
tiny fraction of reads
• Avoided with dual unique indexes
Patterned flow cells
• Patterned flow cells can give "optical duplicates"

• https://sequencing.qcfail.com/articles/illumina-patterned-flow-
cells-generate-duplicated-sequences/
• Can be treated like regular PCR duplicates
HiSeq 2500 HiSeq 4000
NGI stockholm
Two colour chemistry
• Older SBS used four different fluorophores

• One for each nucleotide
• New machines use two

• Faster and cheaper
• NextSeq, NovaSeq, iSeq
• No signal = G

• Can get poly-G if something

goes wrong
https://sequencing.qcfail.com/articles/illumina-2-colour-
chemistry-can-overcall-high-confidence-g-bases/
NGI stockholm
PacBio
• Pacific Biosciences - specialists in long reads

• Also uses fluorescent nucleotides
• Polymerases immobilised at the bottom of tiny wells give
off pulses as the nucleotides are incorporated
• Each well is independent, doesn't use sequencing
rounds like illumina

• Can work with much longer DNA fragments

• 250 bp – 60 kb (max ~160 kb)
NGI stockholm
PacBio
https://youtu.be/NHCJ8PtYCFc
NGI stockholm
PacBio RS II
NGI stockholm
PacBio Sequel
NGI stockholm
PacBio Sequencing
• Long reads are excellent for de-novo genome assembly
and isoform detection

• Output is expensive compared to illumina, but getting
better

• Small genomes are no problem. Larger genomes are
now becoming more feasible.
• New amplification-free enrichment using CRISPR-Cas9
NGI stockholm
Oxford Nanopore
• Newest contender in the sequencing world

• Lots of hype and taken several years to become a reality
• Still developing very fast

• Quality, yield and cost changing almost monthly
• High error rates (but better than they used to be)

• Now 2-13% depending on sequencing type
NGI stockholm
Oxford Nanopore
NGI stockholm
MinION
NGI stockholm
MinION
NGI stockholm
GridION
NGI stockholm
PromethION
NGI stockholm
SmidgION
(not yet released)
NGI stockholm
Oxford Nanopore
• The best technology available for ultra long reads

• Twitter users report getting reads over 1 Mbp long
• "Whale spotting" - finding the longest reads on the end
of the distribution curve
• Price dropping rapidly, but still expensive compared to
illumina

• NGI has 2x MinIONs, hoping for PromethION soon
NGI stockholm
Ion Torrent
• Main application

• Microbial and metagenomic sequencing
• Targeted re-sequencing (gene panels)
• Clinical sequencing
• Short, single-end reads

• Fast run times
NGI stockholm
Ion Torrent PGM
• Yield

• 0.1 - 1 Gbp
• Run time

• 3 hrs
• Read length

• 200 - 400 bp
NGI stockholm
Ion Torrent Proton
• Yield

• 10 Gbp
• Run time

• 4 hrs
• Read length

• 200 bp
NGI stockholm
Ion Torrent S5 XL
• Yield

• 1-13 Gbp
• Run time

• 3 hrs
• Read length

• 200 - 600 bp
NGI stockholm
Sequencing Type
• No need to remember all of this

• Many considerations, changing all the time
• We are experts - come and speak to us!
support@ngisweden.se
https://ngisweden.scilifelab.se/
Sequencing
Applications
NGI stockholm
Library Preparation
• All high throughput sequencing requires some kind of
library preparation

• Add adapters for sequencing chemistry
• Adjust DNA fragment lengths
• Incorporate biological signal into sequence
• Add required enzymes
• Different library preps enable different applications
NGI stockholm
RNA Sequencing
• Choose a type of RNA

• Protein coding mRNA (poly-A)
• All RNA (rRNA depletion)
• Small RNA
• Choose your question

• Differential gene expression
• Differential isoform detection & quantification
• Fusion gene detection
• Define your limitations

• Low-input material
• Low quality material (eg. FFPE)
NGI stockholm
RNA Sequencing
• Illumina sequencing RNA library prep kits

• Illumina TruSeq RNA
• Illumina RiboZero
• Illumina TruSeq RNA Exome
• Clontech SMARTER Pico
• Illumina TruSeq Small RNA
• Oxford Nanopore, PacBio, IonTorrent
Protein-coding poly-A
rRNA depletion
FFPE / low quality
low input
small RNA
NGI stockholm
DNA Sequencing
• Choose your question

• SNP, SNV, indel calling
• Structural variant detection
• De-novo genome assembly
• Choose your priorities

• Sequencing accuracy
• Sequencing depth
• Ultra-long reads
• Define your requirements

• Low-input material
• Low quality material (eg. FFPE)
NGI stockholm
DNA Sequencing
• Illumina sequencing DNA library prep kits

• Illumina TruSeq DNA PCR Free
• Rubicon ThruPLEX
• Illumina Nextera XT
• Illumina Nextera Flex
• 10X Genomics
• Oxford Nanopore, PacBio, IonTorrent
Best quality
Low input
Cheap (plate format)
Fast and simple
Linked reads
NGI stockholm
10X Genomics
• Chromium instrument uses droplet emulsion technology
for nanoliter reaction volumes

• Linked-read sequencing

• Large molecules fragmented in droplets and barcoded
• Normal short-read illumina sequencing used
• Long fragments (20-100+ Kbp) reassembled from barcodes
• Regular illumina sequencing libraries produced
NGI stockholm
10X Genomics
NGI stockholm
10X Genomics
• Single cell RNA sequencing

• Thousands of cells captured in droplets
• Each RNA molecule tagged with droplet
barcode
NGI stockholm
• Now testing Hi-C in NGI Stockholm

• Proximity ligation assay to detect physical colocation of
DNA fragments within cell nuclei
• Multiple applications for data

• Epigenetics
• De-novo genome assembly
• Structural variation detection
Hi-C
Chr 14
Chr14
NGI stockholm
Methylation Sequencing
• Bisulphite sequencing detects Cytosine methylation in
genomic DNA

• Unmethylated Cs converted to Uracil by bisulfites and sequenced as T
• Methylated Cs are protected and sequenced as C
• Oxidative bisulphite informs about hydroxy-methylation

• Current under development at NGI Stockholm
• PacBio and Oxford Nanopore able to detect some
native base modifications
NGI stockholm
RAD Sequencing
• Restriction-site Associated DNA sequencing, also
known as GBS (Genotyping By Sequencing)

• Genome fragmented using a restriction enzyme
• Narrow size range purified - same regions of genome for
all individuals
• Allows cheap high-depth variant calling for large
numbers of samples, without a reference genome

• Excellent for population genomics and ecology
Bioinformatics

at the NGI
NGI stockholm
Bioinformatics at NGI
• Raw sequencing data management

• Demultiplexing, data transfers, backups, delivery
• Quality control

• Every project is checked against quality criteria
• Automated analysis pipelines

• Standardised pipelines give reproducible results
• Software development
NGI stockholm
NGI Data Handling
Sequencer
Network
storage
Preprocessing
Backup
UPPMAX

(Irma)
UPPMAX

(Grus)
SNIC Supr

Authentication
Your
computer
Your analysis
server
NGI stockholm
Grus Deliveries
• UPPMAX tool for NGI data deliveries

• NGI creates a SNIC Supr "delivery project" for each NGI
sequencing project
• Project PI and contact person given access, according
to what was put on the order form
• Email sent with project ID and instructions
• Grus is for secure short term storage only

• Requires two-factor authentication
NGI stockholm
Analysis Pipelines
• Initial data analysis for major protocols

• Internal QC and standardised starting point for users

• All software open source and on GitHub

• http://opensource.scilifelab.se/
• http://github.com/SciLifeLab/
• Accredited facility
NGI stockholm
Analysis Requirements
Automated

Reliable

Easy for others to run

Reproducible results
NGI stockholm
Analysis Pipelines
NouGAT (de-novo)
Sarek Somatic
• SNPs, SNVs and indels
• Structural variants
• Heterogeneity, ploidy and CNVs
• Germline and/or Somatic analysis

• Formerly called Cancer Analysis Workflow
https://github.com/SciLifeLab/Sarek
MuTect2
Strelka
FreeBayes
GATK
HaplotypeCaller
MuTect1
ASCAT
Manta
• Tumour/Normal pair WGS analysis
based on GATK best practices
Sarek
• Tool split into sub-workflows

• Bash wrapper script runs whole
workflow

• Manuscript submitted this week,
preprint available on bioRxiv

• https://www.biorxiv.org/content/early/
2018/05/09/316976
NGI-RNAseq
NGI, April – December 2017

10,227samples processed
     131user projects
https://github.com/SciLifeLab/NGI-RNAseq
MIT Licence
Read alignment
Gene counts
Quality Control
Reporting
Raw data
NGI-RNAseq
Quantitative Biology Center
Tübingen, Germany
https://github.com/SciLifeLab/NGI-RNAseq
Now running using
NGI stockholm
nf-core
• A community effort to collect a curated set of Nextflow
analysis pipelines

• GitHub organisation to collect pipelines in one place
• No institute-specific branding
• Strict set of guideline requirements
• Automated testing for code style and function
https://nf-co.re
NGI stockholm
nf-core
https://nf-co.re
• Easy to run pipelines

• Helpful community

• Super reproducible
results
NGI stockholm
Quality Control
• Every project has some level of quality control checks

• Sequencing quality
• FastQC, FastQ Screen
• Analysis pipelines give application-specific QC

• Qualimap, RSeQC
• Reporting is done using MultiQC
MultiQC
• Reporting tool, parses logs from completed analysis

• Creates single HTML report for all samples & steps in a
project

• Interactive plots for data exploration

• Current version now has 61 supported tools

• Works with anything from tens → thousands of samples

• Highly customisable
Getting MultiQC
PyPI
Conclusions
NGI stockholm
If you have a project
• Visit our order portal

• Create projects
• Request meetings
• Send us an email
https://ngisweden.scilifelab.se
support@ngisweden.se
NGI stockholm
Find our tools
• View our open-source
software

• All code available on
GitHub
http://opensource.scilifelab.se
Acknowledgements
NGI stockholm
Thanks to:
Max Käller

Olga Vinnere Pettersson

NGI Sweden
Phil Ewels

phil.ewels@scilifelab.se

ewels

tallphil
support@ngisweden.se
http://ngisweden.scilifelab.se
http://opensource.scilifelab.se

More Related Content

Similar to Lecture: NGS at the National Genomics Infrastructure

ngs.pptx
ngs.pptxngs.pptx
ngs.pptx
aaaa bbb
 
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGIWhole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Phil Ewels
 
Ngs introduction
Ngs introductionNgs introduction
Ngs introduction
Alagar Suresh
 
NBIS RNA-seq course
NBIS RNA-seq courseNBIS RNA-seq course
NBIS RNA-seq course
Phil Ewels
 
Presentation dan sequencing.pptx
Presentation dan sequencing.pptxPresentation dan sequencing.pptx
Presentation dan sequencing.pptx
quaidiantalha
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
QIAGEN
 
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
AdamAmeur_SciLife_Bioinfo_course_Nov2015.pptAdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
RuthMWinnie
 
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
AdamAmeur_SciLife_Bioinfo_course_Nov2015.pptAdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
EdizonJambormias2
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) Technology
QIAGEN
 
NBIS ChIP-seq course
NBIS ChIP-seq courseNBIS ChIP-seq course
NBIS ChIP-seq course
Phil Ewels
 
Ion torrent
Ion torrentIon torrent
Ion torrent
Aishwarya Babu
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
Owali Shawon
 
Gena tp ny
Gena tp nyGena tp ny
Gena tp ny
prekubatortto
 
EpiChrom 2019 - Updates in Epigenomics at the NGI
EpiChrom 2019 - Updates in Epigenomics at the NGIEpiChrom 2019 - Updates in Epigenomics at the NGI
EpiChrom 2019 - Updates in Epigenomics at the NGI
Phil Ewels
 
NANOPORE SEQUENCING
NANOPORE SEQUENCINGNANOPORE SEQUENCING
NANOPORE SEQUENCING
Kuldeep Gauliya
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
Din Apellidos
 
Thoughts on the recent announcements by Oxford Nanopore Technologies
Thoughts on the recent announcements by Oxford Nanopore TechnologiesThoughts on the recent announcements by Oxford Nanopore Technologies
Thoughts on the recent announcements by Oxford Nanopore Technologies
Keith Bradnam
 
Making powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondMaking powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyond
AdamCribbs1
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can Know
Brian Krueger
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
Atifa Ambreen
 

Similar to Lecture: NGS at the National Genomics Infrastructure (20)

ngs.pptx
ngs.pptxngs.pptx
ngs.pptx
 
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGIWhole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
 
Ngs introduction
Ngs introductionNgs introduction
Ngs introduction
 
NBIS RNA-seq course
NBIS RNA-seq courseNBIS RNA-seq course
NBIS RNA-seq course
 
Presentation dan sequencing.pptx
Presentation dan sequencing.pptxPresentation dan sequencing.pptx
Presentation dan sequencing.pptx
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
 
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
AdamAmeur_SciLife_Bioinfo_course_Nov2015.pptAdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
 
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
AdamAmeur_SciLife_Bioinfo_course_Nov2015.pptAdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) Technology
 
NBIS ChIP-seq course
NBIS ChIP-seq courseNBIS ChIP-seq course
NBIS ChIP-seq course
 
Ion torrent
Ion torrentIon torrent
Ion torrent
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Gena tp ny
Gena tp nyGena tp ny
Gena tp ny
 
EpiChrom 2019 - Updates in Epigenomics at the NGI
EpiChrom 2019 - Updates in Epigenomics at the NGIEpiChrom 2019 - Updates in Epigenomics at the NGI
EpiChrom 2019 - Updates in Epigenomics at the NGI
 
NANOPORE SEQUENCING
NANOPORE SEQUENCINGNANOPORE SEQUENCING
NANOPORE SEQUENCING
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
 
Thoughts on the recent announcements by Oxford Nanopore Technologies
Thoughts on the recent announcements by Oxford Nanopore TechnologiesThoughts on the recent announcements by Oxford Nanopore Technologies
Thoughts on the recent announcements by Oxford Nanopore Technologies
 
Making powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondMaking powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyond
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can Know
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 

More from Phil Ewels

Reproducible bioinformatics for everyone: Nextflow & nf-core
Reproducible bioinformatics for everyone: Nextflow & nf-coreReproducible bioinformatics for everyone: Nextflow & nf-core
Reproducible bioinformatics for everyone: Nextflow & nf-core
Phil Ewels
 
Reproducible bioinformatics workflows with Nextflow and nf-core
Reproducible bioinformatics workflows with Nextflow and nf-coreReproducible bioinformatics workflows with Nextflow and nf-core
Reproducible bioinformatics workflows with Nextflow and nf-core
Phil Ewels
 
ELIXIR Proteomics Community - Connection with nf-core
ELIXIR Proteomics Community - Connection with nf-coreELIXIR Proteomics Community - Connection with nf-core
ELIXIR Proteomics Community - Connection with nf-core
Phil Ewels
 
Coffee 'n code: Regexes
Coffee 'n code: RegexesCoffee 'n code: Regexes
Coffee 'n code: Regexes
Phil Ewels
 
Nextflow Camp 2019: nf-core tutorial (Updated Feb 2020)
Nextflow Camp 2019: nf-core tutorial (Updated Feb 2020)Nextflow Camp 2019: nf-core tutorial (Updated Feb 2020)
Nextflow Camp 2019: nf-core tutorial (Updated Feb 2020)
Phil Ewels
 
Nextflow Camp 2019: nf-core tutorial
Nextflow Camp 2019: nf-core tutorialNextflow Camp 2019: nf-core tutorial
Nextflow Camp 2019: nf-core tutorial
Phil Ewels
 
The future of genomics in the cloud
The future of genomics in the cloudThe future of genomics in the cloud
The future of genomics in the cloud
Phil Ewels
 
SciLifeLab NGI NovaSeq seminar
SciLifeLab NGI NovaSeq seminarSciLifeLab NGI NovaSeq seminar
SciLifeLab NGI NovaSeq seminar
Phil Ewels
 
SBW 2016: MultiQC Workshop
SBW 2016: MultiQC WorkshopSBW 2016: MultiQC Workshop
SBW 2016: MultiQC Workshop
Phil Ewels
 
Developing Reliable QC at the Swedish National Genomics Infrastructure
Developing Reliable QC at the Swedish National Genomics InfrastructureDeveloping Reliable QC at the Swedish National Genomics Infrastructure
Developing Reliable QC at the Swedish National Genomics Infrastructure
Phil Ewels
 
Standardising Swedish genomics analyses using nextflow
Standardising Swedish genomics analyses using nextflowStandardising Swedish genomics analyses using nextflow
Standardising Swedish genomics analyses using nextflow
Phil Ewels
 
Using visual aids effectively
Using visual aids effectivelyUsing visual aids effectively
Using visual aids effectively
Phil Ewels
 
Analysis of ChIP-Seq Data
Analysis of ChIP-Seq DataAnalysis of ChIP-Seq Data
Analysis of ChIP-Seq Data
Phil Ewels
 
Internet McMenemy
Internet McMenemyInternet McMenemy
Internet McMenemy
Phil Ewels
 

More from Phil Ewels (14)

Reproducible bioinformatics for everyone: Nextflow & nf-core
Reproducible bioinformatics for everyone: Nextflow & nf-coreReproducible bioinformatics for everyone: Nextflow & nf-core
Reproducible bioinformatics for everyone: Nextflow & nf-core
 
Reproducible bioinformatics workflows with Nextflow and nf-core
Reproducible bioinformatics workflows with Nextflow and nf-coreReproducible bioinformatics workflows with Nextflow and nf-core
Reproducible bioinformatics workflows with Nextflow and nf-core
 
ELIXIR Proteomics Community - Connection with nf-core
ELIXIR Proteomics Community - Connection with nf-coreELIXIR Proteomics Community - Connection with nf-core
ELIXIR Proteomics Community - Connection with nf-core
 
Coffee 'n code: Regexes
Coffee 'n code: RegexesCoffee 'n code: Regexes
Coffee 'n code: Regexes
 
Nextflow Camp 2019: nf-core tutorial (Updated Feb 2020)
Nextflow Camp 2019: nf-core tutorial (Updated Feb 2020)Nextflow Camp 2019: nf-core tutorial (Updated Feb 2020)
Nextflow Camp 2019: nf-core tutorial (Updated Feb 2020)
 
Nextflow Camp 2019: nf-core tutorial
Nextflow Camp 2019: nf-core tutorialNextflow Camp 2019: nf-core tutorial
Nextflow Camp 2019: nf-core tutorial
 
The future of genomics in the cloud
The future of genomics in the cloudThe future of genomics in the cloud
The future of genomics in the cloud
 
SciLifeLab NGI NovaSeq seminar
SciLifeLab NGI NovaSeq seminarSciLifeLab NGI NovaSeq seminar
SciLifeLab NGI NovaSeq seminar
 
SBW 2016: MultiQC Workshop
SBW 2016: MultiQC WorkshopSBW 2016: MultiQC Workshop
SBW 2016: MultiQC Workshop
 
Developing Reliable QC at the Swedish National Genomics Infrastructure
Developing Reliable QC at the Swedish National Genomics InfrastructureDeveloping Reliable QC at the Swedish National Genomics Infrastructure
Developing Reliable QC at the Swedish National Genomics Infrastructure
 
Standardising Swedish genomics analyses using nextflow
Standardising Swedish genomics analyses using nextflowStandardising Swedish genomics analyses using nextflow
Standardising Swedish genomics analyses using nextflow
 
Using visual aids effectively
Using visual aids effectivelyUsing visual aids effectively
Using visual aids effectively
 
Analysis of ChIP-Seq Data
Analysis of ChIP-Seq DataAnalysis of ChIP-Seq Data
Analysis of ChIP-Seq Data
 
Internet McMenemy
Internet McMenemyInternet McMenemy
Internet McMenemy
 

Recently uploaded

Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 

Recently uploaded (20)

Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 

Lecture: NGS at the National Genomics Infrastructure

  • 1. NGI stockholm NGI Sweden Next Generation Sequencing at the National Genomics Infrastructure Phil Ewels phil.ewels@scilifelab.se Introduction to Bioinformatics Using NGS Data Linköping, 2018-05-23
  • 2. NGI stockholm Overview National Genomics Infrastructure Sequencing Technologies Sequencing Applications Bioinformatics at the NGI
  • 4. NGI stockholm National Genomics Infrastructure Proteomics Metabolomics Single-Cell Biology Cellular & Molecular Imaging Molecular Structure Chemical Biology Genome Engineering Diagnostic Development Drug Discovery & Development NationalBioinformaticsInfrastructure DataOffice SciLifeLab NGI Technology Platforms Research Programs National Genomics Infrastructure
  • 5. NGI stockholm SciLifeLab NGI Stockholm Uppsala Genomics Production SNP&Seq Uppsala Genome CenterGenomics Applications DevelopmentGenomics Applications Development National Genomics Infrastructure
  • 6. NGI stockholm SciLifeLab NGI Our mission is to offer a
 state-of-the-art infrastructure
 for massively parallel DNA sequencing and SNP genotyping, available to researchers all over Sweden
  • 7. NGI stockholm SciLifeLab NGI National resource State-of-the-art infrastructure Guidelines and support We provide 
 guidelines and support
 for sample collection, study design, protocol selection and bioinformatics analysis
  • 8. NGI stockholm NGI Organisation NGI Stockholm NGI Uppsala
  • 9. NGI stockholm NGI Organisation Funding Staff salaries Premises and service contracts Capital equipment Host universities SciLifeLab VR KAW User fees Reagent costs NGI Stockholm NGI Uppsala
  • 10. NGI stockholm Project timeline Sample QC Library preparation, Sequencing, Genotyping Data processing and primary analysis Scientific support and project consultation Data delivery
  • 11. NGI stockholm Project timeline Sample QC Library preparation, Sequencing, Genotyping Data processing and primary analysis Scientific support and project consultation Data delivery
  • 12. NGI stockholm Just
 Sequencing Methods offered at NGI FFPE Sequencing 10X Genomics Nanoporesequencing ATAC-seq UserQC (cheap preps) Hi-C(ox)Bisulphite
 sequencing RAD-seq RNA de novoDNA Data analysis pipelines included
  • 13. NGI stockholm NGI Stockholm NGI Stockholm Projects in 2017 RNA-Seq WG Re-Seq De-Novo Metagenomics Targeted Re-Seq ChIP-Seq Epigenetics RAD Seq 0 45 90 135 180 1 13 15 31 58 61 159 177 • RNA-seq is the most common project type
  • 14. NGI stockholm NGI Stockholm • RNA-seq is the most common project type • In total, NGI Sweden processed 1068 NGS projects with almost 50 000 samples in 2017 NGI Stockholm Samples in 2017 RNA-Seq WG Re-Seq De-Novo Metagenomics Targeted Re-Seq ChIP-Seq Epigenetics RAD Seq 0 4000 8000 12000 16000 192 180 397 5,909 4,496 211 4,551 15,022
  • 15. NGI stockholm NGI Stockholm • Median turn around times from QC passed to data delivered for 2017 • Sequencing only: 11.5 days • RNA: 6.5 weeks • WGS: 8 weeks https://ngisweden.scilifelab.se/ file/stockholm_dashboard
  • 18.
  • 19. NGI stockholm Illumina Sequencing • Largest provider of sequencing technology • NGS machines use "Sequencing-by-synthesis" • Developed at the University of Cambridge in 1990s • Spun into a company called Solexa in 1998 • Solexa acquired by illumina in 2007 • Responsible for vast majority of DNA sequencing experiments worldwide
  • 30. NGI stockholm Illumina at NGI iSeq 100 Coming soon to NGI Uppsala Small cheap runs MiSeq Small runs, long reads (2x300bp) HiSeq 2500 Primary machine for most of NGI's history HiSeq X Cheap, high throughput Only allowed to run WGS with > 15X coverage NovaSeq 6000 Newest machine, both Stockholm & Uppsala Will eventually replace HiSeq 2500
  • 31. NGI stockholm How to choose • Number of reads required • How many samples, how deeply sequenced? • Type of reads required • Single End / Paired End, length? • Urgency and cost • Sharing flow cells with other users • Best price for the project
  • 32. NGI stockholm Patterned flow cells • New type of flow cell • HiSeq 4000, HiSeq X, NovaSeq • Single sequence per well • Higher density, more data • What's index-hopping? • ExAmp can mix up index pairs in tiny fraction of reads • Avoided with dual unique indexes
  • 33. Patterned flow cells • Patterned flow cells can give "optical duplicates" • https://sequencing.qcfail.com/articles/illumina-patterned-flow- cells-generate-duplicated-sequences/ • Can be treated like regular PCR duplicates HiSeq 2500 HiSeq 4000
  • 34. NGI stockholm Two colour chemistry • Older SBS used four different fluorophores • One for each nucleotide • New machines use two • Faster and cheaper • NextSeq, NovaSeq, iSeq • No signal = G • Can get poly-G if something
 goes wrong https://sequencing.qcfail.com/articles/illumina-2-colour- chemistry-can-overcall-high-confidence-g-bases/
  • 35.
  • 36. NGI stockholm PacBio • Pacific Biosciences - specialists in long reads • Also uses fluorescent nucleotides • Polymerases immobilised at the bottom of tiny wells give off pulses as the nucleotides are incorporated • Each well is independent, doesn't use sequencing rounds like illumina • Can work with much longer DNA fragments • 250 bp – 60 kb (max ~160 kb)
  • 40. NGI stockholm PacBio Sequencing • Long reads are excellent for de-novo genome assembly and isoform detection • Output is expensive compared to illumina, but getting better • Small genomes are no problem. Larger genomes are now becoming more feasible. • New amplification-free enrichment using CRISPR-Cas9
  • 41.
  • 42. NGI stockholm Oxford Nanopore • Newest contender in the sequencing world • Lots of hype and taken several years to become a reality • Still developing very fast • Quality, yield and cost changing almost monthly • High error rates (but better than they used to be) • Now 2-13% depending on sequencing type
  • 49. NGI stockholm Oxford Nanopore • The best technology available for ultra long reads • Twitter users report getting reads over 1 Mbp long • "Whale spotting" - finding the longest reads on the end of the distribution curve • Price dropping rapidly, but still expensive compared to illumina • NGI has 2x MinIONs, hoping for PromethION soon
  • 50.
  • 51. NGI stockholm Ion Torrent • Main application • Microbial and metagenomic sequencing • Targeted re-sequencing (gene panels) • Clinical sequencing • Short, single-end reads • Fast run times
  • 52. NGI stockholm Ion Torrent PGM • Yield • 0.1 - 1 Gbp • Run time • 3 hrs • Read length • 200 - 400 bp
  • 53. NGI stockholm Ion Torrent Proton • Yield • 10 Gbp • Run time • 4 hrs • Read length • 200 bp
  • 54. NGI stockholm Ion Torrent S5 XL • Yield • 1-13 Gbp • Run time • 3 hrs • Read length • 200 - 600 bp
  • 55. NGI stockholm Sequencing Type • No need to remember all of this • Many considerations, changing all the time • We are experts - come and speak to us! support@ngisweden.se https://ngisweden.scilifelab.se/
  • 57. NGI stockholm Library Preparation • All high throughput sequencing requires some kind of library preparation • Add adapters for sequencing chemistry • Adjust DNA fragment lengths • Incorporate biological signal into sequence • Add required enzymes • Different library preps enable different applications
  • 58. NGI stockholm RNA Sequencing • Choose a type of RNA • Protein coding mRNA (poly-A) • All RNA (rRNA depletion) • Small RNA • Choose your question • Differential gene expression • Differential isoform detection & quantification • Fusion gene detection • Define your limitations • Low-input material • Low quality material (eg. FFPE)
  • 59. NGI stockholm RNA Sequencing • Illumina sequencing RNA library prep kits • Illumina TruSeq RNA • Illumina RiboZero • Illumina TruSeq RNA Exome • Clontech SMARTER Pico • Illumina TruSeq Small RNA • Oxford Nanopore, PacBio, IonTorrent Protein-coding poly-A rRNA depletion FFPE / low quality low input small RNA
  • 60. NGI stockholm DNA Sequencing • Choose your question • SNP, SNV, indel calling • Structural variant detection • De-novo genome assembly • Choose your priorities • Sequencing accuracy • Sequencing depth • Ultra-long reads • Define your requirements • Low-input material • Low quality material (eg. FFPE)
  • 61. NGI stockholm DNA Sequencing • Illumina sequencing DNA library prep kits • Illumina TruSeq DNA PCR Free • Rubicon ThruPLEX • Illumina Nextera XT • Illumina Nextera Flex • 10X Genomics • Oxford Nanopore, PacBio, IonTorrent Best quality Low input Cheap (plate format) Fast and simple Linked reads
  • 62. NGI stockholm 10X Genomics • Chromium instrument uses droplet emulsion technology for nanoliter reaction volumes • Linked-read sequencing • Large molecules fragmented in droplets and barcoded • Normal short-read illumina sequencing used • Long fragments (20-100+ Kbp) reassembled from barcodes • Regular illumina sequencing libraries produced
  • 64. NGI stockholm 10X Genomics • Single cell RNA sequencing • Thousands of cells captured in droplets • Each RNA molecule tagged with droplet barcode
  • 65. NGI stockholm • Now testing Hi-C in NGI Stockholm • Proximity ligation assay to detect physical colocation of DNA fragments within cell nuclei • Multiple applications for data • Epigenetics • De-novo genome assembly • Structural variation detection Hi-C Chr 14 Chr14
  • 66. NGI stockholm Methylation Sequencing • Bisulphite sequencing detects Cytosine methylation in genomic DNA • Unmethylated Cs converted to Uracil by bisulfites and sequenced as T • Methylated Cs are protected and sequenced as C • Oxidative bisulphite informs about hydroxy-methylation • Current under development at NGI Stockholm • PacBio and Oxford Nanopore able to detect some native base modifications
  • 67. NGI stockholm RAD Sequencing • Restriction-site Associated DNA sequencing, also known as GBS (Genotyping By Sequencing) • Genome fragmented using a restriction enzyme • Narrow size range purified - same regions of genome for all individuals • Allows cheap high-depth variant calling for large numbers of samples, without a reference genome • Excellent for population genomics and ecology
  • 69. NGI stockholm Bioinformatics at NGI • Raw sequencing data management • Demultiplexing, data transfers, backups, delivery • Quality control • Every project is checked against quality criteria • Automated analysis pipelines • Standardised pipelines give reproducible results • Software development
  • 70. NGI stockholm NGI Data Handling Sequencer Network storage Preprocessing Backup UPPMAX (Irma) UPPMAX (Grus) SNIC Supr Authentication Your computer Your analysis server
  • 71. NGI stockholm Grus Deliveries • UPPMAX tool for NGI data deliveries • NGI creates a SNIC Supr "delivery project" for each NGI sequencing project • Project PI and contact person given access, according to what was put on the order form • Email sent with project ID and instructions • Grus is for secure short term storage only • Requires two-factor authentication
  • 72. NGI stockholm Analysis Pipelines • Initial data analysis for major protocols • Internal QC and standardised starting point for users • All software open source and on GitHub • http://opensource.scilifelab.se/ • http://github.com/SciLifeLab/ • Accredited facility
  • 73. NGI stockholm Analysis Requirements Automated Reliable Easy for others to run Reproducible results
  • 75. Sarek Somatic • SNPs, SNVs and indels • Structural variants • Heterogeneity, ploidy and CNVs • Germline and/or Somatic analysis • Formerly called Cancer Analysis Workflow https://github.com/SciLifeLab/Sarek MuTect2 Strelka FreeBayes GATK HaplotypeCaller MuTect1 ASCAT Manta • Tumour/Normal pair WGS analysis based on GATK best practices
  • 76. Sarek • Tool split into sub-workflows • Bash wrapper script runs whole workflow • Manuscript submitted this week, preprint available on bioRxiv • https://www.biorxiv.org/content/early/ 2018/05/09/316976
  • 77. NGI-RNAseq NGI, April – December 2017 10,227samples processed      131user projects https://github.com/SciLifeLab/NGI-RNAseq MIT Licence Read alignment Gene counts Quality Control Reporting Raw data
  • 78. NGI-RNAseq Quantitative Biology Center Tübingen, Germany https://github.com/SciLifeLab/NGI-RNAseq Now running using
  • 79. NGI stockholm nf-core • A community effort to collect a curated set of Nextflow analysis pipelines • GitHub organisation to collect pipelines in one place • No institute-specific branding • Strict set of guideline requirements • Automated testing for code style and function https://nf-co.re
  • 80. NGI stockholm nf-core https://nf-co.re • Easy to run pipelines • Helpful community • Super reproducible results
  • 81. NGI stockholm Quality Control • Every project has some level of quality control checks • Sequencing quality • FastQC, FastQ Screen • Analysis pipelines give application-specific QC • Qualimap, RSeQC • Reporting is done using MultiQC
  • 82. MultiQC • Reporting tool, parses logs from completed analysis • Creates single HTML report for all samples & steps in a project • Interactive plots for data exploration • Current version now has 61 supported tools • Works with anything from tens → thousands of samples • Highly customisable
  • 83.
  • 86. NGI stockholm If you have a project • Visit our order portal • Create projects • Request meetings • Send us an email https://ngisweden.scilifelab.se support@ngisweden.se
  • 87. NGI stockholm Find our tools • View our open-source software • All code available on GitHub http://opensource.scilifelab.se
  • 88. Acknowledgements NGI stockholm Thanks to: Max Käller Olga Vinnere Pettersson NGI Sweden Phil Ewels phil.ewels@scilifelab.se ewels tallphil support@ngisweden.se http://ngisweden.scilifelab.se http://opensource.scilifelab.se