SlideShare a Scribd company logo
1 of 46
Download to read offline
Best practices for data analysis when using
UMI adapters to improve variant detection
1
Wendy Lee, PhD
Staff Scientist
Outline
• Overview of NGS workflow that includes sample multiplexing
• Overview of workflow with xGen® Dual Index UMI Adapters—Tech
Access
• Discussion of data analysis steps:
– Extracting UMIs from sequencing reads
– Constructing consensus reads within UMI families
• Improving variant calling accuracy using consensus reads
2
UMI: unique molecular identifier
NGS workflow with xGen Dual Index UMI Adapters
3
xGen Universal
Blockers
xGen
xGen Dual Index UMI Adapters—Tech Access
4
3-in-1 design
• Designed for Illumina sequencers
• Compatible with standard end-repair and A-tailing library
construction, including PCR-free library methods
• Dual unique sample indices reduce sample cross-talk
• Degenerate 9-base UMI is incorporated for error correction and/or
counting applications
xGen Dual Index UMI Adapters—Tech Access
5
3-in-1 design
Consensus calling reduces artifacts in sequencing data
6
TP
Total readsDedup by start/stop positions
7
TP
Total reads
TP
Consensus reads
(Min3)
Dedup by start/stop positions
A UMI family
Consensus calling reduces artifacts in sequencing data
8
TP TP
Consensus reads
(Min3)
Dedup by start/stop positions
Consensus calling reduces artifacts in sequencing data
Extracting UMIs within sample index reads during
demultiplexing
9
Assumptions and requirements
• Sequencing data are generated from the Illumina platform
• The following tools are installed in a Linux environment:
– Picard, version 2.9.0
– Burrows-Wheeler Aligner (BWA), version 0.7.15-r1140
– Fgbio, version 0.5.0
– VarDict Java
• Access to the raw basecall data output from the sequencer
10
Data analysis guidelines on IDT website
11
www.idtdna.com/UMI-techaccess
Overall workflow
12
Sample Sheet
Steps D1–6: Converted base-calls to short
reads with UMI information during
demultiplexing NGS runs
Short reads files with UMI info
Illumina basecalls
Steps C1–4: Call consensus reads using UMI
Steps P1–4: Post-consensus calling analysis
Variant calls
Extract UMIs from sample index reads
through Illumina demultiplexing workflow
13
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Sample sheet
14
Sample sheet example
15
16
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Steps D1,2: Create a barcode file containing the sample barcode
information for each sample.
Steps 1 and 2 of 6 in demultiplexing
17
Steps D1,2: Create a barcode file containing the sample barcode
information for each sample.
17
• UMI bases are in Ns in the barcode sequence
• This is a tab-delimited file
• In this example, we saved this file in /mnt/demodata/barcode_file.txt
• In this example, we create an output directory in /mnt/demodata/barcodes
barcode_name library_name barcode_sequence_1 barcode_sequence_2
20180326-BN573-S1 Mix1_Rep1 CTGATCGTNNNNNNNNN GCGCATAT
20180326-BN573-S2 Mix1_Rep2 ACTCTCGANNNNNNNNN CTGTACCA
20180326-BN573-S3 Mix1_Rep3 TGAGCTAGNNNNNNNNN GAACGGTT
Steps 1 and 2 of 6 in demultiplexing
18
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D3: Determine the read structure for running
ExtractIlluminaBarcodes.
Step 3 of 6 in demultiplexing
19
Step D3: Determine the read structure for running
ExtractIlluminaBarcodes.
Step 3 of 6 in demultiplexing
For xGen Dual Index UMI Adapters—Tech Access with DNA insert of 100 bp,
use the following corresponding read structure:
100T8B9M8B100T
T – template (insert)
B – Sample barcode
M – Molecular index (UMI)
Read
Step 4 of 6 in demultiplexing 20
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D4: Run Picard ExtractIlluminaBarcodes to extract sample barcodes.
Input: BARCODE_FILE: Barcode file created in Step D1
BASECALLS_DIR: Directory with sequencing basecall files
READ_STRUCTURE: 100T8B9M8B100T from Step D3
LANE: ExtractIlluminaBarcodes process one lane at a time
Output: 1. A metrics file with the barcode extraction summary
2. Extracted barcodes in output directory created in Step D2.
21
java -Xmx4g -jar picard-2.9.0.jar ExtractIlluminaBarcodes 
BARCODE_FILE=/mnt/demodata/barcode_file.txt 
BASECALLS_DIR=/mnt/runs/BN573/Data/Intensities/BaseCalls 
READ_STRUCTURE=100T8B9M8B100T 
LANE=1 
OUTPUT_DIR=/mnt/demodata/barcodes 
METRICS_FILE=/mnt/demodata/barcode_metrics.txt
Step 4 of 6 in demultiplexing
Step D4: Run Picard ExtractIlluminaBarcodes to extract sample barcodes.
22
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D5: Create a tab-delimited file to specify the BAM file for each sample in
the sequencing run with the corresponding barcode sequence(s).
Step 5 of 6 in demultiplexing
23
In this example, we saved this file in
/mnt/demodata/library_param.txt.
Be sure to create the output directory for the BAM file.
In this example, the output directory is /mnt/bam/L001
OUTPUT SAMPLE_ALIAS LIBRARY_NAME BARCODE_1 BARCODE_2
/mnt/bam/L001/BN573-S1_unmapped.bam 20180326-BN573-S1 Mix1_Rep1 CTGATCGTNNNNNNNNN GCGCATAT
/mnt/bam/L001/BN573-S2_unmapped.bam 20180326-BN573-S2 Mix1_Rep2 ACTCTCGANNNNNNNNN CTGTACCA
/mnt/bam/L001/BN573-S3_unmapped.bam 20180326-BN573-S3 Mix1_Rep3 TGAGCTAGNNNNNNNNN GAACGGTT
/mnt/bam/L001/Unmatched.bam Unmatched Unmatched N
Step D5: Create a tab-delimited file to specify the BAM file for each sample in
the sequencing run with the corresponding barcode sequence(s).
Step 5 of 6 in demultiplexing
24
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D6: Run IlluminaBasecallsToSam to convert sequencing
base-calls to short reads in the BAM files.
Step 6 of 6 in demultiplexing
25
Step D6: Run IlluminaBasecallsToSam to convert sequencing
base-calls to short reads BAM files.
java -Xmx4g -jar picard-2.9.0.jar IlluminaBasecallsToSam 
BASECALLS_DIR=/mnt/runs/BN573/Data/Intensities/BaseCalls 
BARCODES_DIR=/mnt/demodata/barcodes  # Step D4
LANE=1  # process by lane
READ_STRUCTURE=100T8B6M8B100T  # Step D3
RUN_BARCODE=180326_BN573  # prefixed to the read names in the output
LIBRARY_PARAMS= /mnt/demodata/library_param.txt  # Step D5
TMP_DIR=/mnt/tmp 
MOLECULAR_INDEX_TAG=RX  # BAM tag that stores UMI sequence
ADAPTERS_TO_CHECK=INDEXED 
READ_GROUP_ID=BN573-S1 
NUM_PROCESSORS=8
Step 6 of 6 in demultiplexing
BAM file created by IlluminaBasecallsToSam
• The reads in the BAM file generated by IlluminaBasecallsToSam are
not yet aligned to the reference genome.
• UMI sequence is in the RX tag.
• UMI sequence quality is in the QX tag.
• Sequencing adapter location is in the XT tag. Adapter sequence can
be trimmed using SamToFastq in Picard tools.
26
180326_BN573:1:1101:10008:4281 77 * 0 0 * * 0 0
ACAACGCTCCACGGGAGACCCACCCATCCCTGCCAGGTGAGCCAGACAGTGGCCAAGGGTCTCTAGGTCGAGGCAG
CDDDDCCCDDFFGGGGGGGGGGGGGGHHHHHHHHHHHGHHHHHHHHGHHHHHGHHHHGGHHHHHHHHHHGEFGGGG
RG:Z:BN573-S1 XT:i:114 QX:Z:FFFFGGGG RX:Z:GGTAAAATG
An example record from the BAM file:
Calling consensus using UMIs
27
Workflow for consensus
calling
28
Step C1: Align reads to reference genome
Mapped BAM
without UMI tags
Step C2: Include
UMI tags from
unmapped BAM in
the mapped BAM
Mapped BAM
with UMI tags
Mapped BAM
with UMI family tags
Step C3:
Group reads by UMIs
Unmapped BAM
with UMI tags
Extract UMIs from sample index during demultiplexing
Unmapped BAM
with consensus
reads
Step C4: Call consensus
Step C1,2: Aligning reads from unmapped BAM files to reference
genome, and including the UMI tags
29
Step C1: Align reads to reference genome
Mapped BAM
without UMI tags
Unmapped BAM
with UMI tags
Extract UMIs from sample index
Step C2: Include
UMI tags from
unmapped BAM in
the mapped BAM
Mapped BAM
with UMI tags
Mapped BAM
with UMI family tags
Step C3:
Group reads by UMIs
Unmapped BAM
with consensus
reads
Step C4: Call consensus
Steps 1 and 2 of 4 in consensus calling
Step C1,2: Aligning reads from unmapped BAM files to reference
genome, and including the UMI tags
The following command consists of three steps:
1. Convert BAM to FASTQ
2. Align reads using BWA-MEM
3. Include UMI tags from the unmapped BAM in the mapped BAM
Steps 1 and 2 of 4 in consensus calling
30
java -Xmx4g -jar picard-2.9.0.jar SamToFastq 
I=BN573-S1_unmapped.bam 
F=/dev/stdout INTERLEAVE=true 
| bwa mem –p –t 8 hg38.fa /dev/stdin 
| java –Xmx4g –jar picard.jar MergeBamAlignment 
UNMAPPED=BN573-S1_unmapped.bam ALIGNED=/dev/stdin 
O=BN573-S1_mapped.bam R=hg38.fa 
SORT_ORDER=coordinate MAX_GAPS=-1 
ORIENTATIONS=FR
31
Step C3: Grouping reads by UMIs
Unmapped BAM
with UMI tags
Extract UMIs from sample index
Step C1: Align reads to reference genome
Mapped BAM
without UMI tags
Step C2: Include
UMI tags from
unmapped BAM in
the mapped BAM
Mapped BAM
with UMI tags
Mapped BAM
with UMI family tags
Step C3:
Group reads by UMIs
Unmapped BAM
with consensus
reads
Step C4: Call consensus
Step 3 of 4 in consensus calling
Step C3: Grouping reads by UMIs
The reads are grouped into families that share the same UMI
Step 3 of 4 in consensus calling
32
java -Xmx4g -jar fgbio.jar GroupReadsByUmi 
--input=BN573-S1_mapped.bam --output=BN573-S1_grouped.bam 
--strategy=adjacency --edits=1 --min-map-q=20 
-–assign-tag=MI
Step 4 of 4 in consensus calling
33
Step C1: Align reads to reference genome
Mapped BAM
without UMI tags
Unmapped BAM
with UMI tags
Extract UMIs from sample index
Step C4: Calling consensus
Step C2: Include
UMI tags from
unmapped BAM in
the mapped BAM
Mapped BAM
with UMI tags
Mapped BAM
with UMI family tags
Step C3:
Group reads by UMIs
Unmapped BAM
with consensus
reads
Step C4: Call consensus
Step C4: Calling consensus
Consensus reads will be generated using fgbio’s
CallMolecularConsensusReads
Step 4 of 4 in consensus calling
34
java -Xmx4g -jar fgbio.jar CallMolecularConsensusReads 
--input=BN573-S1_grouped.bam 
--output=BN573-S1_ssConsensus_unmapped.bam 
--min-reads=1 
--rejects=BN573-S1_ssConsensus_rejected.bam 
--min-input-base-quality=30 
--read-group-id=BN573-S1
Workflow for post consensus-calling analysis
35
Step P2: Include UMI
tags from unmapped
BAM in the mapped
BAM
Mapped BAM
with UMI tags
Step P3: Filter consensus reads
Filtered consensus
BAM
Unmapped BAM
with consensus
reads
Mapped BAM
without UMI tags
Step P1: Align reads to reference genome
VCF
Step P4: Variant calling
Steps 1 and 2 of 4 in post-consensus calling analysis
36
Step P2: Include UMI
tags from unmapped
BAM in the mapped
BAM
Mapped BAM
with UMI tags
Step P3: Filter consensus reads
Filtered consensus
BAM
Unmapped BAM
with single strand
consensus reads
Mapped BAM
without UMI tags
Step P1: Align reads to reference genome
Step P1,2: Aligning reads from unmapped BAM files to reference
genome and merging the UMI tags
VCF
Step P4: Variant calling
Step P1,2: Aligning reads from unmapped BAM files to reference
genome and merging the UMI tags
The following command consists of three steps:
1. Converting BAM to FASTQ
2. Aligning reads using bwa mem
3. Including UMI tags from the unmapped BAM in the mapped BAM
Steps 1 and 2 of 4 in post-consensus calling analysis 37
java -Xmx4g -jar picard-2.9.0.jar SamToFastq 
I=BN573-S1_consensus_unmapped.bam 
F=/dev/stdout INTERLEAVE=true 
| bwa mem –p –t 8 hg38.fa /dev/stdin 
| java –Xmx4g –jar picard.jar MergeBamAlignment 
UNMAPPED=BN573-S1_dsConsensus_unmapped.bam
ALIGNED=/dev/stdin 
O=BN573-S1_consensus_mapped.bam R=hg38.fa 
SORT_ORDER=coordinate MAX_GAPS=-1 
ORIENTATIONS=FR
Step 3 of 4 in post-consensus calling analysis
38
Step P2: Include UMI
tags from unmapped
BAM in the mapped
BAM
Mapped BAM
with UMI tags
Step P3: Filter consensus reads
Filtered consensus
BAM VCF
Unmapped BAM
with single strand
consensus reads
Mapped BAM
without UMI tags
Step P1: Align reads to reference genome
Step P3: Filtering consensus reads
Step P4: Variant calling
Step P3: Filtering consensus reads
There are two kinds of filtering of consensus reads:
1. Masking or filtering individual bases in reads
2. Filtering reads (i.e., not writing them to the output BAM file)
Step 3 of 4 in post-consensus calling analysis
39
java -Xmx4g -jar fgbio.jar FilterConsensusReads 
--input=BN573-S1_ssConsensus_mapped.bam 
--output=BN573-S1_ssConsensus_mapped_filtered.bam 
--min-reads=3 
--min-base-quality=50 
--max-no-call-fraction=0.05
Step 4 of 4 in post-consensus calling analysis
40
Step P2: Include UMI
tags from unmapped
BAM in the mapped
BAM
Mapped BAM
with UMI tags
Step P3: Filter consensus reads
Filtered consensus
BAM VCF
Unmapped BAM
with single strand
consensus reads
Mapped BAM
without UMI tags
Step P1: Align reads to reference genome
Step P4: Variant calling
Step P4: Variant calling
Step P4: Variant calling
Step 4 of 4 in post-consensus calling analysis
41
• Variant calling can be accomplished with the variant caller of your choice
• The following example shows how to use VarDictJava to generate a VCF file
VarDictJava/bin/VarDict 
–G hg38.fa 
-N tumor 
-f 0.01 
-b BN573-S1_ssConsensus_mapped_filtered.bam 
-z –c 1 –S 2 –E 3 –g 4 –th 4 target_regions.bed 
| VarDictJava/VarDict/teststrandbias.R 
| VarDictJava/VarDict/var2vcf_valid.pl –N tumor –E –f 0.01 
| awk ‘{if ($1 ~/^#/) print; else if ($4 != $5) print}’ 
> BN573-S1.ssConsensus.VarDict.vcf
Tumor model system for benchmarking
• 25 ng of a 1% mixture (0.5% minimum allelic frequency) was used to
assess sensitivity and positive predictive value (PPV)
• Libraries were captured with a set of custom xGen Lockdown Probes
covering a total target area of ~35 kb
• Variant calling was performed with VarDict
42
Consensus analysis increases variant calling accuracy
43
All expected variants
0.2% variant calling threshold Positive predictive value (PPV)
THANK YOU
44
Take-home messages
• Building consensus sequences enables in silico error correction,
dramatically increasing variant calling specificity
• Due to the prevalence of artifacts arising from sample degradation,
PCR amplification and sequencing, consensus analysis is necessary
to accurately detect variants present below 1%
• xGen Dual Index UMI Adapters mitigate index switching and can
accurately assign rare variants in multiplexing studies
45
www.idtdna.com/UMI-techaccess
Sensitivity and specificity (PPV)
46
TP: True positive
FP: False positive
FN: False negative
PPV: Positive Predictive Value
Sensitivity =
TP
TP+FN
Specificity (PPV) =
TP
TP+FP

More Related Content

What's hot

RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3BITS
 
Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)Sebastian Schmeier
 
EuroBioc 2018 - metyhlKit overview
EuroBioc 2018 - metyhlKit overviewEuroBioc 2018 - metyhlKit overview
EuroBioc 2018 - metyhlKit overviewAlexander Blume
 
DNA methylation analysis in R
DNA methylation analysis in RDNA methylation analysis in R
DNA methylation analysis in RAltuna Akalin
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesGenome Reference Consortium
 
Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013Torsten Seemann
 
Quantifiler™ Trio: Decision-support to help streamline Sexual Assault sample ...
Quantifiler™ Trio: Decision-support to help streamline Sexual Assault sample ...Quantifiler™ Trio: Decision-support to help streamline Sexual Assault sample ...
Quantifiler™ Trio: Decision-support to help streamline Sexual Assault sample ...Thermo Fisher Scientific
 
[13.09.19] 16S workshop introduction
[13.09.19] 16S workshop introduction[13.09.19] 16S workshop introduction
[13.09.19] 16S workshop introductionMads Albertsen
 
Exome seuencing (steps, method, and applications)
Exome seuencing (steps, method, and applications)Exome seuencing (steps, method, and applications)
Exome seuencing (steps, method, and applications)Hamza Khan
 
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...VHIR Vall d’Hebron Institut de Recerca
 
de Bruijn Graph Construction from Combination of Short and Long Reads
de Bruijn Graph Construction from Combination of Short and Long Readsde Bruijn Graph Construction from Combination of Short and Long Reads
de Bruijn Graph Construction from Combination of Short and Long ReadsSikder Tahsin Al-Amin
 
Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)Leighton Pritchard
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingUzma Jabeen
 
Nanopore sequencing (NGS)
Nanopore sequencing (NGS)Nanopore sequencing (NGS)
Nanopore sequencing (NGS)Sourabh Kumar
 

What's hot (20)

Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3
 
Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)
 
EuroBioc 2018 - metyhlKit overview
EuroBioc 2018 - metyhlKit overviewEuroBioc 2018 - metyhlKit overview
EuroBioc 2018 - metyhlKit overview
 
DNA methylation analysis in R
DNA methylation analysis in RDNA methylation analysis in R
DNA methylation analysis in R
 
Bind database
Bind databaseBind database
Bind database
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
 
Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013
 
Quantifiler™ Trio: Decision-support to help streamline Sexual Assault sample ...
Quantifiler™ Trio: Decision-support to help streamline Sexual Assault sample ...Quantifiler™ Trio: Decision-support to help streamline Sexual Assault sample ...
Quantifiler™ Trio: Decision-support to help streamline Sexual Assault sample ...
 
[13.09.19] 16S workshop introduction
[13.09.19] 16S workshop introduction[13.09.19] 16S workshop introduction
[13.09.19] 16S workshop introduction
 
Exome seuencing (steps, method, and applications)
Exome seuencing (steps, method, and applications)Exome seuencing (steps, method, and applications)
Exome seuencing (steps, method, and applications)
 
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
 
de Bruijn Graph Construction from Combination of Short and Long Reads
de Bruijn Graph Construction from Combination of Short and Long Readsde Bruijn Graph Construction from Combination of Short and Long Reads
de Bruijn Graph Construction from Combination of Short and Long Reads
 
Whole exome sequencing(wes)
Whole exome sequencing(wes)Whole exome sequencing(wes)
Whole exome sequencing(wes)
 
Genome Assembly 2018
Genome Assembly 2018Genome Assembly 2018
Genome Assembly 2018
 
Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)
 
NGS: Mapping and de novo assembly
NGS: Mapping and de novo assemblyNGS: Mapping and de novo assembly
NGS: Mapping and de novo assembly
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Nanopore sequencing (NGS)
Nanopore sequencing (NGS)Nanopore sequencing (NGS)
Nanopore sequencing (NGS)
 

Similar to Best practices for data analysis when using UMI adapters to improve variant detection

Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Li Shen
 
matmultHomework3.pdfNames of Files to Submit matmult..docx
matmultHomework3.pdfNames of Files to Submit  matmult..docxmatmultHomework3.pdfNames of Files to Submit  matmult..docx
matmultHomework3.pdfNames of Files to Submit matmult..docxandreecapon
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
 
Cis 170 c ilab 7 of 7 sequential files
Cis 170 c ilab 7 of 7 sequential filesCis 170 c ilab 7 of 7 sequential files
Cis 170 c ilab 7 of 7 sequential filesCIS321
 
Crash dump analysis - experience sharing
Crash dump analysis - experience sharingCrash dump analysis - experience sharing
Crash dump analysis - experience sharingJames Hsieh
 
fileop report
fileop reportfileop report
fileop reportJason Lu
 
The Anatomy of TYPO3 Sitepackages
The Anatomy of TYPO3 SitepackagesThe Anatomy of TYPO3 Sitepackages
The Anatomy of TYPO3 SitepackagesBenjamin Kott
 
IBM Db2 11.5 External Tables
IBM Db2 11.5 External TablesIBM Db2 11.5 External Tables
IBM Db2 11.5 External TablesPhil Downey
 
Terraform modules restructured
Terraform modules restructuredTerraform modules restructured
Terraform modules restructuredAmi Mahloof
 
Terraform Modules Restructured
Terraform Modules RestructuredTerraform Modules Restructured
Terraform Modules RestructuredDoiT International
 
FAIR Projector Builder
FAIR Projector BuilderFAIR Projector Builder
FAIR Projector BuilderMark Wilkinson
 
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)Cathrine Wilhelmsen
 
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...Cathrine Wilhelmsen
 
Using-The-Common-Space-DUG-Datatel-Miko
Using-The-Common-Space-DUG-Datatel-MikoUsing-The-Common-Space-DUG-Datatel-Miko
Using-The-Common-Space-DUG-Datatel-MikoMIKO ..
 
picard_poster_12_16_15
picard_poster_12_16_15picard_poster_12_16_15
picard_poster_12_16_15David E. Kling
 

Similar to Best practices for data analysis when using UMI adapters to improve variant detection (20)

Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015
 
matmultHomework3.pdfNames of Files to Submit matmult..docx
matmultHomework3.pdfNames of Files to Submit  matmult..docxmatmultHomework3.pdfNames of Files to Submit  matmult..docx
matmultHomework3.pdfNames of Files to Submit matmult..docx
 
BioMake BOSC 2004
BioMake BOSC 2004BioMake BOSC 2004
BioMake BOSC 2004
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
Raptor user manual3.0
Raptor user manual3.0Raptor user manual3.0
Raptor user manual3.0
 
Cis 170 c ilab 7 of 7 sequential files
Cis 170 c ilab 7 of 7 sequential filesCis 170 c ilab 7 of 7 sequential files
Cis 170 c ilab 7 of 7 sequential files
 
Crash dump analysis - experience sharing
Crash dump analysis - experience sharingCrash dump analysis - experience sharing
Crash dump analysis - experience sharing
 
fileop report
fileop reportfileop report
fileop report
 
The Anatomy of TYPO3 Sitepackages
The Anatomy of TYPO3 SitepackagesThe Anatomy of TYPO3 Sitepackages
The Anatomy of TYPO3 Sitepackages
 
IBM Db2 11.5 External Tables
IBM Db2 11.5 External TablesIBM Db2 11.5 External Tables
IBM Db2 11.5 External Tables
 
Sayeh extension(v23)
Sayeh extension(v23)Sayeh extension(v23)
Sayeh extension(v23)
 
Wireshark Packet Analyzer.pptx
Wireshark Packet Analyzer.pptxWireshark Packet Analyzer.pptx
Wireshark Packet Analyzer.pptx
 
Terraform modules restructured
Terraform modules restructuredTerraform modules restructured
Terraform modules restructured
 
Terraform Modules Restructured
Terraform Modules RestructuredTerraform Modules Restructured
Terraform Modules Restructured
 
FAIR Projector Builder
FAIR Projector BuilderFAIR Projector Builder
FAIR Projector Builder
 
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)
 
Audit
AuditAudit
Audit
 
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...
 
Using-The-Common-Space-DUG-Datatel-Miko
Using-The-Common-Space-DUG-Datatel-MikoUsing-The-Common-Space-DUG-Datatel-Miko
Using-The-Common-Space-DUG-Datatel-Miko
 
picard_poster_12_16_15
picard_poster_12_16_15picard_poster_12_16_15
picard_poster_12_16_15
 

More from Integrated DNA Technologies

Overcoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAsOvercoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAsIntegrated DNA Technologies
 
Increasing genome editing efficiency with optimized CRISPR-Cas enzymes
Increasing genome editing efficiency with optimized CRISPR-Cas enzymesIncreasing genome editing efficiency with optimized CRISPR-Cas enzymes
Increasing genome editing efficiency with optimized CRISPR-Cas enzymesIntegrated DNA Technologies
 
The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...Integrated DNA Technologies
 
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...Integrated DNA Technologies
 
Optimized methods to use Cas9 nickases in genome editing
Optimized methods to use Cas9 nickases in genome editingOptimized methods to use Cas9 nickases in genome editing
Optimized methods to use Cas9 nickases in genome editingIntegrated DNA Technologies
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Integrated DNA Technologies
 
Reducing off-target events in CRISPR genome editing applications with a novel...
Reducing off-target events in CRISPR genome editing applications with a novel...Reducing off-target events in CRISPR genome editing applications with a novel...
Reducing off-target events in CRISPR genome editing applications with a novel...Integrated DNA Technologies
 
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotypingrhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotypingIntegrated DNA Technologies
 
Unique, dual-matched adapters mitigate index hopping between NGS samples
Unique, dual-matched adapters mitigate index hopping between NGS samplesUnique, dual-matched adapters mitigate index hopping between NGS samples
Unique, dual-matched adapters mitigate index hopping between NGS samplesIntegrated DNA Technologies
 
Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...Integrated DNA Technologies
 
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Integrated DNA Technologies
 
Cpf1-based genome editing using ribonucleoprotein complexes
Cpf1-based genome editing using ribonucleoprotein complexesCpf1-based genome editing using ribonucleoprotein complexes
Cpf1-based genome editing using ribonucleoprotein complexesIntegrated DNA Technologies
 
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...Integrated DNA Technologies
 
Accurate detection of low frequency genetic variants using novel, molecular t...
Accurate detection of low frequency genetic variants using novel, molecular t...Accurate detection of low frequency genetic variants using novel, molecular t...
Accurate detection of low frequency genetic variants using novel, molecular t...Integrated DNA Technologies
 
Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Integrated DNA Technologies
 
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTHigh efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTIntegrated DNA Technologies
 
Tips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsTips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsIntegrated DNA Technologies
 
Gene synthesis technology and applications update—unleash your lab’s potentia...
Gene synthesis technology and applications update—unleash your lab’s potentia...Gene synthesis technology and applications update—unleash your lab’s potentia...
Gene synthesis technology and applications update—unleash your lab’s potentia...Integrated DNA Technologies
 
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Integrated DNA Technologies
 

More from Integrated DNA Technologies (20)

Overcoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAsOvercoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAs
 
Increasing genome editing efficiency with optimized CRISPR-Cas enzymes
Increasing genome editing efficiency with optimized CRISPR-Cas enzymesIncreasing genome editing efficiency with optimized CRISPR-Cas enzymes
Increasing genome editing efficiency with optimized CRISPR-Cas enzymes
 
The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...
 
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
 
Optimized methods to use Cas9 nickases in genome editing
Optimized methods to use Cas9 nickases in genome editingOptimized methods to use Cas9 nickases in genome editing
Optimized methods to use Cas9 nickases in genome editing
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
 
Reducing off-target events in CRISPR genome editing applications with a novel...
Reducing off-target events in CRISPR genome editing applications with a novel...Reducing off-target events in CRISPR genome editing applications with a novel...
Reducing off-target events in CRISPR genome editing applications with a novel...
 
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotypingrhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
 
Unique, dual-matched adapters mitigate index hopping between NGS samples
Unique, dual-matched adapters mitigate index hopping between NGS samplesUnique, dual-matched adapters mitigate index hopping between NGS samples
Unique, dual-matched adapters mitigate index hopping between NGS samples
 
Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...
 
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...
 
Cpf1-based genome editing using ribonucleoprotein complexes
Cpf1-based genome editing using ribonucleoprotein complexesCpf1-based genome editing using ribonucleoprotein complexes
Cpf1-based genome editing using ribonucleoprotein complexes
 
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
 
Accurate detection of low frequency genetic variants using novel, molecular t...
Accurate detection of low frequency genetic variants using novel, molecular t...Accurate detection of low frequency genetic variants using novel, molecular t...
Accurate detection of low frequency genetic variants using novel, molecular t...
 
Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...
 
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTHigh efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
 
Tips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsTips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI tools
 
Gene synthesis technology and applications update—unleash your lab’s potentia...
Gene synthesis technology and applications update—unleash your lab’s potentia...Gene synthesis technology and applications update—unleash your lab’s potentia...
Gene synthesis technology and applications update—unleash your lab’s potentia...
 
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
 
PrimeTime® qPCR products for gene expression
PrimeTime® qPCR products for gene expressionPrimeTime® qPCR products for gene expression
PrimeTime® qPCR products for gene expression
 

Recently uploaded

Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2John Carlo Rollon
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Temporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of MasticationTemporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of Masticationvidulajaib
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxdharshini369nike
 
Cytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxCytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxVarshiniMK
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsCharlene Llagas
 

Recently uploaded (20)

Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Temporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of MasticationTemporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of Mastication
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptx
 
Cytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxCytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptx
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of Traits
 

Best practices for data analysis when using UMI adapters to improve variant detection

  • 1. Best practices for data analysis when using UMI adapters to improve variant detection 1 Wendy Lee, PhD Staff Scientist
  • 2. Outline • Overview of NGS workflow that includes sample multiplexing • Overview of workflow with xGen® Dual Index UMI Adapters—Tech Access • Discussion of data analysis steps: – Extracting UMIs from sequencing reads – Constructing consensus reads within UMI families • Improving variant calling accuracy using consensus reads 2 UMI: unique molecular identifier
  • 3. NGS workflow with xGen Dual Index UMI Adapters 3 xGen Universal Blockers xGen
  • 4. xGen Dual Index UMI Adapters—Tech Access 4 3-in-1 design • Designed for Illumina sequencers • Compatible with standard end-repair and A-tailing library construction, including PCR-free library methods • Dual unique sample indices reduce sample cross-talk • Degenerate 9-base UMI is incorporated for error correction and/or counting applications
  • 5. xGen Dual Index UMI Adapters—Tech Access 5 3-in-1 design
  • 6. Consensus calling reduces artifacts in sequencing data 6 TP Total readsDedup by start/stop positions
  • 7. 7 TP Total reads TP Consensus reads (Min3) Dedup by start/stop positions A UMI family Consensus calling reduces artifacts in sequencing data
  • 8. 8 TP TP Consensus reads (Min3) Dedup by start/stop positions Consensus calling reduces artifacts in sequencing data
  • 9. Extracting UMIs within sample index reads during demultiplexing 9
  • 10. Assumptions and requirements • Sequencing data are generated from the Illumina platform • The following tools are installed in a Linux environment: – Picard, version 2.9.0 – Burrows-Wheeler Aligner (BWA), version 0.7.15-r1140 – Fgbio, version 0.5.0 – VarDict Java • Access to the raw basecall data output from the sequencer 10
  • 11. Data analysis guidelines on IDT website 11 www.idtdna.com/UMI-techaccess
  • 12. Overall workflow 12 Sample Sheet Steps D1–6: Converted base-calls to short reads with UMI information during demultiplexing NGS runs Short reads files with UMI info Illumina basecalls Steps C1–4: Call consensus reads using UMI Steps P1–4: Post-consensus calling analysis Variant calls
  • 13. Extract UMIs from sample index reads through Illumina demultiplexing workflow 13 Step D1: Create the sample barcode input file Barcode_file.txt Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Sample sheet
  • 15. 15
  • 16. 16 Step D1: Create the sample barcode input file Barcode_file.txt Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Steps D1,2: Create a barcode file containing the sample barcode information for each sample. Steps 1 and 2 of 6 in demultiplexing
  • 17. 17 Steps D1,2: Create a barcode file containing the sample barcode information for each sample. 17 • UMI bases are in Ns in the barcode sequence • This is a tab-delimited file • In this example, we saved this file in /mnt/demodata/barcode_file.txt • In this example, we create an output directory in /mnt/demodata/barcodes barcode_name library_name barcode_sequence_1 barcode_sequence_2 20180326-BN573-S1 Mix1_Rep1 CTGATCGTNNNNNNNNN GCGCATAT 20180326-BN573-S2 Mix1_Rep2 ACTCTCGANNNNNNNNN CTGTACCA 20180326-BN573-S3 Mix1_Rep3 TGAGCTAGNNNNNNNNN GAACGGTT Steps 1 and 2 of 6 in demultiplexing
  • 18. 18 Step D1: Create the sample barcode input file Barcode_file.txt Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D3: Determine the read structure for running ExtractIlluminaBarcodes. Step 3 of 6 in demultiplexing
  • 19. 19 Step D3: Determine the read structure for running ExtractIlluminaBarcodes. Step 3 of 6 in demultiplexing For xGen Dual Index UMI Adapters—Tech Access with DNA insert of 100 bp, use the following corresponding read structure: 100T8B9M8B100T T – template (insert) B – Sample barcode M – Molecular index (UMI) Read
  • 20. Step 4 of 6 in demultiplexing 20 Step D1: Create the sample barcode input file Barcode_file.txt Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D4: Run Picard ExtractIlluminaBarcodes to extract sample barcodes.
  • 21. Input: BARCODE_FILE: Barcode file created in Step D1 BASECALLS_DIR: Directory with sequencing basecall files READ_STRUCTURE: 100T8B9M8B100T from Step D3 LANE: ExtractIlluminaBarcodes process one lane at a time Output: 1. A metrics file with the barcode extraction summary 2. Extracted barcodes in output directory created in Step D2. 21 java -Xmx4g -jar picard-2.9.0.jar ExtractIlluminaBarcodes BARCODE_FILE=/mnt/demodata/barcode_file.txt BASECALLS_DIR=/mnt/runs/BN573/Data/Intensities/BaseCalls READ_STRUCTURE=100T8B9M8B100T LANE=1 OUTPUT_DIR=/mnt/demodata/barcodes METRICS_FILE=/mnt/demodata/barcode_metrics.txt Step 4 of 6 in demultiplexing Step D4: Run Picard ExtractIlluminaBarcodes to extract sample barcodes.
  • 22. 22 Step D1: Create the sample barcode input file Barcode_file.txt Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D5: Create a tab-delimited file to specify the BAM file for each sample in the sequencing run with the corresponding barcode sequence(s). Step 5 of 6 in demultiplexing
  • 23. 23 In this example, we saved this file in /mnt/demodata/library_param.txt. Be sure to create the output directory for the BAM file. In this example, the output directory is /mnt/bam/L001 OUTPUT SAMPLE_ALIAS LIBRARY_NAME BARCODE_1 BARCODE_2 /mnt/bam/L001/BN573-S1_unmapped.bam 20180326-BN573-S1 Mix1_Rep1 CTGATCGTNNNNNNNNN GCGCATAT /mnt/bam/L001/BN573-S2_unmapped.bam 20180326-BN573-S2 Mix1_Rep2 ACTCTCGANNNNNNNNN CTGTACCA /mnt/bam/L001/BN573-S3_unmapped.bam 20180326-BN573-S3 Mix1_Rep3 TGAGCTAGNNNNNNNNN GAACGGTT /mnt/bam/L001/Unmatched.bam Unmatched Unmatched N Step D5: Create a tab-delimited file to specify the BAM file for each sample in the sequencing run with the corresponding barcode sequence(s). Step 5 of 6 in demultiplexing
  • 24. 24 Step D1: Create the sample barcode input file Barcode_file.txt Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D6: Run IlluminaBasecallsToSam to convert sequencing base-calls to short reads in the BAM files. Step 6 of 6 in demultiplexing
  • 25. 25 Step D6: Run IlluminaBasecallsToSam to convert sequencing base-calls to short reads BAM files. java -Xmx4g -jar picard-2.9.0.jar IlluminaBasecallsToSam BASECALLS_DIR=/mnt/runs/BN573/Data/Intensities/BaseCalls BARCODES_DIR=/mnt/demodata/barcodes # Step D4 LANE=1 # process by lane READ_STRUCTURE=100T8B6M8B100T # Step D3 RUN_BARCODE=180326_BN573 # prefixed to the read names in the output LIBRARY_PARAMS= /mnt/demodata/library_param.txt # Step D5 TMP_DIR=/mnt/tmp MOLECULAR_INDEX_TAG=RX # BAM tag that stores UMI sequence ADAPTERS_TO_CHECK=INDEXED READ_GROUP_ID=BN573-S1 NUM_PROCESSORS=8 Step 6 of 6 in demultiplexing
  • 26. BAM file created by IlluminaBasecallsToSam • The reads in the BAM file generated by IlluminaBasecallsToSam are not yet aligned to the reference genome. • UMI sequence is in the RX tag. • UMI sequence quality is in the QX tag. • Sequencing adapter location is in the XT tag. Adapter sequence can be trimmed using SamToFastq in Picard tools. 26 180326_BN573:1:1101:10008:4281 77 * 0 0 * * 0 0 ACAACGCTCCACGGGAGACCCACCCATCCCTGCCAGGTGAGCCAGACAGTGGCCAAGGGTCTCTAGGTCGAGGCAG CDDDDCCCDDFFGGGGGGGGGGGGGGHHHHHHHHHHHGHHHHHHHHGHHHHHGHHHHGGHHHHHHHHHHGEFGGGG RG:Z:BN573-S1 XT:i:114 QX:Z:FFFFGGGG RX:Z:GGTAAAATG An example record from the BAM file:
  • 28. Workflow for consensus calling 28 Step C1: Align reads to reference genome Mapped BAM without UMI tags Step C2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Mapped BAM with UMI family tags Step C3: Group reads by UMIs Unmapped BAM with UMI tags Extract UMIs from sample index during demultiplexing Unmapped BAM with consensus reads Step C4: Call consensus
  • 29. Step C1,2: Aligning reads from unmapped BAM files to reference genome, and including the UMI tags 29 Step C1: Align reads to reference genome Mapped BAM without UMI tags Unmapped BAM with UMI tags Extract UMIs from sample index Step C2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Mapped BAM with UMI family tags Step C3: Group reads by UMIs Unmapped BAM with consensus reads Step C4: Call consensus Steps 1 and 2 of 4 in consensus calling
  • 30. Step C1,2: Aligning reads from unmapped BAM files to reference genome, and including the UMI tags The following command consists of three steps: 1. Convert BAM to FASTQ 2. Align reads using BWA-MEM 3. Include UMI tags from the unmapped BAM in the mapped BAM Steps 1 and 2 of 4 in consensus calling 30 java -Xmx4g -jar picard-2.9.0.jar SamToFastq I=BN573-S1_unmapped.bam F=/dev/stdout INTERLEAVE=true | bwa mem –p –t 8 hg38.fa /dev/stdin | java –Xmx4g –jar picard.jar MergeBamAlignment UNMAPPED=BN573-S1_unmapped.bam ALIGNED=/dev/stdin O=BN573-S1_mapped.bam R=hg38.fa SORT_ORDER=coordinate MAX_GAPS=-1 ORIENTATIONS=FR
  • 31. 31 Step C3: Grouping reads by UMIs Unmapped BAM with UMI tags Extract UMIs from sample index Step C1: Align reads to reference genome Mapped BAM without UMI tags Step C2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Mapped BAM with UMI family tags Step C3: Group reads by UMIs Unmapped BAM with consensus reads Step C4: Call consensus Step 3 of 4 in consensus calling
  • 32. Step C3: Grouping reads by UMIs The reads are grouped into families that share the same UMI Step 3 of 4 in consensus calling 32 java -Xmx4g -jar fgbio.jar GroupReadsByUmi --input=BN573-S1_mapped.bam --output=BN573-S1_grouped.bam --strategy=adjacency --edits=1 --min-map-q=20 -–assign-tag=MI
  • 33. Step 4 of 4 in consensus calling 33 Step C1: Align reads to reference genome Mapped BAM without UMI tags Unmapped BAM with UMI tags Extract UMIs from sample index Step C4: Calling consensus Step C2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Mapped BAM with UMI family tags Step C3: Group reads by UMIs Unmapped BAM with consensus reads Step C4: Call consensus
  • 34. Step C4: Calling consensus Consensus reads will be generated using fgbio’s CallMolecularConsensusReads Step 4 of 4 in consensus calling 34 java -Xmx4g -jar fgbio.jar CallMolecularConsensusReads --input=BN573-S1_grouped.bam --output=BN573-S1_ssConsensus_unmapped.bam --min-reads=1 --rejects=BN573-S1_ssConsensus_rejected.bam --min-input-base-quality=30 --read-group-id=BN573-S1
  • 35. Workflow for post consensus-calling analysis 35 Step P2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Step P3: Filter consensus reads Filtered consensus BAM Unmapped BAM with consensus reads Mapped BAM without UMI tags Step P1: Align reads to reference genome VCF Step P4: Variant calling
  • 36. Steps 1 and 2 of 4 in post-consensus calling analysis 36 Step P2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Step P3: Filter consensus reads Filtered consensus BAM Unmapped BAM with single strand consensus reads Mapped BAM without UMI tags Step P1: Align reads to reference genome Step P1,2: Aligning reads from unmapped BAM files to reference genome and merging the UMI tags VCF Step P4: Variant calling
  • 37. Step P1,2: Aligning reads from unmapped BAM files to reference genome and merging the UMI tags The following command consists of three steps: 1. Converting BAM to FASTQ 2. Aligning reads using bwa mem 3. Including UMI tags from the unmapped BAM in the mapped BAM Steps 1 and 2 of 4 in post-consensus calling analysis 37 java -Xmx4g -jar picard-2.9.0.jar SamToFastq I=BN573-S1_consensus_unmapped.bam F=/dev/stdout INTERLEAVE=true | bwa mem –p –t 8 hg38.fa /dev/stdin | java –Xmx4g –jar picard.jar MergeBamAlignment UNMAPPED=BN573-S1_dsConsensus_unmapped.bam ALIGNED=/dev/stdin O=BN573-S1_consensus_mapped.bam R=hg38.fa SORT_ORDER=coordinate MAX_GAPS=-1 ORIENTATIONS=FR
  • 38. Step 3 of 4 in post-consensus calling analysis 38 Step P2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Step P3: Filter consensus reads Filtered consensus BAM VCF Unmapped BAM with single strand consensus reads Mapped BAM without UMI tags Step P1: Align reads to reference genome Step P3: Filtering consensus reads Step P4: Variant calling
  • 39. Step P3: Filtering consensus reads There are two kinds of filtering of consensus reads: 1. Masking or filtering individual bases in reads 2. Filtering reads (i.e., not writing them to the output BAM file) Step 3 of 4 in post-consensus calling analysis 39 java -Xmx4g -jar fgbio.jar FilterConsensusReads --input=BN573-S1_ssConsensus_mapped.bam --output=BN573-S1_ssConsensus_mapped_filtered.bam --min-reads=3 --min-base-quality=50 --max-no-call-fraction=0.05
  • 40. Step 4 of 4 in post-consensus calling analysis 40 Step P2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Step P3: Filter consensus reads Filtered consensus BAM VCF Unmapped BAM with single strand consensus reads Mapped BAM without UMI tags Step P1: Align reads to reference genome Step P4: Variant calling Step P4: Variant calling
  • 41. Step P4: Variant calling Step 4 of 4 in post-consensus calling analysis 41 • Variant calling can be accomplished with the variant caller of your choice • The following example shows how to use VarDictJava to generate a VCF file VarDictJava/bin/VarDict –G hg38.fa -N tumor -f 0.01 -b BN573-S1_ssConsensus_mapped_filtered.bam -z –c 1 –S 2 –E 3 –g 4 –th 4 target_regions.bed | VarDictJava/VarDict/teststrandbias.R | VarDictJava/VarDict/var2vcf_valid.pl –N tumor –E –f 0.01 | awk ‘{if ($1 ~/^#/) print; else if ($4 != $5) print}’ > BN573-S1.ssConsensus.VarDict.vcf
  • 42. Tumor model system for benchmarking • 25 ng of a 1% mixture (0.5% minimum allelic frequency) was used to assess sensitivity and positive predictive value (PPV) • Libraries were captured with a set of custom xGen Lockdown Probes covering a total target area of ~35 kb • Variant calling was performed with VarDict 42
  • 43. Consensus analysis increases variant calling accuracy 43 All expected variants 0.2% variant calling threshold Positive predictive value (PPV)
  • 45. Take-home messages • Building consensus sequences enables in silico error correction, dramatically increasing variant calling specificity • Due to the prevalence of artifacts arising from sample degradation, PCR amplification and sequencing, consensus analysis is necessary to accurately detect variants present below 1% • xGen Dual Index UMI Adapters mitigate index switching and can accurately assign rare variants in multiplexing studies 45 www.idtdna.com/UMI-techaccess
  • 46. Sensitivity and specificity (PPV) 46 TP: True positive FP: False positive FN: False negative PPV: Positive Predictive Value Sensitivity = TP TP+FN Specificity (PPV) = TP TP+FP