This document describes Assign 2.0, a software program for quality control analysis of DNA sequencing data for high-throughput HLA typing. The software analyzes Phred quality values (PQV) from sequencing runs to provide quality scores for individual samples and entire runs. PQV are highly reproducible for conserved positions between samples but are lower for heterozygous versus homozygous calls. Assign 2.0 calculates mean and standard deviation of PQV for samples and runs and compares them to target values determined from previous runs to monitor accuracy and precision of sequencing quality over time. The software enables automated quality control monitoring needed for high-throughput HLA sequencing-based typing.
The document discusses curating sequence and literature data for RefSeq and Gene at the National Center for Biotechnology Information. It provides an overview of RefSeq, describing what RefSeq is, how it compares to GenBank, its advantages, and how the RefSeq dataset is built through curated data and sequence analysis. It then discusses the curation process in depth, including examples of curating genes, transcripts, proteins, and literature. It also describes the tools and quality assurance checks used in curation.
This document summarizes benchmarking of germline small variant calling using Genome in a Bottle (GIAB) reference materials. It highlights best practices for benchmarking, including using benchmarking tools like hap.py and stratified performance metrics. It demonstrates benchmarking an Illumina HiSeq dataset aligned and called against GRCh37 using hap.py and stratifications from the GA4GH benchmarking tool. The results show precision and recall metrics with confidence intervals to evaluate performance across variant classes and difficulty levels. Ongoing work includes developing GIAB resources for GRCh38 and structural variants.
Enabling CNV Studies from Single Cells Using Whole Genome Amplification and L...QIAGEN
DNA copy number variations (CNVs) play an important role in the pathogenesis and progression of cancer. While array comparative genomic hybridization (aCGH) has generally been used to identify CNVs in the whole genome, next-generation sequencing (NGS) provides an opportunity to characterize CNVs genome-wide with unprecedented resolution, even at the single cell level.
However, CNV detection in single cells is faced with various challenges, such as incomplete genome coverage, introduction of sequence errors, GC bias and false positives.
In this new poster, we show a method for capturing the entire genomic complexity of a single cell, overcoming these challenges and ensuring accurate detection of CNVs.
This document summarizes a study comparing RNA sequencing (RNA-Seq) results from challenging sample types amplified using NuGEN Technologies' Ovation RNA-Seq and Ovation RNA-Seq FFPE systems. The study found that both systems produced high-quality sequencing data from as little as 500 picograms or 100 nanograms of total RNA, respectively, without requiring rRNA reduction or polyA selection. Differential expression analysis of RNA from formalin-fixed paraffin-embedded (FFPE) samples showed high concordance with matched fresh frozen samples. The results demonstrate the ability to reliably study disease using archived FFPE samples.
This document discusses GeneRead DNAseq Targeted Exon Enrichment and the GeneRead Library Quantification System for next generation sequencing. It begins with an introduction and agenda, then discusses targeted enrichment including the workflow, principles, data analysis, pathway content, performance data, and an application example. It also discusses library quantification including the workflow and an application example. In summary, the document presents Qiagen's GeneRead DNAseq and Library Quant systems as targeted enrichment and library quantification solutions for next generation sequencing applications.
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Golden Helix Inc
This document summarizes Golden Helix's capabilities for handling big data in genomics. It discusses how Golden Helix's software tools like VarSeq and the Variant Storage Warehouse can scale to handle large volumes of genomic and clinical data from whole genome sequencing, exome sequencing, and large population studies. It provides examples of how these tools have been used for clinical and research applications like variant filtering, annotation, and analysis of rare variants. The presentation concludes with a demonstration of the software.
1) The document discusses a study analyzing the impact of gene length on detecting differentially expressed genes using RNA-seq technology.
2) The study will first test the reproducibility of RNA-seq and the effect of normalization. It will then compare different statistical tests for identifying differentially expressed genes.
3) Finally, the study will specifically test how gene length impacts the likelihood of a gene being identified as differentially expressed, as longer genes are easier to map with short reads.
2013-Blomquist-Targeted RNA-sequencing with competitive multiplex-PCR amplico...Ji-Youn Yeo
This document describes a new targeted RNA sequencing method using competitive multiplex PCR to generate amplicon libraries. It aims to address limitations of existing targeted RNA sequencing approaches by 1) controlling for inter-library variation in measurement of transcript expression, and 2) reducing the large number of sequencing reads required to quantify transcripts across a wide range of expression. The method involves amplifying native RNA targets alongside known quantities of competitive internal standard templates. This normalization approach causes amplification products to converge toward equimolar concentrations, improving reproducibility and allowing accurate quantification of transcripts using fewer total sequencing reads. Validation studies demonstrated excellent reproducibility, concordance with other methods, and ability to quantify over 100 transcripts across a 107-fold expression range using only 1.46105 sequencing
The document discusses curating sequence and literature data for RefSeq and Gene at the National Center for Biotechnology Information. It provides an overview of RefSeq, describing what RefSeq is, how it compares to GenBank, its advantages, and how the RefSeq dataset is built through curated data and sequence analysis. It then discusses the curation process in depth, including examples of curating genes, transcripts, proteins, and literature. It also describes the tools and quality assurance checks used in curation.
This document summarizes benchmarking of germline small variant calling using Genome in a Bottle (GIAB) reference materials. It highlights best practices for benchmarking, including using benchmarking tools like hap.py and stratified performance metrics. It demonstrates benchmarking an Illumina HiSeq dataset aligned and called against GRCh37 using hap.py and stratifications from the GA4GH benchmarking tool. The results show precision and recall metrics with confidence intervals to evaluate performance across variant classes and difficulty levels. Ongoing work includes developing GIAB resources for GRCh38 and structural variants.
Enabling CNV Studies from Single Cells Using Whole Genome Amplification and L...QIAGEN
DNA copy number variations (CNVs) play an important role in the pathogenesis and progression of cancer. While array comparative genomic hybridization (aCGH) has generally been used to identify CNVs in the whole genome, next-generation sequencing (NGS) provides an opportunity to characterize CNVs genome-wide with unprecedented resolution, even at the single cell level.
However, CNV detection in single cells is faced with various challenges, such as incomplete genome coverage, introduction of sequence errors, GC bias and false positives.
In this new poster, we show a method for capturing the entire genomic complexity of a single cell, overcoming these challenges and ensuring accurate detection of CNVs.
This document summarizes a study comparing RNA sequencing (RNA-Seq) results from challenging sample types amplified using NuGEN Technologies' Ovation RNA-Seq and Ovation RNA-Seq FFPE systems. The study found that both systems produced high-quality sequencing data from as little as 500 picograms or 100 nanograms of total RNA, respectively, without requiring rRNA reduction or polyA selection. Differential expression analysis of RNA from formalin-fixed paraffin-embedded (FFPE) samples showed high concordance with matched fresh frozen samples. The results demonstrate the ability to reliably study disease using archived FFPE samples.
This document discusses GeneRead DNAseq Targeted Exon Enrichment and the GeneRead Library Quantification System for next generation sequencing. It begins with an introduction and agenda, then discusses targeted enrichment including the workflow, principles, data analysis, pathway content, performance data, and an application example. It also discusses library quantification including the workflow and an application example. In summary, the document presents Qiagen's GeneRead DNAseq and Library Quant systems as targeted enrichment and library quantification solutions for next generation sequencing applications.
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Golden Helix Inc
This document summarizes Golden Helix's capabilities for handling big data in genomics. It discusses how Golden Helix's software tools like VarSeq and the Variant Storage Warehouse can scale to handle large volumes of genomic and clinical data from whole genome sequencing, exome sequencing, and large population studies. It provides examples of how these tools have been used for clinical and research applications like variant filtering, annotation, and analysis of rare variants. The presentation concludes with a demonstration of the software.
1) The document discusses a study analyzing the impact of gene length on detecting differentially expressed genes using RNA-seq technology.
2) The study will first test the reproducibility of RNA-seq and the effect of normalization. It will then compare different statistical tests for identifying differentially expressed genes.
3) Finally, the study will specifically test how gene length impacts the likelihood of a gene being identified as differentially expressed, as longer genes are easier to map with short reads.
2013-Blomquist-Targeted RNA-sequencing with competitive multiplex-PCR amplico...Ji-Youn Yeo
This document describes a new targeted RNA sequencing method using competitive multiplex PCR to generate amplicon libraries. It aims to address limitations of existing targeted RNA sequencing approaches by 1) controlling for inter-library variation in measurement of transcript expression, and 2) reducing the large number of sequencing reads required to quantify transcripts across a wide range of expression. The method involves amplifying native RNA targets alongside known quantities of competitive internal standard templates. This normalization approach causes amplification products to converge toward equimolar concentrations, improving reproducibility and allowing accurate quantification of transcripts using fewer total sequencing reads. Validation studies demonstrated excellent reproducibility, concordance with other methods, and ability to quantify over 100 transcripts across a 107-fold expression range using only 1.46105 sequencing
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
The document discusses Genome in a Bottle (GIAB) and its efforts to characterize human genomes and provide reference materials and benchmarks to evaluate genome sequencing and variant calling. Specifically, it summarizes how GIAB has characterized 7 human genomes, provides extensive public sequencing data for benchmarking, and is now using linked and long reads to expand the small variant benchmark set, develop a structural variant benchmark, and perform diploid assembly of difficult regions. It also shows how new benchmarks that include more difficult regions have revealed errors in previous benchmarks and reduced performance metrics for variant calling tools.
The document summarizes the Genome in a Bottle (GIAB) project, which aims to develop reference materials and benchmarks for evaluating human genome sequencing. GIAB has characterized 7 human genomes to high accuracy using multiple sequencing technologies and bioinformatics analyses. The characterized genomes and variant calls are made publicly available to benchmark sequencing performance. Recently, GIAB has incorporated linked and long read sequencing to expand reference benchmarks to more difficult genomic regions and develop benchmarks for structural variants.
This document provides information about different sequencing platforms and their characteristics. It compares Illumina, PacBio and Oxford Nanopore platforms in terms of average read length, advantages, limitations and recommended material. It also provides a comparison table of long-range sequencing platforms including their throughput per run, number of human genomes per run and cost per genome.
This document summarizes the developmental validation of a quantitative approach to short tandem repeat (STR) sequencing using next-generation sequencing (NGS). The method aims to provide strict quantitative analysis (input equals output) and was tested for sensitivity, precision, and accuracy across a range of DNA inputs from 15-500 pg. Results showed high correlation between DNA input and NGS output. The non-normalized, automated workflow demonstrated high sensitivity, precision, and accuracy for applications like mixture interpretation and low-level DNA analysis.
Comparison of Different NGS Library Construction Methods for Single-Cell Sequ...QIAGEN
Recent advances in whole genome amplification (WGA), whole transcriptome amplification (WTA) technologies and next-generation sequencing (NGS) have enabled whole genome or transcriptome sequencing at the single-cell level. Single-cell sequencing studies have yielded new insights into the heterogeneity of the genome and transcriptome in individual cells. Such heterogeneity at the single-cell level has been shown to be closely related to cellular function, differentiation, development, and diseases. A critical element of the single-cell sequencing workflow is sequencing library construction following WGA or WTA. An efficient library construction method is required to convert a high percentage of the DNA fragments to an adaptor-ligated sequencing library and to ensure high sequence complexity of the library. Furthermore, uniform representation of all genomic regions in a sequencing library is essential for retaining all important sequence information.
Here we compared 2 library construction methods following a REPLI-g MDA-mediated WGA or WTA:
• A ligation-based library construction method using a GeneRead Library Prep Kit (QIAGEN)
• A ‘tagmentation’-based method using Nextera DNA Sample Prep Kit (Illumina), which simultaneously fragments and tags DNA.
Our results demonstrated that the Nextera library construction method can be directly used with the REPLI-g-amplified DNA following MDA reaction, without the need for DNA purification. This could be beneficial if working with a high number of samples or if the complete workflow of single WGA/WTA and library construction should be automated. However, compared with the tagmentation method, the ligation-based library construction method is more flexible with regard to the input DNA amount and delivers sequencing libraries with higher complexity and less bias. This is critical for sensitive applications, such as identification of genomic variants or comprehensive profiling of transcriptomes.
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GenomeInABottle
This document provides an overview of the Genome in a Bottle (GIAB) Consortium's efforts to develop human genome reference materials and benchmarks for evaluating genome sequencing and variant calling. It summarizes the characterization of 7 human genomes, including developing variant calls, regions, and reference values. It also describes new efforts using linked and long reads to characterize structural variants and difficult genomic regions. The goal is to provide reference materials and benchmarks to help evaluate sequencing performance and accuracy across different technologies and algorithms.
Next-generation DNA sequencing technologies have significantly impacted genetics research. Three major platforms - Roche/454, Illumina Genome Analyzer, and Applied Biosystems SOLiD - utilize massively parallel sequencing to generate large amounts of sequence data. Roche/454 uses emulsion PCR to amplify DNA fragments on beads and pyrosequencing to determine sequences. Illumina performs bridge amplification on a flow cell to generate DNA clusters then sequences by synthesis. Applied Biosystems SOLiD uses ligation-based sequencing. These new methods have enabled genome-wide studies and applications such as ancient DNA sequencing and metagenomics that were previously difficult or impossible.
Speaker: Benedict C. S. Cross, PhD, Team leader (Discovery Screening), Horizon Discovery
CRISPR–Cas9 mediated genome editing provides a highly efficient way to probe gene function. Using this technology, thousands of genes can be knocked out and their function assessed in a single experiment. We have conducted over 150 of these complex and powerful screens and will use our experience to guide you through the process of screen design, performance and analysis.
We'll be discussing:
• How to use CRISPR screening for target ID and validation, understanding drug MOA and patient stratification
• The screen design, quality control and how to evaluate success of your screening program
• Horizon’s latest developments to the platform
• Horizon’s novel approaches to target validation screening
This document summarizes the process used to benchmark large deletion calls from multiple sequencing technologies and bioinformatics pipelines. Researchers merged deletion calls from 14 datasets into regions and evaluated call size accuracy. Calls supported by two or more technologies were identified as draft benchmark calls. Sensitivity to these calls was calculated for each method. The results provide insight into strengths and weaknesses of different approaches to structural variant detection.
Genome in a Bottle is working to characterize difficult variants in human genomes to enable benchmarking of sequencing technologies and bioinformatics methods. They have extensively characterized five human genomes and are now focusing on large insertions, deletions, and structural variants over 20 base pairs. This work presents many challenges due to limitations in detection and representation of large variants. Genome in a Bottle is integrating calls from multiple technologies and approaches to refine sequence-resolved variants and provide benchmark variant call files.
Pooja Patel is seeking a laboratory technician position to further her experience in the molecular diagnostic field. She has a Bachelor of Science in Biochemistry from the University of Texas at Austin and a Bachelor of Science in Molecular Genetic Technology from the University of Texas at MD Anderson. Patel has over 3 years of laboratory experience in DNA extraction, PCR, sequencing, and bioinformatics. She is proficient in molecular biology techniques and is seeking to expand her skills in a full-time laboratory role.
The document discusses GeneRead DNAseq Targeted Exon Enrichment and GeneRead Library Quantification System for Next Generation Sequencing. It provides an overview of the targeted enrichment workflow and principles, pathway-focused analysis tools, library quantification workflow, and performance data. The targeted enrichment panels allow users to focus sequencing on genes of interest, improve detection of low prevalence mutations from poor quality samples. The library quantification system uses qPCR to accurately quantify sequencing libraries and assess sample quality before NGS runs.
Using VarSeq to Improve Variant Analysis Research WorkflowsDelaina Hawkins
Many questions must be answered when analyzing DNA sequence variants: How do I determine which variants are potentially deleterious? Is the sequencing quality sufficient? How do I prioritize the results? Which annotation sources may help answer my research question?
In this webinar presentation, we will review workflow strategies for quality control and analysis of DNA sequence variants using the VarSeq software package from Golden Helix. VarSeq is a powerful platform for analysis of DNA sequence variants in clinical and translational research settings. VarSeq provides researchers with easy access to curated public databases of variant annotation information, and also enables users to incorporate their own local databases or downloaded information about variants and genomic regions.
The presentation will include interactive demonstrations using VarSeq to analyze variants found by exome sequencing of an extended family with a complex disease. We will review strategies for assessing variant quality, applying genomic annotations, incorporating custom annotation sources, and creating variant filters in VarSeq. We will also demonstrate the PhoRank gene ranking algorithm and its application for prioritizing variants.
Using VarSeq to Improve Variant Analysis Research WorkflowsGolden Helix Inc
In this webinar presentation, we will review workflow strategies for quality control and analysis of DNA sequence variants using the VarSeq software package from Golden Helix. VarSeq is a powerful platform for analysis of DNA sequence variants in clinical and translational research settings. VarSeq provides researchers with easy access to curated public databases of variant annotation information, and also enables users to incorporate their own local databases or downloaded information about variants and genomic regions.
GIAB Integrating multiple technologies to form benchmark SVs 180517GenomeInABottle
Genome in a Bottle aims to provide well-characterized human genomes as benchmarks to validate genome sequencing and variant calling. The summary characterizes five genomes that have been analyzed to provide benchmark calls for simple and some complex variants, though many challenges remain, particularly for structural variants and difficult genomic regions. Integration of multiple data types and analyses from diverse technologies is key to improving benchmark calls over time in an open and transparent manner.
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923GenomeInABottle
Using accurate long reads to improve Genome in a Bottle Benchmarks
The Genome in a Bottle Consortium has used accurate long reads to characterize variants in difficult genomic regions for 7 human genomes. Long and linked reads improved the small variant benchmark by expanding reference coverage and the number of called variants. Accurate long reads were also essential for generating benchmarks for medically relevant genes and for improving benchmarks on chromosomes X and Y. Ongoing work includes developing RNA sequencing benchmarks from long reads and generating the first tumor/normal cell line benchmark.
Achieve Complete Coverage of the SARS-CoV-2 GenomeCamille Cappello
Utilize multiple overlapping amplicons in a single tube, using a rapid, 2-hour workflow to prepare ready-to-sequence libraries. The PCR1+PCR2 workflow generates robust libraries even from low input quantities of DNA that may be subsequently quantified and normalized with conventional methods such as Qubit® or Agilent Bioanalyzer, or optionally using the included Swift Normalase reagents.
Provides coverage of >99% of the SARS-CoV-2 genome from limited viral titers
This document describes the development and validation of a targeted next-generation sequencing (NGS) method that employs synthetic cDNA internal standards. The method uses a multiplex competitive PCR approach to amplify both native cDNA targets and known copies of synthetic internal standards, allowing measurement of each target relative to its standard. This controls for variation in library preparation and sequencing. The method was validated using ERCC RNA reference materials, demonstrating good accuracy, precision, reproducibility, and ability to detect fold-changes across platforms. Stochastic sampling effects were also evaluated, showing increased variability at low input molecules or reads. The method reduces over-sequencing needs and allows cost-effective targeted NGS analysis.
The document describes QIAGEN's GeneRead DNAseq Targeted Exon Enrichment and GeneRead Library Quantification System for next generation sequencing. It discusses targeted enrichment workflow and principles, data analysis, pathway content of panels, performance data and application examples. It also covers the library quantification workflow, using qPCR to quantify sequencing libraries, and a DNAseq library quantification array to assess sample quality. The document is aimed at promoting these NGS sample preparation and analysis solutions to potential customers.
This document discusses the Genome in a Bottle Consortium's efforts to develop reference materials and standards to validate next generation sequencing assays. It provides an overview of the consortium's goals to generate reference genomes with highly confident variant calls and accompanying data to allow labs to compare results and assess false positives and false negatives. The document describes some examples of how labs are using the consortium's data on the NA12878 genome to benchmark sequencing platforms and bioinformatics workflows.
The document provides instructions for creating an account and submitting a paper writing request to the website HelpWriting.net. It outlines a 5-step process: 1) Create an account with an email and password. 2) Complete a 10-minute order form with instructions, sources, and deadline. 3) Review bids from writers and choose one. 4) Review the completed paper and authorize payment. 5) Request revisions until satisfied. The website promises original, high-quality content and refunds for plagiarized work.
Writing An Abstract For A Research Paper GuidelineCrystal Sanchez
The document provides guidelines for using the writing service HelpWriting.net. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete a 10-minute order form providing instructions, sources, and deadline. 3) Review bids from writers and choose one based on qualifications. 4) Review the completed paper and authorize payment if pleased. 5) Request revisions to ensure satisfaction, and the service offers refunds for plagiarized work.
More Related Content
Similar to Assign 2.0 software for the analysis of Phred quality values for quality control of HLA sequencing-based typing.pdf
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
The document discusses Genome in a Bottle (GIAB) and its efforts to characterize human genomes and provide reference materials and benchmarks to evaluate genome sequencing and variant calling. Specifically, it summarizes how GIAB has characterized 7 human genomes, provides extensive public sequencing data for benchmarking, and is now using linked and long reads to expand the small variant benchmark set, develop a structural variant benchmark, and perform diploid assembly of difficult regions. It also shows how new benchmarks that include more difficult regions have revealed errors in previous benchmarks and reduced performance metrics for variant calling tools.
The document summarizes the Genome in a Bottle (GIAB) project, which aims to develop reference materials and benchmarks for evaluating human genome sequencing. GIAB has characterized 7 human genomes to high accuracy using multiple sequencing technologies and bioinformatics analyses. The characterized genomes and variant calls are made publicly available to benchmark sequencing performance. Recently, GIAB has incorporated linked and long read sequencing to expand reference benchmarks to more difficult genomic regions and develop benchmarks for structural variants.
This document provides information about different sequencing platforms and their characteristics. It compares Illumina, PacBio and Oxford Nanopore platforms in terms of average read length, advantages, limitations and recommended material. It also provides a comparison table of long-range sequencing platforms including their throughput per run, number of human genomes per run and cost per genome.
This document summarizes the developmental validation of a quantitative approach to short tandem repeat (STR) sequencing using next-generation sequencing (NGS). The method aims to provide strict quantitative analysis (input equals output) and was tested for sensitivity, precision, and accuracy across a range of DNA inputs from 15-500 pg. Results showed high correlation between DNA input and NGS output. The non-normalized, automated workflow demonstrated high sensitivity, precision, and accuracy for applications like mixture interpretation and low-level DNA analysis.
Comparison of Different NGS Library Construction Methods for Single-Cell Sequ...QIAGEN
Recent advances in whole genome amplification (WGA), whole transcriptome amplification (WTA) technologies and next-generation sequencing (NGS) have enabled whole genome or transcriptome sequencing at the single-cell level. Single-cell sequencing studies have yielded new insights into the heterogeneity of the genome and transcriptome in individual cells. Such heterogeneity at the single-cell level has been shown to be closely related to cellular function, differentiation, development, and diseases. A critical element of the single-cell sequencing workflow is sequencing library construction following WGA or WTA. An efficient library construction method is required to convert a high percentage of the DNA fragments to an adaptor-ligated sequencing library and to ensure high sequence complexity of the library. Furthermore, uniform representation of all genomic regions in a sequencing library is essential for retaining all important sequence information.
Here we compared 2 library construction methods following a REPLI-g MDA-mediated WGA or WTA:
• A ligation-based library construction method using a GeneRead Library Prep Kit (QIAGEN)
• A ‘tagmentation’-based method using Nextera DNA Sample Prep Kit (Illumina), which simultaneously fragments and tags DNA.
Our results demonstrated that the Nextera library construction method can be directly used with the REPLI-g-amplified DNA following MDA reaction, without the need for DNA purification. This could be beneficial if working with a high number of samples or if the complete workflow of single WGA/WTA and library construction should be automated. However, compared with the tagmentation method, the ligation-based library construction method is more flexible with regard to the input DNA amount and delivers sequencing libraries with higher complexity and less bias. This is critical for sensitive applications, such as identification of genomic variants or comprehensive profiling of transcriptomes.
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GenomeInABottle
This document provides an overview of the Genome in a Bottle (GIAB) Consortium's efforts to develop human genome reference materials and benchmarks for evaluating genome sequencing and variant calling. It summarizes the characterization of 7 human genomes, including developing variant calls, regions, and reference values. It also describes new efforts using linked and long reads to characterize structural variants and difficult genomic regions. The goal is to provide reference materials and benchmarks to help evaluate sequencing performance and accuracy across different technologies and algorithms.
Next-generation DNA sequencing technologies have significantly impacted genetics research. Three major platforms - Roche/454, Illumina Genome Analyzer, and Applied Biosystems SOLiD - utilize massively parallel sequencing to generate large amounts of sequence data. Roche/454 uses emulsion PCR to amplify DNA fragments on beads and pyrosequencing to determine sequences. Illumina performs bridge amplification on a flow cell to generate DNA clusters then sequences by synthesis. Applied Biosystems SOLiD uses ligation-based sequencing. These new methods have enabled genome-wide studies and applications such as ancient DNA sequencing and metagenomics that were previously difficult or impossible.
Speaker: Benedict C. S. Cross, PhD, Team leader (Discovery Screening), Horizon Discovery
CRISPR–Cas9 mediated genome editing provides a highly efficient way to probe gene function. Using this technology, thousands of genes can be knocked out and their function assessed in a single experiment. We have conducted over 150 of these complex and powerful screens and will use our experience to guide you through the process of screen design, performance and analysis.
We'll be discussing:
• How to use CRISPR screening for target ID and validation, understanding drug MOA and patient stratification
• The screen design, quality control and how to evaluate success of your screening program
• Horizon’s latest developments to the platform
• Horizon’s novel approaches to target validation screening
This document summarizes the process used to benchmark large deletion calls from multiple sequencing technologies and bioinformatics pipelines. Researchers merged deletion calls from 14 datasets into regions and evaluated call size accuracy. Calls supported by two or more technologies were identified as draft benchmark calls. Sensitivity to these calls was calculated for each method. The results provide insight into strengths and weaknesses of different approaches to structural variant detection.
Genome in a Bottle is working to characterize difficult variants in human genomes to enable benchmarking of sequencing technologies and bioinformatics methods. They have extensively characterized five human genomes and are now focusing on large insertions, deletions, and structural variants over 20 base pairs. This work presents many challenges due to limitations in detection and representation of large variants. Genome in a Bottle is integrating calls from multiple technologies and approaches to refine sequence-resolved variants and provide benchmark variant call files.
Pooja Patel is seeking a laboratory technician position to further her experience in the molecular diagnostic field. She has a Bachelor of Science in Biochemistry from the University of Texas at Austin and a Bachelor of Science in Molecular Genetic Technology from the University of Texas at MD Anderson. Patel has over 3 years of laboratory experience in DNA extraction, PCR, sequencing, and bioinformatics. She is proficient in molecular biology techniques and is seeking to expand her skills in a full-time laboratory role.
The document discusses GeneRead DNAseq Targeted Exon Enrichment and GeneRead Library Quantification System for Next Generation Sequencing. It provides an overview of the targeted enrichment workflow and principles, pathway-focused analysis tools, library quantification workflow, and performance data. The targeted enrichment panels allow users to focus sequencing on genes of interest, improve detection of low prevalence mutations from poor quality samples. The library quantification system uses qPCR to accurately quantify sequencing libraries and assess sample quality before NGS runs.
Using VarSeq to Improve Variant Analysis Research WorkflowsDelaina Hawkins
Many questions must be answered when analyzing DNA sequence variants: How do I determine which variants are potentially deleterious? Is the sequencing quality sufficient? How do I prioritize the results? Which annotation sources may help answer my research question?
In this webinar presentation, we will review workflow strategies for quality control and analysis of DNA sequence variants using the VarSeq software package from Golden Helix. VarSeq is a powerful platform for analysis of DNA sequence variants in clinical and translational research settings. VarSeq provides researchers with easy access to curated public databases of variant annotation information, and also enables users to incorporate their own local databases or downloaded information about variants and genomic regions.
The presentation will include interactive demonstrations using VarSeq to analyze variants found by exome sequencing of an extended family with a complex disease. We will review strategies for assessing variant quality, applying genomic annotations, incorporating custom annotation sources, and creating variant filters in VarSeq. We will also demonstrate the PhoRank gene ranking algorithm and its application for prioritizing variants.
Using VarSeq to Improve Variant Analysis Research WorkflowsGolden Helix Inc
In this webinar presentation, we will review workflow strategies for quality control and analysis of DNA sequence variants using the VarSeq software package from Golden Helix. VarSeq is a powerful platform for analysis of DNA sequence variants in clinical and translational research settings. VarSeq provides researchers with easy access to curated public databases of variant annotation information, and also enables users to incorporate their own local databases or downloaded information about variants and genomic regions.
GIAB Integrating multiple technologies to form benchmark SVs 180517GenomeInABottle
Genome in a Bottle aims to provide well-characterized human genomes as benchmarks to validate genome sequencing and variant calling. The summary characterizes five genomes that have been analyzed to provide benchmark calls for simple and some complex variants, though many challenges remain, particularly for structural variants and difficult genomic regions. Integration of multiple data types and analyses from diverse technologies is key to improving benchmark calls over time in an open and transparent manner.
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923GenomeInABottle
Using accurate long reads to improve Genome in a Bottle Benchmarks
The Genome in a Bottle Consortium has used accurate long reads to characterize variants in difficult genomic regions for 7 human genomes. Long and linked reads improved the small variant benchmark by expanding reference coverage and the number of called variants. Accurate long reads were also essential for generating benchmarks for medically relevant genes and for improving benchmarks on chromosomes X and Y. Ongoing work includes developing RNA sequencing benchmarks from long reads and generating the first tumor/normal cell line benchmark.
Achieve Complete Coverage of the SARS-CoV-2 GenomeCamille Cappello
Utilize multiple overlapping amplicons in a single tube, using a rapid, 2-hour workflow to prepare ready-to-sequence libraries. The PCR1+PCR2 workflow generates robust libraries even from low input quantities of DNA that may be subsequently quantified and normalized with conventional methods such as Qubit® or Agilent Bioanalyzer, or optionally using the included Swift Normalase reagents.
Provides coverage of >99% of the SARS-CoV-2 genome from limited viral titers
This document describes the development and validation of a targeted next-generation sequencing (NGS) method that employs synthetic cDNA internal standards. The method uses a multiplex competitive PCR approach to amplify both native cDNA targets and known copies of synthetic internal standards, allowing measurement of each target relative to its standard. This controls for variation in library preparation and sequencing. The method was validated using ERCC RNA reference materials, demonstrating good accuracy, precision, reproducibility, and ability to detect fold-changes across platforms. Stochastic sampling effects were also evaluated, showing increased variability at low input molecules or reads. The method reduces over-sequencing needs and allows cost-effective targeted NGS analysis.
The document describes QIAGEN's GeneRead DNAseq Targeted Exon Enrichment and GeneRead Library Quantification System for next generation sequencing. It discusses targeted enrichment workflow and principles, data analysis, pathway content of panels, performance data and application examples. It also covers the library quantification workflow, using qPCR to quantify sequencing libraries, and a DNAseq library quantification array to assess sample quality. The document is aimed at promoting these NGS sample preparation and analysis solutions to potential customers.
This document discusses the Genome in a Bottle Consortium's efforts to develop reference materials and standards to validate next generation sequencing assays. It provides an overview of the consortium's goals to generate reference genomes with highly confident variant calls and accompanying data to allow labs to compare results and assess false positives and false negatives. The document describes some examples of how labs are using the consortium's data on the NA12878 genome to benchmark sequencing platforms and bioinformatics workflows.
Similar to Assign 2.0 software for the analysis of Phred quality values for quality control of HLA sequencing-based typing.pdf (20)
The document provides instructions for creating an account and submitting a paper writing request to the website HelpWriting.net. It outlines a 5-step process: 1) Create an account with an email and password. 2) Complete a 10-minute order form with instructions, sources, and deadline. 3) Review bids from writers and choose one. 4) Review the completed paper and authorize payment. 5) Request revisions until satisfied. The website promises original, high-quality content and refunds for plagiarized work.
Writing An Abstract For A Research Paper GuidelineCrystal Sanchez
The document provides guidelines for using the writing service HelpWriting.net. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete a 10-minute order form providing instructions, sources, and deadline. 3) Review bids from writers and choose one based on qualifications. 4) Review the completed paper and authorize payment if pleased. 5) Request revisions to ensure satisfaction, and the service offers refunds for plagiarized work.
Start Writing Your Own Statement Of Purpose (SOCrystal Sanchez
The document discusses the development of Cuban music culture, which stems from the fusion of West African and European musical traditions brought to Cuba by immigrants. African slaves introduced percussion instruments and rhythms, while settlers brought instruments like the clarinet and violin. Over time, Cuban music incorporated these influences to create a unique sound, though it still draws from its European and African roots. The culture also saw some changes due to American cultural influences in later years.
Top 10 Effective Tips To Hire Your Next Essay Writer TopTeny.ComCrystal Sanchez
The document provides tips for hiring an essay writer from the website HelpWriting.net. It outlines a 5-step process: 1) Create an account with valid email and password; 2) Complete a order form with instructions, sources, and deadline; 3) Review writer bids and qualifications then select a writer; 4) Review the paper and authorize payment if satisfied; 5) Request revisions until fully satisfied, with a refund option for plagiarism.
The document discusses steps to create an account and request a paper writing service on the HelpWriting.net site. It involves registering with an email and password, completing an order form with instructions and deadline, and choosing a writer based on qualifications and reviews to start the assignment. Customers can request revisions and receive a refund if the paper is plagiarized. The site aims to fully meet customer needs with original, high-quality content.
Research Process- Objective, Hypothesis (Lec2) Hypothesis, HypothesisCrystal Sanchez
The document discusses how Disney and Pixar films portray masculinity in male characters. It focuses on an analysis from a paper that examines the films Cars, Toy Story, and The Incredibles. The analysis argues that the main male characters in these films start out with stereotypically masculine personalities but have their masculinity challenged or "emasculated" by other characters by the end of the films. This represents a shift from Disney's previous portrayal of exclusively alpha male lead characters to Pixar introducing new types of male characters with less traditionally masculine traits.
PDF A Manual For Writers Of Research Papers, ThesesCrystal Sanchez
Martin Luther King Jr. wrote his famous "Letter from Birmingham Jail" in response to a public letter from eight Alabama clergymen who criticized the civil rights demonstrations in Birmingham. In his letter, King defended the use of nonviolent protest to combat racial injustice. He argued that justice delayed is justice denied, and that African Americans have grown tired of empty promises and political slowdowns. King also stated that segregation degrades human personality and inflicts spiritual and psychological damage.
Write My Persuasive Speech, 11 Tips How To WritCrystal Sanchez
1. The document provides tips for writing a persuasive speech by outlining a 5-step process for hiring a writer on the website HelpWriting.net.
2. Students can create an account, submit a request with instructions and deadline, and writers will bid on the request. Students can then choose a writer and provide feedback on the paper.
3. The website offers revisions and refunds plagiarized work, aiming to fully meet student needs for original, high-quality content.
University Entrance Essay Help. Online assignment writing service.Crystal Sanchez
The document provides steps for requesting writing assistance from the website HelpWriting.net:
1. Create an account with a password and email.
2. Complete a 10-minute order form providing instructions, sources, deadline, and attaching a sample if wanting the writer to imitate your style.
3. The website uses a bidding system where you can review bids from writers and choose one based on qualifications, history, and feedback, placing a deposit to start.
4. After receiving the paper, check for satisfaction and authorize full payment, or request free revisions if needed. The website guarantees original, high-quality content or a full refund.
Essay About My First Day At A New Schoo. Online assignment writing service.Crystal Sanchez
The document provides instructions for requesting writing assistance from HelpWriting.net in 5 steps: 1) Create an account with a password and email. 2) Complete a 10-minute order form providing instructions, sources, and deadline. 3) Review bids from writers and choose one based on qualifications. 4) Review the completed paper and authorize payment if pleased. 5) Request revisions until fully satisfied, with the option of a full refund for plagiarism.
The document discusses the benefits of forming a limited liability company (LLC) for a partnership business. Key benefits include:
- Limited liability, which shields members from liability for other members' debts and obligations.
- Flexible management structure that can be member-managed like a partnership or manager-managed.
- Tax benefits as LLCs can elect to be taxed as a partnership or corporation for tax purposes.
- Relative administrative simplicity compared to other structures like corporations.
Abstracts For Research Papers What Are Some FreCrystal Sanchez
The character analysis essay discusses Estella Havisham from Great Expectations. It argues that while Estella appears cold-hearted and cruel, criticism of her is undeserved as she was raised by Miss Havisham in a controlled environment where she was essentially brainwashed. Estella's demeanor upon introduction seems to suggest a heart of ice, yet her beauty captivates Pip, making her the focus of his thoughts for much of the novel. The essay aims to provide context for Estella's behavior and argue she is not entirely at fault due to Miss Havisham's influence.
8 Steps To Write Your Memoir Memoir Writing Prompts,Crystal Sanchez
Hildegard of Bingen was a 12th century German Benedictine abbess, writer, composer, philosopher, Christian mystic, visionary, and polymath. She founded two monasteries and was an influential figure in the Roman Catholic Church. Some of her major works include Scivias, a collection of visions she experienced; O Ordinis, a commentary on the Rule of Saint Benedict; and her music, which helped establish the sacred music tradition. She faced opposition from church authorities but was eventually recognized for her writings and theological teachings.
(PDF) How To Write A Book Review. Online assignment writing service.Crystal Sanchez
The document discusses the steps to request and complete an assignment writing request through the HelpWriting.net platform, including registering an account, completing an order form with instructions and deadline, and choosing a writer to complete the assignment based on their qualifications and reviews. The process also allows customers to provide feedback and request revisions to ensure satisfaction with the completed work. Customers can request assignments confidently knowing the platform aims to deliver original, high-quality content.
This document discusses Shakespeare's play Henry V and the complex characterization of King Henry. While Henry leads England to victory at Agincourt, Shakespeare also depicts him as a flawed ruler, highlighting the political and personal shortcomings of the king. The play avoids glorifying or condemning war, instead providing insight into both the public and private lives of Henry. It encourages audiences to form their own judgments of Henry rather than accept a single interpretation.
Best College Essay Ever - UK Essay Writing Help.Crystal Sanchez
This document outlines a 5-step process for obtaining essay writing help from HelpWriting.net:
1. Create an account with a password and email.
2. Complete a 10-minute order form providing instructions, sources, deadline and sample work.
3. Review bids from writers and choose one based on qualifications, history and feedback. Place a deposit to start work.
4. Review the completed paper and authorize full payment if pleased, or request revisions using the free revision policy.
5. Papers can receive multiple revisions to ensure satisfaction, and HelpWriting.net guarantees original, high-quality content or a full refund.
Home - Write Better Scripts Screenplay Writing, WritinCrystal Sanchez
This document provides instructions for requesting writing assistance from HelpWriting.net. It outlines a 5-step process: 1) Create an account with an email and password. 2) Complete a 10-minute order form providing instructions, sources, and deadline. 3) Review bids from writers and select one. 4) Review the completed paper and authorize payment if satisfied. 5) Request revisions until fully satisfied, with the option of a full refund for plagiarized work. The service aims to provide original, high-quality content through this process.
The document discusses taking an online World History class and how it has led to significant changes in the person's life. They thought the class would be easy to pass but did not expect how it would impact them. The class has provided learning and perspective that has altered their views.
The document discusses the steps to request an assignment writing service from HelpWriting.net. It involves creating an account, completing an order form with instructions and deadline, reviewing bids from writers and choosing one, and authorizing payment after receiving the completed paper. The service offers revisions and refunds if plagiarized work is provided.
The document provides instructions for requesting writing assistance from the HelpWriting.net website. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete a 10-minute order form providing instructions, sources, and deadline. 3) Review bids from writers and choose one based on qualifications. 4) Review the completed paper and authorize payment if satisfied. 5) Request revisions to ensure satisfaction, with a refund option for plagiarized content.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
it describes the bony anatomy including the femoral head , acetabulum, labrum . also discusses the capsule , ligaments . muscle that act on the hip joint and the range of motion are outlined. factors affecting hip joint stability and weight transmission through the joint are summarized.
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
Thinking of getting a dog? Be aware that breeds like Pit Bulls, Rottweilers, and German Shepherds can be loyal and dangerous. Proper training and socialization are crucial to preventing aggressive behaviors. Ensure safety by understanding their needs and always supervising interactions. Stay safe, and enjoy your furry friends!
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
Assessment and Planning in Educational technology.pptxKavitha Krishnan
In an education system, it is understood that assessment is only for the students, but on the other hand, the Assessment of teachers is also an important aspect of the education system that ensures teachers are providing high-quality instruction to students. The assessment process can be used to provide feedback and support for professional development, to inform decisions about teacher retention or promotion, or to evaluate teacher effectiveness for accountability purposes.
Assessment and Planning in Educational technology.pptx
Assign 2.0 software for the analysis of Phred quality values for quality control of HLA sequencing-based typing.pdf
1. Assign 2.0: software for the analysis of
Phred quality values for quality control of
HLA sequencing-based typing
D.C. Sayer
D.M. Goodridge
F.T. Christiansen
Authors’ affiliations:
D.C. Sayer1,2,3
,
D.M. Goodridge1,3
,
F.T. Christiansen1,2
1
Department of Clinical
Immunology and Biochemical
Genetics, Royal Perth
Hospital, Wellington Street,
Perth 6000, Western
Australia, Australia
2
School of Surgery and
Pathology, Division of
Pathology, University of
Western Australia, Verdun
Street, Nedlands, Western
Australia, Australia
3
Conexio Genomics, PO Box
1670, Applecross, Western
Australia, Australia
Correspondence to:
David C. Sayer
Department of Clinical Immu-
nology and Biochemical
Genetics
Royal Perth Hospital
Wellington Street
Perth 6000
Western Australia
Australia
Tel.: þ61 8 92242899
Fax: þ61 8 92242920
e-mail: david.sayer@
health.wa.gov.au
Abstract: As improvements to DNA sequencing technology have resulted in
increasing the throughput of DNA sequencing, the bottleneck for high
throughput DNA sequencing-based typing (SBT) has shifted to sequence
analysis, genotyping and quality control (QC). Consistent high-quality DNA
sequence is required in order to reduce manual verification and editing of
sequence electropherograms. However, identifying systematic changes in
quality is difficult to achieve without the aid of sophisticated sequence
analysis programs dedicated to this purpose. We describe a computer
software program called Assign 2.0, which integrates sequence QC analysis
and genotyping in order to facilitate high-throughput SBT. Assign 2.0
performs an analysis of Phred quality values in order to produce quality
scores for a sample and a sequencing run. This enables sample-to-sample and
run-to-run QC monitoring and provides a mechanism for the comparison of
sequence quality between various genes, various reagents and various
protocols with the aim of improving the overall quality of DNA sequence data.
This, in turn, will result in reducing sequence analysis as a bottleneck for
high-throughput SBT.
Recent advances in DNA-sequencing technology, including the intro-
duction of capillary DNA sequencers and improvements in dye-
labelling technology (1, 2), have simplified DNA-sequencing protocols
and have improved the ability to detect heterozygous sequences.
As a result, an increasing number of clinical and research labora-
tories are using DNA sequencing in order to study genetic diversity.
This is particularly true for laboratories performing human leucocyte
antigen (HLA) typing for the matching of donors and recipients for
bone marrow transplantation. Several HLA genes are required for
matching for transplantation and each is highly polymorphic (http://
www.ebi.ac.uk/imgt/hla). Current genotyping approaches are hier-
archical and employ low typing resolution molecular techniques
that are relatively inexpensive and suitable for high throughput,
followed by DNA sequencing to provide high-resolution typing
when required. DNA sequencing is regarded as the gold standard
Key words:
assign; quality control; resequencing;
sequencing
Received 14 January 2004, revised 6 April 2004, accepted
for publication 19 April 2004
Copyright ß Blackwell Munksgaard 2004
doi: 10.1111/j.1399-0039.2004.00283.x
Tissue Antigens 2004: 64: 556–565
Printed in Denmark. All rights reserved
556
2. for HLA typing and therefore the ideal would be for DNA sequencing
to be the sole method for HLA typing. State-of-the-art DNA sequen-
cers provide the throughput requirements for most HLA-typing
laboratories. However, data analysis, including manual verification
of automated sequence base calling, allele assignment and quality
control (QC), is a significant impediment to high-throughput sequen-
cing-based typing (SBT).
HLA SBT is a complex multi-step process, which requires the
specific polymerase chain reaction (PCR) amplification of the region
to be sequenced, sequencing up to four polymorphic exons in both
directions, splicing the intron sequence and creating a single con-
catenated consensus sequence for analysis. The consensus sequence
is usually matched against a database of allele sequences in order to
identify those alleles, which are best matched to the test sequence.
Computer software programs, such as SeqScape1
v2.0 Software
(SeqScape) from Applied Biosystems (Foster City, CA), perform base
calling, align forward and reverse complementary sequences, splice
intron sequence and produce a concatenated consensus sequence for
allele assignment. However, base calling may be unreliable, espe-
cially for heterozygous sequence, because an arbitrary threshold for
heterozygosity is assigned based on the percentage of one peak
within another. If the threshold is too low, the presence of any back-
ground may result in false calling of heterozygotes. If the threshold
is not set low enough, then some heterozygotes with low di-
deoxynucleotide incorporation may be incorrectly called homozygotes.
Therefore, manual verification by viewing the sequence electropherograms
(EPG) is required.
The requirement for manual sequence base-call verification and
sequence editing is highest, when the quality of the sequence is poor.
The ability to obtain and maintain high-quality sequence is critical to
improving the throughput capabilities of SBT. High-quality sequence
results in improved accuracy of base calling and removes the time
required for manual verification. As sequencing is being increasingly
used in a clinical setting, guidelines for sequence quality have been
suggested by groups, such as the Clinical Molecular Genetics Society
(http://cmgs.org/BPG/Guidelines/2002/data%20quality). However,
these guidelines tend to be subjective.
Unique and objective approaches to SBT QC are required. We
suggest that various combinations of alleles in heterozygous sam-
ples, each with its own unique sequence, are amplified in PCR and
sequencing reactions with various efficiencies, largely as a result of
the different melting temperatures and GC content. Thus, every
sample should have its own QC. Furthermore, as the sequence for
every sample is usually derived from concatenated bi-directionally
sequenced units (BSU) or exons as is the case for most HLA class-I
SBT assays (3), the basic unit of QC should be the BSU. We have
developed a computer software program that enables such QC to be
performed. We have integrated this with our allele assignment
software in order to provide a comprehensive sequence-analysis
software program, called Assign 2.0. Assign 2.0 is suitable for high--
throughput HLA SBT or any resequencing application.
The Assign 2.0 QC tools enable the analysis of several indicators
of sequence quality. However, the primary function of Assign 2.0 is
the analysis of Phred quality values (PQV) (4, 5). Phred is a software
program, which provides a probability that a base call within a
sequence is correct by using the algorithm QV ¼ 10*log10 (PE),
where PE is the probability that the base call is an error. Thus, a
PQV of 40 indicates that there is a one in 10,000 chance that the base
call is incorrect. However, this algorithm was developed for cloned
template and the same interpretations of base call accuracy may not
apply to heterozygous sequence from PCR products. Therefore, we
have investigated whether PQV can have a broader utility for the
assessment of SBT QC and provide a quality score for a sequenced
sample and a sequencing run or gel. We demonstrate a unique and
informative objective assessment of sequence quality following the
analysis of PQV that enables the setting of target specifications of
quality. As a result, we are able to monitor samples and sequencing
runs for deviations from target specifications (accuracy) and exces-
sive variability around target specifications (precision), thus meeting
the criteria for effective QC (6).
Methods
Sequencing reactions were performed by means of Applied Biosys-
tems Big Dye1
Terminator v3.0 sequencing chemistry. All sequen-
cing was performed on an Applied Biosystems ABI PRISM1
3730
Genetic Analyzer (AB 3730). The AB 3730 is a 48 capillary auto-
mated DNA sequencer. HLA-A, HLA-B and HLA-C SBT protocols
were developed in house. Each locus was typed by means of DNA
sequencing following locus-specific amplification and bi-directional
sequencing of exons 2 and 3. HLA-A and HLA-C were amplified with
a single set of amplification primers and HLA-B was amplified in two
PCRs in order to amplify the HLA-B alleles in two groups character-
ized by the alternate ‘TA’ and ‘CG’ dimorphism located in intron 1 (7).
The locus names HLA-BTA and HLA-BCG have been used in order
to indicate the alternative PCR amplifications. The DNA sequences
were analysed in a two-step process. First, the sequences
were analysed with the help of ABI PRISM SeqScape1
Software
(SeqScape) in order to splice intron sequence, align forward and reverse
sequence strands and assign consensus sequence quality values. The
DNA sequence files in .xml format were then imported into Assign
2.0 for allele assignment and QC analysis. The data included in the
Sayer et al : Quality control of SBT
Tissue Antigens 2004: 64: 556–565 557
3. .xml files contain the consensus sequence base calls and the consensus
sequence PQV (CSPQV). The .xml files are named according to a
strict convention, which includes the sample name, the locus being
sequenced and the sequencing primer. In addition, the .xml file
storage system is organized by means of locus and sequencing
date in order to facilitate data retrieval and enable chronological
analysis of sequence QC data. Assign 2.0 QC tools perform inde-
pendent analysis of CSPQV of automated homozygous (CSPQV-hom)
and heterozygous (CSPQV-het) base calls for a single position, a
range of positions (e.g., exon 2 or exon 3) or a selected date range
for a selected locus. We present an analysis of data from HLA-A
SBT runs from 12 February 2003 to 7 July 2003. This included
1086 samples sequence on 76 different sequencing runs.
The Assign 2.0 allele assignment component of the software matches
the consensus test sequence against an HLA allele sequence library
generated from the IMGT/HLA database (http://www.ebi.ac.uk/imgt/hla).
The matching algorithm has been developed in order to enable high-
speed matching on multiple samples to facilitate high-throughput SBT.
Results
Assign 2.0 QC tools: QC analysis of PQV
As PQV have previously been reported to be an indicator of base call
accuracy (5) and therefore sequence quality, we examined the possi-
bility that analysis of CSPQV could be extrapolated in order to
provide useful QC data for sample and/or a sequencing run. The
hypothesis is that the mean and standard deviation (SD) of CSPQV
for all nucleotide positions will reflect the sequence quality of the
sample sequenced. Furthermore, the mean and SD CSPQV of all
nucleotides for all samples on a sequencing run will reflect the
quality of the sequencing run. However, in order to determine the
feasibility of this approach, we needed to determine the degree of
variability of base call CSPQV at the same site between various
sequences, which appeared visually to be of good quality, between
various samples. It is important to demonstrate that CSPQV only
varies because of changes in sequence quality. For this purpose, we
analysed the CSPQV at 100 conserved (and therefore homozygous)
positions within exon 2 of HLA-A SBT from 20 samples within
the same sequencing run. The results have been presented in Fig. 1.
The mean CSPQV between positions may differ slightly, but more
importantly the CSPQV at each position are reproducible between
various samples. All but three positions have SD of less than 5
CSPQV units and a coefficient of variation (CV) of 5% with a
mean CV value for all positions of 2.7%.
While CSPQV are highly reproducible between samples, the
CSPQV of homozygous and heterozygous base calls are different.
This is demonstrated for two polymorphic positions (positions 165
and 170) within exon 2 of HLA-A in Fig. 2 Figure 2(A) shows the
frequency distribution of CSPQV-hom and CSPQV-het for position
165 of HLA-A. HLA-A alleles can be either A or G at this position.
The grey bars represent the frequency distribution of CSPQV-het
base calls (where both A and G are sequenced) and the black bars
represent the frequency distribution of homozygous base calls (this
includes both A and G base calls). Similarly, Fig. 2B is a frequency
histogram of the CSPQV at position 170. HLA-A alleles are also
either A or G at this position. For both positions, the distribution of
the CSPQV for heterozygous and homozygous positions is normally
distributed, but the CSPQV-het values are lower than CSPQV-hom
values. At position 165, the mean CSPQV-het is 27.10 and SD is 1.14
and the mean CSPQV-hom is 40.84 and SD is 1.75. For position 170,
the mean CSPQV-het is 25.48 and SD is 1.14 and the CSPQV-hom is
40.86 and SD is 1.53.
As a result of the findings described above, we suggest that:
1. The mean and/or SD values of CSPQV-hom of a BSU (i.e., the
various exons for HLA class-I) will provide good indicators of
sequence quality of the BSU. Some samples may not have
0
5
10
15
20
25
30
35
40
45
50
Mean
CSPQV
0
2
4
6
8
10
12
14
16
18
20
SD
CSPQV
Mean
SD
Conserved sequence nucleotide positions within exon 2 HLA-A
Fig. 1. The mean and standard deviation of
consensus sequence PQV (CSPQV) at 100
conserved (therefore, homozygous) positions
of exon 2 of HLA-A are shown from 20
consecutive unrelated samples. The mean
CSPQV (the plot in the top half of the graph) varies
between positions within the same sequence, but the
CSPQV at one position is reproducible between
samples as indicated by the low-standard deviations.
This indicates that a mean value of all CSPQV-hom for
a BSU should provide an indication of sequence
quality of the BSU. BSU, bi-directionally sequenced
units; CSPQV-hom, CSPQV of automated homozygous
base calls; PQV, Phred quality values.
Sayer et al : Quality control of SBT
558 Tissue Antigens 2004: 64: 556–565
4. heterozygous positions and so the use of CSPQV-het should not
be used as an indicator of sequence quality of a BSU.
2. The mean and/or SD values of all CSPQV-hom for all samples on a
sequencing run will provide good indicators of sequence quality of
the sequencing run.
3. Sequence quality ‘target’ (or ‘expected’) values can be calculated
from multiple data points and the mean and SD values of CSPQV
for individual BSU and sequencing runs can be compared to
expected values according to Shewhart rules for analysing con-
trols (6).
In order to test these hypotheses, we performed a retrospective
analysis of SBT data for HLA obtained between 12 February 2003
and 7 July 2003.
Within-run QC analysis
The graphs shown in Figs. 3 and 5 are examples of CSPQV analysis
that can be performed by the Assign 2.0 QC tools in just a few
seconds. Analyses of CSPQV-hom data for exons 2 and 3, respec-
tively, for each of 24 samples of the HLA-A SBT run 10–05–03 have
been presented in Fig. 3(A, B). In both graphs, the mean and SD data
are mirror images such that a sample with a high mean CSPQV
usually has a low SD. Grey bars with a horizontal line through the
middle have been used in order to indicate the mean 2 SD of
CSPQV data calculated from all runs between 12 February 2003 and
7 July 2003.
The exon 2 graph (Fig. 3A) reveals considerable variability
between samples, compared to the graph for exon 3 (Fig. 3B). This
40
(A)
35
30
mean = 27.10
Heterozygous
Sequence
Heterozygous
sequence
Homozygous
sequence
Homozygous
Sequence
SD = 0.90
mean = 25.48
SD = 1.14 mean = 40.86
SD = 1.53
mean = 40.90
SD = 1.75
25
HLA-A exon 2 position 165
HLA-A exon 2 position 165
20
Frequency
(%)
Frequency
(%)
15
10
5
0
40
35
30
25
20
15
10
5
0
1 4
(B)
7 10 13 16 19
PQV scores
22 25 28 31 34 37 40 43 46 49
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49
Fig. 2. The frequency histograms of
consensus sequence PQV (CSPQV) at
homozygous (black bars) and heterozygous
(grey bars) base calls have been shown for
two polymorphic positions (positions 165,
Fig. 2A and 170, Fig. 2B) within exon 2 of
HLA-A for all samples (n ¼ 1086 samples)
sequenced between 12 February 2003 and 7
July 2003. The distribution of the CSPQV is bi-
modal with CSPQV of heterozygous base calls being
less than the CSPQV of homozygous base calls. These
results indicate that homozygous and heterozygous
CSPQV should be considered independently if CSPQV
is used as a measure of sequence quality for a sample
or a sequencing run as the number of heterozygous
positions vary between samples.
Sayer et al : Quality control of SBT
Tissue Antigens 2004: 64: 556–565 559
5. indicates variability in sequence quality between the exon 2
sequences of the samples and consistent high-quality sequence for
exon 3 for all samples. Analysis of the sequence EPG for the forward
and reverse sequencing primers for exon 2 revealed that the
sequences from the forward sequencing primer contained high back-
ground for some samples, whereas the reverse sequencing primers
resulted in consistent good quality sequence (data not shown). The
CSPQV is deduced from PQV from both strands and poor quality
sequence on one strand is sufficient to reduce the CSPQV. The EPG
from the forward sequencing primer for some of the samples with
and without background have been shown in Fig. 4. A comparison of
the EPG and CSPQV-hom for these samples reveals that when the
background is high, i.e., the quality of sequence is poor (e.g., samples
13 and 19), the mean CSPQV-hom is low (35.01 and 33.58, respec-
tively) and SD is high (7.61 and 8.42, respectively). In samples where
there is no background, i.e., good quality sequence (e.g., samples 02,
21 and 06), the mean CSPQV is high (41.41, 41.30 and 41.03, respec-
tively) and the SD is low (2.2, 2.1 and 1.8, respectively).
These data demonstrate that mean and SD of CSPQV-hom are
sensitive and quantitative measurements of sequence quality.
With the exception of sample 3, the QC data for exon 3 indicate that
all sequence is of similar quality. Furthermore, all CSPQV-hom means
are greater than the expected mean CSPQV (horizontal line through the
middle of the grey bar) and all but one of the sample SDs are below the
expected SD. This indicates that the quality of sequence obtained for
exon 3 for all samples of this run is of greater quality than is expected.
For sample 3, only two of the 276 bases of exon 3 were included by the
SeqScape algorithm for analysis for one of the sequencing primers. As
a result, much of the sequence is single-stranded. The high PQV is an
anomaly of the SeqScape/Phred algorithm where the CSPQV may be
higher for single-strand sequence than for those with bi-directional
coverage. As a result, a SD was not calculated for this sample.
48
(A) Exon 2
Run 05_10_03. Position: exon 2
(B) Exon 3
Run 05_10_03. Position: exon 3
44
40
36
32
28
PQV-hom
mean
PQV-hom
mean
PQV-hom
SD
PQV-hom
SD
24
20
16
12
8
4
0
48
44
40
36
32
28
24
20
16
12
8
4
0
01 02 03 04 05 06 07 08 09 10 11 12
Sample
Sample
13 14
Mean (this run) = 39.82
SD (this run) = 1.96
Mean (this run) = 4.00
SD (this run) = 1.99
Mean (this run) = 40.6
SD (this run) = 1.29
Mean (this run) = 3.41
SD (this run) = 0.81
15 16 17 18 19 20 21 22 23 24
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
20
18
16
14
12
10
8
6
4
2
0
20
18
16
14
12
10
8
6
4
2
0
Fig. 3. The mean and standard deviation (SD) of
consensus sequence PQV (CSPQV) for
homozygous base calls within exon 2 (Fig. 3A)
and exon 3 (Fig.3B) have been shown for each of
24 samples within an HLA-A SBT run (run
ID¼ 05–10–03). The mean values of each sample are
plotted on the top part of each graph and are associated
with the Y-axis on the left hand side of the graph and the
SD values are plotted on the lower half of each graph
and the values are on the Y-axis on the right hand side of
the graph. The grey bars represent the mean 2 SD
limits of the mean and SD values of all samples for all
runs (n ¼ 76 runs) between 12 February 2003 and 7 July
2003. The mean and SD plots are mirror images, such
that when the mean is high, the SD is low and vice versa.
The plots demonstrate why individual BSU, in this case
each exon, are analysed separately. The exon 2 data is
variable, indicating sequence of variable quality, with
the mean and SD CSPQV for two samples (e.g., samples
13 and 19) outside the expected limits. By contrast, the
exon 3 data are much more consistent with all values
being on or greater than the expected mean of the mean
CSPQV and all but one sample being below the mean of
the expected SD CSPQV. These data indicate a potential
problem of varying degree effecting exon 2 sequences
only. SBT, sequencing-based typing;
Sayer et al : Quality control of SBT
560 Tissue Antigens 2004: 64: 556–565
6. Between-run QC analysis
In contrast to Fig. 3, where sample-to-sample QC analysis within a
sequencing run is demonstrated, Fig. 5 demonstrates run-to-run
(between-run) QC analysis. Between-run analysis is performed by
plotting the mean and SD of CSPQV calculated from all positions
for all samples on a sequencing run. This has been demonstrated in
Fig. 5, where the CSPQV data for exons 2 and 3 are plotted for each
run between 12 February 2003 and 7 July 2003 (76 runs, 1086
samples). The grey bars represent the mean 2 SD of data from
all runs. The data from the sequencing run of 10–05–03 (as demon-
strated in Fig. 3) are indicated by the arrows and does not appear to
be significantly different from data from other runs. However, the
exon 2 mean and SD data from the 19 runs after the run of 10–05–03
indicate that there has been a change in sequence quality. For nine of
the last 19 runs, the mean CSPQV-hom is below the expected mean
and four of the nine are on the lower 2 SD limit. By contrast, only
one result of the previous 57 runs has been on the lower 2 SD limit.
Similarly for the SD data for CSPQV-hom, 14 of the last 19 runs have
SD greater than the expected SD value. This indicates a change of
sequence quality as a result of the variable sequence obtained with
the exon 2 forward sequencing primer shown in Fig. 4. It is of interest
to note that similar changes in sequence quality are not indicated by
the CSPQV-het data. It is not clear why this is the case, but it may be
because of the smaller number of heterozygous sequence positions,
some which may be at positions where the background does not exist.
It is of interest to note that, although unlikely to be statistically
significant, the mean CSPQV-hom for exon 2 for all runs is higher
than the mean CSPQV-hom for exon 3 for all runs (exon 2 ¼ 40.06,
exon 3 ¼ 38.93). In addition, the SD is lower (mean SD for exon 2 is
3.99 and for exon 3 the mean SD is 5.25). This indicates that the
sequence quality for exon 2 is consistently better than the sequence
quality for exon 3. It is possible that this difference is because of the
inherent sequence differences between exon 2 and exon 3. However,
this difference may suggest that the conditions are not optimal for
exon 3. Table 1 lists the mean and SD of CSPQV-hom for the BSU
(i.e., exons 2 and 3) of HLA-A, HLA-B (both the HLA-BTA and HLA-
BCG HLA-B protocols) and HLA-C sequenced during the same
period. The exon 2 BSU sequence of HLA-A has the highest mean
PQV-hom and lowest SD, compared to all the other BSU for the other
loci. This indicates that the sequence quality obtained for the HLA-A
exon 2 BSU is better than the quality of sequence for the exon 3 BSU
of HLA-A and better than the sequence for all other BSU for the other
loci. The challenge now is to understand why this is the case and
optimize the sequencing conditions for the other loci to improve the
sequence quality at least to the level of the HLA-A exon 2 BSU.
Allele assignment
An example of an HLA allele assignment result page has been shown
in Fig. 6. A unique feature of Assign 2.0 is that the result page
contains important QC information in addition to the HLA allele
assignment. The allele assignment is displayed as a list of allele
combinations within the library that are best matched with test
sequence. Mismatched positions include the sequence base call of
the test sample at this position and the expected base call for the
allele combination. Additional information, including the CSPQV of
the test sequence at the mismatched positions and whether there was
Sample 02
PQV
41.41
41.30
41.03
39.56
38.56
35.01
33.58
2.2
2.1
1.8
5.3
6.0
7.6
8.4
Mean SD
Sample 21
Sample06
Sample 01
Sample 04
Sample 13
Sample 19
Fig. 4. The electropherogram (EPG) from a
region of exon 2 for selected samples from run
10–05–03 has been shown. The figure also
includes the mean and SD CSPQV-hom for the
samples of the EPG. When the sequence quality is
good (no background noise), the CSPQV means are
high and SDs are low. As the background noise
increases, the mean CSPQV-hom decreases and the SD
increases. CSPQV is an indicator of sequence quality.
The background noise appears as non-specific peaks
usually smaller than the specific sequence peak.
CSPQV, consensus sequence PQV; CSPQV-hom,
CSPQV of automated homozygous base calls; PQV,
Phred quality values.
Sayer et al : Quality control of SBT
Tissue Antigens 2004: 64: 556–565 561
7. a discrepancy between forward and reverse strand base calls (FRD)
or whether the mismatched position was sequenced in a single
direction only (SS), is also shown. Base calls that have arisen from
sequencing one strand only are also indicated in the result table by
‘SS’ in the ‘Quality Values’ row (not present in the example in Fig. 6).
The QC information of the sample includes the number of bases
sequenced (e.g., n ¼ 546 of the 546 bases which constitute exon
2 þ exon 3 for HLA-A, the homozygous and heterozygous base call
CSPQV (CSPQV-hom and CSPQV-het) statistics (mean CSPQV-
hom ¼ 39.9 and SD ¼ 4.3, mean CSPQV-het ¼ 25.8 and SD ¼ 2.1)
and the SS (0% for homozygous base calls, 0% for heterozygous
base calls) and FRD data (2% of homozygous and 0% of heterozy-
gous consensus base calls had FRD).
In the example shown in Fig. 6, there are two mismatches between
the test sequence and the best-matched alleles. Both mismatches
(position 282 and 448) are at positions, where there was an FRD.
An FRD indicates a base call error when sequencing in one direction
and high potential for an incorrect consensus base call. Such a
position is a priority for manual review. In addition, the base calls
at these positions are mismatched against all of the alleles in the
result table, indicating that the test sequence contains unique poly-
morphisms or they are incorrect base calls. By contrast, the base call
50 20
18
16
14
12
10
8
CSPQV
SD
6
4
2
0
20
18
16
14
12
10
8
CSPQV-het
SD
6
4
2
0
20
18
16
14
12
10
8
6
4
2
0
Homozygous base calls by SBT run-HLA-A exon 2 Homozygous base calls by SBT run-HLA-A exon 3
Heterozygous base calls by SBT run-HLA-A exon 2 Heterozygous base calls by SBT run-HLA-A exon 3
45
40
Mean (all runs) = 40.06
Mean (all runs) = 3.99
SD (all runs) = 1.05
Mean (all runs) = 23.97
SD (all runs) = 2.31
Mean (all runs) = 3.94
SD (all runs) = 1.5
Mean (all runs) = 3.94
SD (all runs) = 1.40
Mean (all runs) = 38.93
SD (all runs) = 1.58
Mean (all runs) = 5.25
SD (all runs) = 1.08
SD (all runs) = 1.16
35
30
25
CSPQV
mean
CSPQV-het
Mean
CSPQV
hom
Mean
CSPQV-het
mean
CSPQV
hom
SD
20
18
16
14
12
10
8
6
4
2
0
CSPQV-het
SD
20
15
10
5
0
30
28
26
24
22
20
18
16
14
12
10
8
6
4
2
0
30
28
26
24
22
20
18
16
14
12
10
8
6
4
2
0
50
45
40
35
30
25
20
15
10
5
0
Sequence run
Sequence run
Sequence run
Sequence run
Fig. 5. Between-run monitoring of sequence quality has been shown. The mean and SD CSPQV-hom and CSPQV-het for all samples of each
run (n ¼ 76 runs) for the period 12 February 2003 and 7 July 2003 have been plotted for exons 2 and 3. The grey bars represent the
mean 2 SD limits for all values on each graph. As for Fig. 3(A,B), the mean values have been shown in the top half of the graph and the SD values have been
shown in the bottom half of each graph. The arrows show the values for the run 5_10_3 (from Fig. 3A,B). Despite the poor quality sequence for the forward
sequencing primer in exon 2 for some samples in runs that follow 5_10_03, run mean does not fall out of the 2 SD limits (see the top left hand graph). However,
it is of interest to note that of the 19 runs following the run of 05–10–03, nine of the runs have a mean value below the expected mean and four of the nine runs
have values on the lower limit. By contrast, only one run in the previous 57 runs has been on the lower limit. This indicates a shift (decrease) in the mean CSPQV
for this assay, as a result of the suboptimal sequence obtained from the forward sequencing primer. The situation is similar for the SD values. Fourteen of the
last 19 SD value runs are greater than the mean SD value for all runs, indicating a shift in the mean SD for this assay. By contrast, the exon 3 data indicate that
the quality of sequence has increased. Sixteen of the last 19 runs are above the expected mean CSPQV and 12 of the last 19 are below the expected SD. This
indicates an overall improvement of SBT of exon 3 of HLA-A. However, a specific problem exists with the forward sequencing primer of exon 2. The changes in
sequence quality demonstrated in the CSPQV-hom data are not reflected in the CSPQV-het data. CSPQV, consensus sequence PQV; CSPQV-het, CSPQV of
heterozygous base cells; CSPQV-hom, CSPQV of automated homozygous base cells; PQV, Phred quality values.
Sayer et al : Quality control of SBT
562 Tissue Antigens 2004: 64: 556–565
8. at position 258 is ‘C’ and the CSPQV at this position is 42. This
indicates that ‘C’ has been called on both strands and a CSPQV of 42
indicates sequence of high quality and very low probability of an
incorrect base call.
Confirmation of base calls at positions within the mismatch table
is performed by viewing the EPG in SeqScape. Any edits to the
sequence are then performed directly in Assign 2.0 and the result
table is updated without the need for re-analysing the sequence
against the allele sequence library (i.e., in real time). Following con-
firmation of all base calls, Assign 2.0 will produce a report listing the
alleles that are best matched to the test sequence. The operator can
then click to the next sample for analysis and the result table is
immediately updated with data from the next sample.
Discussion
We have described a sequence data analysis computer software
program called Assign 2.0 that combines allele assignment with a
comprehensive and effective quality control system. Thousands of
sequences can be analysed in seconds making Assign 2.0 suitable
for high throughput sequencing-based typing or any resequencing
project. We have used the sequence-based typing of the highly
polymorphic HLA-A locus to demonstrate the utility of Assign 2.0.
The unique feature of Assign 2.0 is the ability to analyse PQV in
order to provide a comprehensive QC analysis of SBT data. We have
demonstrated that the mean and SD of all CSPQV-hom within a BSU
are sensitive indicators of sequence quality for that sample. Similarly,
the CSPQV-hom data for all BSU for all samples within a sequencing
run provide QC data for that sequencing run. As a result, sample-
to-sample and run-to-run QC monitoring can be performed.
Furthermore, the normal distribution of mean PQV data indicates
that Shewhart control graphs can be used and changes in sequence
quality can be accurately monitored. These processes add very little
time to the SBT process and yet provide valuable QC data.
A retrospective analysis of all data from February 2003 to July
2003 generated in our laboratory revealed changes in sequence quality
associated with an intermittent increase in background with a single
sequencing primer in our HLA-A SBT assay. This resulted a greater
than expected number of runs falling below the expected mean
CSPQV-hom. In addition, a comparison of CSPQV-hom data between
our HLA-A, HLA-B and HLA-C SBT assays revealed a difference in
sequence quality between the assays with HLA-A exon 2 providing
the best quality data. We are in the process of using Assign 2.0 in
order to re-optimize the HLA-B, HLA-C and HLA-A exon 3 assays so
optimal quality sequence data are obtained.
It is of interest to note that Phred was not designed to provide quality
values for heterozygous sequence (4, 5). However, the data shown in Fig.2
demonstrate that CSPQV-het are normally distributed but with a much
lower mean than CSPQV-hom. Therefore, in theory, CSPQV-het can also
be used for monitoring sequence quality. In most cases, the mean and SD
values of CSPQV-hom were mirror images, indicating that either of these
values, or the coefficient of variation (CV (%) ¼ SD*100/mean) can be
used as an indicator of sequence quality. The data presented in this study
did not indicate that analysis of CSPQV-het provided as sensitive an
indicatorofqualityasCSPQV-hom.Thisislikelytobebecauseofvariable
and low numbers of heterozygous positions, compared to homozygous
positions within a sequence.
The analysis of CSPQV in the ways we have described provides the
ability to assess the effect of reagents and SBT protocols on sequence
data quality. By improving the data obtained from SBT protocols, the
data analysis component of SBT protocols will be significantly reduced
and SBT will become a high-throughput protocol for measuring diver-
sity. In addition, the Assign 2.0 QC tools can be used for between-
laboratory comparison of data and provide a means of standardizing
SBT assays through workshops and QA exchange programs.
The applications of DNA sequencing are moving from the ‘sequence
factories’, where cloned DNA from a single chromosome is sequenced,
to studies of genetic diversity that includes the sequencing of PCR
products of highly polymorphic genes from pairs of chromosomes.
This includes research studies of evolution and population migration
(8) or for clinical diagnostic purposes (9–11). In addition, DNA sequen-
cing is being used by some laboratories for low to medium throughput
SNP analysis and de novo mutation detection (Ivo Gut, CNG, Paris,
France, personal communication). Appropriate QC is critical. Obtain-
ing, maintaining and monitoring sequence quality is required for all of
these applications. This manuscript describes a means by which
appropriate sequencing QC can be performed.
Assign v3.0 has been developed and does not require a third party
software, such as SeqScape, thus further improving the efficiency of SBT.
Mean and standard deviation CSPQV for homozygous base calls (CSPQV-hom) of
exon 2 and exon 3 of various HLA class-I SBT assays
CSPQV-hom
Exon 2 Exon 3
Locus Mean SD Mean SD
HLA-A 40.06 1.05 38.93 1.58
HLA-BCG 38.70 1.95 39.07 2.04
HLA-BTA 39.07 2.04 39.22 1.73
HLA-C 39.33 2.55 38.43 2.81
HLA-A exon 2 results in sequence quality with highest mean CSPQV and lowest SD, which may
reflect that the SBT conditions are better optimized for this BSU than the BSU of other loci. BSU,
bi-directionally sequenced units; CSPQV, consensus sequence PQV; PQV, Phred quality values;
SBT, sequencing-based typing.
Table 1
Sayer et al : Quality control of SBT
Tissue Antigens 2004: 64: 556–565 563
9. A
I
H
B C
E
F
G
D
A) Browse window for locating the .xml files for analysis
B) Locus being typed. If the locus is indicated in the sample name the selected locus in the ‘‘Locus’’ pane is over ridden
C) Indicates the maximum tolerance at which results are listed. Assign will list the best matched alleles up to 31 mismatches within the library.
D) The sample quality control information for the homozygous and heterozygous base calls. Included is the mean and standard deviation Phred quality value
information. The amount of sequence which was from a single strand (SS) and the percentage of base calls which were made from forward/reverse strand base
call discrepancies
E) Contains the ID of the sample for which the report is shown. The number of bases sequenced in also shown
F) This is the results pane. It lists the alleles which are best matched with the test sequence, the number of sequence differences between the alleles and the test
sequence and the sequence base call information at positions that are discrepant between the test sequence and the best matched alleles. This includes the
observed base calls of the test sample, the Phred quality value which is colour coded to represent base calls of high quality which do not require review (green).
Base calls which require review but which are probably correct (yellow) and base calls which definitely require review because they are either at a position with
single strand coverage, there is a forward/reverse strand base call discrepancy or the sequence quality is very poor (red).
G) This is the editor window and allows confirmation of the base calls. Once confirmed the final result can be determined and a report is generated
H) This is the list of samples that have been analysed. Selecting a sample ID results in immediate viewing of the SBT details as described above. Above the
sample IDs is the date of the release of the IMGT/HLA database.
I) This is the control panel which includes access to the QC tools
Fig. 6. A typical allele assignment result page has been shown. A detailed description of the result page is present in the key. The result page contains
the list of alleles, which are best matched to the test sequence, ranked in order of best match. The results have been presented, so that mismatched sequence
positions have been listed across the result page in sequence number order and include the consensus sequence of the test sample, the Phred quality value of the
consensus sequence (CSPQV) base call, if there was a forward and reverse strand base call discrepancy (FRD) and if the position was sequenced in both
directions (SS if sequence was from a single strand only) and the corresponding sequence of the alleles within the table. Moreover, included on the result page
are the total number of bases sequenced, the mean and standard deviation of CSPQV of the homozygous sequence base calls, CSPQV of the heterozygous base
calls, the number of positions (expressed as a percentage of the homozygous and heterozygous base calls), at which there were forward and reverse strand
sequence base call discrepancies (FRD), and the total amount of SS sequence.
Sayer et al : Quality control of SBT
564 Tissue Antigens 2004: 54: 556–565
10. References
1. Rosenblum BB, Lee LG, Spurgeon SL et al.
New dye-labeled terminators for improved
DNA sequencing patterns. Nucleic Acids Res
1997: 25: 4500–4.
2. Lee LG, Spurgeon SL, Heiner CR et al. New
energy transfer dyes for DNA sequencing.
Nucleic Acids Res 1997: 25: 2816–22.
3. Sayer DC, Whidborne R, De Santis D,
Rozemuller E, Christiansen FT, Tilanus M. A
multi centre evaluation of single-tube
amplification protocols for SBT of HLA-DRB1
and HLA-DRB3, 4, 5 are reproducible and
robust. HLA 2002. 2003. Tissue Antigens
2004: 63(5): 412–23.
4. Ewing B, Green P. Base-calling of automated
sequencer traces using phred. II. Error
probabilities. Genome Res 1998: 8: 186–94.
5. Ewing B, Hillier L, Wendl MC, Green P. Base-
calling of automated sequencer traces using
phred. I. Accuracy assessment. Genome Res
1998: 8: 175–85.
6. Shewhart WA. Economic Control of Quality of
Manufactured Product, 1st edn. New York:
Van Nostrand, 1931.
7. Cereb N, Yang SY. Dimorphic primers derived
from intron 1 for use in the molecular typing
of HLA-B alleles. Tissue Antigens 1997: 50:
74–6.
8. Malhi RS, Mortensen HM, Eshleman JA et al.
Native American mtDNA prehistory in the
American Southwest. Am J Phys Anthropol 2003:
120: 108–24.
9. Sayer DC, Land S, Gizzarelli L et al. A quality
assessment program (QAP) for genotypic
antiretroviral testing (GART) results in an
improvement in the detection of drug
resistance mutations. J Clin Microbiol 2003:
41: 227–36.
10. Sayer D, Whidborne R, Brestovac B, Trimboli F,
Witt C, Christiansen F. HLA-DRB1 DNA
sequencing based typing: an approach
suitable for high throughput typing including
unrelated bone marrow registry donors.
Tissue Antigens 2001: 57: 46–54.
11. Pryce TM, Palladino S, Kay D, Coombs GW.
Rapid identification of fungi by sequencing
the ITS1 and ITS2 regions using an
automated capillar electrophoresis system.
Med Mycol 2003: 41: 369–81.
Sayer et al : Quality control of SBT
Tissue Antigens 2004: 64: 556–565 565