Whole Exome Sequencing .pptx

Exome sequencing:
A revolutionary technology in
life sciences

Whole Exome Sequencing
• Whole Exome Sequencing is a Next Generation Sequencing technology used for sequencing the
coding regions that is exons in the genome (Bamshad et al., 2011).
• Although conventional approaches of variant prediction are used to predict genetic variants
that correspond to a particular disease, the whole exome sequence approach would help in
determining the involvement of rare variants and to accelerate the process of variant
identification.
• To capture the exome sequences for the sequencing process which constitutes about 2% of the
genome, a hybirization based capture technique was proposed.
• This techniques proves to be effective for:
1) Identifying variants of monogenic diseases
2) Identifying variants of mendelian disorders (as these disorders are mainly due to protien
coding variants)
3) Identifying high impact variants (missense and nonsense variants)

Why Whole Exome Sequencing?
• Candidate gene approach is not enough to explain the complexity of the diseases
and disorders.
• Although genotyping allowed the identification of variants occuring at a particular
locus, it is tedious and we can only focus on specific locations for a single study.
• The Next Generation Sequencing technologies allowed us to identify variants in
the whole ~ 3 billion base pairs (Whole Genome Sequencing) or only at targeted
spots in the genome (Whole Exome Sequencing and Targeted amplicon
sequencing).
• Compared to Whole Genome Sequencing, targeted sequencing methods such as
exome sequencing provide us with high coverage, high efficient data for detection
of rare variants with much lesser cost and time.

Why WES, Why not WGS?
• Whole Exome Sequencing helps us in uncovering variants that happen in the
protein-coding regions of the genome and is therefore helpful in the
identification of causal variants of a disease.
• Whole Exome sequencing produces substantial amount of data ~ 3Gb – 5Gb
which is easy to handle and store, whereas WGS generateds ~90Gb data.
• 500 exoms per run can be sequenced in less than 45 hours by the Illumina
HiSeq 2000s, whereas 48 whole genome sequences can be sequenced at
the same velocity.
• Clinical genetic evaluations include WES and not WGS (in many cases) and
it’s detection rate ranges from 24% to 68%.

Limitations of WES
• A major limitation of WES technology is that the coverage is not
uniform, many regions have low coverage. The lack of uniformity in
sequence coverage will affect the downstream analysis and variant
calling.
• Whole Genome Sequencing have good coverage and is more
comprehensive than WES. Whole Genome Sequencing will help in
identification of pathogenic variants outside the protein coding region.
• Whole Exome Sequencing cannot be used for the detection of Copy
Number Variations (CNVs).

Whole
Exome
Sequencing
Help in dividing
patient group
into categories
for clinical
management
Identification of
novel candidate
genes
Incidental
pathological
findings
Clinical diagnosis
of ambiguous
cases and
mendelian
diseases
Discovering
pleiotropic effect
of known
disease genes
Figure 1 – Applications of Whole Exome Sequencing

Clinical Application of WES
• As WES is unbiased, the analysis of the sequencing will let us know all the disease-causing
variants corresponding to more than one disease state even when the clinical presentation
does not have evidence for the same.
• The clinical diagnostic ability of WES was estimated by several studies and is estimated to
be 44.41% (in a sample size of 1360 patients), 28.8% (in a sample size of 3040 patients),
and 25% (in a sample of 169 children suspected to have monogenic disorders).
• From 2011, WES has been recommended as a diagnostic tool in many diagnostic facilities
and by clinicians.
• Zhang et al., 2021 states that the positive detection rate of WES depends more on the
clinical representation made by clinician and the analytical capability of data analyst.
• WES has successfully applied for identifying genetic variants associated with Miler
Syndrome, Complex disorders such as ASD, and some Mendelian phenotypes.

Clinical Application of WES
• Efforts such as “Grand Opportunity” Exome Sequencing Project (GO-
ESP) and Exome Aggregation Consortium (ExAC) which help in
identifying population dependent rare genetic variants are proposed
to be the initial step of personalized medicine.
• Whole exome sequencing followed by filtering the variants in the
candidate genes is suggested to be an unbiased approach for the
diagnosis of many diseases and disorders such as Neuromuscular
disorders, epilepsy, autism spectrum disorder and intellectual
disability.

Figure 2 – Characteristics of WES data

DNA extraction
•A process in which the DNA from the samples are isolated
Quality and Quantity analysis
•The quality of the isolated DNA is checked to identify contamination of RNA and other impurities
•Quantity is estimated to determine whether the basic requirement for sequencing is met.
DNA fragmentation
• The process of shearing the DNA into pieces of a particular length
Exome capture/enrichment
•Separate exonic regions from other sequences of the genomic DNA
Sequencing
•The process by which the bases in the DNA are identified and the sequence of the DNA determined
Analysis
•Down stream analyses to identify genetic variants associated with a particular disease/disorder
Figure 3 – Workflow for Whole Exome Sequencing

Exome enrichment
• After the DNA is isolated and fragmented by physical or enzymatic methods
the next process is exome enrichment or exome capture.
• During this process, only the exonic regions of the DNA are captured for
sequencing.
• Exome capture can be done using microarray or magnetic beads. The
microarray based exome capture uses probes (specific to exons) that are
hybridized to microarray, the second method the probes are hybridized to
the exons which are then pulled by the magnetic beads.
• After exome capture, the captured DNA regions are sequenced.

Quality check
&
Pre-processing
Mapping reads
with human
reference
Base Recalibration
Variant calling
Variant Filtration
&
Variant Annotation
Finding Disease
related variants
Figure 4 – The workflow of Whole Exome Sequence data analysis

Pre-processing and Quality Control
• The first step for WES data analysis the quality check and pre-processing the reads.
• Quality check is done to ensure the reads are of good quality. Most WES data
analysis pipelines use “FastQC” for quality check. This tool helps us in visualising
the quality of the reads. It tells us the quality of the reads and does several
important parameter checks.
• The parameters tested include GC content, per base quality scores, per sequence
quality scores, sequence duplication levels, per base N content, adapter content
and several other parameters.
• After the quality checks, the sequences that are less than 40 base pairs, adapter
contents are usually removed in the pre-processing. There are several tools that
are available for pre-processing, it include fastp, cutadapt and trimmomatic.

Alignment/mapping the reads
• The trimmed and processed reads are used as input for the mapping step.
This step aligns/maps the reads to the reference genome, this is done to
mark the genomic position of the reads.
• To perform mapping, a reference genome is selected. For humans, GRCh37
or GRCh38 is currently used in more studies as the reference. Whereas, the
Ensembl references hg19 and hg38 are also used by data analyst.
• There are several tools for alignment, the selection of an alignment tool
depends on the following,
1) The length of the reads
2) The size of the reference genome used
• The tools available for alignment include BWA, Bowtie2

Variant Calling and annotation
• Variant calling is the process in which the variants are identified by the tool by
comparing each base position with the reference base.
• There are various tools for variant calling, it include VarScan, FreeBayes,
HaplotypeCaller, Mutect2 and others. The tool to be used varies from one
workflow to other based on the type of variant one needs to call for, the
downstream analysis and the tools to be used for downstream analysis.
• Variant annotation is the process by which the region in which the variant has
happened in the genome (whether it is intronic, exonic or downstream will be
identified), the gene/the immediate gene, the reference and alternate alleles and
many other functional and structural properties would be found out.
• The variants are then filtered based on several properties such as quality, depth
and other parameters. This process helps in filtering out true positive variants and
minimizing false positives (based on those parameters).

Finding disease related variants
• After variant annotation, we can focus on or search for variants in the
already known candidate genes of that particular disease.
• It is also possible to identify the variants in the genes that are possible
involved in the disease. After filtering the variants, we can select and list
out variants that has allele frequency higher than 1 (which are possibly
deleterious), has deleterious effects (predicted by SIFT or PolyPhen), after
which the gene-pathway analysis can be done to validate it’s involvement in
the disease.
• The variants when annotated with dbSNP and ClinVar can let us find out
whether the already available and disease related variants are found in the
sample.

Conclusion
There are reports showing the advantageousness of Whole Exome Sequencing in clinical
diagnosis of several disease and disorders. Although it’s effectiveness has been stated, the
rate of diagnosis varies from one study to another and is highly dependent on the number of
individuals being considered in the study, the heterogeneity of the disease, and many other
factors.
In case of disease related variant identification, Whole Exome Sequencing is considered to be
fast and cost efficient than WGS but it is considered to less efficient for the detection of
variants because of the ununiformed sequencing. Also, it is very well known that it cannot
detect variants in the intronic and other non-coding regions of the genome. Although, earlier
the non-coding genetic variatns were not considered important in pathogenesis but now
there are studies relating them with disease and disorders. So, it also important to analyze
the variants in those regions.
Many researchers and clinicians consider the diagnosis of diseases through WES because, in
gene panel we can only test for a set of genes that were previously known. Whereas, in case
of WES we can sequence the exons and then search for the previously known candidate
genes and also search for genes based on the clinical interpretations. This technology can
therefore help in identification of genes that were not previously reported but is present in
the patient and is involved in the pathogenesis.

References
• Bamshad, M. J., Ng, S. B., Bigham, A. W., Tabor, H. K., Emond, M. J., Nickerson, D. A., & Shendure, J. (2011). Exome sequencing as a tool for Mendelian disease gene discovery. Nature reviews.
Genetics, 12(11), 745–755. https://doi.org/10.1038/nrg3031
• Burdick, K. J., Cogan, J. D., Rives, L. C., Robertson, A. K., Koziura, M. E., Brokamp, E., Duncan, L., Hannig, V., Pfotenhauer, J., Vanzo, R., Paul, M. S., Bican, A., Morgan, T., Duis, J., Newman, J. H., Hamid, R.,
Phillips, J. A., 3rd, & Undiagnosed Diseases Network (2020). Limitations of exome sequencing in detecting rare and undiagnosed diseases. American journal of medical genetics. Part A, 182(6), 1400–
1406. https://doi.org/10.1002/ajmg.a.61558
• Suwinski, P., Ong, C., Ling, M., Poh, Y. M., Khan, A. M., & Ong, H. S. (2019). Advancing Personalized Medicine Through the Application of Whole Exome Sequencing and Big Data Analytics. Frontiers in
genetics, 10, 49. https://doi.org/10.3389/fgene.2019.00049
• Westra, D., Schouten, M. I., Stunnenberg, B. C., Kusters, B., Saris, C., Erasmus, C. E., van Engelen, B. G., Bulk, S., Verschuuren-Bemelmans, C. C., Gerkes, E. H., de Geus, C., van der Zwaag, P. A., Chan, S.,
Chung, B., Barge-Schaapveld, D., Kriek, M., Sznajer, Y., van Spaendonck-Zwarts, K., van der Kooi, A. J., Krause, A., … Voermans, N. C. (2019). Panel-Based Exome Sequencing for Neuromuscular Disorders
as a Diagnostic Service. Journal of neuromuscular diseases, 6(2), 241–258. https://doi.org/10.3233/JND-180376
• Rochtus, A., Olson, H. E., Smith, L., Keith, L. G., El Achkar, C., Taylor, A., Mahida, S., Park, M., Kelly, M., Shain, C., Rockowitz, S., Rosen Sheidley, B., & Poduri, A. (2020). Genetic diagnoses in epilepsy: The
impact of dynamic exome analysis in a pediatric cohort. Epilepsia, 61(2), 249–258. https://doi.org/10.1111/epi.16427
• Stefanski, A., Calle-López, Y., Leu, C., Pérez-Palma, E., Pestana-Knight, E., & Lal, D. (2021). Clinical sequencing yield in epilepsy, autism spectrum disorder, and intellectual disability: A systematic review
and meta-analysis. Epilepsia, 62(1), 143–151. https://doi.org/10.1111/epi.16755
• Bartha, Á., & Győrffy, B. (2019). Comprehensive Outline of Whole Exome Sequencing Data Analysis Tools Available in Clinical Oncology. Cancers, 11(11), 1725. https://doi.org/10.3390/cancers11111725
• Retterer, K., Juusola, J., Cho, M. et al. Clinical application of whole-exome sequencing across clinical indications. Genet Med 18, 696–704 (2016). https://doi.org/10.1038/gim.2015.148
• Zhang, Q., Qin, Z., Yi, S., Wei, H., Zhou, X.Z., & Su, J. (2021). Clinical application of whole-exome sequencing: A retrospective, single-center study. Experimental and Therapeutic Medicine, 22, 753.
https://doi.org/10.3892/etm.2021.10185
• Suwinski, P., Ong, C., Ling, M., Poh, Y. M., Khan, A. M., & Ong, H. S. (2019). Advancing Personalized Medicine Through the Application of Whole Exome Sequencing and Big Data Analytics. Frontiers in
genetics, 10, 49. https://doi.org/10.3389/fgene.2019.00049
• O'Roak, B. J., Vives, L., Girirajan, S., Karakoc, E., Krumm, N., Coe, B. P., Levy, R., Ko, A., Lee, C., Smith, J. D., Turner, E. H., Stanaway, I. B., Vernot, B., Malig, M., Baker, C., Reilly, B., Akey, J. M., Borenstein,
E., Rieder, M. J., Nickerson, D. A., … Eichler, E. E. (2012). Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature, 485(7397), 246–250.
https://doi.org/10.1038/nature10989
• Jeste, S. S., & Geschwind, D. H. (2014). Disentangling the heterogeneity of autism spectrum disorder through genetic findings. Nature reviews. Neurology, 10(2), 74–81.
https://doi.org/10.1038/nrneurol.2013.278

Whole Exome Sequencing .pptx

More Related Content

What's hot

Similar to Whole Exome Sequencing .pptx

Recently uploaded

Whole Exome Sequencing .pptx

Editor's Notes