2. 2
Exome sequencing in cropExome sequencing in crop
improvementimprovement
Shriram Ashru Belge
2012-11-105
Centre for Plant Biotechnology and Molecular Biology
College of Horticulture 22
3. 3
Outline
Introduction
Exome sequencing and its significance
Strategies for crop improvement
Advantages of exome sequencing
Sequencing: Tools and Techniques
Application of exome sequencing
Limitations
Conclusions
33
4. Human genome sequencing
•2003- Human genome sequenced
($ 2.7 billion and
13 years)
•2008- $1.5 million and 5 months
•At present- $10 000 in few days
(Jonatan et al., 2013)
4
Introduction
7. Exome and its significance
7
• Exon- sequence of DNA or RNA which code for protein
synthesis
• Exome- entire protein coding region of haploid set of
chromosome in an organism
8. 8
• Whole human genome
size 3 billion bp and 30,000
genes
(www.genome.gov.in)
• The exome of the human
genome consists 180,000
exons constituting about
1% of the total genome, or
about 30 mb of DNA
(Turner et al., 2009)
…Exome and its significance
9. Strategies for crop improvement
• Conventional breeding-
almost exhausted
• MAS (Marker assisted
selection)- lack of
validated markers
9
11. 11
Advantages of exome sequencing
• The majority of genetic disorder/disrupt due to changes in
protein-coding sequences
• To identify the functional variation that is responsible for
differential expression of desirable traits (Choi et al.,
2009)
• To identify the coding region sequence variation in
closely related species
12. Feature Whole genome
sequencing
Exome sequencing
Sequence included Whole genome Only protein coding
sequence
Sequence size Whole genome Smaller than the whole
genome
Time Fairly long Faster due to smaller size
Cost Expensive Relatively cheaper
Assembly success rate Low due to highly
repetitive sequences
High due to small size
Ease of analysis Low due to large data size High due to smaller data
size
Information excluded No sequence information
is excluded
All non coding regions
excluded
12
Whole genome vs. exome sequencing
(Tang et al., 2010)
14. I. cDNA synthesis
• Isolation of
mRNA at
specific/different
stages
• cDNA synthesis
by reverse
transcriptase
14
Exome sequencing process
15. A. Sanger sequencing:-
• Sanger dideoxy chain termination method (Sanger and
Coulson, 1975)
• Chemical sequencing method (Maxam and Gilbert in
1976-77)
B. Next-generation sequencing:-
1. Pyrosequencing
2. Reversible terminator-based sequencing
3. Sequencing by-ligation
4. Ion semiconductor-based nanoptical sequencing
5. DNA nanoball sequencing 15
… exome sequencing process
II. DNA sequencing
16. C. Next-next generation sequencing (Third generation):-
1. True single molecule sequencing(TSMS)
2. Single molecule, real time (SMRT) sequencing
3. Single-molecule RNAP motion-based real-time
sequencing
4. Nanopore sequencing
16
… DNA sequencing
17. A. Sanger sequencing:-
• Oldest method and still in practice
• Four ddNTPs labelled with different fluorescent dye
during amplification
• Laser based detection of the incorporated ddNTP in
amplicon
• Poor quality at the initial 20-50 bases and poor size
resolution at 600-1000 bases due to large DNA size
17
… exome sequencing process
20. 20
Exome sequencing process
Isolation of mRNA
cDNA strand synthesis
Fragmentation of cDNA strand synthesis
Ligation of adapters
Amplification
DNA sequencing and data analysing
21. Applications of Exome Sequencing
21
I. Exploring biodiversity:-
II. Investigating the natural evolution of crop:-
III. To study host-pathogen interaction:-
IV. Crop breeding:-
22. … application
22
I. Exploring biodiversity:-
• Morphological analysis and DNA fingerprinting not
enough to identify closely related species (Hebert et al.,
2003)
• Exome sequencing overcome the limitation
• Physcomitrella patens exome sequenced and annotated
• Compared with some green algae and flowering plants
(Daniel et al., 2008)
24. II. Exome sequencing for investigating the natural
evolution of crop:-
•Natural evolution is studied by utilizing morphological and
molecular markers
•Exome sequencing technologies used for maize and rice
domestication studies (Doebley et al., 2006)
•Exome sequencing is reported to be more efficient in
studying crop evolution in African rice (Burk et al., 2007)
24
… application
25. • Evolutionary and geographical origins of cassava have
remained unresolved and controversial
• Exome analysis for nuclear gene glyceraldehyde 3-
phosphate dehydrogenase (G3pdh) acting on aldehydes
25
…evolution of cassava
26. 26
…evolution of cassava
• Sequenced all the genotypes
from different region and
compared the exome
• 28 haplotypes identified
among 212 individuals (424
alleles) examined
27. • Origin of Cassava:- conformed as Amazon basin
and Brazil (Kenneth and Barbara,1999)
27
…evolution of cassava
BrazilAmazon basin
28. III. Exome sequencing to study host-pathogen
interaction:-
•Virulence and susceptibility in the host-pathogen interaction
can be altered by even single amino acid change (Carroll et
al., 2011)
•Mapping entire genome of host for every modification of
host-pathogen interaction is challenging task
•Hu et al., 2010 identified the genes involved in Plant-
bacterial interaction through exome sequencing
28
…applications
29. • Evaluated Pseudomonas
syringae (DC3000 strain)
infection in Arabidopsis thaliana
• Cause bacterial speck disease
and some changes occurred
• Type III secreted effectors
(T3SEs) and other virulence-
associated genes detected (Hu et
al., 2010)
29
… study host-pathogen interaction
(http://www.mmg.mso.edu.com)
30. • Traditional breeding is not enough to improve disease
resistance in high yielding varieties
• Exome sequencing help analysis of individual allele and
QTL of genotypes
• Unique tool to test genetic markers in MAS
• TILLING and EcoTILLING, easier in germplasm
collections for allelic variants in target genes
30
IV. Exome sequencing in crop breeding:-
31. 31
• Some useful region of the genome is uncovered in exomic
sequencing
Examples:-
- miRNA
- UTRs and
- pseudogenes etc
Limitations
32. 1. MicroRNA (miRNAs) and other noncoding RNAs:-
• miR160, miR167 and miR171 could be responsible for
the development of Arabidopsis thaliana root systems
under N-starvation conditions
32
… limitations
33. 2.3’- and 5’-UTR regions (Washida et al., 2009):-
3.In rice, glutelin production require cis-localization element
for RNA transport process
4.Two located at the 5 ′ and 3 ′ ends of the coding sequences
and the third is within the 3 ′ untranslated region (not covered
in exome sequence) 33
...limitations
34. 3. Pseudogenes (Ebert and Sharp, 2010):-
• Non coding region regulate the RNA coding region
• miRNA is produced from non coding region
34
…limitations
35. 35
• Exome- part of genome which include only the coding region
• Knowing the exact coding region can help manage the crop
more efficiently
• More efficient in new generation sequencing
• Less data is generated and it is easy to assemble and analyse
Conclusion
36. …conclusion
• Excellent and practical (cost and time) tool for crop
improvement compared to whole genome
• Utilizing the omics and bioinformatics platform and
integration of their outcome will serve for crop
improvement
36