 Introduction
 Genome
 Genome sequencing
 Methods of sequencing
 DNA sequencing
 Genome sequencing
 Genome sequencing of some crops
 Case studies
 Limitations & Future Prospectus
 Conclusion
 A genome is an organism’s
complete set of DNA,
including all of its genes.
 Each genome contains all of
the information needed to
build and maintain that
organism.
 In humans, a copy of the
entire genome more than 3
billion DNA base pairs is
contained in all cells that have
a nucleus.
Source:https://www.ncbi.nlm.nih.gov/
 Genome sequencing is figuring
out the order of DNA nucleotides,
or bases, in a genome i.e. the
order of As, Cs, Gs, and Ts that
make up an organism's DNA. The
human genome is made up of
over 3 billion of these genetic
letters.
 Genome sequence is simply a
very long string of letters in a
mysterious language.
 Represent a valuable shortcut, helping scientists find genes
much more easily and quickly.
 Help to understand how the genome as a whole works, how
genes work together to direct the growth, development and
maintenance of an entire organism.
 Genes account for less than 25 percent of the DNA in the
genome, and so knowing the entire genome sequence will
help scientists study the parts of the genome outside the
genes. This includes the regulatory regions that control how
genes are turned on an off, as well as long stretches of
"nonsense" or "junk" DNA.
 Allows mass sequencing of genomes and transcriptomes.
 Genome-wide expression studies provide breeders with an
understanding of the molecular basis of complex traits.
 Re-sequencing of genomes is very useful for the genome-wide
discovery of markers amenable for high-throughput genotyping
platforms, like SSRs and SNPs, or the construction of high
density genetic maps.
 They allow the identification of markers linked to genes and
QTLs.
1. First generation sequencing
 Maxam – Gilbert Sequencing.
 Sanger sequencing.
2. Next Generation Sequencing Methods.(NGS)
 Pyrosequencing.
 454 sequemcing.
 Illumina (solexa) sequencing.
 SoLid sequencing.
 Ion torrent sequencing.
3.Third generation sequencing
 Single molecule sequencing (SMS)
 Single molecule real time (SMRT)
sequencing
 Oxford Nanopore sequencing.(ONS)
Known as chain termination
method of sequencing.
 There are two approaches to the
task of cutting up the genome and
putting it back together again.
 Hierarchical shotgun
sequencing
 Whole genome shotgun
sequencing
 Also known as clone by clone sequencing
 Map based cloning i.e. map first sequence later.
 During clone-by-clone sequencing, a map of each
chromosome of the genome is made before the DNA is split
up into fragments ready for sequencing.
 In clone-by-clone sequencing the genome is broken up into
large chunks, 150 kilo bases Long (150,000 base pairs).
 The chunks are then inserted into bacterial artificial
chromosomes (BACs) and put inside bacterial cells to grow.
 The chunks of DNA are copied each
time the bacteria divide to produce lots
of identical copies.
 The DNA in the individual
bacterial clones is then broken down
into even smaller, overlapping
fragments. Each fragment is 500 base
pairs long so that they are a more
manageable size for sequencing.
 These fragments are put into
a vector that has a known DNA
sequence.
 The DNA fragments are then
sequenced, starting with the known
Contd…
 This ‘assembly’ is carried out by computers
which spot areas of overlap and piece the
DNA sequence together. Then, by
following the map constructed at the
beginning, the large chunks can be
assembled back into the chromosomes as
part of the complete genome sequence.
 The clone-by-clone approach was used
during the 1980s and 1990s to sequence
the genomes of the nematode worm ( C.
elegans), and the yeast ( S. cerevisiae) .
 Clone-by-clone sequencing was the
preferred method during the Human
Advantages
 Every fragment of DNA is taken
from a known region of the
genome, so it is relatively easy to
determine where there are any
gaps in the sequence.
 Assembly is more reliable because
a genome map Is followed so the
scientists know where the larger
fragments are in relation to each
other.
 As each fragment is distinct many
people can work on the genome at
Disadvantages
 Making clones and generating
genome maps takes a long time.
 Clone-by-clone sequencing is
generally more expensive than
other sequencing methods.
 Some parts of the
chromosomes, such as the
centromeres, are difficult to
clone. This is because they
contain long repetitive sections
which makes them difficult to cut
and clone into bacs.
 Shotgun sequencing involves randomly
breaking up DNA sequences into lots of
small pieces and then reassembling the
sequence by looking for regions of
overlap.
 Shotgun sequencing was originally used by
Fred sanger and his colleagues to
sequence small genomes such as those
of viruses and bacteria.
 This method of genome sequencing
was used by Craig venter, founder of
 In whole genome shotgun
sequencing the entire genome is
broken up into small fragments
of DNAfor sequencing.
 The sequenced fragments are then
assembled together by computer
programs that find where fragments
overlap
 We can imagine shotgun
sequencing as being a bit like
shredding multiple copies of a
book (which in this case is a
genome), mixing up all the
fragments and then reassembling
the original text (genome) by
finding fragments with text that
WGS Cont…
Advantages
 By removing the mapping stages, whole
genome shotgun sequencing is a much
faster process than clone-by-clone
sequencing.
 Whole genome shotgun sequencing uses
a fraction of the DNA that clone-by-clone
sequencing needs.
 Whole genome shotgun sequencing is
particularly efficient if there is an existing
reference sequence. It is much easier to
assemble the genome sequence by
aligning it to an existing reference
genome.
Disadvantages
 Vast amounts of computing power and
sophisticated software are required to
assemble shotgun sequences together.
 Whole genome shotgun sequencing can
only really be carried out if a reference
genome is already available, otherwise
assembly is very difficult without an existing
genome to match it to.
 Whole genome shotgun sequencing can
also lead to errors which need to be
resolved by other, more labour-intensive
types of sequencing, such as clone-by-clone
sequencing.
 Done By :Clone by clone
sequencing /hierarchical shot
gun sequencing/map based
shot gun sequencing.
 It includes:
Map Construction
Clone Selection
Sub Clone Library
Construction
Random Shot Gun Phase
Directed Finishing Phase
Sequence Authentication.
Strategy
 Large insert BAC , PAC libraries prepared
 Physical maps of genome of accession
COLUMBIA were assembled.
 Clones are selected for shot gun sequencing
Outcomes of sequencing project
:
 115409949 bp (~115.4 mb) are sequenced .
 The non sequenced centromeric and
ribosomal DNA repeat regions measures
roughly 10 mb .
 25498 genes are predicted.
 The International Rice Genome
Sequencing Project (IRGSP) began in
September 1997, at a workshop held in
conjunction with the International
Symposium On Plant Molecular Biology
in Singapore.
 Sequencing of the rice genome is being
performed mainly from genomic BAC
or PAC libraries created from the
Nipponbare variety.
 China, working on the sequencing of
chromosome 4, is the only IRGSP
Efficiency
 The members of the IRGSP have also chosen to
accept common standards sets a standard of less
than one basepair (bp) error in 10,000 bp.
Outcome of sequencing
 Genome size of Rice is 420 Mb
 A total of 37,544 non-transposable-element-
related protein-coding sequences were
detected,.
 A total of 2,859 genes seem to be unique to rice
and the other cereals, some of which might
differentiate monocot and dicot lineages.
 Between 0.38 and 0.43% of the nuclear
genome contains organellar DNA fragments,
representing repeated and ongoing transfer of
organellar DNA to the nuclear genome.
 The transposon content of rice is at least 35%.
Source: NCBI
Source:
INDIAN INITIATIVE FOR RICE GENOME SEQUENCING (Chromosome
No :11)
 Pigeonpea variety ‘Asha’
(ICPL87119) was used for
genome sequencing.
 Whole genome shotgun
sequencing
 Illumina next-generation
sequencing platform along with
Sanger based bacterial artificial
chromosome end sequences
and a genetic map have used for
sequencing..
Source: https://www.ncbi.nlm.nih.gov
 Sequencing projects is that
while the ICRISAT-led team
has assembled 605.78 mb
out of the 833.07 mb (about
72.5%) of the genome, the
ICAR team has captured 511
mb (about 61%).
 Genome analysis predicted
59,515 genes for pigeon pea
and also showed the
potential role that certain
gene families, for example,
drought tolerance related
genes, have played
throughout the
domestication of pigeon
pea and the evolution of its
Man behind
pigeon pea
sequencing
Dr Rajeev K Varshney
Source :The Plant Journal (2012) 70, 177–190
http://plabipd.de/portal/sequence-timeline
More
information
on
 A high percentage of repeats in many plant genomes
makes it difficult to assemble the short reads from the
NGS platforms.
 Failure to capture the information embedded in the
repetitive fraction of the genome is a major drawback,
as it may have key roles in regulatory aspects (Feuillet
et al. 2011).
 Heterozygosity and polyploidy also add to the
difficulties.
 Extensive re-sequencing is needed for the detection of
SNPs. The cost of sequencing is the major hurdle here.
However, the cost has been reduced considerably in
recent years and is expected to be cheaper in the near
future.
 More reliable and user-friendly software have to be
developed for more precise data analysis.
 Another challenge is that the functions of many genes
identified by genome sequencing remain unknown and
the genetic control of the majority of agronomic traits has
yet to be determined
 The shift from manual DNA sequencing methods such as Maxam-
Gilbert sequencing and Sanger sequencing in the 1970s and 1980s to more rapid,
automated sequencing methods in the 1990s played a crucial role in giving
scientists the ability to sequence whole genomes.
 Almost any biological sample containing a full copy of the DNA even a very small
amount of DNA can provide the genetic material necessary for
full genome sequencing.
 Genome sequencing is like solving puzzle. But once its solved then the result is
magnificent. We can derive much more information from it .Staring from finding
the genes and its functions to QTLs, from Epigenomics to Phylogeny its all about
genome.
 From genome sequencing we are moving towards Exome sequencing. Ultimately
all these sequencing give us every bit of information about an organism which we
can use for human welfare.
DNA is not just a word of 3 letters.
its beyond that….

CROP GENOME SEQUENCING

  • 2.
     Introduction  Genome Genome sequencing  Methods of sequencing  DNA sequencing  Genome sequencing  Genome sequencing of some crops  Case studies  Limitations & Future Prospectus  Conclusion
  • 3.
     A genomeis an organism’s complete set of DNA, including all of its genes.  Each genome contains all of the information needed to build and maintain that organism.  In humans, a copy of the entire genome more than 3 billion DNA base pairs is contained in all cells that have a nucleus.
  • 4.
  • 5.
     Genome sequencingis figuring out the order of DNA nucleotides, or bases, in a genome i.e. the order of As, Cs, Gs, and Ts that make up an organism's DNA. The human genome is made up of over 3 billion of these genetic letters.  Genome sequence is simply a very long string of letters in a mysterious language.
  • 6.
     Represent avaluable shortcut, helping scientists find genes much more easily and quickly.  Help to understand how the genome as a whole works, how genes work together to direct the growth, development and maintenance of an entire organism.  Genes account for less than 25 percent of the DNA in the genome, and so knowing the entire genome sequence will help scientists study the parts of the genome outside the genes. This includes the regulatory regions that control how genes are turned on an off, as well as long stretches of "nonsense" or "junk" DNA.
  • 7.
     Allows masssequencing of genomes and transcriptomes.  Genome-wide expression studies provide breeders with an understanding of the molecular basis of complex traits.  Re-sequencing of genomes is very useful for the genome-wide discovery of markers amenable for high-throughput genotyping platforms, like SSRs and SNPs, or the construction of high density genetic maps.  They allow the identification of markers linked to genes and QTLs.
  • 8.
    1. First generationsequencing  Maxam – Gilbert Sequencing.  Sanger sequencing. 2. Next Generation Sequencing Methods.(NGS)  Pyrosequencing.  454 sequemcing.  Illumina (solexa) sequencing.  SoLid sequencing.  Ion torrent sequencing. 3.Third generation sequencing  Single molecule sequencing (SMS)  Single molecule real time (SMRT) sequencing  Oxford Nanopore sequencing.(ONS)
  • 9.
    Known as chaintermination method of sequencing.
  • 10.
     There aretwo approaches to the task of cutting up the genome and putting it back together again.  Hierarchical shotgun sequencing  Whole genome shotgun sequencing
  • 11.
     Also knownas clone by clone sequencing  Map based cloning i.e. map first sequence later.  During clone-by-clone sequencing, a map of each chromosome of the genome is made before the DNA is split up into fragments ready for sequencing.  In clone-by-clone sequencing the genome is broken up into large chunks, 150 kilo bases Long (150,000 base pairs).  The chunks are then inserted into bacterial artificial chromosomes (BACs) and put inside bacterial cells to grow.
  • 12.
     The chunksof DNA are copied each time the bacteria divide to produce lots of identical copies.  The DNA in the individual bacterial clones is then broken down into even smaller, overlapping fragments. Each fragment is 500 base pairs long so that they are a more manageable size for sequencing.  These fragments are put into a vector that has a known DNA sequence.  The DNA fragments are then sequenced, starting with the known Contd…
  • 13.
     This ‘assembly’is carried out by computers which spot areas of overlap and piece the DNA sequence together. Then, by following the map constructed at the beginning, the large chunks can be assembled back into the chromosomes as part of the complete genome sequence.  The clone-by-clone approach was used during the 1980s and 1990s to sequence the genomes of the nematode worm ( C. elegans), and the yeast ( S. cerevisiae) .  Clone-by-clone sequencing was the preferred method during the Human
  • 14.
    Advantages  Every fragmentof DNA is taken from a known region of the genome, so it is relatively easy to determine where there are any gaps in the sequence.  Assembly is more reliable because a genome map Is followed so the scientists know where the larger fragments are in relation to each other.  As each fragment is distinct many people can work on the genome at Disadvantages  Making clones and generating genome maps takes a long time.  Clone-by-clone sequencing is generally more expensive than other sequencing methods.  Some parts of the chromosomes, such as the centromeres, are difficult to clone. This is because they contain long repetitive sections which makes them difficult to cut and clone into bacs.
  • 15.
     Shotgun sequencinginvolves randomly breaking up DNA sequences into lots of small pieces and then reassembling the sequence by looking for regions of overlap.  Shotgun sequencing was originally used by Fred sanger and his colleagues to sequence small genomes such as those of viruses and bacteria.  This method of genome sequencing was used by Craig venter, founder of
  • 16.
     In wholegenome shotgun sequencing the entire genome is broken up into small fragments of DNAfor sequencing.  The sequenced fragments are then assembled together by computer programs that find where fragments overlap  We can imagine shotgun sequencing as being a bit like shredding multiple copies of a book (which in this case is a genome), mixing up all the fragments and then reassembling the original text (genome) by finding fragments with text that WGS Cont…
  • 17.
    Advantages  By removingthe mapping stages, whole genome shotgun sequencing is a much faster process than clone-by-clone sequencing.  Whole genome shotgun sequencing uses a fraction of the DNA that clone-by-clone sequencing needs.  Whole genome shotgun sequencing is particularly efficient if there is an existing reference sequence. It is much easier to assemble the genome sequence by aligning it to an existing reference genome. Disadvantages  Vast amounts of computing power and sophisticated software are required to assemble shotgun sequences together.  Whole genome shotgun sequencing can only really be carried out if a reference genome is already available, otherwise assembly is very difficult without an existing genome to match it to.  Whole genome shotgun sequencing can also lead to errors which need to be resolved by other, more labour-intensive types of sequencing, such as clone-by-clone sequencing.
  • 18.
     Done By:Clone by clone sequencing /hierarchical shot gun sequencing/map based shot gun sequencing.  It includes: Map Construction Clone Selection Sub Clone Library Construction Random Shot Gun Phase Directed Finishing Phase Sequence Authentication.
  • 19.
    Strategy  Large insertBAC , PAC libraries prepared  Physical maps of genome of accession COLUMBIA were assembled.  Clones are selected for shot gun sequencing Outcomes of sequencing project :  115409949 bp (~115.4 mb) are sequenced .  The non sequenced centromeric and ribosomal DNA repeat regions measures roughly 10 mb .  25498 genes are predicted.
  • 20.
     The InternationalRice Genome Sequencing Project (IRGSP) began in September 1997, at a workshop held in conjunction with the International Symposium On Plant Molecular Biology in Singapore.  Sequencing of the rice genome is being performed mainly from genomic BAC or PAC libraries created from the Nipponbare variety.  China, working on the sequencing of chromosome 4, is the only IRGSP
  • 21.
    Efficiency  The membersof the IRGSP have also chosen to accept common standards sets a standard of less than one basepair (bp) error in 10,000 bp. Outcome of sequencing  Genome size of Rice is 420 Mb  A total of 37,544 non-transposable-element- related protein-coding sequences were detected,.  A total of 2,859 genes seem to be unique to rice and the other cereals, some of which might differentiate monocot and dicot lineages.  Between 0.38 and 0.43% of the nuclear genome contains organellar DNA fragments, representing repeated and ongoing transfer of organellar DNA to the nuclear genome.  The transposon content of rice is at least 35%. Source: NCBI
  • 22.
  • 23.
    INDIAN INITIATIVE FORRICE GENOME SEQUENCING (Chromosome No :11)
  • 24.
     Pigeonpea variety‘Asha’ (ICPL87119) was used for genome sequencing.  Whole genome shotgun sequencing  Illumina next-generation sequencing platform along with Sanger based bacterial artificial chromosome end sequences and a genetic map have used for sequencing..
  • 25.
    Source: https://www.ncbi.nlm.nih.gov  Sequencingprojects is that while the ICRISAT-led team has assembled 605.78 mb out of the 833.07 mb (about 72.5%) of the genome, the ICAR team has captured 511 mb (about 61%).  Genome analysis predicted 59,515 genes for pigeon pea and also showed the potential role that certain gene families, for example, drought tolerance related genes, have played throughout the domestication of pigeon pea and the evolution of its
  • 26.
  • 31.
    Source :The PlantJournal (2012) 70, 177–190
  • 32.
  • 33.
     A highpercentage of repeats in many plant genomes makes it difficult to assemble the short reads from the NGS platforms.  Failure to capture the information embedded in the repetitive fraction of the genome is a major drawback, as it may have key roles in regulatory aspects (Feuillet et al. 2011).  Heterozygosity and polyploidy also add to the difficulties.
  • 34.
     Extensive re-sequencingis needed for the detection of SNPs. The cost of sequencing is the major hurdle here. However, the cost has been reduced considerably in recent years and is expected to be cheaper in the near future.  More reliable and user-friendly software have to be developed for more precise data analysis.  Another challenge is that the functions of many genes identified by genome sequencing remain unknown and the genetic control of the majority of agronomic traits has yet to be determined
  • 35.
     The shiftfrom manual DNA sequencing methods such as Maxam- Gilbert sequencing and Sanger sequencing in the 1970s and 1980s to more rapid, automated sequencing methods in the 1990s played a crucial role in giving scientists the ability to sequence whole genomes.  Almost any biological sample containing a full copy of the DNA even a very small amount of DNA can provide the genetic material necessary for full genome sequencing.  Genome sequencing is like solving puzzle. But once its solved then the result is magnificent. We can derive much more information from it .Staring from finding the genes and its functions to QTLs, from Epigenomics to Phylogeny its all about genome.  From genome sequencing we are moving towards Exome sequencing. Ultimately all these sequencing give us every bit of information about an organism which we can use for human welfare.
  • 36.
    DNA is notjust a word of 3 letters. its beyond that….