This document summarizes the genome sequencing of Arabidopsis thaliana. It discusses that genome sequencing approaches began being discussed in 1984 and the Human Genome Project officially began in 1990. The Arabidopsis genome project was initiated in 1990 and was completed in 2000, sequencing approximately 115.4 Mb and predicting 25,498 genes. The outcomes of the sequencing project included characterization of coding regions, comparative analysis between accessions and other plant genera, and integration of the three plant genomes.
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
Whole genome sequencing of arabidopsis thaliana
1.
2. “GENOME SEQUENCING”
•Idea discussed in scientific
community during 1984
onwards
•1990 : human genome project officially
began
•Genome sequencing approaches:
•Clone by clone sequencing
• Shot gun sequencing
“GENOMICS”
Determination of genetic
information and the
mechanism by which this
information used by the
organism
“GENOME”
The complete set of
genetic information
of an organism
3. Genome sequencing projects
Model organisms: Mostly used in genetic and scientific
studies
Yeast E.coli Cenorhabditis
elegans
Drosophila
Arabidopsis thaliana
5. ARABIDOPSIS GENOME ANALYSIS: Initiation and
progress
• 1983 - first genetic map published
• 1988-89 - publication of RFLP maps
• 1990 - Multinational Coordinated Arabidopsis thaliana
Genome project initiated
• 1991 - first YAC libraries
• 1995-96 - standard BAC and P1 libraries constructed
• 1996 - Arabidopsis Genome Initiative organised and started
sequencing
• 1998 - Physical maps of all chromosomes completed
• 1999 - sequence and analysis of chromosome 2 and 4
• 2000 - sequence and analysis of chromosomes 1, 3 and 5
• 2000 - completion of whole genome sequencing
6. This report includes:
– Completed Arabidopsis genome sequences
– Annotation of predicted genes
– Assignment of functional categories
– Chromosomal dynamics and architecture
– Distribution of transposable elements and other
repeats
– Extend of lateral gene transfer from organelles
– Comparison of the genome sequence and structure to
that of other Arabidopsis accessions and plant species
7. • “Clone by clone sequencing”= “hierarchical shot
gun sequencing”= “map based shot gun
sequencing “
• It includes:
– Map construction
– Clone selection
– Sub clone library construction
– Random shot gun phase
– Directed finishing phase
– Sequence authentication
8. • Primary substrates – large insert BAC , P1 and
TAC libraries
• Physical maps of genome of accession
COLUMBIA were assembled by restriction
fragment ‘fingerprint’ analysis of BAC clones,
by hybridization or PCR of STS and by
hybridization and southern blotting
• 47788 BAC clones are end sequenced to
assemble the contigs
Steps
9. • Minimally overlapping 1569 BAC, TAC,Cosmid
and P1 clones (avg. Insert size: 100 Kb) used to
assemble 10 contigs :represent minimum
tiling path
• These clones are selected for shot gun
sequencing
• To link the regions not covered by cloned DNA
or to optimize the minimum tiling path 22 PCR
products were amplified directly from
genomic DNA
10. • DNA insert of selected clones is purified and
subjected to random fragmentation by physical
shearing
• Enzymatic repair is done in broken end
• Size fractionation and elution of 2-5 Kb
fragments
• They are subcloned into plasmids or M13
vectors
11. • Sequence reads of plasmids are derived from
universal priming sites
• Sufficient redundant data generated and
sequence reads are computationally
assembled (>99.99% accuracy if 8-10 fold
sequence coverage)
• All available sequenced genetic markers were
integrated to sequence assemblies to verify
the sequenced contigs
12. Outcomes of sequencing project
• 115409949 bp (~115.4 Mb) are sequenced
• The unsequenced centromeric and ribosomal
DNA repeat regions measures roughly 10 Mb
• 25498 genes are predicted
13.
14. Outcomes of sequencing project
• Characterization of the coding regions
• Genome organization and duplication
• Comparative analysis of Arabidopsis
accessions
• Comparison of Arabidopsis and other plant
genera
• Integration of 3 genomes in the plant cell
• Transposable elements
• rDNA, telomeres and centromeres
15. • Membrane transport
• DNA repair and recombination
• Gene regulation
• Cellular organization
• Development
• Signal transduction
• Recognizing and respond to pathogens
• Photomorphogenesis and photosynthesis
• metabolism
16. • Launched at the beginning of 2008 to discover the whole-
genome sequence variation in 1001 strains
• Each accession is an inbred line with seeds that are freely
available from the stock centre
• Hierarchical approach of selection 1001 genomes
– Sampling 10 individuals from 10 populations each in 10
geographical regions throughout Eurasia plus at least one north
African accession (10x10x10+1)
• sequence information can be used directly in association
studies at biochemical, metabolic, physiological,
morphological and whole plant-fitness levels
• The complete genome sequences of over 80 accessions were
released in early 2010 by the Max Planck Institute,
• many more have been added by the Salk Institute, the Gregor
Mendel Institute and Monsanto.
• September 2014 over 1100 lines have been sequenced,
17.
18. References
• The Multinational Coordinated Arabidopsis thaliana
Functional Genomics Project: Annual Report 2010
Multinational Arabidopsis steering committee, 2010
• Weigel D & Mott R(2009) The 1001 genomes project
for Arabidopsis thaliana Genome Biol 10(5):107.
• http://1001genomes.org/
• Arabidopsis Genome Initiative (2000) Analysis of the genome
sequence of the flowering plant Arabidopsis thaliana nature 408:
796-815.
• Green E D (2001) Strategies for the systematic sequencing of
complex genomes Nature Reviews Genetics 2(8):573-83.
• Singh B D (2009) Biotechnology expanding horizons Kalyani India.