Determining the order of billions of chemical units that builds the genetic material. ◦ Secrets of life is locked up in the order of the 4 letters!!!! 5-100 million living species???
Organism Year Institute Genome SizeBacteriophage 1976 Walter Fiers at 3569 bpMS2 the University of GhentPhage Φ-X174 1977 Fred Sanger 5386 bp CambridgeHaemophilus 1995 TIGR 1,830,138 bpinfluenzaeSaccharomyces 1996 European 12,495,682cerevisiae Effort (16 chromosomes)Human 2000 Multiple 3.3 x 109Genome Organizations (3 billionProject letters)
Sanger Dideoxy Sequencing methods(1977) Maxam Gilberts Chemical degradation methods(1977) Two Labs that owned automated sequencers: 1. Leroy Hood at Caltech, 1986(commercialized by AB) 2. Wilhelm Ansorge at EMBL, 1986(commercialized by Pharmacia-Amersham and GE healthcare) 3.Hypoxanthine-guanine phosphoribosyltransferase (HGPRT)Alu sequences 4. Hitachi Laboratory developed High throughput capillary array sequencer, 1996.1991, A patent filed by EMBL on media less, solid support based sequencing.
454 sequencing methods(2006) ◦ Principles of pyrophosphate detection(1985, 1988) Illumina(Solexa) Genome sequencing methods(2007) Applied Biosystems ABI SOLiD System(2007) Helicos single molecule sequencing(Helioscope, 2007) Pacific Biosciences single-molecule real-time(SMRT) technology, 2010 Sequenom for Nanotechnology based sequencing. BioNanomatrixnanofluidiscs RNAP technologyhttp://www.ncbi.nlm.nih.gov/books/NBK20261/
JGI – IMG [http://img.jgi.doe.gov/] Broad TIGR WashU VBI at Virginia Tech
NHGRI Solicited RFAs were First pilot sought for Publicat proposal full ion in for ENCODE ENCODE 2000 In October GWAS -1990 Human Finished 90% lies First Report ENCODE Genome paper in outside on Encode published project 2003 coding Published 2005 2012 started in 2007
What we knew• 95% of the genome is “junk”. – 2.94% of the genome is coding• cis regulatory elements occur within a limited genome distance.• Most of the genome is transposable elements that are of obscure origin are dying.• Transcribed elements are most often translated than not.
Some of the useful links:• http://www.nature.com/encode/• http://www.encodeproject.org/ENCOD E/• http://www.factorbook.org/• http://encodeproject.org/ENCODE/dat aStandards.html• http://1000genomes.org• http://genome.ucsc.edu/ENCODE/
Key Findings:• 80% of the human genome is active!! – 70,000 promoters and 400,000 enhancers• 75% of the genome transcribed in some tissue or other during life time.• Environment plays great role in switching on or off of a lot many genes. [Epigenetics]• Most of the diseases don’t lie with the genes but the switches!!• Dark matters controlling the genes are physically close to the genes they control.
Key Findings:• Genes and the switches don’t hold one to one relationship!• 4 million switches controlling 21,000 genes!!• Identical twins are NOT identical – greatly influenced by environments.• Astronomy and genetic Biology looks similar(95% of the Universe is called as dark matter – we don’t understand)
Copy Number Variation SNPs Indels http://en.wikipedia.org/wiki/1000_Genomes_ProjectYoruba in Ibadan, Nigeria; Japanese in Tokyo; Chinese in Beijing; Utahresidents with ancestry from northern and western Europe; Luhya inWebuye, Kenya; Maasai in Kinyawa, Kenya; Toscani in Italy; Peruvians inPerú; Gujarati Indians in Houston; Chinese in metropolitan Denver; people
To study the effect of environment and their effects on diseases. 99.5% DNA are similar. 269 individuals genotype. One million SNPs genotyped ◦ Rose to 10 million including polymorphic sites.