NHGRI Solicited RFAs were First pilot sought for Publicat proposal full ion in for ENCODE ENCODE 2000 In October GWAS -1990 Human Finished 90% lies First Report ENCODE Genome paper in outside on Encode published project 2003 coding Published 2005 2012 started in 2007
Treasure Hunt?It is like google map says Eric Lander : Map of earthfrom outer space
95% of the genome is “junk”. ◦ 2.94% of the genome is coding cis regulatory elements occur within a limited genome distance. Most of the genome is transposable elements that are of obscure origin are dying. Transcribed elements are most often translated than not.
80% of the human genome is active!! ◦ 70,000 promoters and 400,000 enhancers 75% of the genome transcribed in some tissue or other during life time. Environment plays great role in switching on or off of a lot many genes. [Epigenetics] Most of the diseases don‟t lie with the genes but the switches!! Dark matters controlling the genes are physically close to the genes they control.
Genes and the switches don‟t hold one to one relationship! 4 million switches controlling 21,000 genes!! Identical twins are NOT identical – greatly influenced by environments. Astronomy and genetic Biology looks similar(95% of the Universe is called as dark matter – we don‟t understand)
“This explains why 6.5 billion people on earth don‟t look alike”.. Intelligent Design (Creationism) believers are excited that it is handiwork of God. Natural selectionists (Darwinists) excited that natural selection at its best. ◦ This has raged a war between democrats and republicans as usual. Junk DNA is an “Oxymoron”. Some are still wondering about the remaining 20%.
„I hope this information stirs the mind of those researchers that have ignored "trace minerals" in food as part of the nutritional package‟. The more we think we are close to finding an answer – the far we find ourselves. Reminds me of Aristotle Who once said “The more you know, the more you know you dont know”
Most part of DNA was considered “Garbage” but later upgraded to “junk”. Most people are actually happy because it is happening during their “life time”. Switches are software and genes are hardware. Ancient Egyptians considered “torso” has a divine role and discarded grey matter in head as “junk”.
Sean Eddy “At least 40% of the human genome is composed of the decaying DNA remains of transposable elements (TEs), different species of which have replicated in great waves during the evolution of our genome.” “I sure wish I‟d gotten the memo, because this week a collaboration of labs led by myself, Arian Smit, and Jerzy Jurka just released a new data resource that annotates nearly 50% of the human genome as transposable element-derived, and transposon-derived repetitive sequence is the poster child for what we colloquially call “junk DNA”.” http://cryptogenomicon.org/
The Cell TypesCell Type Tier Description SourceGM12878 1 B-Lymphoblastoid cell line Coriell GM12878 ChronicK562 1 Myelogenous/Erythroleukemia ATCC CCL-243 cell line Human Embryonic Stem Cells, Cellular DynamicsH1-hESC 1 line H1 InternationalHepG2 2 Hepatoblastoma cell line ATCC HB-8065HeLa-S3 2 Cervical carcinoma cell line ATCC CCL-2.2 Human Umbilical VeinHUVEC 2 Lonza CC-2517 Endothelial Cells PLoS Biol.Various (Tier 3) 3 Various cell lines, cultured primary cells, and primary Various 2011 tissues April; 9(4): e1001046 .
DNAseI -> Transcription factor binding sites (2.9 million sites, 1/3 rd in one cell type and remaining in others) Chip-seq -> sequence transcription factor and histone binding sites (HeLA and GM12878 – qualified to be called as new species) 5C technology -> Finding proximity between regulatory and regulated regions High density 5 bp tiling DNA micro arrays
Cap Analysis of Gene Expression Paired-End diTag (PET) Reduced Representation Bisulphite Sequencing (RRBS)
33.45% exon and 66.55% intron. 62% of the genome is transcribed reproducibly. 231 MB of genome has protein binding sites. ◦ 80% of which are low affinity sites (http://www.factorbook.org/) ◦ Many are highly conserved cell selective type 96% of the CpG exhibited differential methylation pattern. GWAS SNPs had overlaps with ENCODE elements.
Chromosome confirmation capture carbon copy(5C) ◦ 1% of the genome is distally regulated (>1000 bp) ◦ On an average 3.9 distal elements interacted with TSS. ◦ Distance could be several KBs to MBs
cis-regulatory elements - Enhancers, promoters, insulators, silencers. 2.9 million DHS encompassing 125 diverse cell and tissue types. 20-50 bp length DHS mapped uniquely to 86.9% of genome ◦ 580,000 distal DHS with target promoters ◦ 3% lie in TSS ◦ 5% lie within 2.5 KB of TSS ◦ 95% lie distally (introns and intergenic regions) ◦ Strongly enriched in LTRs
3/4th of genome is capable of transcription – redefine concept of gene? ◦ 62.1% AND 74.7% are processed or primary transcripts. ◦ 10-12 expressed isoforms per gene per cell. ◦ Coding and non-coding transcripts are localized in cytoplasm and nucleus respectively. ◦ 6% of the coding and non-coding transcripts overlap with small RNAs – precursors? ◦ Most of the novel transcripts lacked protein coding ability.
Mapping job is only half done. Characterizing everything a genome does is 10% done. Finding Network of switches for genes. A number of correlations…..
Where does gene therapy go from here? Our fundamental understanding of genes as the functional units are flawed?? Epigenetics becomes the key player… Gives impetus to holistic approach in treating a disease. Do we still believe that human genome is most efficient?