Molecular Biology


Published on

Published in: Education, Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Molecular Biology

  1. 1. 311 404 Molecular Biology References • Brown, T. A. 2007. Genome. 3rd ed. Garland Science Publishing, New York Genomes Watanachai Lontom, Ph.D. Department of Biology, Faculty of Science, Khon Kaen University • Weaver, R. F. 2008. Molecular Biology. 4th ed. The McGraw-Hill Companies, Inc., New York. 1 2 E-learning • Khan Academy • Youtube Education Objectives When you have learned this Chapter, you should be able to: 1. Describe the differences between prokaryotic and eukaryotic genomes, 2. Described the organization of genome, 3. Describe the importance of some genome projects. 3 4
  2. 2. Genome of Organisms Genome of Organisms  Genome is the complete collection of genetic information, including the genes and the extra DNA that are passed down from generation to generation in a given organism.  Genome can be DNA or RNA.  Genome sizes vary among organisms  RNA viruses have the smallest genome which compose of only 3 genes 5 Genome of Organisms 6 Prokaryote and Eukaryote Diversity of DNA-based genome organization (Allison et al., 2007) Genome Form Size (Kb) Eukaryotes ds linear 104-106 Bacteria ds circular 103 Plasmid ds circular (some ds linear) 2-15 Mammalian DNA viruses 3-280 Bacteriophage ss linear, ds linear, ds circular ss circular, ds linear Chloroplast DNA ds circular 120-160 Mitochondrial DNA ds circular (some ds linear) Animals: 16.5 Plants: 100-2500 ~50 7 8
  3. 3. Prokaryote and Eukaryote Prokaryotic Genome Structure of prokaryotic genome  Prokaryotes do not have nucleus. However, they still must fit DNA that is 1000 times the length of the cell within the cell membrane.  Most of prokaryotes (for example Escherichia coli) have 1 large chromosome which is circular DNA.  The Genome of E. coli is 4,700 kb in size and exists as one double-stranded circular DNA molecule, which no free 5’ or 3’ ends. 9 10 Prokaryotic Genome Prokaryotic Genome Chromosome of E. Coli Structure of prokaryotic genome  The chromosomal DNA is organized into a condensed ovoid structure called a nucleoid.  The chromosomal DNA is packed with the help of DNAbinding protein, histone-like proteins or nucleoid-associated proteins.  HU (heat-unstable protein),IHF (integration host factor), HNS (heat-stable nucleoid structuring), and SMC (structural maintenance of chromosomes) are histone-like proteins. 11 (HU protein) 40-50 loops 12
  4. 4. Prokaryotic Genome Prokaryotic Genome Structure of prokaryotic genome  The chromosome of E. coli is supercoiled.  Supercoiled occurs when additional turns are introduced into the DNA double helix (positive supercoiling) or if turns are removed (negative supercoiling) E. coli 1 เซลล์มีขนาด 1 x 2 μm แต่โครโมโซมของ E. coli มีเส้นรอบวง ่ 1.6 mm โครโมโซมดังกล่าวบรรจุอยูในนิวคลิออยด์ของเซลล์ E. coli ได้อย่างไร?  In E. coli the supercoiling is thought to be generated and controlled by two enzymes, DNA gyrase and DNA topoisomerase I. 13 14 Prokaryotic Genome Prokaryotic Genome Supercoiled structure of bacterial DNA Structure of prokaryotic genome  The current model has the E. coli DNA attached to aprotein core from which 40-50 supercoiled loops radiate out into the cell.  Each loop contains approximately 100 kb of supercoiled DNA. (HU protein) (40-50 loops) 15 16
  5. 5. Prokaryotic Genome Prokaryotic Genome Structure of prokaryotic genome Structure of prokaryotic genome  Although the majority of bacterial and archaeal chromosomes are circurlar, an increasing number of linear ones are being found.  Plasmids are small, double-stranded circular or linear DNA molecules carried by bacteria (some fungi and some higher plant).  They range in size from 2-100 kb with self-replicating property.  The first of these, for Borrelia burgdorferi, the organism that cause Lyme disease, was described in 1989 and during the following years similar discoveries were made for Streptomyces. 17  Some types of plasmids are able to integrate into the main genome, but others are thought to be permanently independent.  Plasmids carry genes that are not usually present in the main chromosome coding for characteristics such as antibiotic resistance. 18 Prokaryotic Genome Prokaryotic Genome Structure of prokaryotic genome  Most prokaryotes have 1 copy of gene  They have genes with no intron  Very little spaces between genes  Very low frequency of repetitive sequence in genome  Contain groups of genes that are located adjacent to one another in the genome (operon) such as lactose operon in E. coli ’s genome 19 20
  6. 6. Prokaryotic Genome Prokaryotic Genome 21 22 Comparison of the 50-kb segments of genome of humans, yeast, fruit flies, maize, and E. coli (Brown, 2007). Eukaryotic Genome Eukaryotic Genome Nuclear genome  Large and complex  Nuclear genome  Multiple linear DNA  Organelle genomes  In ordinary cells, linear DNA molecules are packed into chromatin (DNA with its associated proteins).  Chromatin is then folded into chromosomes in metaphase cells.  More than 1 copies of genes  High frequency of intron and repetitive DNA 23 24
  7. 7. Eukaryotic Genome Eukaryotic Genome Chemical composition of eukaryotic chromosome 1. DNA 2. Protein  จีโนม 1 ชุดของมนุษย์มีดีเอ็นเอความยาวรวมทั้งหมดประมาณ 100 cm ทําไมจึงสามารถเก็บในรู ปของโครโมโซมจํานวน 23 โครโมโซมได้ ทั้งที่โครโมโซมใหญ่สุดมีขนาดเพียง 0.5 x 10 μm ในระยะเมทาเฟส Basic protein has positive charge at neutral pH. Histone proteins (H1, H2A, H2B, H3 และ H4) Histone molecule is rich in lysine and arginine that result in the positive charge of histone. Histone is well associated with DNA by ionic bond.  Acidic protein has positive charge at neutral pH. .Non-histone proteins 25 Eukaryotic Genome 26 Eukaryotic Genome Packaging of DNA into chromosomes Nuclease protection experiments (1973-1974) Olins and Olins (1974) proposed electron micrograph of protein beads on the string of DNA. Each bead is called nucleosome. 27 28
  8. 8. Eukaryotic Genome Packaging of DNA into chromosomes  Nuclosome comprises 8 molecules of histone proteins (2 of H2A, H2B, H3 and H4) called core octamer wrapped twice around with 140-150 bp of DNA  A single linker histone (H1) is attached to each nucleosome.  Each nucleosome is seperated by 50-70 bp of linker DNA. 29 30 Eukaryotic Genome Packaging of DNA into chromosomes  The 30 nm fiber Bead-on-a-string structure forms a compact fiber of approximately 30 nm in diameter.  Solenoid model or zig-zag ribbon structure 31 32
  9. 9. Eukaryotic Genome Packaging of DNA into chromosomes Eukaryotic Genome Packaging of DNA into chromosomes Looped domains  Loop domains - The 30 nm fiber is compacted into loop domains. - The length of loops is approximately 0.25 m  Metaphase chromosomes - Further condensation requires a number of ATP-hydrolyzing enzymes, including topoisomerase II and the condensin complex. - Condensin is a large protein complex composed of 5 subunits and is one of the most abundant structural components of metaphase chromosomes. 33 34 Eukaryotic Genome Centromere Eukaryotic Genome Centromere  A specific position where 2 sister chromatids are held together  Arabidopsis centromere span 0.9-1.2 Mb of DNA and each one is made up largely of 180-bp repeat sequences.  The 125-bp yeast centromere is divided into 3 regions:  I and III have conserve sequence which involves in the attachment of spindle fiber  II lines in the middle region with AT-reached 90 bp 35 36
  10. 10. Eukaryotic Genome Telomere Eukaryotic Genome Telomere  The terminal region of chromosomes  Mark the end of chromosomes and enable the cell to distinguish a real end from an unnatural end  Made up of hundred copies of repeated motif (5’-T1-4A0-1G1-8-3’)  Has a short extension of the 3’ terminus which then forms a Tloop by unusual hydrogen bond  Telomerase regulates the length of telomere 37 38 Eukaryotic Genome Eukaryotic Genome Organization of genes in genome  Genes are distributed randomly in genome.  Gene density varies among chromosome and species Arabidopsis 1-38 gene (s)/100 kb Humans 0-64 gene (s)/100 kb  Genes in genome can be catagorized by their function or their protein domain. 39 Comparison of the gene catalogs of Saccharomyces cerevisiae, Arabidopsis thaliana, Caenorhabditis elegans, fruit fly and humans (Brown, 2002) 40
  11. 11. Eukaryotic Genome Eukaryotic Genome Organization of genes in genome  Multigene families: groups of genes of identical or similar nucleotide sequence and present in multiple copies in genome.  Gene that is a heavy demand for cellular metabolism.  rRNA genes in plant genome compose of sequences that code for 25S, 18S and 5.8S rRNAs align as repeating units in nucleolar organizer region (NOR) 41 42 Eukaryotic Genome Eukaryotic Genome Organelle genomes rRNA genes 43  Both mitochondria and chloroplasts contain their own genetic information.  The genomes are usually, but not always, circular.  In circular form, the mitochondrial and chloroplast genomes look remarkably similar to bacterial genomes.  This similarity led to the endosymbiont hypothesis.  Organelle genomes are inherited independently of the nuclear genome and they exhibit a uniparental mode of inheriance 44  Some genes in organelle are contributed with gene in nucleus.
  12. 12. Eukaryotic Genome Eukaryotic Genome Organelle genomes Mitochondrial DNA (mtDNA)  mtDNA is usually a circular, double-stranded DNA molecule that is not packaged with histone.  Encodes essential enzymes or protein involved in ATP production (NADH dehydrogenase, cytochrome b, cytochrome c oxidase and ATP synthase)  Differs greatly in size among organisms.  16-18 kb in animals  100 kb – 2.5 Mb in plants 45  Multiple copies of mtDNA per organelle Eukaryotic Genome The Saccharomyces cerevisiae mitochondrial genome (Brown, 2002) 46 Eukaryotic Genome Organelle genomes Chloroplast DNA (cpDNA)  cpDNA is a circular and double-stranded DNA molecule  120-160 kb  20-40 copies / organelle  Encodes enzymes involved in photosynthesis, rRNA and tRNA 47 The rice chloroplast genome (Brown, 48 2002)
  13. 13. Eukaryotic Genome Eukaryotic Genome Repetitive DNA in eukaryotic genome Repetitive DNA in eukaryotic genome Repetitive DNA: repeating units of nucleotide sequences found in DNA molecule 1. Tandemly repeated DNA  Tandemly repeated DNA is a common feature of eukaryotic genome.  Tandemly repeated DNA  This type of repeat is also called satellite DNA with repeat domain that contains repeat unit < 5 to >200 bp  Interspersed genome-wide repeats  Present in centromere and telomere  Minisatellites form cluster up to 20 kb length with repeat units up to 25 bp. Telomeric DNA with 100 units of repeat units 5’-TTAGGG-3’ is an example of minisatellites. 49  Microsatellite form cluster <150 bp with repeat units of 13 bp or less. Eukaryotic Genome 50 Eukaryotic Genome Repetitive DNA in eukaryotic genome Repetitive DNA in eukaryotic genome 2.Interspersed genome-wide repeats 2.1 DNA transposon  Are arised by transposition of transposon  Transposon which transpose in DNA to DNA manner. DNA transposon is cut from the original location by transposase (conservative transposition) or is copied (replicative transposition)  Transposon or transposable element (TE) is a DNA fragment that can transposition from one location to another. TEs are devided into  Ac/Ds elememts in maize is an example of DNA transposon in eukaryote. 2.1 DNA transposon 2.2 Retrotransposon  Insertion sequences (IS1 และ IS186) in E. coli genome is an example of DNA transposon in prokaryote. 51 52
  14. 14. Eukaryotic Genome Eukaryotic Genome Repetitive DNA in eukaryotic genome DNA transposon (Ac/Ds elememts) in maize 54 53 Eukaryotic Genome Eukaryotic Genome Repetitive DNA in eukaryotic genome Repetitive DNA in eukaryotic genome 2.2 Retrotransposon 2.2 Retrotransposon Retrotransposon  Transposon which requires RNA intermediate for transposition  Retrotransposon is similar to retrovirus LTR retrotranspson มีลาดับเบสซํ้าขนาดยาวที่ปลาย ํ ทั้งสองด้าน (long terminal repeats; LTR) 55 Non-LTR retrotranspson LINEs (long interspersed nuclear elements) มี reverse-transcriptase-like gene SINEs (short interspersed nuclear elements) ไม่มี reverse-transcriptase-like gene 56
  15. 15. Eukaryotic Genome Genome Projects of Some Organisms  Genome projects are scientific projects that aim to map and sequence genomes of organisms  There are 3 basic steps to complete the project Genome sequencing Genome assembly Retroelements (Brown, 2002) 57 Genome annotation Genome Projects of Some Organisms (Weaver, 2008) 58 The Human Genome Project The human genome project (HGP)  HGP is an international scientific research project with a primary goal of determining the sequence of chemical base pairs which make up DNA, and of identifying and mapping the approximately 20,000–25,000 genes of the human genome.  The project began in October 1990 by Department of Energy and National Institutes of Health of USA and completed in 2003. 59 60 U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
  16. 16. The Human Genome Project The Human Genome Project The objectives of this project were to: 1. identify all the approximately 20,000-25,000 genes in human DNA, 2. determine the sequences of the 3 billion chemical base pairs that make up human DNA, 3. store this information in databases, 4. improve tools for data analysis, 5. transfer related technologies to the private sector, and 6. address the ethical, legal, and social issues (ELSI) that may arise from the project. What does the sequence tell us?  The human genome size is 3038 Mb.  The average gene consists of 3000 bases, but sizes vary greatly, with the largest known human gene being dystrophin at 2.4 million bases.  The total number of genes is approximately 20,000-25,000 genes  Almost all (99.9%) nucleotide bases are exactly the same in all people.  The functions are unknown for over 50% of discovered genes. 62 61 U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003 The Human Genome Project U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003 The Human Genome Project What does the sequence tell us?  Chromosome 1 has the most genes (2968), and the Y chromosome has the fewest (231).  Less than 2% of the genome codes for proteins.  Repeated sequences that do not code for proteins ("junk DNA") make up at least 50% of the human genome.  Repetitive sequences are thought to have no direct functions, but they shed light on chromosome structure and dynamics. Over time, these repeats reshape the genome by rearranging it, creating entirely new genes, and modifying and reshuffling existing genes.  The human genome has a much greater portion (50%) of repeat sequences than the mustard weed (11%), the worm (7%), and the fly (3%). 63 U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003 64
  17. 17. The Human Genome Project Anticipated benefits The Human Genome Project Anticipated benefits Molecular Medicine • improve diagnosis of disease • detect genetic predispositions to disease • create drugs based on molecular information • use gene therapy and control systems as drugs • design “custom drugs” (pharmacogenomics) based on individual genetic profiles Microbial Genomics • rapidly detect and treat pathogens (disease-causing microbes) in clinical practice • develop new energy sources (biofuels) • monitor environments to detect pollutants • protect citizenry from biological and chemical warfare • clean up toxic waste safely and efficiently DNA Identification (Forensics) • identify potential suspects whose DNA may match evidence left at crime scenes • exonerate persons wrongly accused of crimes • identify crime and catastrophe victims • establish paternity and other family relationships • identify endangered and protected species as an aid to wildlife officials (could be used for prosecuting poachers) • detect bacteria and other organisms that may pollute air, water, soil, and food • match organ donors with recipients in transplant programs • determine pedigree for seed or livestock breeds • authenticate consumables such as caviar and wine 65 U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003 The Human Genome Project 66 U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003 The Human Genome Project Future Challenges: What We Still Don’t Know Anticipated benefits • Gene number, exact locations, and functions Agriculture, Livestock Breeding, and Bioprocessing • grow disease-, insect-, and drought-resistant crops • breed healthier, more productive, disease-resistant farm animals • grow more nutritious produce • develop biopesticides • incorporate edible vaccines incorporated into food products • develop new environmental cleanup uses for plants like tobacco 67 U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003 • Gene regulation • DNA sequence organization • Chromosomal structure and organization • Noncoding DNA types, amount, distribution, information content, and functions • Coordination of gene expression, protein synthesis, and post-translational events • Interaction of proteins in complex molecular machines • Predicted vs experimentally determined gene function • Evolutionary conservation among organisms • Protein conservation (structure and function) • Proteomes (total protein content and function) in organisms • Correlation of SNPs (single-base DNA variations among individuals) with health and disease • Disease-susceptibility prediction based on gene sequence variation • Genes involved in complex traits and multigene diseases • Complex systems biology including microbial consortia useful for environmental restoration 68 • Developmental genetics, genomics U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
  18. 18. Rice Genome Project  Rice (Oryza sativa L.) is the staple food and an important biological model species for monocot plants, and major cereal crops such as maize, wheat, barley and sorghum.  Its immense economic value and a relatively small genome size (12 chromosomes) makes it a focal point for scientific investigations.  Rice was the first organism whose sequencing was pursued by four groups independently - International Rice Genome Sequencing Project (IRGSP) - Monsanto japonica cultivar ‘Nipponbare’ - Syngenta - Beijing Genomics Institute (BGI) indica cultivar ‘93-11’ Rice Genome Project  This project was started in 1998 and finished in 2004. 69 70 Rice Genome Project ฐานข้ อมูลจีโนมของโครงการศึกษาจีโนมสิ่ งมีชีวต ิ  A total of 37,544 genes have been predicted for the complete sequence with an average gene density of 1 gene/9.9 kb and average gene length of 2,699 bp.  Chromosomes 1 and 3 have the highest gene density.  Chromosomes 11 and 12 have the lowest gene density.  Rice genome comprises ~35% repeat elements.  For more details, see Vij et al. (2006) เวปไซต์ 71 72
  19. 19. ฐานข้ อมูลจีโนมของโครงการศึกษาจีโนมสิ่ งมีชีวต ิ 73