Bioinformatics Lecture 1


Published on

Introduction to Bioinformatics

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • An appreciation ___ large amount of information about Living things. Applications of BI to _____molecular biology, medicine, pharmacology, biotechnology, agriculture, forensic science, anthropology etc. A useful knowledge of the techniques by which, through the WWW, we access the data and the methods for their analysis.
  • A sense of optimism that the data and methods of bioinformatics will create profound advances in our understanding of life, and improvements in the health of humans and other living things.
  • Molecular biology -Genetics & Protein Biochemistry
  • Computational biology  involves the development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems. [1]  
  • Genomics existed before any genomes were completely sequenced, but in a very primitive state
  • Related Fields: Pharmacogenomics The application of genomic methods to identify drug targets For example, searching entire genomes for potential drug receptors, or by studying gene expression patterns in tumors i nvestigation of genetics and drug response: the study of the relationship between a specific person's genetic makeup and his or her response to drug treatment ( takes a singular verb ) Microsoft® Encarta® 2007. © 1993-2006 Microsoft Corporation. All rights reserved.
  • theories of Heredity Developed his theories through the study of pea pods. Studied them “for the fun of the thing”
  • This concept was later fully developed into the concept of chromosomes
  • Studied plant and animal germ cells distinguished between body cells and germ cells and proposed the theory of the continuity of germ plasm from generation to generation (1885) Developed the concept of meiosis
  • British microbiologist In 1928, Studied the effects of bacteria on mice Determined that some kind of “transforming factor” existed in the heredity of cells Frederick Griffith (1881-1941), British microbiologist who discovered a phenomenon called transformation—meaning an alteration of hereditary characteristics—in the Streptococcus pneumoniae bacterium. Griffith’s work paved the way for later experiments, which proved that deoxyribonucleic acid (DNA) is the material within cells that passes on genetic traits. He is considered by some to be the father of molecular biology. Avery's team purified this substance and found it was pure DNA. Avery published the results of his research in 1944. Transformation is the process by which bacteria take up unpackaged DNA from the environment. Only cells that are “competent” can receive DNA in this fashion. Cells can be made competent, however, via a routine procedure in molecular labs where introduction of foreign DNA into bacteria is fundamental to recombinant DNA technology. In transduction, viruses move DNA from one cell to another.
  • Martha Chase: she participated in the famous “blender experiment,” also known as the Hershey-Chase experiment. This experiment, which earned Alfred Day Hershey a Nobel Prize, used a kitchen blender to separate the protein coats of simple viruses called bacteriophages from their DNA cores. Hershey and Chase showed that the isolated DNA was infectious, but that without the DNA, the protein coats were not. Microsoft ® Encarta ® 2007. © 1993-2006 Microsoft Corporation. All rights reserved.
  • 1980: submission of the whole genome sequence of adenovirus to GenBank(R), the National Institutes of Health genetic sequence database. The submission marks the first time that a new method has been used to sequence a whole genome since Walter Gilbert and Frederick Sanger won the Nobel Prize in 1980 for the invention of DNA sequencing in 1977. The whole genome sequence (GenBank accession nos. AY370909, AY370910, and AY370911) was generated in less than one day using the first technology ever designed to sequence whole genomes, not one gene at a time. More importantly, this was accomplished by using a new platform that is scaleable to larger genomes. The bacteriophages Rt7 and Qβ have RNA as their genetic material. The single stranded RNA contains only three genes, One codes for A protein the second for coat protein, and the third for one of four subunits of replicase. (The other three units of replicase are host proteins). Polyoma or SV40 viruses have 5-10 genes and their chromosomes are only 1.7 microns in length. The single stranded DNA virus ØX174 has DNA which codes for 9 Proteins. The bacterial virus lambda has about 40 genes and T4 has over a hundred genes. The number of genes in viruses ranges from only three in the simplest viruses to about 250 in the most complex ones.
  • Although there is some relationship between the number of genes and the complexity of an organism there in no strict correlation between apparent genetic complexity and the DNA content per haploid nucleus. Thus some fishes and amphibians contain 10 to 20 times more DNA than humans. Moreover, the size of the genome varies over a 20-fold range within the species of a phylum kilo base pairs = 1000 bp; Mb =  mega base pairs  = 1000000 bp; 1 million bp GBP: 1000 million bp
  • Comparative Genomics: the management and analysis of the millions of data points that result from Genomics__ Sorting out the mess Functional Genomics: identifying gene functions and associations Strucutural Genomics: future plans of structural genomics efforts around the world and describes the possible benefits of this research
  • Recall the concept of differentiation from embryology.
  • The haploid human genome contains ca. 23,000 protein-coding genes , far fewer than had been expected before its sequencing. [1][2]  In fact, only about 1.5% of the genome codes for  proteins , while the rest consists of  non-coding RNA  genes, regulatory sequences ,  introns , and  noncoding DNA  (once known as "junk DNA"). [3] Surprisingly, the number of human genes seems to be less than a factor of two greater than that of many much simpler organisms, such as the  roundworm  and the  fruit fly . However, human cells make extensive use of  alternative splicing  to produce several different proteins from a single gene, and the human  proteome  is thought to be much larger than those of the aforementioned organisms.[ citation needed ] Besides, most human genes have multiple  exons , and human  introns  are frequently much longer than the flanking exons.[ citation needed ]
  • Prions epigenetics  is the study of  heritable  changes in  phenotype  (appearance) or  gene expression  caused by mechanisms other than changes in the underlying  DNA  sequence, hence the name  epi-  (Greek:  επί - over, above)  - genetics . These changes may remain through  cell   divisions  for the remainder of the cell's life and may also last for multiple generations. However, there is no change in the underlying  DNA  sequence of the organism; [1]  instead, non-genetic factors cause the organism's genes to behave (or "express themselves") differently. [2] One example of epigenetic changes in  eukaryotic  biology is the process of  cellular differentiation . During  morphogenesis , totipotent   stem cells  become the various  pluripotent   cell lines  of the  embryo  which in turn become fully differentiated cells. In other words, a single fertilized egg cell – the  zygote  – changes into the many cell types including neurons, muscle cells, epithelium, blood vessels etc. as it continues to  divide . It does so by activating some genes while inhibiting others
  • The normal role of Prions is not known. It probably protects cells from injury. All known mammalian prion diseases are caused by the so-called prion protein,  PrP . The endogenous, properly-folded, form is denoted PrPC (for  c ommon  or  c ellular ) while the disease-linked, misfolded form is denoted PrPSc (for  Sc rapie , after one of the diseases first linked to prions and neurodegeneration.) [9][10]  The precise structure of the prion is not known, though they can be formed by combining PrPC, polyadenylic acid, and lipids in a  Protein Misfolding Cyclic Amplification  (PMCA) reaction. [11] Proteins showing prion-type behavior are also found in some  fungi , which has been useful in helping to understand mammalian prions. Interestingly,  fungal prions  do not appear to cause disease in their hosts and may even confer an evolutionary  advantage through a form of protein-based  inheritance . [12]
  • Bioinformatics Lecture 1

    1. 1. Introduction To Bioinformatics
    2. 3. After the Completion of This Course <ul><li>An appreciation ___ the Huge data about Living things. </li></ul><ul><li>Applications of BI to _____molecular biology, medicine, biotechnology, agriculture, forensic science, anthropology etc. </li></ul><ul><li>Using the WWW, ____access the data and the methods for their analysis. </li></ul>
    3. 4. After the Completion of This Course <ul><li>the role of computer science in the ___Analysis of the data </li></ul><ul><li>___ information retrieval, ____ and the ability to extend these skills by self-directed 'field work' on the Web. </li></ul><ul><li>A sense of optimism ____ Human Welfare </li></ul>
    4. 5. <ul><li>The Course Contents for Bioinformatics </li></ul><ul><li>An Insight into Bioinformatics </li></ul><ul><li>Biological Databases </li></ul><ul><li>Bioinformatics for Gene(s) - Pairwise alignment </li></ul><ul><li>Alignment of Multiple Sequences </li></ul><ul><li>Searching sequence databases using BLAST FASTA </li></ul><ul><li>Phylogenetic Analysis </li></ul><ul><li>Hidden Markov models </li></ul><ul><li>Gene Prediction, Micro arrays </li></ul><ul><li>Bioinformatics for Protein(s) -Structure Prediction </li></ul><ul><li>Dynamic programming applet </li></ul><ul><li>Rasmol, Phylogeny, Multiple alignment </li></ul>
    5. 6. What Is Bioinformatics? <ul><li>Bioinformatics is the unified discipline formed from the combination of biology, computer science, and information technology. </li></ul><ul><li>&quot;The mathematical, statistical and computing methods that aim to solve biological problems using DNA and amino acid sequences and related information.“ –Frank Tekaia </li></ul>
    6. 8. A Molecular Alphabet <ul><li>Most large biological molecules are polymers , ordered chains of simple molecules called monomers </li></ul><ul><li>All monomers belong to the same general class, but there are several types </li></ul><ul><li>the ordering of monomers in the macromolecule encodes information, just like the letters of an alphabet </li></ul>
    7. 9. Related Fields: Computational Biology <ul><li>The study and application of computing methods for classical biology </li></ul><ul><li>Primarily concerned with evolutionary, population and theoretical biology, rather than the cellular or molecular level </li></ul>
    8. 10. Related Fields: Medical Informatics <ul><li>The study and application of computing methods to improve communication, understanding, and management of medical data </li></ul><ul><li>Generally concerned with how the data is manipulated rather than the data itself </li></ul>
    9. 11. Related Fields: Cheminformatics <ul><li>The study and application of computing methods, along with chemical and biological technology, for drug design and development </li></ul>
    10. 12. Related Fields: Genomics <ul><li>Analysis and comparison of the entire genome of a single species or of multiple species </li></ul><ul><li>A genome: set of all genes possessed by an organism </li></ul>
    11. 13. Related Fields: Proteomics <ul><li>Study of how the genome is expressed in proteins, and of how these proteins function and interact </li></ul><ul><li>Concerned with the actual states of specific cells, rather than the potential states described by the genome </li></ul>
    12. 14. Related Fields: Pharmacogenetics <ul><li>The use of genomic methods to determine what causes variations in individual response to drug treatments </li></ul><ul><li>The goal is to identify drugs that may only be effective for subsets of patients, or to tailor drugs for specific individuals or groups </li></ul>
    13. 15. History of Bioinformatics <ul><li>Genetics </li></ul><ul><li>Computers and Computer Science </li></ul><ul><li>Bioinformatics </li></ul>
    14. 16. History of Genetics <ul><li>Gregor Mendel </li></ul><ul><li>Chromosomes </li></ul><ul><li>DNA </li></ul>
    15. 17. Gregor Mendel (1822-1884) <ul><li>theories of Heredity </li></ul><ul><li>through the study of pea pods. </li></ul><ul><li>Studied them </li></ul><ul><li>“ for the fun of the thing” </li></ul>
    16. 18. Mendel’s Experiments <ul><li>Cross-bred two different types of pea seads </li></ul><ul><ul><li>Sperical </li></ul></ul><ul><ul><li>Wrinkled </li></ul></ul><ul><li>After the 2nd generation of pea seeds were cross-bred, Mendel noticed that, although all of the 2nd generation seeds were spherical, about 1/4th of the 3rd generation seeds were wrinkled. </li></ul>
    17. 19. Mendel’s Experiments (cont.) <ul><li>Through this, Mendel developed the concept of “discrete units of inheritance,” and that each individual pea plant had two versions , or alleles , of a trait determining gene. </li></ul>
    18. 20. History of Chromosomes <ul><li>W Flemming </li></ul><ul><li>A Weissman </li></ul><ul><li>T Boveri </li></ul><ul><li>W S. Sutton </li></ul><ul><li>T H Morgan </li></ul>
    19. 21. Walther Flemming (1843-1905) <ul><li>Studied the cells of salamanders and developing improved fixing and staining methods </li></ul><ul><li>Developed the concept of mitosis (1882 ). </li></ul>
    20. 22. August Weismann ( 1834-1914) <ul><li>distinguished between Soma cells and germ cells </li></ul><ul><li>theory of the continuity of germ plasm (1885) </li></ul><ul><li>Developed the concept of meiosis </li></ul>
    21. 23. Walter S. Sutton (1877-1916) <ul><li>Also studied germ cells specifically those of the Brachystola magna (grasshopper) </li></ul><ul><li>Discovered that chromosomes carried the cell’s unit’s of inheritance </li></ul>
    22. 24. Thomas Hunt Morgan (1866-1945) <ul><li>Studied the Drosophilae fruit fly to determine whether heredity determined Darwinist evolution </li></ul><ul><li>Found that genes could be mapped in order along the length of a chromosome </li></ul>
    23. 25. History of DNA <ul><li>Griffith </li></ul><ul><li>Avery, MacLeod, and McCarty </li></ul><ul><li>Hershey and Chase </li></ul><ul><li>Watson and Crick </li></ul>
    24. 26. Frederick Griffith <ul><li>In 1928, Studied the effects of bacteria on mice </li></ul><ul><ul><li>Determined that some kind of “ transforming factor” existed in the heredity of cells </li></ul></ul>
    25. 27. Oswald Theodore Avery (1877-1955) Colin MacLeod <ul><li>1944 - Through their work in bacteria, showed that DNA was the transforming factor </li></ul><ul><li>DNA transferring genetic information </li></ul><ul><ul><li>Previously thought to be a protein </li></ul></ul>
    26. 28. Alfred Hershey (1908-1997) Martha Chase (1930- ) <ul><li>1952 - Studied the bacteriophage T2 and its host bacterium, Escherichia coli </li></ul><ul><li>Found that DNA actually </li></ul><ul><li>is the genetic material </li></ul><ul><li>that is transferred </li></ul>
    27. 29. James Watson (1928-) Francis Crick (1916-) <ul><li>1951 – Collaborated to gather all available data about DNA in order to determine its structure </li></ul><ul><li>1953 Developed </li></ul><ul><ul><li>The double helix model for DNA structure </li></ul></ul><ul><ul><li>The AT-CG strands that the helix is consisted of </li></ul></ul>
    28. 30. &quot; The structure was too pretty not to be true.&quot; -- JAMES D. WATSON
    29. 31. <ul><li>Programable Mechanical Computer </li></ul>
    30. 32. Computer Timeline <ul><li>~1000BC The abacus </li></ul><ul><li>1621 The slide rule invented </li></ul><ul><li>1625 Wilhelm Schickard's mechanical calculator </li></ul><ul><li>1822 Charles Babbage's Difference Engine </li></ul><ul><li>1926 First patent for a semiconductor transistor </li></ul><ul><li>1937 Alan Turing invents the Turing Machine </li></ul><ul><li>1939 Atanasoff-Berry Computer created at Iowa State </li></ul><ul><ul><li>the world's first electronic digital computer </li></ul></ul><ul><li>1939 to 1944 Howard Aiken's Harvard Mark I (the IBM ASCC) </li></ul><ul><li>1940 Konrad Zuse -Z2 uses telephone relays instead of mechanical logical circuits </li></ul><ul><li>1943 Collossus - British vacuum tube computer </li></ul><ul><li>1944 Grace Hopper, Mark I Programmer (Harvard Mark I) </li></ul><ul><li>1945 First Computer &quot;Bug&quot;, Vannevar Bush &quot;As we may think&quot; </li></ul>History of Computers
    31. 33. Computer Timeline (cont.) <ul><li>1948 to 1951 The first commercial computer – UNIVAC </li></ul><ul><li>1952 G.W.A. Dummer conceives integrated circuits </li></ul><ul><li>1954 FORTRAN language developed by John Backus (IBM) </li></ul><ul><li>1955 First disk storage (IBM) </li></ul><ul><li>1958 First integrated circuit </li></ul><ul><li>1963 Mouse invented by Douglas Englebart </li></ul><ul><li>1963 BASIC (standing for B eginner's A ll Purpose S ymbolic I nstruction C ode) was written (invented) at Dartmouth College, by mathematicians John George Kemeny and Tom Kurtzas as a teaching tool for undergraduates </li></ul><ul><li>1969 UNIX OS developed by Kenneth Thompson </li></ul><ul><li>1970 First static and dynamic RAMs </li></ul><ul><li>1971 First microprocessor: the 4004 </li></ul><ul><li>1972 C language created by Dennis Ritchie </li></ul><ul><li>1975 Microsoft founded by Bill Gates and Paul Allen </li></ul><ul><li>1976 Apple I and Apple II microcomputers released </li></ul><ul><li>1981 First IBM PC with DOS </li></ul><ul><li>1985 Microsoft Windows introduced </li></ul><ul><li>1985 C++ language introduced </li></ul><ul><li>1992 Pentium processor </li></ul><ul><li>1993 First PDA </li></ul><ul><li>1994 JAVA introduced by James Gosling </li></ul><ul><li>1994 Csharp language introduced </li></ul>
    32. 34. Genomics <ul><li>Classic Genomics </li></ul><ul><li>Post Genomic era </li></ul><ul><ul><li>Comparative Genomics </li></ul></ul><ul><ul><li>Functional Genomics </li></ul></ul><ul><ul><li>Structural Genomics </li></ul></ul>
    33. 35. Genomics <ul><li>Genome </li></ul><ul><ul><li>complete set of genetic instructions for making an organism </li></ul></ul><ul><li>Genomics </li></ul><ul><ul><li>any attempt to analyze or compare the entire genetic complement of a species </li></ul></ul><ul><ul><li>Early genomics was mostly recording genome sequences </li></ul></ul>
    34. 36. History of Genomics <ul><li>1980 </li></ul><ul><ul><li>First complete genome sequence for </li></ul></ul><ul><li>adenovirus is published </li></ul><ul><ul><ul><li>FX174 - 5,386 base pairs coding 9 proteins. </li></ul></ul></ul><ul><ul><ul><li>~5Kb </li></ul></ul></ul><ul><li>1995 </li></ul><ul><ul><li>Haemophilus influenzea genome sequenced (flu bacteria, 1.8 Mb) </li></ul></ul><ul><li>1996 </li></ul><ul><ul><li>Saccharomyces cerevisiae (baker's yeast, 12.1 Mbp) </li></ul></ul><ul><li>1997 </li></ul><ul><ul><li>E. coli (4.7 Mbp) </li></ul></ul><ul><li>2000 </li></ul><ul><ul><li>Pseudomonas aeruginosa (6.3 Mbp) </li></ul></ul><ul><ul><li>A. thaliana genome (100 Mb) </li></ul></ul><ul><ul><li>D. melanogaster genome (180Mb) </li></ul></ul>
    35. 37. 2001 The Big One <ul><li>The Human Genome sequence is published </li></ul><ul><ul><li>3 Gb </li></ul></ul>
    36. 38. What next? <ul><li>Post Genomic era </li></ul><ul><ul><li>Comparative Genomics </li></ul></ul><ul><ul><li>Functional Genomics </li></ul></ul><ul><ul><li>Structural Genomics </li></ul></ul>
    37. 39. What Is Proteomics? <ul><li>Proteomics is the study of the proteome—the “PROTEin complement of the genOME” </li></ul><ul><li>More specifically, &quot;the qualitative and quantitative comparison of proteomes under different conditions to further unravel biological processes&quot; </li></ul>
    38. 40. What Makes Proteomics Important? <ul><li>A cell’s DNA—its genome—describes a blueprint for the cell’s potential , all the possible forms that it could conceivably take. </li></ul><ul><li>It does not describe the cell’s actual, current form, </li></ul><ul><li>in the same way that the source code of a computer program does not tell us what input a particular user is currently giving his copy of that program. </li></ul>
    39. 41. What Makes Proteomics Important ? <ul><li>All cells in an organism contain the same DNA. </li></ul><ul><li>This DNA encodes every possible cell type in that organism—muscle, bone, nerve, skin, etc. </li></ul>
    40. 42. <ul><li>If we want to know about the type and state of a particular cell, the DNA does not help us, </li></ul><ul><li>in the same way that knowing what language a computer program was written in tells us nothing about what the program does. </li></ul>
    41. 43. What Makes Proteomics Important? Out of the thousands of genes, only a handful actually determine that cell’s structure. Many of the interesting things about a given cell’s current state can be deduced from the type and structure of the proteins it expresses. Changes in, for example, tissue types, carbon sources, temperature, and stage in life of the cell can be observed in its proteins.
    42. 44. Proteomics In Disease Treatment <ul><li>A large number of diseases are caused by a particular pattern in a group of genes. </li></ul><ul><li>Isolating this group by comparing the hundreds of thousands of genes in each of many genomes would be very impractical. </li></ul><ul><li>Looking at the proteomes of the cells associated with the disease is much more efficient. </li></ul>
    43. 45. Proteomics In Disease Treatment <ul><li>Many human diseases are caused by a normal protein being modified improperly. This also can only be detected in the proteome, not the genome. </li></ul><ul><li>The targets of almost all medical drugs are proteins. By identifying these proteins, proteomics aids the progress of pharmacogenetics. </li></ul>
    44. 46. Examples <ul><li>What do these have in common? </li></ul><ul><ul><li>Alzheimer's disease </li></ul></ul><ul><ul><li>Cystic fibrosis </li></ul></ul><ul><ul><li>Mad Cow disease </li></ul></ul><ul><ul><li>An inherited form of emphysema </li></ul></ul><ul><ul><li>Even many cancers </li></ul></ul>
    45. 47. Stanley Prusiner <ul><li>Prion Protein </li></ul><ul><li>is a normal protein frequent in living organisms. </li></ul><ul><li>When some prion change its shape by missfolding </li></ul><ul><li>it becomes Prion___ an  infectious agent   </li></ul><ul><li>  This is in contrast to all other known infectious agents, which must contain  nucleic acids   along with protein components.  </li></ul><ul><li>Prions come in different strains, each with a slightly different structure, and most of the time, strains breed true. Prion replication is nevertheless subject to occasional  epimutation  and then  natural selection just like other forms of replication. [8]  However, the number of possible distinct prion strains is likely far smaller than the number of possible DNA sequences, so evolution takes place within a limited space. </li></ul>