Bioinformatics – Discovering the “Bio-Logic” of Nature Robert Cormia Foothill College
Transducing the Genome <ul><li>50 years after Watson and Crick deduced the structure of DNA… </li></ul><ul><li>The informa...
 
‘ Energy Systems’ Before ‘Life’ <ul><li>“ Life” arose on earth almost 4 billion years ago, 1 billion years before cells </...
Energy Metabolism
“ In the Beginning” <ul><li>Rock, heat, and some water </li></ul><ul><li>Early molecules of life </li></ul><ul><li>Energy ...
Life on the Sea Floor?
RNA Busy Before Cellular Life
The RNA World <ul><li>There is no way to know how the molecules of life really formed… </li></ul><ul><li>Amino acids and r...
RNA Codons and Catalysts
RNA and DNA <ul><li>A, T, C, G, and U </li></ul><ul><li>A = Adenine </li></ul><ul><li>T = Thymine </li></ul><ul><li>C = Cy...
Central Dogma of Life
The Genome <ul><li>DNA –  D eoxyribo N ucleic  A cid is the prominent molecule of the genome </li></ul><ul><li>Genes are f...
 
DNA at Transcription
The Proteome <ul><li>Proteins form cellular structure and enzymes, which function in metabolism </li></ul><ul><li>Over 100...
Rubisco Protein – Photosynthesis
RAD Protein Complex
Number of Genes vs. Time
What is Bioinformatics? <ul><li>Molecular biology </li></ul><ul><ul><li>Ability to sequence DNA </li></ul></ul><ul><li>Int...
Internet Technologies CPU Networking Data Storage Data Mining Grid Computing Storage Area Networks
Bioinformatics Technologies Informatics IT / Networking Molecular Biology Data Modeling Computational Biology Genomic Data...
A Tool for Biotechnology <ul><li>Bioinformatics creates a set of tools for understanding the mountain of new data </li></u...
From Data to Knowledge
DNA Sequencing
DNA Sequencing <ul><li>Chemical sequencing </li></ul><ul><li>Molecular sequencing </li></ul><ul><li>Now about $0.01 per ba...
DNA Sequencing http://www.accessexcellence.org
Gel Enhanced Staining
DNA Micro Arrays <ul><li>Used to monitor gene expression </li></ul><ul><ul><li>Which genes are active? </li></ul></ul><ul>...
Microarray Output Screen
Microarray Output
Partnering with Pharma <ul><li>Bioinformatics is an industry of tools </li></ul><ul><ul><li>Biotech is a consumer / user o...
Pharma and Biotech
Drug Discovery <ul><li>Target discovery </li></ul><ul><li>Target validation </li></ul><ul><li>Protein interactions </li></...
Drug Development Process
Drug Discovery
“ Pharmaco Genomics” <ul><li>Individualized medicine </li></ul><ul><li>Looking at SNPs along drug targets </li></ul><ul><u...
Pharmaco Genomics
One Genome <ul><li>There are three very different ways to look at genomic diversity – and all are equally valid! </li></ul...
Terra Genoma
Molecular Networks <ul><li>Genome or Proteome? </li></ul><ul><li>Proteome of Genome? </li></ul><ul><li>Wait a minute… </li...
Gene Regulatory Networks
 
Pathway Kinetics
Gene Regulatory Network
Bioinformatics Tools <ul><li>NCBI </li></ul><ul><ul><li>BLAST, 12 million records, SNP databases </li></ul></ul><ul><li>Ex...
NCBI <ul><li>National Center for Biotechnology Information, part of NIH and NLM </li></ul><ul><li>Funded by US – open to a...
 
NCBI Resources
Retroviruses
BLAST <ul><li>Basic Local Alignment Search Tool </li></ul><ul><li>Used as a “genomic search engine” </li></ul><ul><li>Comp...
 
Swiss-Prot <ul><li>Swiss - protein annotated database </li></ul><ul><li>Protein resource </li></ul><ul><ul><li>Minimal red...
ExPASy <ul><li>The ExPASy ( Ex pert  P rotein  A nalysis  Sy stem)  </li></ul><ul><li>Proteomics server of the Swiss Insti...
PROSITE - Database of Protein Families and Domains
Structure Analysis
Protein Data Bank  <ul><li>SWISS-MODEL </li></ul><ul><li>Protein Data Bank </li></ul><ul><li>Archive of .pdb files </li></...
 
PIR <ul><li>Protein Information Resource </li></ul><ul><li>i ProClass and PRI-NREF </li></ul><ul><ul><li>PIR-PSD, Swiss-Pr...
 
Pfam <ul><li>Protein family comparisons </li></ul><ul><ul><li>Look at multiple alignments  </li></ul></ul><ul><ul><li>View...
 
The Grand Challenge
The Technology Roadmap <ul><li>Genomics </li></ul><ul><ul><li>1995 to 2005 </li></ul></ul><ul><li>Proteomics </li></ul><ul...
Convergence of Biotech & Pharma <ul><li>Genomics </li></ul><ul><li>Proteomics </li></ul><ul><li>Systems biology </li></ul>...
Mouse Genome
 
Gene Therapy <ul><li>Somatic Gene Therapy </li></ul><ul><li>Therapeutic Gene Therapy </li></ul><ul><ul><li>Incorporate “mi...
 
Labeling Active Genes  Along Chromosomes
Transgenic Species
Designer Flies – Is Blue Cool?
Your Own Private Genome
Surfing the Genome <ul><li>Internet technologies </li></ul><ul><ul><li>Connecting users, tools, and data </li></ul></ul><u...
Contact Information <ul><li>Robert D. Cormia </li></ul><ul><li>Foothill College </li></ul><ul><li>[email_address] </li></u...
Upcoming SlideShare
Loading in …5
×

Bioinformatics - Discovering the Bio Logic Of Nature

2,480 views

Published on

Bioinformatics will help us discover the hidden mathematics and bio-logic of nature.

Published in: Education, Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,480
On SlideShare
0
From Embeds
0
Number of Embeds
52
Actions
Shares
0
Downloads
187
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Bioinformatics - Discovering the Bio Logic Of Nature

  1. 1. Bioinformatics – Discovering the “Bio-Logic” of Nature Robert Cormia Foothill College
  2. 2. Transducing the Genome <ul><li>50 years after Watson and Crick deduced the structure of DNA… </li></ul><ul><li>The information molecules of nature now reside as data bits inside computers </li></ul><ul><ul><li>But what does it all mean? </li></ul></ul><ul><li>We have ~15 GBytes of genomic data </li></ul><ul><ul><li>And only just beginning to unravel it </li></ul></ul>
  3. 4. ‘ Energy Systems’ Before ‘Life’ <ul><li>“ Life” arose on earth almost 4 billion years ago, 1 billion years before cells </li></ul><ul><li>Long chains of molecules harvesting energy, probably deep below the sea </li></ul><ul><ul><li>Before DNA, RNA and the sophisticated proteins that we know today </li></ul></ul><ul><li>There were plenty of sources of energy, but no “choreographed metabolism” </li></ul>
  4. 5. Energy Metabolism
  5. 6. “ In the Beginning” <ul><li>Rock, heat, and some water </li></ul><ul><li>Early molecules of life </li></ul><ul><li>Energy moved from rock into sea </li></ul><ul><li>Molecular networks played in the path </li></ul><ul><li>Capturing a memory of that process was probably the key to life today </li></ul>
  6. 7. Life on the Sea Floor?
  7. 8. RNA Busy Before Cellular Life
  8. 9. The RNA World <ul><li>There is no way to know how the molecules of life really formed… </li></ul><ul><li>Amino acids and ribonucleotides have formed in “pre-biotic” experiments </li></ul><ul><li>RNA molecules, which appear to be both catalysts and templates, are thought to have formed energy networks </li></ul>
  9. 10. RNA Codons and Catalysts
  10. 11. RNA and DNA <ul><li>A, T, C, G, and U </li></ul><ul><li>A = Adenine </li></ul><ul><li>T = Thymine </li></ul><ul><li>C = Cytosine </li></ul><ul><li>G = Guanine </li></ul><ul><li>U = Uracil </li></ul><ul><li>A-T and C-G in DNA </li></ul><ul><li>A-U and C-G in RNA </li></ul>
  11. 12. Central Dogma of Life
  12. 13. The Genome <ul><li>DNA – D eoxyribo N ucleic A cid is the prominent molecule of the genome </li></ul><ul><li>Genes are formed of lengths of DNA polymers which code for proteins </li></ul><ul><li>Exons and introns exist in DNA </li></ul><ul><li>Regulatory regions control transcription and the formation of every protein and enzyme. It is the key to metabolism. </li></ul>
  13. 15. DNA at Transcription
  14. 16. The Proteome <ul><li>Proteins form cellular structure and enzymes, which function in metabolism </li></ul><ul><li>Over 100,000 proteins exist in humans </li></ul><ul><li>DNA is not enough to run metabolism </li></ul><ul><li>Proteins have a “run-time” knowledge </li></ul><ul><li>Proteins control the transcription of DNA and DNA controls formation of proteins </li></ul>
  15. 17. Rubisco Protein – Photosynthesis
  16. 18. RAD Protein Complex
  17. 19. Number of Genes vs. Time
  18. 20. What is Bioinformatics? <ul><li>Molecular biology </li></ul><ul><ul><li>Ability to sequence DNA </li></ul></ul><ul><li>Internet databases </li></ul><ul><ul><li>To store and transmit data </li></ul></ul><ul><li>Mathematical algorithms </li></ul><ul><ul><li>To model and solve biological problems </li></ul></ul><ul><li>Analysis Using the I2I Technology Model </li></ul>
  19. 21. Internet Technologies CPU Networking Data Storage Data Mining Grid Computing Storage Area Networks
  20. 22. Bioinformatics Technologies Informatics IT / Networking Molecular Biology Data Modeling Computational Biology Genomic Databases
  21. 23. A Tool for Biotechnology <ul><li>Bioinformatics creates a set of tools for understanding the mountain of new data </li></ul><ul><li>In biotechnology, these tools are used to discover how genes and proteins work </li></ul><ul><li>Computers are used to both analyze and “mine” new data for hidden relationships </li></ul><ul><li>Discovering the “bio-logic” of nature </li></ul>
  22. 24. From Data to Knowledge
  23. 25. DNA Sequencing
  24. 26. DNA Sequencing <ul><li>Chemical sequencing </li></ul><ul><li>Molecular sequencing </li></ul><ul><li>Now about $0.01 per base </li></ul><ul><li>Human Genome took 10 years </li></ul><ul><ul><li>Celera sequenced in 3 years </li></ul></ul><ul><li>Moore’s law applies to biotechnology too </li></ul><ul><ul><li>In 2010 a single human genome in ~7 days </li></ul></ul>
  25. 27. DNA Sequencing http://www.accessexcellence.org
  26. 28. Gel Enhanced Staining
  27. 29. DNA Micro Arrays <ul><li>Used to monitor gene expression </li></ul><ul><ul><li>Which genes are active? </li></ul></ul><ul><ul><li>What are the “co-expressed patterns”? </li></ul></ul><ul><li>Compare healthy and diseased tissue </li></ul><ul><ul><li>Extract “expressed” mRNA in cytoplasm </li></ul></ul><ul><ul><li>Convert mRNA to cDNA </li></ul></ul><ul><li>Discover relationships of proteins to disease states, and function / location of genes </li></ul><ul><li>Is becoming the first step in “drug-discovery” </li></ul>
  28. 30. Microarray Output Screen
  29. 31. Microarray Output
  30. 32. Partnering with Pharma <ul><li>Bioinformatics is an industry of tools </li></ul><ul><ul><li>Biotech is a consumer / user of these tools </li></ul></ul><ul><li>Pharma needs more “innovation engines” </li></ul><ul><ul><li>Less than 2 drugs per firm in the ‘pipeline’ </li></ul></ul><ul><ul><li>Drug discovery creates a new value chain </li></ul></ul><ul><li>bioinformatics > biotech > ‘big pharma’ </li></ul><ul><li>Convergence is the modality of innovation </li></ul>
  31. 33. Pharma and Biotech
  32. 34. Drug Discovery <ul><li>Target discovery </li></ul><ul><li>Target validation </li></ul><ul><li>Protein interactions </li></ul><ul><li>Rapid screening </li></ul><ul><li>The long haul… </li></ul><ul><ul><li>$800 million / year is spent on drug discovery </li></ul></ul><ul><ul><li>Over 75% of drug compounds will never work </li></ul></ul>
  33. 35. Drug Development Process
  34. 36. Drug Discovery
  35. 37. “ Pharmaco Genomics” <ul><li>Individualized medicine </li></ul><ul><li>Looking at SNPs along drug targets </li></ul><ul><ul><li>What makes each of us – us? </li></ul></ul><ul><ul><li>1 million SNPs, about one per intron </li></ul></ul><ul><li>In the future, each of us will have our genome “insilico” (genome on a chip) </li></ul><ul><li>Data mining against 6 billion genomes! </li></ul>
  36. 38. Pharmaco Genomics
  37. 39. One Genome <ul><li>There are three very different ways to look at genomic diversity – and all are equally valid! </li></ul><ul><li>A “collective” human genome </li></ul><ul><ul><li>3 billion base pairs – called the ‘golden path’ </li></ul></ul><ul><li>Each one of us is a unique genome </li></ul><ul><ul><li>“ I am a genome of one”, my SNPS make me - ‘me’ </li></ul></ul><ul><li>The Genome on planet earth </li></ul><ul><ul><li>A collective metabolic evolution and speciation </li></ul></ul>
  38. 40. Terra Genoma
  39. 41. Molecular Networks <ul><li>Genome or Proteome? </li></ul><ul><li>Proteome of Genome? </li></ul><ul><li>Wait a minute… </li></ul><ul><li>What if it’s both? </li></ul><ul><li>Now what would that look like? </li></ul>
  40. 42. Gene Regulatory Networks
  41. 44. Pathway Kinetics
  42. 45. Gene Regulatory Network
  43. 46. Bioinformatics Tools <ul><li>NCBI </li></ul><ul><ul><li>BLAST, 12 million records, SNP databases </li></ul></ul><ul><li>ExPASy </li></ul><ul><ul><li>Swiss-Prot, EMBL, Swiss-Model </li></ul></ul><ul><li>PIR – Protein Information Resource </li></ul><ul><li>PDB – Protein Data Bank </li></ul><ul><li>Pfam – Protein families </li></ul>
  44. 47. NCBI <ul><li>National Center for Biotechnology Information, part of NIH and NLM </li></ul><ul><li>Funded by US – open to all </li></ul><ul><li>GenBank and GenPept </li></ul><ul><ul><li>13 million entries, 12 billion base pairs </li></ul></ul><ul><ul><li>Resources include oncology, retroviruses, SNP databases, and much more </li></ul></ul><ul><li>Sequin submission of raw sequence data </li></ul>
  45. 49. NCBI Resources
  46. 50. Retroviruses
  47. 51. BLAST <ul><li>Basic Local Alignment Search Tool </li></ul><ul><li>Used as a “genomic search engine” </li></ul><ul><li>Compare your target sequence to the “non-redundant” database of 13B bps. </li></ul><ul><li>Can search the genomes of species </li></ul><ul><ul><li>Human, mouse, fly, E.coli etc. </li></ul></ul><ul><li>‘ Hits’ return inks to GenBank and GenPept </li></ul>
  48. 53. Swiss-Prot <ul><li>Swiss - protein annotated database </li></ul><ul><li>Protein resource </li></ul><ul><ul><li>Minimal redundancy, reasonably current </li></ul></ul><ul><ul><li>protein annotated / integrated database </li></ul></ul><ul><ul><li>Links to protein structures and properties </li></ul></ul><ul><li>Links back into GenBank, EMBL, DDBJ </li></ul><ul><li>Literature resources for submissions </li></ul>
  49. 54. ExPASy <ul><li>The ExPASy ( Ex pert P rotein A nalysis Sy stem) </li></ul><ul><li>Proteomics server of the Swiss Institute of Bioinformatics (SIB) is dedicated to analysis of protein sequences and structures </li></ul><ul><li>Swiss-Prot and PROSITE </li></ul><ul><li>Links to SWISS-MODEL </li></ul>
  50. 55. PROSITE - Database of Protein Families and Domains
  51. 56. Structure Analysis
  52. 57. Protein Data Bank <ul><li>SWISS-MODEL </li></ul><ul><li>Protein Data Bank </li></ul><ul><li>Archive of .pdb files </li></ul><ul><li>Structures determined by X-ray, NMR </li></ul><ul><li>Theoretical Structure Search </li></ul><ul><li>Features a “Molecule of the Month” </li></ul><ul><li>http://www.rcsb.org/pdb/ </li></ul>
  53. 59. PIR <ul><li>Protein Information Resource </li></ul><ul><li>i ProClass and PRI-NREF </li></ul><ul><ul><li>PIR-PSD, Swiss-Prot, TrEMBL, RefSeq, GenPept, and PDB </li></ul></ul><ul><li>http://pir.georgetown.edu/ </li></ul><ul><li>Integrated public resource of protein informatics </li></ul><ul><li>Supports genomic and proteomic research and scientific discovery - i ProClass and PRI-NREF </li></ul>
  54. 61. Pfam <ul><li>Protein family comparisons </li></ul><ul><ul><li>Look at multiple alignments </li></ul></ul><ul><ul><li>View protein domain architectures </li></ul></ul><ul><ul><li>Examine species distribution </li></ul></ul><ul><ul><li>Follow links to other databases </li></ul></ul><ul><ul><li>View known protein structures </li></ul></ul><ul><li>Follow ‘conserved domains’ from BLASTp searches of protein databases </li></ul>
  55. 63. The Grand Challenge
  56. 64. The Technology Roadmap <ul><li>Genomics </li></ul><ul><ul><li>1995 to 2005 </li></ul></ul><ul><li>Proteomics </li></ul><ul><ul><li>2000 to 2010 </li></ul></ul><ul><li>Systems biology </li></ul><ul><ul><li>2005 to 2015 </li></ul></ul><ul><li>Genetic remodeling / re-engineering </li></ul><ul><ul><li>2010 to 2020 </li></ul></ul><ul><li>Generation Phi </li></ul><ul><ul><li>Children born in 2025 may never know disease </li></ul></ul>
  57. 65. Convergence of Biotech & Pharma <ul><li>Genomics </li></ul><ul><li>Proteomics </li></ul><ul><li>Systems biology </li></ul><ul><li>Pharmaco genomics </li></ul><ul><li>Genetic engineering </li></ul>
  58. 66. Mouse Genome
  59. 68. Gene Therapy <ul><li>Somatic Gene Therapy </li></ul><ul><li>Therapeutic Gene Therapy </li></ul><ul><ul><li>Incorporate “missing genes” </li></ul></ul><ul><ul><li>Remove cells from host organism </li></ul></ul><ul><ul><li>Amplify target cells </li></ul></ul><ul><ul><li>Insert gene using (viral) vector </li></ul></ul><ul><ul><li>Return target cells into host organism </li></ul></ul><ul><li>Insulin gene was one of the first trials </li></ul>
  60. 70. Labeling Active Genes Along Chromosomes
  61. 71. Transgenic Species
  62. 72. Designer Flies – Is Blue Cool?
  63. 73. Your Own Private Genome
  64. 74. Surfing the Genome <ul><li>Internet technologies </li></ul><ul><ul><li>Connecting users, tools, and data </li></ul></ul><ul><li>Molecular biology </li></ul><ul><ul><li>Racing forward a top Moore’s Law </li></ul></ul><ul><li>Informatics </li></ul><ul><ul><li>Mathematical interrogation of nature’s secrets </li></ul></ul><ul><li>Surfing the Genome! </li></ul><ul><ul><li>Discovering the “bio-logic” of Nature </li></ul></ul><ul><li>http://www.SurfingTheGenome.us/ Spring 2003 </li></ul>
  65. 75. Contact Information <ul><li>Robert D. Cormia </li></ul><ul><li>Foothill College </li></ul><ul><li>[email_address] </li></ul><ul><li>http://www.informaticus.org/ </li></ul><ul><li>650 747 1588 </li></ul><ul><li>Surfing the Genome – Spring 2003 </li></ul>

×