MOM 2010 <br />Bioinformatics: Reality Check & Challenges<br />Dr. SouhamMeshoul<br />Information Technology Dept. CCIS, K...
Outline<br /><ul><li>  Introduction
  Central Dogma of Molecular Biology
  Biological Data representation
  How Computers can be useful in Biology
  Intelligent bioinformatics
  Challenges
  Conclusion</li></li></ul><li>Introduction<br />The human body is made up of an estimated 1012 cells, each of which conta...
Introduction<br />
Introduction<br />What is Bioinformatics?<br />Bioinformatics is the field of science in which biology, computer science, ...
Introduction<br />Bioinformatics Vs Computational biology<br />Source: http://ccb.wustl.edu/<br />
Introduction<br />The problem:<br />             basic understanding of how gene sequences code specific proteins<br />   ...
Central Dogma of Molecular Biology<br />
Central Dogma of Molecular Biology<br />
Central Dogma of Molecular Biology<br />DNA is responsible for all the hereditary information in an organism.<br />
Central Dogma of Molecular Biology<br />
Deoxyribonucleic Acid (DNA)<br />DNA is found inside a special area of the cell called the nucleus. Because the cell is ve...
Deoxyribonucleic Acid (DNA)Composition<br />DNA is made of chemical building blocks called nucleotides.  <br />What is DNA...
Deoxyribonucleic Acid (DNA)Genes<br />Genes<br />These unique coding sections of DNA that ultimately are transcribed into ...
Deoxyribonucleic Acid (DNA)Function<br />What does DNA do?<br />DNA contains the instructions needed for an organism to de...
Deoxyribonucleic Acid (DNA)Replication<br />Replication<br />Chromosomes are located in the nucleus of a cell.  DNA must b...
Deoxyribonucleic Acid (DNA)Transcription<br />Transcription: <br />The actual information in the DNA of chromosomes is dec...
Deoxyribonucleic Acid (DNA)Translation<br />Translation:<br />The information from the DNA, now in the form of a linear RN...
Biological Data Representation<br />
Biological Data Representation<br />
Biological Data Representation<br />Strings: to represent DNA, RNA and sequences of amino-acids    DNA: {A,C,G,T}, RNA: {A...
Biological Data Representation<br />Trees: to represent the evolution of various organisms<br />
Biological Data Representation<br />Sets of 3D points: to represent the protein structure<br />
Biological Data Representation<br />Graphs: to represent metabolic and signaling pathways.<br />
How can computers be useful for biology?<br />
How can computers be useful to biology?<br />First, Computing technology for storing DNA sequences and constructing these ...
How can computers be useful to biology?<br />Structuring and organizing large databases using a common ontology . Access d...
How can computers be useful to biology?<br />Databases maintenance: Need to check consistency of databases for a valid and...
How can computers be useful for biology?<br />Several databases: genome databases, protein sequence databases, metabolic d...
How can computers be useful for biology?<br />Several algorithms: <br />Sequence Comparison Algorithms: Needleman-Wunch (g...
Intelligent Bioinformatics<br /><ul><li>  Data generation in biology/bioinformatics is outpacing methods of data analysis.
  Data interpretation and generation of hypotheses requires intelligence.
  AI offers established methods for knowledge representation and “intelligent” data interpretation.
  Predict utilization of AI in bioinformatics to increase.</li></li></ul><li>Intelligent Bioinformatics<br />Search proble...
Intelligent Bioinformatics<br />AI  and computational intelligence techniques and models:<br />Basic search techniques  A*...
Intelligent Bioinformatics:An example: Sequence Alignment<br />A<br />-<br />T<br />G<br />G<br />G<br />G<br />-<br />-<b...
Search strategy
Scoring function</li></li></ul><li>Intelligent Bioinformatics:An example: Sequence Alignment<br />si: string defined over ...
Intelligent BioinformaticsAn example: the robot scientist<br />Source: BBC News<br /><ul><li>University of Wales
 Designed for the study of functional genomics
 Tested on yeast metabolic pathways
 Uses knowledge representation schemes
Utilizes a Prolog database to store background biological information.
Upcoming SlideShare
Loading in …5
×

MoM2010: Bioinformatics

2,807 views

Published on

Published in: Technology
1 Comment
4 Likes
Statistics
Notes
  • Nice Presentation. Very detailed.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
2,807
On SlideShare
0
From Embeds
0
Number of Embeds
148
Actions
Shares
0
Downloads
0
Comments
1
Likes
4
Embeds 0
No embeds

No notes for slide

MoM2010: Bioinformatics

  1. 1. MOM 2010 <br />Bioinformatics: Reality Check & Challenges<br />Dr. SouhamMeshoul<br />Information Technology Dept. CCIS, KSU<br />smeshoul@ksu.edu.sa<br />Dr. NikhatSiddiqi<br />Biochemistry Dept.<br />CS, KSU<br />nikkat@ksu.edu.sa<br />&<br />MOM 2010 , May 31 st<br />
  2. 2. Outline<br /><ul><li> Introduction
  3. 3. Central Dogma of Molecular Biology
  4. 4. Biological Data representation
  5. 5. How Computers can be useful in Biology
  6. 6. Intelligent bioinformatics
  7. 7. Challenges
  8. 8. Conclusion</li></li></ul><li>Introduction<br />The human body is made up of an estimated 1012 cells, each of which contains 23 pairs of chromosomes that are composed of approximately 30,000 genes which in turn contain some 3 billion pairs of DNA bases.<br />Biological data explosion<br />http://bip.weizmann.ac.il/education/course/introbioinfo/04/lect1/introbioinfo04/index.htm<br />
  9. 9. Introduction<br />
  10. 10. Introduction<br />What is Bioinformatics?<br />Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline<br />Bioinformatics is the electronic<br /> infrastructure of Biology<br />Science that encompasses the methods that are used to collect, store, retrieve, analyze, and correlate the mountain of complex biological information.<br />Source: http://ccb.wustl.edu/<br />
  11. 11. Introduction<br />Bioinformatics Vs Computational biology<br />Source: http://ccb.wustl.edu/<br />
  12. 12. Introduction<br />The problem:<br /> basic understanding of how gene sequences code specific proteins<br /> l Lack of the information necessary to completely understand role of DNA in specific diseases or the functions of the thousands of proteins that are produced.<br />The goals:<br />provide scientists with a means to explain:<br />Biological processes. <br />Malfunctions in these processes which lead to diseases. <br />Drug discovery and their mode of action.<br />
  13. 13. Central Dogma of Molecular Biology<br />
  14. 14. Central Dogma of Molecular Biology<br />
  15. 15. Central Dogma of Molecular Biology<br />DNA is responsible for all the hereditary information in an organism.<br />
  16. 16. Central Dogma of Molecular Biology<br />
  17. 17. Deoxyribonucleic Acid (DNA)<br />DNA is found inside a special area of the cell called the nucleus. Because the cell is very small, and because organisms have many DNA molecules per cell, each DNA molecule must be tightly packaged. This packaged form of the DNA is called a chromosome.<br />
  18. 18. Deoxyribonucleic Acid (DNA)Composition<br />DNA is made of chemical building blocks called nucleotides. <br />What is DNA made of?<br />The four types of nitrogen bases found in nucleotides are: adenine (A), , thymine (T), guanine (G) and cytosine (C). The order, or sequence, of these bases determines what biological instructions are contained in a strand of DNA. For example, the sequence ATCGTT might instruct for blue eyes, while ATCGCT might instruct for brown.<br />
  19. 19. Deoxyribonucleic Acid (DNA)Genes<br />Genes<br />These unique coding sections of DNA that ultimately are transcribed into unique mRNA which are translated into unique proteins are called genes.<br />
  20. 20. Deoxyribonucleic Acid (DNA)Function<br />What does DNA do?<br />DNA contains the instructions needed for an organism to develop, survive and reproduce. To carry out these functions, DNA sequences must be converted into messages that can be used to produce proteins, which are the complex molecules that do most of the work in our bodies.<br />To form a strand of DNA, nucleotides are linked into chains, with the phosphate and sugar groups alternating.<br />
  21. 21. Deoxyribonucleic Acid (DNA)Replication<br />Replication<br />Chromosomes are located in the nucleus of a cell.  DNA must be duplicated in a process called replication before a cell divides.  The replication of DNA allows each daughter cell to contain a full complement of chromosomes.  <br />
  22. 22. Deoxyribonucleic Acid (DNA)Transcription<br />Transcription: <br />The actual information in the DNA of chromosomes is decoded in a process called transcription   through the formation of another nucleic acid, ribonucleic acid or RNA.<br />
  23. 23. Deoxyribonucleic Acid (DNA)Translation<br />Translation:<br />The information from the DNA, now in the form of a linear RNA sequence, is decoded in a process called translation, to form a protein, another biological polymer.<br />
  24. 24. Biological Data Representation<br />
  25. 25. Biological Data Representation<br />
  26. 26. Biological Data Representation<br />Strings: to represent DNA, RNA and sequences of amino-acids DNA: {A,C,G,T}, RNA: {A,C,G,U},<br />Protein: {A,C,D,E,F,G,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y}<br />e.g5’GTAAAGTCCCGTTAGC 3’<br />Image source; www.ebi.ac.uk/microarray/ biology_intro.htm<br />
  27. 27. Biological Data Representation<br />Trees: to represent the evolution of various organisms<br />
  28. 28. Biological Data Representation<br />Sets of 3D points: to represent the protein structure<br />
  29. 29. Biological Data Representation<br />Graphs: to represent metabolic and signaling pathways.<br />
  30. 30. How can computers be useful for biology?<br />
  31. 31. How can computers be useful to biology?<br />First, Computing technology for storing DNA sequences and constructing these latter from fragments. Data storage and access requirements – <br />Study of the organization and evolution of genomes through comparative genome analysis. Visualisation tools and techniques requirements, sequence analysis requirements .<br />
  32. 32. How can computers be useful to biology?<br />Structuring and organizing large databases using a common ontology . Access data from different databases using the same query language. (Gene Ontology Consortium )<br />Many areas of biology use images to communicate their results. Tools and techniques for searching, describing, manipulating and analyzing features within these images.<br />
  33. 33. How can computers be useful to biology?<br />Databases maintenance: Need to check consistency of databases for a valid and error free content. <br />Storing protein sequences, their structure as well as their function. Tools and techniques for manipulating protein sequences, protein secondary and tertiary structure prediction… <br />
  34. 34. How can computers be useful for biology?<br />Several databases: genome databases, protein sequence databases, metabolic databases, Microarray databases<br />EMBL-EBI (Europe), GenBank-NCBI (USA), DDBJ (Japan)<br />Basic Local Alignment Search Tool (BLAST) program for quick DB searches.<br />
  35. 35. How can computers be useful for biology?<br />Several algorithms: <br />Sequence Comparison Algorithms: Needleman-Wunch (global alignment 1970),Smith-Waterman algorithm: Local sequence alignment (1981).<br />BLAST, FASTA, CLUSTALW, MEME...etc<br />http://en.wikipedia.org/wiki/Category:Bioinformatics_algorithms<br />
  36. 36. Intelligent Bioinformatics<br /><ul><li> Data generation in biology/bioinformatics is outpacing methods of data analysis.
  37. 37. Data interpretation and generation of hypotheses requires intelligence.
  38. 38. AI offers established methods for knowledge representation and “intelligent” data interpretation.
  39. 39. Predict utilization of AI in bioinformatics to increase.</li></li></ul><li>Intelligent Bioinformatics<br />Search problems: Sequence alignment<br />Learning problems: Gene regulatory networks<br />Clustering problems: Gene expression data processing.<br />Prediction problems: tertiary and secondary structure.<br />Data mining: Inferring knowledge from large biological databases.<br />
  40. 40. Intelligent Bioinformatics<br />AI and computational intelligence techniques and models:<br />Basic search techniques A*, Branch and Bound, <br />Genetic Algorithms <br />Simulated Annealing <br />Particle Swarm Optimization <br />Neural networks, <br />Support Vector Machines,<br />K nearest neighbors,<br />Hidden Markov Models,….<br />
  41. 41. Intelligent Bioinformatics:An example: Sequence Alignment<br />A<br />-<br />T<br />G<br />G<br />G<br />G<br />-<br />-<br />T<br />T<br />A<br />-<br />T<br />A<br />C<br />C<br />C<br />-<br />G<br />-<br />A<br />G<br />-<br />G<br />T<br />T<br />G<br />T<br />G<br />T<br />-<br />A<br />-<br />-<br />A<br />C<br />C<br />A<br />-<br />G<br />C<br />Possible alignment<br />Possible alignment<br />S1=AGGTC<br />S2=GTTCG<br />S3=TGAAC<br />For lengthy sequences the problem is very hard to solve:<br /><ul><li>Optimization problem.
  42. 42. Search strategy
  43. 43. Scoring function</li></li></ul><li>Intelligent Bioinformatics:An example: Sequence Alignment<br />si: string defined over an alphabet A<br />Aligning sequences in S is obtaining S’ /<br />such that:<br />Each sequence si’ is an extension of si defined over the<br /> alphabet <br />2. For all i,jlength(S)=length(S’<br />
  44. 44. Intelligent BioinformaticsAn example: the robot scientist<br />Source: BBC News<br /><ul><li>University of Wales
  45. 45. Designed for the study of functional genomics
  46. 46. Tested on yeast metabolic pathways
  47. 47. Uses knowledge representation schemes
  48. 48. Utilizes a Prolog database to store background biological information.
  49. 49. Prolog can inspect biological information, infer knowledge, and make predictions
  50. 50. Optimal hypothesis is determined using machine learning, which looks at probabilities and associated cost</li></li></ul><li>Intelligent BioinformaticsAnother example: the robot scientist<br />Ross D. King, et al., Nature, January 2004<br />
  51. 51. Intelligent BioinformaticsAn other example: the robot scientist<br />Performance similar to humans<br />Performance significantly better than “naïve” or “random” selection of experiments<br />Ross D. King, et al., Nature, January 2004<br />
  52. 52. Challenges<br />Data Fusion or integration: <br />Integration of a wide variety of data sources such as clinical and genomic data will allow us to use disease symptoms to predict genetic mutations and vice versa.<br /> The integration of GIS data, such as maps, weather systems, with crop health and genotype data, will allow us to predict successful outcomes of agriculture experiments. <br />
  53. 53. Challenges<br />Large-scale comparative genomics. <br />development of tools that can do comparisons of genomes will push forward the discovery rate in this field of bioinformatics.<br />Modeling and visualization of full networks of complex systems<br />predict how the system (or cell) reacts to a drug for example.<br />
  54. 54. Challenges<br />Compare complex biological observations, such as gene expression patterns and protein networks.<br />Converting biological observations to a model that a computer will understand.<br />More than that we are more than the sum of the parts….COMPLEXITY<br />
  55. 55. Challenges<br /><ul><li>Important information still needs to be decoded: Genome sequencing, microarrays
  56. 56. Exciting research potential: Leads to important discoveries
  57. 57. Jobs available
  58. 58. SmartMoney ranks Bioinformatics a #1 among next Hot Jobs
  59. 59. Saves time and money</li></ul>http://smartmoney.com/consumer/index.cfm?story=working-june02<br />
  60. 60. OUR CHALLENGE: START WORKING ON BIOINFORMATICS<br />J. Cohen “Computer scientists should be encouraged to learn biology as biologists computer science to prepare themselves for an intellectually stimulating and financially rewarding future.”<br />D. Knuth “…the number of radically new results in pure computer science is likely to decrease, while scientists continue working on biological challenges for the next 500 years….”<br />L. Adleman “..biological life can be equated with computation…”<br />
  61. 61. Conclusion<br />Image Source: http://www.amazon.com/<br />
  62. 62. Conclusion<br />Image Source: http://www.amazon.com/<br />
  63. 63. Ressources<br />ISCB: http://www.iscb.org/<br />NBCI: http://ncbi.nlm.nih.gov/<br />http://www.bioinformatics.org/<br />Journals: IEEE/ACM <br />Conferences (ISMB, RECOMB, PSB…)<br />http://kbrin.a-bldg.louisville.edu/CECS694/<br />
  64. 64. Conclusion<br />Bioinformatics is all about how computer science can enhance biology and how biology can stimulate computer science.<br />
  65. 65. MOM 2010<br />

×