Slide 1

1,169 views
1,071 views

Published on

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,169
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
24
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Slide 1

  1. 1. The CMBI: Bioinformatics Content  Bioinformatics  Bioinformatics@CMBI  Bioinformatics tools & databases Hanka Venselaar CMBI UMC Radboud February 2009 h.venselaar@cmbi.ru.nl
  2. 2. 2/37 ©CMBI 2009 What is bioinformatics? • Bioinformatics is the use of computers in solving information problems in the life sciences • You are "doing bioinformatics" when you use computers to store, retrieve, analyze or predict the sequence, function and/or structure of biomolecules. Bioinformatics
  3. 3. 3/37 ©CMBI 2009 Human genome, great expectations Data ≠ Knowledge, insight !!! Bioinformatics
  4. 4. 4/37 ©CMBI 2009 Why do we need Bioinformatics? Flood of biological data: – DNA-sequences (genomes) – protein sequences and structures – gene expression profiles (transcriptomics) – cellular protein profiles (proteomics) – cellular metabolite profiles (metabolomics) We want to : – collect and store the data – integrate, analyze, compare and mine the data – predict genes, protein function and protein structure – predict physiology (models, mechanisms, pathways) – understand how a whole cell works Bioinformatics
  5. 5. 5/37 ©CMBI 2009 A large fraction of the human genes has an unknown function (Science, 2001) Bioinformatics
  6. 6. 6/37 ©CMBI 2009 What is protein function? Homology Genomic context Bioinformatics
  7. 7. 7/37 ©CMBI 2009 How can we predict function of proteins? “similar sequence with known function. E.g. proteine kinase”“new, unknown protein” Extrapolate the function Compare with database of proteins BLAST The importance of sequence similarity and sequence alignment Similar sequences have: – A similar evolutionary origin – A similar function – A similar 3D structure Bioinformatics
  8. 8. 8/37 ©CMBI 2009 CMBI - Centre for Molecular and Biomolecular Informatics • •Dutch national centre for computational molecular sciences research •Research groups –Comparative Genomics (Huynen) –Bacterial Genomics (Siezen) –Computational Drug Design (De Vlieg) –Bioinformatics of Macromolecular Structures (Vriend) •Training & Education –MSc, PhD and PostDoc programmes –International workshops –Hotel Bioinformatica –High school courses •Computational facilities, databases, and software packages via (inter-)national service platforms (NBIC, EBI, etc) •NBIC: National BioInformatics Centre. Bioinformatics @CMBI
  9. 9. 9/37 ©CMBI 2009 Computational Drug Discovery (CDD) Group • Head: Prof. Jacob de Vlieg • Key goal Develop molecular modeling and computer-based simulation techniques for structure-based drug design, translational medicine and protein family based approaches to design and identify drug-like compounds • Key Research Fields – Structural bioinformatics for drug design – Bioinformatics for genomics (microarray analysis, text mining, etc) – Translational medicine informatics Academic Research New scientific approaches Training & education Applications Exciting real life problems ‘wet’ validation CDD Bridging academic research and applied genomics Bioinformatics @CMBI
  10. 10. 10/37 ©CMBI 2009 Examples of CDD Projects •Exploiting Structural Genomics Information To Incorporate Protein Flexibility In Drug Design •Protein knowledge building through comparative genomics and data integration •In silico studies on p63 as a new drug-target protein Bioinformatics @CMBI
  11. 11. 11/37 ©CMBI 2009 International Computational Drug Discovery Course •Course covers the entire research pipeline from genomics and proteomics in target discovery to Structure Based Drug Design and QSAR in drug optimization. •Lectures and practicals •2 week course •June/July 2009 •www.cmbi.ru.nl/ICDD2008 Bioinformatics @CMBI
  12. 12. 12/37 ©CMBI 2009 Bacterial Genomics Group • Head: Prof Roland Siezen • Research interest: Biological questions in the interest of Dutch Food Industry • How can we improve: – fermentation – safety – health • Micro-organisms studied: Gram-positive food bacteria: – lactic acid bacteria (Lactococcus, Lactobacillus) – spoilage bacteria (Listeria, Clostridium, Bacillus cereus) listeria lactococcus Bioinformatics @CMBI
  13. 13. 13/37 ©CMBI 2009 Bacterial Genomics: from sequence to predicted function Key research fields: – Genome sequencing and interpretation – Network reconstruction and analysis – Systems biology, dynamic modelling Raw sequence data: 2 to 5 million nucleotides AAACACTTAGACAATCAATATAAAGATGAA GTGAACGCTCTTAAAGAGAAGTTGGAAAAC TTGCAGGAACAAATCAAAGATCAAAAAAGG ATAGAAGAACAAGAAAAACCACAAACACTT AGACAATCAATATAAAGATGAAGTGAACGC TCTTAAAGAGAAGTTGGAAAACTTGCAGGA ACAAATCAAAGATCAAAAAAGGATAGAAGA ACAAGAAAAACCACAAACACTTAGACAATC AATATAAAGATGAAGTGAACGCTCTTAAAG AGAAGTTGGAAAACTTGCAGGAACAAATCA AAGATCAAAAAAGGATAGAAGAACAAGAAA AACCACAAACACTTAGACAATCAATATAAA GATGAAGTGAACGCTCTTAAAGAGAAGTTG GAAAACTTGCAGGAACAAATCAAAGATCAA AAAAGGATAGAAGAACAAGAAAAACCACAA ACACTTAGACAATCAATATAAAGATGAAGT GAACGCTCTTAAAGAGAAGTTGGAAAACTT GCAGGAA A virtual cell: overview of predicted pathways Bioinformatics @CMBI
  14. 14. 14/37 ©CMBI 2009 Bacterial Genomics: Example Differential NF-κB pathways induction by Lactobacillus plantarum in the duodenum of healthy humans correlating with immune tolerance Peter van Baarlen et al., PNAS, Febr 3, 2009 Bioinformatics @CMBI
  15. 15. 15/37 ©CMBI 2009 Comparative Genomics Group • Head: Prof. Martijn Huynen • Research Focus: – How do the proteins encoded in genomes interact with each other to produce cells and phenotypes ? – To predict such functional interactions between proteins as there exist e.g. in metabolic pathways, signalling pathways or protein complexes A genome is more than the sum of its genes -> Use “genomic context” for function prediction Types of genomic context: Gene fusion/fission Chromosomal location Gene order/neighbourhood Co-evolution Co-expression Bioinformatics @CMBI
  16. 16. 16/37 ©CMBI 2009 Turning data into knowledge Research topics: • Develop computational genomics techniques that exploit the information in sequenced genomes and functional genomics data • Make testable predictions about pathways and the functions of proteins therein. • Evolution of the eukaryotic cell and in the origin and evolution of organelles like the mitochondria and the peroxisomes Education: • Comparative Genomics Course, 3 EC, April 2009 Comparative genomics Prediction of protein function, pathways Bioinformatics @CMBI
  17. 17. 17/37 ©CMBI 2009 Frataxin Example • Frataxin is a well-known disease gene (Friedreich's ataxia) whose function has remained elusive despite more than six years of intensive experimental research. • Using computational genomics we have shown that frataxin has co-evolved with hscA and hscB and is likely involved in iron-sulfur cluster assembly in conjunction with the co-chaperone HscB/JAC1. Prediction Confirmation Bioinformatics @CMBI
  18. 18. 18/37 ©CMBI 2009 Bioinformatics of macromolecular structures •Head: Prof. Gert Vriend •Research Focus: Understanding proteins (and their environment) •Proteins are the core of life, they do all the work, and they give you feelings, contact with the outside world, etc. •Proteins, therefore, are the most important molecules on earth. •We want to understand life; why are we what we are, why do we do what we do, how come you can think what you think? Bioinformatics @CMBI
  19. 19. 19/37 ©CMBI 2009 Bioinformatics of macromolecular structures Research topics Vriend group •Homology modeling technology and applications •Application of bioinformatics in medical research (Hanka Venselaar) •Structure validation and structure determination improvement •Molecular class specific information systems (e.g. GPCRDB & NucleaRDB) •Data mining •WHAT IF molecular modelling and visualization software Bioinformatics @CMBI
  20. 20. Hearing loss Unknown structure MGTPWRKRKGIAGPGLPDLSCALVLQPRAQVGTMSPAI ALAFLPLVVTLLVRYRHYFRLLVRTVLLRSLRDCLSGLRI EERAFSYVLTHALPGDPGHILTTLDHWSSRCEYLSHMG PVKGQILMRLVEEKAPACVLELGTYCGYSTLLIARALPP GGRLLTVERDPRTAAVAEKLIRLAGFDEHMVELIVGSSE DVIPCLRTQYQLSRADLVLLAHRPRCYLRDLQLLEAHAL LPAGATVLADHVLFPGAPRFLQYAKSCGRYRCRLHHTG LPDFPAIKDGIAQLTYAGPG DFNB63: Homology Modeling Homology modeling: Prediction of 3D structure based upon a highly similar structure Bioinformatics @CMBI
  21. 21. 21/37 ©CMBI 2009 Prediction of 3D structure based upon a highly similar structure Add sidechains, Molecular Dynamics simulation on model Unknown structure NSDSECPLSHDG NSDSECPLSHDG || || | || NSYPGCPSSYDG Alignment of model and template sequence Known structure Known structure Back bone copied Copy backbone and conserved residues Model! Homology Modeling Bioinformatics @CMBI
  22. 22. Hearing loss Structure! MGTPWRKRKGIAGPGLPDLSCALVLQPRAQVGTMSPAI ALAFLPLVVTLLVRYRHYFRLLVRTVLLRSLRDCLSGLRI EERAFSYVLTHALPGDPGHILTTLDHWSSRCEYLSHMG PVKGQILMRLVEEKAPACVLELGTYCGYSTLLIARALPP GGRLLTVERDPRTAAVAEKLIRLAGFDEHMVELIVGSSE DVIPCLRTQYQLSRADLVLLAHRPRCYLRDLQLLEAHAL LPAGATVLADHVLFPGAPRFLQYAKSCGRYRCRLHHTG LPDFPAIKDGIAQLTYAGPG DFNB63: Homology Modeling Bioinformatics @CMBI
  23. 23. 23/37 ©CMBI 2009 Saltbridge between Arginine and Glutamic acid is lost in both cases •Arginine 81 -> Glutamic acid •Glutamic acid 110 -> Lysine Mutations: Homology Modeling Bioinformatics @CMBI
  24. 24. 24/37 ©CMBI 2009 Mutation: •Tryptophan 105 -> Arginine Hydrophobic contacts from the Tryptophan are lost, introduction of an hydrophilic and charged residue Homology Modeling Bioinformatics @CMBI
  25. 25. 25/37 ©CMBI 2009 The three mutated residues are all important for the correct positioning of Tyrosine 111 Tyrosine 111 is important for substrate binding Ahmed et al., Mutations of LRTOMT, a fusion gene with alternative reading frames, cause nonsyndromic deafness in humans. Nat Genet. 2008 Nov;40(11):1335-40. Interested? Contact Hanka Venselaar (h.venselaar@cmbi.ru.nl) Homology Modeling Bioinformatics @CMBI
  26. 26. 26/37 ©CMBI 2009 Hotel Bioinformatica Hotel functions • Temporary housing, teaching and supervision of experimentalists for data analysis at the CMBI • Centralization of UMC-wide bioinformaticians • Shared (weekly) seminars of CMBI with ‘inhouse bioinformaticians’ • Collaboration/advice in acquiring grants with a Bioinformatics aspect Interested? Contact Martijn Huynen (m.huynen@cmbi.ru.nl) Bioinformatics @CMBI
  27. 27. 27/37 ©CMBI 2009 Bioinformatics data types mRNA expression profiles MS data Large amount of data Growing very very fast Heterogeneous data types Bioinformatics Tools & Databases
  28. 28. 28/37 ©CMBI 2009 Biological Databases • Information is the core of bioinformatics • Literally thousands of databases exist that are relevant for biology, medicine, and/or chemistry Content Database protein sequences SwissProt UniProt trEMBL nucleotide sequences EMBL GenBank DDBJ structures (protein, DNA, RNA) Protein Data Bank (PDB) Genomes Ensembl UCSC Mutations OMIM Patterns, Motifs PROSITE Protein Domains InterPro SMART Pathways KEGG Bioinformatics Tools & Databases
  29. 29. 29/37 ©CMBI 2009 Important records in SwissProt/UniProt (1) Bioinformatics Tools & Databases
  30. 30. 30/37 ©CMBI 2009 Important records in SwissProt/UniProt (2) Cross references Direct hyperlinks to: • EMBL • PDB • OMIM, • InterPro • etc. etc. Features • post-translational modifications • signal peptides • binding sites, • enzyme active sites • domains, • disulfide bridges • etc. etc. Bioinformatics Tools & Databases
  31. 31. 31/37 ©CMBI 2009 Protein Databank & Structure Visualization • PDB structures have a unique identifier, the PDB Code: 4 digits (often 1 digit & 3 letters, e.g. 1CRN). • Download PDB structures, give correct file extension: 1CRN.pdb • Structures from PDB can directly be visualized with: 1. Yasara (www.yasara.org) 2. SwissPDBViewer (http://spdbv.vital-it.ch/) 3. Protein Explorer (http://www.umass.edu/microbio/rasmol/) 4. Cn3D (http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml) Bioinformatics Tools & Databases
  32. 32. 32/37 ©CMBI 2009 OMIM Database OMIM - Online Mendelian Inheritance in Man • a large, searchable, current database of human genes, genetic traits, and hereditary disorders • contains information on all known mendelian disorders and over 12,000 genes • focuses on the relationship between phenotype and genotype Bioinformatics Tools & Databases
  33. 33. 33/37 ©CMBI 2009 Browsing genomes UCSC http://genome.ucsc.edu/ Only eukaryotic genomes NCBI Ensembl http://www.ensembl.org/ Bioinformatics Tools & Databases
  34. 34. 34/37 ©CMBI 2009 Sequence Retrieval with MRS (1) Google = Thé best generic search and retrieval system MRS = Maarten’s Retrieval System (http://mrs.cmbi.ru.nl ) MRS is the Google of the biological database world Search engine (like Google) Input/Query = word(s) Output = entry/entries from database Searching is very intuitive: – Select database(s) of choice – Formulate your query – Hit “Search” – The result is a “query set” or “hitlist” – Analyze the results Bioinformatics Tools & Databases
  35. 35. 35/37 ©CMBI 2009 Sequence Retrieval with MRS (2) Formulate query. But think about your query first!! Select database MRS hitlist Bioinformatics Tools & Databases
  36. 36. 36/37 ©CMBI 2009 BLAST and CLUSTAL with MRS Blast brings you to the MRS-page from which you can do Blast searches. Blast results brings you to the page where MRS stores your Blast results of the current session. Clustal brings you to the MRS page from which you can do Clustal sequence alignments. Bioinformatics Tools & Databases
  37. 37. 37/37 ©CMBI 2009 Your Exercise Today The practicum: FAMILIAL VISCERAL AMYLOIDOSIS Today for PhD students Friday (13:00) for MMD students  CMBI, Course room, ground floor NCMLS You will study Lysozyme: •Protein •Gene •Mutations causing familial visceral amyloidosis •3D structure HAVE FUN!! Bioinformatics Tools & Databases
  38. 38. The Practicum You can find the practicum at http://swift.cmbi.ru.nl/teach/lyso/ 38/37 ©CMBI 2009 Work with MRS Work with Yasara Read the text carefully User login = c(your pc number) f.e c07 User password = t0psp0rt (with zero’s) The program Yasara is on your desktop

×