Integration of biomedical literature and databases

437 views

Published on

Nordic Conference for Scolarly Communication 2008, Scandic Star Hotel, Lund, Sweden, April 21-23, 2008

Published in: Technology, Health & Medicine
  • Be the first to comment

  • Be the first to like this

Integration of biomedical literature and databases

  1. 1. Integration of biomedical literature and databases Lars Juhl Jensen EMBL Heidelberg
  2. 2. why integration?
  3. 3. why biomedicine?
  4. 4. why literature?
  5. 5. why databases?
  6. 6. open access databases
  7. 7. a lot of them
  8. 8. Duncan Hull, nodalpoint.org
  9. 9. PubChem
  10. 11. 19.2 million compounds
  11. 12. GenBank
  12. 14. 85 million sequences
  13. 15. 89 billion nucleotides
  14. 16. UniProt
  15. 18. 5.6 million sequences
  16. 19. PDB
  17. 21. 50000 protein structures
  18. 22. BIND Biomolecular Interaction Network Database
  19. 23. DIP Database of Interacting Proteins
  20. 24. MINT Molecular Interactions Database
  21. 25. IntAct
  22. 26. BioGRID
  23. 28. 204000 interactions
  24. 29. too many
  25. 30. incomplete
  26. 31. literature mining
  27. 32. M EDLINE
  28. 33. 17.9 million citations
  29. 35. too much to read
  30. 36. information retrieval
  31. 37. finding the papers
  32. 40. user-specified query
  33. 41. “ yeast AND cell cycle”
  34. 42. stemming
  35. 43. yeast / yeasts
  36. 44. dynamic query expansion
  37. 45. yeast / S. cerevisiae
  38. 46. ranking
  39. 49. Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1 hyperphosphorylation and degradation
  40. 50. no tool will find it
  41. 51. entity recognition
  42. 52. identifying the substance(s)
  43. 53. Mitotic cyclin ( Clb2 )-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5 -dependent Swe1 hyperphosphorylation and degradation
  44. 54. Cdc28  yeast
  45. 55. Cdc28  cell cycle
  46. 57. synonyms list
  47. 58. orthographic variation
  48. 59. CDC28
  49. 60. Cdc28p
  50. 61. disambiguation
  51. 62. Cdc2
  52. 63. SDS
  53. 65. still too much to read
  54. 66. information extraction
  55. 67. formalizing the facts
  56. 69. co-mentioning
  57. 70. statistical methods
  58. 71. NLP Natural Language Processing
  59. 72. <ul><li>Gene and protein names </li></ul><ul><li>Cue words for entity recognition </li></ul><ul><li>Verbs for relation extraction </li></ul><ul><li>[ nxexpr T he expression of [ nxgene the cytochrome genes [ nxpg CYC1 and CYC7 ]]] is controlled by [ nxpg HAP1 ] </li></ul>
  60. 73. Mitotic cyclin ( Clb2 )-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5 -dependent Swe1 hyperphosphorylation and degradation
  61. 75. yet another database
  62. 76. integration
  63. 77. augmented browsing
  64. 79. semantic tagging
  65. 81. association networks
  66. 84. curated knowledge
  67. 86. genomic context
  68. 87. phylogenetic profiles
  69. 89. gene neighborhood
  70. 91. experimental data
  71. 92. physical interactions
  72. 94. genetic interactions
  73. 96. literature mining
  74. 98. restricted access
  75. 99. Bayesian framework
  76. 101. summary
  77. 102. literature mining is good
  78. 103. data integration is better
  79. 104. open access
  80. 105. Acknowledgments <ul><li>STRING & STITCH </li></ul><ul><ul><li>Christian von Mering </li></ul></ul><ul><ul><li>Michael Kuhn </li></ul></ul><ul><ul><li>Manuel Stark </li></ul></ul><ul><ul><li>Samuel Chaffron </li></ul></ul><ul><ul><li>Philippe Julien </li></ul></ul><ul><ul><li>Tobias Doerks </li></ul></ul><ul><ul><li>Jan Korbel </li></ul></ul><ul><ul><li>Berend Snel </li></ul></ul><ul><ul><li>Martijn Huynen </li></ul></ul><ul><ul><li>Peer Bork </li></ul></ul><ul><li>Reflect </li></ul><ul><ul><li>Evangelos Pafilis </li></ul></ul><ul><ul><li>Michael Kuhn </li></ul></ul><ul><ul><li>Sean O’Donoghue </li></ul></ul><ul><ul><li>Reinhardt Schneider </li></ul></ul><ul><li>Natural Language Processing </li></ul><ul><ul><li>Jasmin Saric </li></ul></ul><ul><ul><li>Rossitza Ouzounova </li></ul></ul><ul><ul><li>Isabel Rojas </li></ul></ul><ul><ul><li>Peer Bork </li></ul></ul>

×