Network biology: Large-scale data and text mining

454 views

Published on

Published in: Science
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
454
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Network biology: Large-scale data and text mining

  1. 1. Network biology Large-scale data and text mining Lars Juhl Jensen
  2. 2. guilt by association
  3. 3. protein networks
  4. 4. STRING
  5. 5. computational predictions
  6. 6. gene fusion
  7. 7. Korbel et al., Nature Biotechnology, 2004
  8. 8. gene neighborhood
  9. 9. Korbel et al., Nature Biotechnology, 2004
  10. 10. phylogenetic profiles
  11. 11. Korbel et al., Nature Biotechnology, 2004
  12. 12. experimental data
  13. 13. gene coexpression
  14. 14. protein interactions
  15. 15. Jensen & Bork, Science, 2008
  16. 16. curated knowledge
  17. 17. complexes
  18. 18. pathways
  19. 19. Letunic & Bork, Trends in Biochemical Sciences, 2008
  20. 20. many databases
  21. 21. different formats
  22. 22. different identifiers
  23. 23. variable quality
  24. 24. not comparable
  25. 25. not same species
  26. 26. hard work
  27. 27. quality scores
  28. 28. von Mering et al., Nucleic Acids Research, 2005
  29. 29. calibrate vs. gold standard
  30. 30. von Mering et al., Nucleic Acids Research, 2005
  31. 31. homology-based transfer
  32. 32. Franceschini et al., Nucleic Acids Research, 2013
  33. 33. missing most of the data
  34. 34. text mining
  35. 35. >10 km
  36. 36. too much to read
  37. 37. computer
  38. 38. as smart as a dog
  39. 39. teach it specific tricks
  40. 40. named entity recognition
  41. 41. comprehensive lexicon
  42. 42. CDC2
  43. 43. cyclin dependent kinase 1
  44. 44. expansion rules
  45. 45. hCdc2
  46. 46. CDC2
  47. 47. flexible matching
  48. 48. cyclin-dependent kinase 1
  49. 49. cyclin dependent kinase 1
  50. 50. “black list”
  51. 51. SDS
  52. 52. augmented browsing
  53. 53. Reflect
  54. 54. browser add-on
  55. 55. real-time text mining
  56. 56. Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009 O’Donoghue et al., Journal of Web Semantics, 2010
  57. 57. information extraction
  58. 58. co-mentioning
  59. 59. within documents
  60. 60. within paragraphs
  61. 61. within sentences
  62. 62. text corpus
  63. 63. ~22 million abstracts
  64. 64. no access
  65. 65. millions of full-text articles
  66. 66. localization and disease
  67. 67. general approach
  68. 68. COMPARTMENTS
  69. 69. TISSUES
  70. 70. DISEASES
  71. 71. curated knowledge
  72. 72. experimental data
  73. 73. text mining
  74. 74. computational predictions
  75. 75. common identifiers
  76. 76. quality scores
  77. 77. visualization
  78. 78. compartments.jensenlab.org
  79. 79. tissues.jensenlab.org
  80. 80. dissemination
  81. 81. web interfaces
  82. 82. web services
  83. 83. diseases.jensenlab.org
  84. 84. bulk download
  85. 85. Acknowledgments STRING Christian von Mering Damian Szklarczyk Michael Kuhn Manuel Stark Samuel Chaffron Chris Creevey Jean Muller Tobias Doerks Philippe Julien Alexander Roth Milan Simonovic Jan Korbel Berend Snel Martijn Huynen Peer Bork Text mining Sune Frankild Evangelos Pafilis Kalliopi Tsafou Alberto Santos Janos Binder Heiko Horn Michael Kuhn Nigel Brown Reinhardt Schneider Sean O’ Donoghue

×