Mining biomedical texts Lars Juhl Jensen >10 km
exponential growth
 
 
some things are constant
 
~45 seconds per paper
information retrieval
find the relevant texts
still too much to read
computer
as smart as a dog
teach it specific tricks
 
 
named entity recognition
identify the concepts
comprehensive lexicon
small molecules
proteins
cellular components
organisms
diseases
orthographic variation
“ black list”
Reflect.ws
augmented browsing
browser add-on
Pafilis, O’Donoghue, Jensen et al.,  Nature Biotechnology , 2009 O’Donoghue et al.,  Journal of Web Semantics , 2010
Firefox
Internet Explorer
Google Chrome
Safari
Utopia Documents
web services
~150 years of publishing
 
dead wood
 
dead e-wood
added value
collaboration
 
 
SciVerse application
 
 
 
 
 
STITCH
Kuhn et al.,  Nucleic Acids Research , 2010
curated knowledge
drug targets
pathways
Letunic & Bork,  Trends in Biochemical Sciences , 2008
experimental data
physical interactions
Jensen & Bork,  Science , 2008
text mining
co-mentioning
 
NLP Natural Language Processing
 
abstracts
full text
restricted access
 
collaboration
electronic patient journals
a hard problem
in Danish
no lexicon
by busy doctors
acronyms
typos
about psychiatric patients
delusions
domain specific system
F20 F200 Negation Family
diagnoses
patient stratification
Roque et al.,  PLoS Computational Biology , 2011
disease comorbidity
Roque et al.,  PLoS Computational Biology , 2011
medication
adverse drug events
pharmacovigilance
phenotype
genotype
Thank you! <ul><ul><li>Reflect.ws </li></ul></ul><ul><ul><li>Sune Frankild </li></ul></ul><ul><ul><li>Heiko Horn </li></ul...
larsjuhljensen
Upcoming SlideShare
Loading in …5
×

Mining biomedical texts

627 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
627
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
20
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Mining biomedical texts

  1. 1. Mining biomedical texts Lars Juhl Jensen >10 km
  2. 2. exponential growth
  3. 5. some things are constant
  4. 7. ~45 seconds per paper
  5. 8. information retrieval
  6. 9. find the relevant texts
  7. 10. still too much to read
  8. 11. computer
  9. 12. as smart as a dog
  10. 13. teach it specific tricks
  11. 16. named entity recognition
  12. 17. identify the concepts
  13. 18. comprehensive lexicon
  14. 19. small molecules
  15. 20. proteins
  16. 21. cellular components
  17. 22. organisms
  18. 23. diseases
  19. 24. orthographic variation
  20. 25. “ black list”
  21. 26. Reflect.ws
  22. 27. augmented browsing
  23. 28. browser add-on
  24. 29. Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology , 2009 O’Donoghue et al., Journal of Web Semantics , 2010
  25. 30. Firefox
  26. 31. Internet Explorer
  27. 32. Google Chrome
  28. 33. Safari
  29. 34. Utopia Documents
  30. 35. web services
  31. 36. ~150 years of publishing
  32. 38. dead wood
  33. 40. dead e-wood
  34. 41. added value
  35. 42. collaboration
  36. 45. SciVerse application
  37. 51. STITCH
  38. 52. Kuhn et al., Nucleic Acids Research , 2010
  39. 53. curated knowledge
  40. 54. drug targets
  41. 55. pathways
  42. 56. Letunic & Bork, Trends in Biochemical Sciences , 2008
  43. 57. experimental data
  44. 58. physical interactions
  45. 59. Jensen & Bork, Science , 2008
  46. 60. text mining
  47. 61. co-mentioning
  48. 63. NLP Natural Language Processing
  49. 65. abstracts
  50. 66. full text
  51. 67. restricted access
  52. 69. collaboration
  53. 70. electronic patient journals
  54. 71. a hard problem
  55. 72. in Danish
  56. 73. no lexicon
  57. 74. by busy doctors
  58. 75. acronyms
  59. 76. typos
  60. 77. about psychiatric patients
  61. 78. delusions
  62. 79. domain specific system
  63. 80. F20 F200 Negation Family
  64. 81. diagnoses
  65. 82. patient stratification
  66. 83. Roque et al., PLoS Computational Biology , 2011
  67. 84. disease comorbidity
  68. 85. Roque et al., PLoS Computational Biology , 2011
  69. 86. medication
  70. 87. adverse drug events
  71. 88. pharmacovigilance
  72. 89. phenotype
  73. 90. genotype
  74. 91. Thank you! <ul><ul><li>Reflect.ws </li></ul></ul><ul><ul><li>Sune Frankild </li></ul></ul><ul><ul><li>Heiko Horn </li></ul></ul><ul><ul><li>Evangelos Pafilis </li></ul></ul><ul><ul><li>Michael Kuhn </li></ul></ul><ul><ul><li>Reinhardt Schneider </li></ul></ul><ul><ul><li>Sean O’Donoghue </li></ul></ul><ul><ul><li>SciVerse app </li></ul></ul><ul><ul><li>Juan-Carlos Silla-Castro </li></ul></ul><ul><ul><li>Sean O’Donoghue </li></ul></ul><ul><ul><li>EPJ-mining </li></ul></ul><ul><ul><li>Francisco S Roque </li></ul></ul><ul><ul><li>Peter B Jensen </li></ul></ul><ul><ul><li>Robert Eriksson </li></ul></ul><ul><ul><li>Henriette Schmock </li></ul></ul><ul><ul><li>Marlene Dalgaard </li></ul></ul><ul><ul><li>Massimo Andreatta </li></ul></ul><ul><ul><li>Thomas Hansen </li></ul></ul><ul><ul><li>Karen Søeby </li></ul></ul><ul><ul><li>Søren Bredkjær </li></ul></ul><ul><ul><li>Anders Juul </li></ul></ul><ul><ul><li>Thomas Werge </li></ul></ul><ul><ul><li>Søren Brunak </li></ul></ul>
  75. 92. larsjuhljensen

×