Turning literature into databases

401 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
401
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Turning literature into databases

  1. 1. Turning literature into databases >10 km Lars Juhl Jensen
  2. 2. corpora
  3. 3. 22M abstracts
  4. 4. 1.9M freely available articles
  5. 5. 1.9M Elsevier documents
  6. 6. entity recognition
  7. 7. identify the concepts
  8. 8. comprehensive lexicon
  9. 9. small molecules
  10. 10. proteins
  11. 11. cellular components
  12. 12. tissues
  13. 13. organisms
  14. 14. phenotypes
  15. 15. diseases
  16. 16. orthographic variation
  17. 17. singular vs. plural
  18. 18. flexible matching
  19. 19. spaces and hyphens
  20. 20. “black list”
  21. 21. information extraction
  22. 22. count co-mentioning
  23. 23. within documents
  24. 24. within paragraphs
  25. 25. within sentences
  26. 26. new scoring scheme
  27. 27. STRING v9.1
  28. 28. ~2x better sensitivity
  29. 29. web-centric databases
  30. 30. suite of web interfaces
  31. 31. common backend database
  32. 32. diseases.jensenlab.org
  33. 33. search for a protein
  34. 34. ranked table of diseases
  35. 35. search for a disease
  36. 36. STRING network
  37. 37. evidence viewer
  38. 38. compartments.jensenlab.org
  39. 39. text mining
  40. 40. curated knowledge
  41. 41. sequence-based predictions
  42. 42. visualization
  43. 43. tissues.jensenlab.org
  44. 44. related projects
  45. 45. importance of full text
  46. 46. NIH grant abstracts
  47. 47. electronic patient records
  48. 48. patient stratification
  49. 49. Roque et al., PLoS Computational Biology, 2011
  50. 50. pharmacovigilance
  51. 51. Eriksson et al., in preparation, 2012
  52. 52. Thank you! Sune Frankild Janos Binder Kalliopi TsafouPeter Bjødstrup Jensen Robert Eriksson

×