Biological databases: Challenges in organization and usability

661 views
476 views

Published on

Published in: Science
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
661
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Biological databases: Challenges in organization and usability

  1. 1. Biological databases Challenges in organization and usability Lars Juhl Jensen
  2. 2. Ph.D.
  3. 3. postdoc
  4. 4. staff scientist
  5. 5. group leader
  6. 6. cofounder
  7. 7. challenges
  8. 8. buzzword du jour
  9. 9. big data
  10. 10. semantic web
  11. 11. cognitive computing
  12. 12. Underpants Gnomes
  13. 13. elephant in the room
  14. 14. heterogeneous data
  15. 15. many databases
  16. 16. different formats
  17. 17. different identifiers
  18. 18. variable quality
  19. 19. difficult to interpret
  20. 20. organization
  21. 21. identifier mapping
  22. 22. pick a reference
  23. 23. map all else to that
  24. 24. hard work
  25. 25. database import
  26. 26. automatic updating
  27. 27. separate parsers
  28. 28. error checking
  29. 29. formats change
  30. 30. unstructured data
  31. 31. text mining
  32. 32. dictionary-based methods
  33. 33. co-occurrence statistics
  34. 34. steep learning curve
  35. 35. quality assessment
  36. 36. high error rates
  37. 37. don’t filter it
  38. 38. score it
  39. 39. von Mering et al., Nucleic Acids Research, 2005
  40. 40. calibrate vs. gold standard
  41. 41. von Mering et al., Nucleic Acids Research, 2005
  42. 42. control error rate
  43. 43. improves comparability
  44. 44. helps interpretation
  45. 45. usability
  46. 46. for bioinformaticians
  47. 47. common identifiers
  48. 48. common format
  49. 49. cannot ask for more
  50. 50. for biologists
  51. 51. web interfaces
  52. 52. unified information portal
  53. 53. nobody will use it
  54. 54. focused resources
  55. 55. STRING
  56. 56. protein associations
  57. 57. computational predictions
  58. 58. Korbel et al., Nature Biotechnology, 2004
  59. 59. experimental data
  60. 60. Jensen & Bork, Science, 2008
  61. 61. curated knowledge
  62. 62. Letunic & Bork, Trends in Biochemical Sciences, 2008
  63. 63. text mining
  64. 64. >10 km
  65. 65. general approach
  66. 66. COMPARTMENTS
  67. 67. TISSUES
  68. 68. DISEASES
  69. 69. visualization
  70. 70. quick overview
  71. 71. protein networks
  72. 72. string-db.org
  73. 73. subcellular localization
  74. 74. compartments.jensenlab.org
  75. 75. tissue expression
  76. 76. tissues.jensenlab.org
  77. 77. access to more details
  78. 78. tables are boring
  79. 79. summary
  80. 80. common identifiers
  81. 81. quality scores
  82. 82. focused resources
  83. 83. visualization

×