Networks of proteins and diseases
Lars Juhl Jensen
sequence analysis
protein networks
de Lichtenberg, Jensen et al., Science, 2005
adverse drug reactions
Campillos, Kuhn et al., Science, 2008
group leader
cofounder
data mining
proteomics
text mining
biomedical literature
electronic health records
protein networks
guilt by association
STRING
computational predictions
gene fusion
Korbel et al., Nature Biotechnology, 2004
gene neighborhood
Korbel et al., Nature Biotechnology, 2004
phylogenetic profiles
Korbel et al., Nature Biotechnology, 2004
experimental data
gene coexpression
protein interactions
Jensen & Bork, Science, 2008
curated knowledge
complexes
pathways
Letunic & Bork, Trends in Biochemical Sciences, 2008
many databases
different formats
different identifiers
variable quality
not comparable
hard work
quality scores
von Mering et al., Nucleic Acids Research, 2005
calibrate vs. gold standard
missing most of the data
text mining
>10 km
too much to read
computer
as smart as a dog
teach it specific tricks
named entity recognition
comprehensive lexicon
CDC2
cyclin dependent kinase 1
expansion rules
hCdc2
CDC2
flexible matching
cyclin-dependent kinase 1
cyclin dependent kinase 1
“black list”
SDS
augmented browsing
Reflect
browser add-on
real-time text mining
Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009
O’Donoghue et al., Journal of Web Semantics, 2010
information extraction
co-mentioning
within documents
within paragraphs
within sentences
text corpus
~22 million abstracts
no access
~4 million full-text articles
localization and disease
general approach
COMPARTMENTS
TISSUES
DISEASES
curated knowledge
experimental data
text mining
computational predictions
common identifiers
quality scores
visualization
compartments.jensenlab.org
tissues.jensenlab.org
dissemination
web interfaces
web services
diseases.jensenlab.org
bulk download
disease networks
medical data
electronic health records
central registries
individual hospitals
Jensen et al., Nature Reviews Genetics, 2012
structured data
Jensen et al., Nature Reviews Genetics, 2012
unstructured data
in Danish
by busy doctors
confounding factors
age and gender
reporting bias
custom dictionaries
typo rules
age/gender matching
comorbidity
Jensen et al., Nature Reviews Genetics, 2012
Roque et al., PLOS Computational Biology, 2011
temporal correlation
diagnosis trajectories
Jensen et al., in preparation, 2013
pharmocovigilance
adverse drug reactions
Eriksson et al., submitted, 2013
ADR profiles
Eriksson et al., submitted, 2013
ADR frequencies
Eriksson et al., submitted, 2013
molecular basis
protein networks
Acknowledgments
STRING
Christian von
Mering
Damian
Szklarczyk
Michael Kuhn
Manuel Stark
Samuel Chaffron
Chris Creevey
Jean...
Thank you!
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Networks of proteins and diseases
Upcoming SlideShare
Loading in …5
×

Networks of proteins and diseases

167
-1

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
167
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Networks of proteins and diseases

  1. 1. Networks of proteins and diseases Lars Juhl Jensen
  2. 2. sequence analysis
  3. 3. protein networks
  4. 4. de Lichtenberg, Jensen et al., Science, 2005
  5. 5. adverse drug reactions
  6. 6. Campillos, Kuhn et al., Science, 2008
  7. 7. group leader
  8. 8. cofounder
  9. 9. data mining
  10. 10. proteomics
  11. 11. text mining
  12. 12. biomedical literature
  13. 13. electronic health records
  14. 14. protein networks
  15. 15. guilt by association
  16. 16. STRING
  17. 17. computational predictions
  18. 18. gene fusion
  19. 19. Korbel et al., Nature Biotechnology, 2004
  20. 20. gene neighborhood
  21. 21. Korbel et al., Nature Biotechnology, 2004
  22. 22. phylogenetic profiles
  23. 23. Korbel et al., Nature Biotechnology, 2004
  24. 24. experimental data
  25. 25. gene coexpression
  26. 26. protein interactions
  27. 27. Jensen & Bork, Science, 2008
  28. 28. curated knowledge
  29. 29. complexes
  30. 30. pathways
  31. 31. Letunic & Bork, Trends in Biochemical Sciences, 2008
  32. 32. many databases
  33. 33. different formats
  34. 34. different identifiers
  35. 35. variable quality
  36. 36. not comparable
  37. 37. hard work
  38. 38. quality scores
  39. 39. von Mering et al., Nucleic Acids Research, 2005
  40. 40. calibrate vs. gold standard
  41. 41. missing most of the data
  42. 42. text mining
  43. 43. >10 km
  44. 44. too much to read
  45. 45. computer
  46. 46. as smart as a dog
  47. 47. teach it specific tricks
  48. 48. named entity recognition
  49. 49. comprehensive lexicon
  50. 50. CDC2
  51. 51. cyclin dependent kinase 1
  52. 52. expansion rules
  53. 53. hCdc2
  54. 54. CDC2
  55. 55. flexible matching
  56. 56. cyclin-dependent kinase 1
  57. 57. cyclin dependent kinase 1
  58. 58. “black list”
  59. 59. SDS
  60. 60. augmented browsing
  61. 61. Reflect
  62. 62. browser add-on
  63. 63. real-time text mining
  64. 64. Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009 O’Donoghue et al., Journal of Web Semantics, 2010
  65. 65. information extraction
  66. 66. co-mentioning
  67. 67. within documents
  68. 68. within paragraphs
  69. 69. within sentences
  70. 70. text corpus
  71. 71. ~22 million abstracts
  72. 72. no access
  73. 73. ~4 million full-text articles
  74. 74. localization and disease
  75. 75. general approach
  76. 76. COMPARTMENTS
  77. 77. TISSUES
  78. 78. DISEASES
  79. 79. curated knowledge
  80. 80. experimental data
  81. 81. text mining
  82. 82. computational predictions
  83. 83. common identifiers
  84. 84. quality scores
  85. 85. visualization
  86. 86. compartments.jensenlab.org
  87. 87. tissues.jensenlab.org
  88. 88. dissemination
  89. 89. web interfaces
  90. 90. web services
  91. 91. diseases.jensenlab.org
  92. 92. bulk download
  93. 93. disease networks
  94. 94. medical data
  95. 95. electronic health records
  96. 96. central registries
  97. 97. individual hospitals
  98. 98. Jensen et al., Nature Reviews Genetics, 2012
  99. 99. structured data
  100. 100. Jensen et al., Nature Reviews Genetics, 2012
  101. 101. unstructured data
  102. 102. in Danish
  103. 103. by busy doctors
  104. 104. confounding factors
  105. 105. age and gender
  106. 106. reporting bias
  107. 107. custom dictionaries
  108. 108. typo rules
  109. 109. age/gender matching
  110. 110. comorbidity
  111. 111. Jensen et al., Nature Reviews Genetics, 2012
  112. 112. Roque et al., PLOS Computational Biology, 2011
  113. 113. temporal correlation
  114. 114. diagnosis trajectories
  115. 115. Jensen et al., in preparation, 2013
  116. 116. pharmocovigilance
  117. 117. adverse drug reactions
  118. 118. Eriksson et al., submitted, 2013
  119. 119. ADR profiles
  120. 120. Eriksson et al., submitted, 2013
  121. 121. ADR frequencies
  122. 122. Eriksson et al., submitted, 2013
  123. 123. molecular basis
  124. 124. protein networks
  125. 125. Acknowledgments STRING Christian von Mering Damian Szklarczyk Michael Kuhn Manuel Stark Samuel Chaffron Chris Creevey Jean Muller Tobias Doerks Philippe Julien Alexander Roth Milan Simonovic Jan Korbel Berend Snel Martijn Huynen Peer Bork Text mining Sune Frankild Evangelos Pafilis Kalliopi Tsafou Alberto Santos Janos Binder Heiko Horn Michael Kuhn Nigel Brown Reinhardt Schneider Sean O’ Donoghue EHR mining Anders Boeck Jensen Peter Bjødstrup Jensen Francisco S. Roque Henriette Schmock Marlene Dalgaard Massimo Andreatta Thomas Hansen Karen Søeby Søren Bredkjær Anders Juul Tudor Oprea Pope Moseley Thomas Werge Søren Brunak
  126. 126. Thank you!

×