Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Citizen Science and Rare
Disease Research
Andrew Su, Ph.D.
@andrewsu
asu@scripps.edu
http://sulab.org
September 22, 2016
P...
2
Credit: http://www.slideshare.net/PhRMA/rare-disease-infographics
3
Credit: http://www.slideshare.net/PhRMA/rare-disease-infographics
Rare disease case study #1
4
Photo: Retta Beery
5
Bainbridge et al., STM, 2011
6
Photo: Retta Beery
Rare disease case study #2
7
8
… but no obvious treatments
9
Bainbridge et al., STM, 2011
SPR
What differentiates SPR and NGLY1?
10
SPR
11
Sarah Olmstead
https://flic.kr/p/364dZW
NGLY1
12
NGLY1
(11 PubMed articles)
Congenital disorders of
glycosylation
(822)
PNGase
(686)
ERAD
(1330)
glycosylation
(48,862)
...
The biomedical literature is massive…
13
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
1983 1988 1993 1998 2003 20...
… but it is very hard to query and compute
14
… but it is very hard to query and compute
15
Imatinib
Crizotinib
Erlotinib
Gefitinib
Sorafenib
Lapatinib
Dasatinib
…
Acut...
16
Personalized medicine relies on effective
PietroBellini
https://flic.kr/p/k5jmja
KNOWELDGE MANAGEMENT
Information extraction from biomedical text
17
1. Identify biomedical concepts in text
… We report a case of familial syst...
Information extraction from biomedical text
18
imatinib
dasatinib
PKC412
Familial systemic
mastocytosis
KIT
K509I
1. Ident...
19
Goal: Assemble a network of biomedical
knowledge that is comprehensive,
current, computable and traceable.
20
http://www.navy.mil/management/photodb/photos/101104-N-6383T-508.jpg
21
Crowdsourcing
is to data
is to text
biomedical
Provide a database of the world’s
knowledge that anyone can edit
- Denny Vrandečić
23
Subclass of
Regulates
Physically
interacts with
Protein
Neural
development
Property:P279
Property:P128
Property:P129
Q8054...
Property:P279
Property:P128
Property:P129
Q8054
Q1345738
Q1979313
Q423510
Q13561329
https://www.wikidata.org/w/api.php?act...
We are seeding it with
biomedical data
• All human, mouse genes and proteins
• All Gene Ontology terms
• All FDA approved ...
Inter-item links form a giant knowledge graph
Everything is
connected
Reelin, Heart disease,
Barack Obama,
everything..
ht...
28
Crowdsourcing
Question: Can a group of non-scientists collectively
perform concept recognition in biomedical texts?
29
30
Experts versus crowd for concept identification
593 PubMed abstracts
6,900 mentions of
“disease concepts”
F = 0.87F = 0...
31
Experts versus crowd for concept identification
593 PubMed abstracts
6,900 mentions of
“disease concepts”
F = 0.87F = 0...
32
http://mark2cure.org
33
Paid crowdsourcing
• F = 0.84
• 28 days
• 212 workers
• Total cost: $0
$$$
• F = 0.87
• 9 days
• 145 workers
• Total: $...
Mapping the biomedical network around NGLY1
34
NGLY1
35
http://mark2cure.org
36
A preliminary view of the NGLY1-
focused biological network
1,200 contributors
3,200 documents
787,400 annotations
37
Personalized medicine relies on effective
PietroBellini
https://flic.kr/p/k5jmja
KNOWELDGE MANAGEMENT
38
If I have seen further than
others, it is by standing on the
shoulders of giants.
- Sir Isaac Newton
39
Jake Bruggeman
Karthik G
Ramya Gamini
Louis Gioia
Toby Li
Greg Stupp
Other group members
Funding and Support
BioGPS: GM...
Why do I Mark2Cure?
40
I am retired, have a doctorate in
medical humanities, and have two
children with Gaucher disease. I...
Upcoming SlideShare
Loading in …5
×

Citizen Science and Rare Disease Research

359 views

Published on

Talk given at "Personalized Health in the Digital Age" September 22, 2016 at Campus Biotech in Geneva, Switzerland https://www.personalizedhealth2016.ch/

Published in: Science
  • Be the first to comment

  • Be the first to like this

Citizen Science and Rare Disease Research

  1. 1. Citizen Science and Rare Disease Research Andrew Su, Ph.D. @andrewsu asu@scripps.edu http://sulab.org September 22, 2016 Personalized Health in the Digital Age Symposium Slides: slideshare.net/andrewsu
  2. 2. 2 Credit: http://www.slideshare.net/PhRMA/rare-disease-infographics
  3. 3. 3 Credit: http://www.slideshare.net/PhRMA/rare-disease-infographics
  4. 4. Rare disease case study #1 4 Photo: Retta Beery
  5. 5. 5 Bainbridge et al., STM, 2011
  6. 6. 6 Photo: Retta Beery
  7. 7. Rare disease case study #2 7
  8. 8. 8 … but no obvious treatments
  9. 9. 9 Bainbridge et al., STM, 2011 SPR
  10. 10. What differentiates SPR and NGLY1? 10 SPR
  11. 11. 11 Sarah Olmstead https://flic.kr/p/364dZW NGLY1
  12. 12. 12 NGLY1 (11 PubMed articles) Congenital disorders of glycosylation (822) PNGase (686) ERAD (1330) glycosylation (48,862) alacrima (164) Genetic interactors (3016) symptoms (109,928) 25 million articles in PubMed
  13. 13. The biomedical literature is massive… 13 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1983 1988 1993 1998 2003 2008 2013 Number of new PubMed-indexed articles
  14. 14. … but it is very hard to query and compute 14
  15. 15. … but it is very hard to query and compute 15 Imatinib Crizotinib Erlotinib Gefitinib Sorafenib Lapatinib Dasatinib … Acute myeloid leukemia Acute lymphoblastic leukemia Chronic myelogenous leukemia Chronic lymphocytic leukemia Hodgkin lymphoma Non-Hodgkin lymphoma Myeloma … AND
  16. 16. 16 Personalized medicine relies on effective PietroBellini https://flic.kr/p/k5jmja KNOWELDGE MANAGEMENT
  17. 17. Information extraction from biomedical text 17 1. Identify biomedical concepts in text … We report a case of familial systemic mastocytosis with the rare KIT K509I germ line mutation. In vitro treatment with imatinib, dasatinib and PKC412 reduced cell viability of primary mast cells harboring KIT K509I mutation. Both patients with familial systemic mastocytosis had remarkable hematological and skin improvement after three months of imatinib treatment. Leuk Res. 2014 Oct;38(10):1245-51. doi: 10.1016/j.leukres. GENES DISEASES DRUGS VARIANTS
  18. 18. Information extraction from biomedical text 18 imatinib dasatinib PKC412 Familial systemic mastocytosis KIT K509I 1. Identify biomedical concepts in text 2. Identify relationships between concepts Mutation of Mutation causes causes treats inhibits
  19. 19. 19 Goal: Assemble a network of biomedical knowledge that is comprehensive, current, computable and traceable.
  20. 20. 20 http://www.navy.mil/management/photodb/photos/101104-N-6383T-508.jpg
  21. 21. 21 Crowdsourcing
  22. 22. is to data is to text biomedical Provide a database of the world’s knowledge that anyone can edit - Denny Vrandečić
  23. 23. 23
  24. 24. Subclass of Regulates Physically interacts with Protein Neural development Property:P279 Property:P128 Property:P129 Q8054 Q1345738 VLDL receptor Q1979313 Amyloid beta A4 Q423510 Q13561329 http://www.wikidata.org/wiki/Q13561329 Decreased expression in Property:P1910 Schizophrenia Q41112 Bipolar disorder Q131755
  25. 25. Property:P279 Property:P128 Property:P129 Q8054 Q1345738 Q1979313 Q423510 Q13561329 https://www.wikidata.org/w/api.php?action=wbgetentitie s&ids=Q13561329&format=json Property:P1910 Q41112 Q131755
  26. 26. We are seeding it with biomedical data • All human, mouse genes and proteins • All Gene Ontology terms • All FDA approved drugs • 9,000+ human diseases • 120 reference microbial genomes Burgstaller et al (2016) Database (preprint in BioRxiv) Mitraka et al (2015) Semantic Web Applications for the Life Sciences (best paper) (preprint in BioRxiv) Putman et al (2016) Database (preprint in BioRxiv)
  27. 27. Inter-item links form a giant knowledge graph Everything is connected Reelin, Heart disease, Barack Obama, everything.. https://query.wikidata.org SPARQL endpoint for Wikidata
  28. 28. 28 Crowdsourcing
  29. 29. Question: Can a group of non-scientists collectively perform concept recognition in biomedical texts? 29
  30. 30. 30 Experts versus crowd for concept identification 593 PubMed abstracts 6,900 mentions of “disease concepts” F = 0.87F = 0.78 $$$
  31. 31. 31 Experts versus crowd for concept identification 593 PubMed abstracts 6,900 mentions of “disease concepts” F = 0.87F = 0.87 $$$ • 9 days • 145 workers • Total: $630.96
  32. 32. 32 http://mark2cure.org
  33. 33. 33 Paid crowdsourcing • F = 0.84 • 28 days • 212 workers • Total cost: $0 $$$ • F = 0.87 • 9 days • 145 workers • Total: $630.96 “Help science, please” Citizen Science
  34. 34. Mapping the biomedical network around NGLY1 34 NGLY1
  35. 35. 35 http://mark2cure.org
  36. 36. 36 A preliminary view of the NGLY1- focused biological network 1,200 contributors 3,200 documents 787,400 annotations
  37. 37. 37 Personalized medicine relies on effective PietroBellini https://flic.kr/p/k5jmja KNOWELDGE MANAGEMENT
  38. 38. 38 If I have seen further than others, it is by standing on the shoulders of giants. - Sir Isaac Newton
  39. 39. 39 Jake Bruggeman Karthik G Ramya Gamini Louis Gioia Toby Li Greg Stupp Other group members Funding and Support BioGPS: GM83924 Gene Wiki: GM089820 BD2K COE: GM114833 Contact http://sulab.org asu@scripps.edu @andrewsu Mark2Cure Jennifer Fouquier Max Nanis Ginger Tsueng AMT volunteers and Mark2Curators! Slides: slideshare.net/andrewsu Icon credits (Noun Project, Wikimedia Commons): Zach VanDeHey, hunotika, Viktorvoigt, Alberto Rojas, Lloyd Humphreys Matt and Cristina Might NGLY1 community Gene Wiki Ben Good Sebastian Burgstaller Tim Putman Núria Queralt Rosinach Julia Turner Andra Waagmeester BioThings API Chunlei Wu Julee Adesara Cyrus Afrasiabi Sebastien Lelong Mike Mayers Kevin Xin
  40. 40. Why do I Mark2Cure? 40 I am retired, have a doctorate in medical humanities, and have two children with Gaucher disease. I am just looking for some way to put my education to use. My 4 year old daughter Phoebe is living with and battling rare disease. I have Ehlers Danlos Syndrome. I hope to help people learn about this painful and debilitating disorder, so that others like me can receive more effective medical care. Take part in something that helps humanity. I Mark2Cure in memory of my son Mike who had type 1 diabetes. Studied biology in college and I really miss it! In memory of my daughter who had Cystic Fibrosis Give back

×