0
The Gene Wiki: Crowdsourcing human gene               annotation                 Andrew Su, Ph.D.             The Scripps ...
2The Long Tail is a prolific source of content                      Short                      Head            Content    ...
3  We can harness theLong Tail of scientiststo directly participate in  the gene annotation        process.
4Wikipedia is reasonably accurate
5Wikipedia has breadth and depth           Articles            Words            (millions)                         Wikiped...
Filtering, extracting, and summarizing PubMedDocuments Concepts
7Wiki success depends on a positive feedback                  Gene wiki page utility                             1   100  ...
8 10,000 gene “stubs” within Wikipedia          Utility                                                         Users     ...
9 Gene Wiki has a critical mass of readers                                                                      Utility   ...
10 Gene Wiki has a critical mass of editors                                                                               ...
11A review article for every gene is powerful     Reelin: 98 editors, 703 edits since July 2002                           ...
12Making the Gene Wiki more computableFree text       Structured annotations
13Filling the gaps in gene annotation                                   Good, BMC Genomics 2011, 12:603                   ...
14Filling the gaps in gene annotation                                   Good, BMC Genomics 2011, 12:603                   ...
15Novel GO annotations – so what?                                        Good, BMC Genomics 2011, 12:603                 6...
16Gene Wiki content improves enrichment analysis    axon                                            Enrichment  guidance  ...
17Gene Wiki content improves enrichment analysis   muscle                                          Enrichment contraction ...
18Gene Wiki content improves enrichment analysis                      More    p-value      significant with(PubMed + GW)  ...
19Gene Wiki+ for integrative queries                      mwsync                http://genewikiplus.org
20Dynamic queries across genes, diseases, SNPs
21
22TOP 100GENES
23Gene Wiki+ for integrative queries                     mwsync                                OMIM                       ...
24Gene Wiki+ for integrative queries                      mwsync                                   OMIM                   ...
25          The Long Tail of scientistsis a valuable source of  information on gene        function
26Crowdsourcing a gene annotation portal
27       Collaborators                                                  Group membersDoug Howe, ZFIN                      ...
Upcoming SlideShare
Loading in...5
×

ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation

2,919

Published on

Note, several slides use animation, so for best display please download and view in Powerpoint.

Published in: Education, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,919
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
12
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • Relying on the entire community of scientists to digest the biomedical literature: identification filtering extraction summarization
  • Structured annotations enable pathway analysis, statistical analyses, cross-species comparisons
  • Tried on 773 GO categories, significant in 356 cases (46%)
  • We extended this analysis to all 773 GO terms used in human gene annotations and found a consistent improvement in the enrichment scores
  • Also want to convince you that the Long Tail of bioinformatics developers is valuable too, but first have to convince you that there is a bottleneck in tool development.
  • Transcript of "ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation"

    1. 1. The Gene Wiki: Crowdsourcing human gene annotation Andrew Su, Ph.D. The Scripps Research Institute ISMB Special Session: Harnessing community intelligence for bioinformatics #ISMB #SS7 July 17, 2012
    2. 2. 2The Long Tail is a prolific source of content Short Head Content produced Long Tail Contributors (sorted) News : Newspapers Blogs Video: TV/Hollywood YouTube Product reviews: Consumer reports Amazon reviews Food reviews: Food critics Yelp Talent judging: Olympics American Idol Gene annotation: Manual curation Gene Wiki
    3. 3. 3 We can harness theLong Tail of scientiststo directly participate in the gene annotation process.
    4. 4. 4Wikipedia is reasonably accurate
    5. 5. 5Wikipedia has breadth and depth Articles Words (millions) Wikipedia Britannica Online http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons, July 2008
    6. 6. Filtering, extracting, and summarizing PubMedDocuments Concepts
    7. 7. 7Wiki success depends on a positive feedback Gene wiki page utility 1 100 2 200 Number of Number of contributors users
    8. 8. 8 10,000 gene “stubs” within Wikipedia Utility Users Contributors Protein structure Gene summary Symbols and identifiers Gene Ontology annotations Proteininteractions Tissue expression Linked patternreferences Links to structured databasesHuss, PLoS Biol, 2008
    9. 9. 9 Gene Wiki has a critical mass of readers Utility Users Contributors Total: ~4.3 million views / monthHuss, PLoS Biol, 2008; Good, NAR, 2011
    10. 10. 10 Gene Wiki has a critical mass of editors Utility ~10,000 words added / month Users Contributors Total 1.42 million words ≈ 230 full-length articles 4.3 million views / month Cumulative edits Productive edits 1000 edits / month VandalismGood, NAR, 2011
    11. 11. 11A review article for every gene is powerful Reelin: 98 editors, 703 edits since July 2002 Hyperlinks to related concepts Heparin: 358 editors, 654 edits since June 2003 AMPK: 109 editors, 203 edits since March 2004 RNAi: 394 editors, 994 edits since October 2002 References to the literature
    12. 12. 12Making the Gene Wiki more computableFree text Structured annotations
    13. 13. 13Filling the gaps in gene annotation Good, BMC Genomics 2011, 12:603 NCBI Entrez Gene: 3362 Gene Wiki mapping Wikilink Candidate assertion GO:0004993 GO exact synonym Annotator
    14. 14. 14Filling the gaps in gene annotation Good, BMC Genomics 2011, 12:603 NCBI Entrez Gene: 334 Gene Wiki mapping Wikilink Candidate assertion GO:0006897 GO exact match Annotator
    15. 15. 15Novel GO annotations – so what? Good, BMC Genomics 2011, 12:603 6319 11,022 ~100,000 “novel” 4703 (43%)annotations annotations annotations match knownmined from from GO @ 48-64% annotations Gene Wiki consortium specificity
    16. 16. 16Gene Wiki content improves enrichment analysis axon Enrichment guidance GO term analysis(GO:0007411) 811 articles 264 genes PubMed Concept Gene list abstracts recognition GO:0007411 Yes NoLinked genes Yes 13 2 through No 251 12033 PubMed P = 1.55 E-20
    17. 17. 17Gene Wiki content improves enrichment analysis muscle Enrichment contraction GO term analysis(GO:0006936) 251 articles 87 genes PubMed Concept Gene list abstracts recognition + Gene Wiki 87 articles GO:0006936 GO:0006936Linked genes Linked genes through through PubMed PubMed + Gene Wiki P = 1.0 P = 1.22 E-09
    18. 18. 18Gene Wiki content improves enrichment analysis More p-value significant with(PubMed + GW) PubMed only Muscle contraction More significant with PubMed + GW p-value (PubMed only)
    19. 19. 19Gene Wiki+ for integrative queries mwsync http://genewikiplus.org
    20. 20. 20Dynamic queries across genes, diseases, SNPs
    21. 21. 21
    22. 22. 22TOP 100GENES
    23. 23. 23Gene Wiki+ for integrative queries mwsync OMIM PharmGKB {{#ask: [[Category:Human_proteins]] [[is_associated_with:: <q>[[Category:Breast_cancer] ]</q>]] [[HasSNP:: … <q>[[is_associated_with:: http://genewikiplus.org
    24. 24. 24Gene Wiki+ for integrative queries mwsync OMIM PharmGKB http://genewikiplus.org
    25. 25. 25 The Long Tail of scientistsis a valuable source of information on gene function
    26. 26. 26Crowdsourcing a gene annotation portal
    27. 27. 27 Collaborators Group membersDoug Howe, ZFIN Erik Clarke Ian MacleodJohn Hogenesch, U PennJon Huss, GNF Ben Good Max NanisLuca de Alfaro, UCSC Salvatore Loguercio Chunlei WuAngel Pizzaro, U PennFaramarz Valafar, SDSUPierre Lindenbaum, Fondation Jean Dausset ISMB travel supportMichael Martone, RushKonrad Koehler, Karo BioWarren Kibbe, Simon Lim, NorthwesternMany Wikipedia editors WP:MCB Project Contact http://sulab.org asu@scripps.edu @andrewsu +Andrew Su Funding and Support (BioGPS: GM83924, Gene Wiki: GM089820)
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×