ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation
Upcoming SlideShare
Loading in...5
×
 

ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation

on

  • 3,110 views

Note, several slides use animation, so for best display please download and view in Powerpoint.

Note, several slides use animation, so for best display please download and view in Powerpoint.

Statistics

Views

Total Views
3,110
Views on SlideShare
2,159
Embed Views
951

Actions

Likes
2
Downloads
8
Comments
0

8 Embeds 951

http://sulab.org 904
https://twitter.com 31
http://feeds.feedburner.com 9
https://si0.twimg.com 2
https://www.linkedin.com 2
https://twimg0-a.akamaihd.net 1
http://ranksit.com 1
http://translate.googleusercontent.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Relying on the entire community of scientists to digest the biomedical literature: identification filtering extraction summarization
  • Structured annotations enable pathway analysis, statistical analyses, cross-species comparisons
  • Tried on 773 GO categories, significant in 356 cases (46%)
  • We extended this analysis to all 773 GO terms used in human gene annotations and found a consistent improvement in the enrichment scores
  • Also want to convince you that the Long Tail of bioinformatics developers is valuable too, but first have to convince you that there is a bottleneck in tool development.

ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation Presentation Transcript

  • The Gene Wiki: Crowdsourcing human gene annotation Andrew Su, Ph.D. The Scripps Research Institute ISMB Special Session: Harnessing community intelligence for bioinformatics #ISMB #SS7 July 17, 2012
  • 2The Long Tail is a prolific source of content Short Head Content produced Long Tail Contributors (sorted) News : Newspapers Blogs Video: TV/Hollywood YouTube Product reviews: Consumer reports Amazon reviews Food reviews: Food critics Yelp Talent judging: Olympics American Idol Gene annotation: Manual curation Gene Wiki
  • 3 We can harness theLong Tail of scientiststo directly participate in the gene annotation process.
  • 4Wikipedia is reasonably accurate
  • 5Wikipedia has breadth and depth Articles Words (millions) Wikipedia Britannica Online http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons, July 2008
  • Filtering, extracting, and summarizing PubMedDocuments Concepts
  • 7Wiki success depends on a positive feedback Gene wiki page utility 1 100 2 200 Number of Number of contributors users
  • 8 10,000 gene “stubs” within Wikipedia Utility Users Contributors Protein structure Gene summary Symbols and identifiers Gene Ontology annotations Proteininteractions Tissue expression Linked patternreferences Links to structured databasesHuss, PLoS Biol, 2008
  • 9 Gene Wiki has a critical mass of readers Utility Users Contributors Total: ~4.3 million views / monthHuss, PLoS Biol, 2008; Good, NAR, 2011
  • 10 Gene Wiki has a critical mass of editors Utility ~10,000 words added / month Users Contributors Total 1.42 million words ≈ 230 full-length articles 4.3 million views / month Cumulative edits Productive edits 1000 edits / month VandalismGood, NAR, 2011
  • 11A review article for every gene is powerful Reelin: 98 editors, 703 edits since July 2002 Hyperlinks to related concepts Heparin: 358 editors, 654 edits since June 2003 AMPK: 109 editors, 203 edits since March 2004 RNAi: 394 editors, 994 edits since October 2002 References to the literature
  • 12Making the Gene Wiki more computableFree text Structured annotations
  • 13Filling the gaps in gene annotation Good, BMC Genomics 2011, 12:603 NCBI Entrez Gene: 3362 Gene Wiki mapping Wikilink Candidate assertion GO:0004993 GO exact synonym Annotator
  • 14Filling the gaps in gene annotation Good, BMC Genomics 2011, 12:603 NCBI Entrez Gene: 334 Gene Wiki mapping Wikilink Candidate assertion GO:0006897 GO exact match Annotator
  • 15Novel GO annotations – so what? Good, BMC Genomics 2011, 12:603 6319 11,022 ~100,000 “novel” 4703 (43%)annotations annotations annotations match knownmined from from GO @ 48-64% annotations Gene Wiki consortium specificity
  • 16Gene Wiki content improves enrichment analysis axon Enrichment guidance GO term analysis(GO:0007411) 811 articles 264 genes PubMed Concept Gene list abstracts recognition GO:0007411 Yes NoLinked genes Yes 13 2 through No 251 12033 PubMed P = 1.55 E-20
  • 17Gene Wiki content improves enrichment analysis muscle Enrichment contraction GO term analysis(GO:0006936) 251 articles 87 genes PubMed Concept Gene list abstracts recognition + Gene Wiki 87 articles GO:0006936 GO:0006936Linked genes Linked genes through through PubMed PubMed + Gene Wiki P = 1.0 P = 1.22 E-09
  • 18Gene Wiki content improves enrichment analysis More p-value significant with(PubMed + GW) PubMed only Muscle contraction More significant with PubMed + GW p-value (PubMed only)
  • 19Gene Wiki+ for integrative queries mwsync http://genewikiplus.org
  • 20Dynamic queries across genes, diseases, SNPs
  • 21
  • 22TOP 100GENES
  • 23Gene Wiki+ for integrative queries mwsync OMIM PharmGKB {{#ask: [[Category:Human_proteins]] [[is_associated_with:: <q>[[Category:Breast_cancer] ]</q>]] [[HasSNP:: … <q>[[is_associated_with:: http://genewikiplus.org
  • 24Gene Wiki+ for integrative queries mwsync OMIM PharmGKB http://genewikiplus.org
  • 25 The Long Tail of scientistsis a valuable source of information on gene function
  • 26Crowdsourcing a gene annotation portal
  • 27 Collaborators Group membersDoug Howe, ZFIN Erik Clarke Ian MacleodJohn Hogenesch, U PennJon Huss, GNF Ben Good Max NanisLuca de Alfaro, UCSC Salvatore Loguercio Chunlei WuAngel Pizzaro, U PennFaramarz Valafar, SDSUPierre Lindenbaum, Fondation Jean Dausset ISMB travel supportMichael Martone, RushKonrad Koehler, Karo BioWarren Kibbe, Simon Lim, NorthwesternMany Wikipedia editors WP:MCB Project Contact http://sulab.org asu@scripps.edu @andrewsu +Andrew Su Funding and Support (BioGPS: GM83924, Gene Wiki: GM089820)