Games for Human Gene Annotation

2,315 views
2,241 views

Published on

Structured gene annotations are a foundation on which many bioinformatics and statistical analyses are built, however their representation is quite sparse – in comparison to the total knowledge that could be captured. As centralized biocuration efforts struggle to keep up with the rate of biomedical data generation, new models for gene annotation need to be explored.
Recently, online games have emerged as an effective way to recruit, engage and organize contributors to help address difficult challenges like online image tagging (ESP Game), protein folding (Foldit), or multiple sequence alignment (Phylo).
We present here two online games - Dizeez and GenESP - aimed at identifying novel gene-disease annotations, i.e. gene-disease links well established in the literature, but not yet reflected as structured annotations. Preliminary results are provided from game play online and at scientific confer-ences. These data suggest that even after limited game play, novel gene-disease annotations can be mined from game playing logs.
Both games are available at http://genegames.org.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,315
On SlideShare
0
From Embeds
0
Number of Embeds
925
Actions
Shares
0
Downloads
9
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Games for Human Gene Annotation

  1. 1. Games)for)Human)Gene)AnnotaDon) Salvatore)Loguercio*,)Benjamin)Good,)Andrew)Su) Department)of)Molecular)and)Experimental)Medicine) The)Scripps)Research)InsDtute) ISMB) BioEOntologies)SIG) July)13,)2012)
  2. 2. 2 Growth of potential annotations 1000000 PubMed in 2012: 950000 > 21 million articles. 900000 Approaching 1 million 850000 new articles per yearNumber of 800000 (>1/minute) articles added to 750000 PubMed 700000 650000 600000 550000 500000
  3. 3. 3 Number of articlesof humantypical scientist Average capacity read by scientist201001979 1984 1989 1994 1999 2004 2009
  4. 4. GO) 1.5%)of)PubMed*)cited) PubMed) by)GO)annotaDons)*311,696)arDcles)(2011)))
  5. 5. 5 Sooner or later, the research community willneed to be involved in the 0annotation effort to scale up to the rate of data generation.
  6. 6. How)to)involve)the)community) in)gene)annotaDon?)
  7. 7. Crowdsourcing)Biology)
  8. 8. Gene)Wiki:)Comprehensively)organize)knowledge)of)all)human)genes)
  9. 9. Gene)annotaDon)portal)for) aggregaDng)geneEcentric) online)content)h]p://biogps.org))
  10. 10. Biological)games)Build)scienDfic)knowledge)through)game)play)
  11. 11. Why)games?)
  12. 12. It)is)esDmated)that)9)billion)hours)are)spent)playing)Solitaire)every)year)
  13. 13. 13)Seven million human hours h]p://www.flickr.com/photos/archana3k1/4124330493/)
  14. 14. 14)Twenty million human hours h]p://www.flickr.com/photos/ableman/2171326385/)
  15. 15. 15)150 billion human hours E) per year h]p://www.flickr.com/photos/rvpEcw/6243289302/)
  16. 16. Can)we)harness)some)of)this) Dme)and)energy?))
  17. 17. Games)with)a)purpose)
  18. 18. Label)all)images)on)the)Web) Devise)protein)folding)algorithms) Fix)mulDple)sequence)alignments) Design)RNA)molecules)
  19. 19. Annotate)all)human)genes)
  20. 20. Record)the)relevant)properDes)of)each)gene)in)a)manner)that) facilitates)computaDon) •  biological)process) •  molecular)funcDon) •  cellular)localizaDon) •  interacDon)partners) •  disease)relevance) Gene) •  genomic)locaDon) •  geneDc)variaDons) •  post)translaDonal) modificaDons) •  related)drugs) •  related)publicaDons) •  ...)
  21. 21. Dizeez:)geneEdisease)associaDon)quiz)
  22. 22. DIZEEZ:)geneEdisease)associaDon)quiz) hurry! then on to the next question If its ‘right’, you get points Click the related disease h]p://genegames.org)
  23. 23. Gameplay)•  AdverDsed)with)a)blog)post,)a)few)tweets)and)conference)poster)•  Results)since)Dec.)2011:) –  180)people)have)played)it) –  713)one)minute)game)rounds)have)been)completed) –  5,282)disDnct)geneEdisease)associaDons)collected)
  24. 24. Quality)through)replicaDon)DisDnct)geneEdisease)pairs)collected) 5,282) 482)) collected)more)than)once) PotenDal)new)annotaDons)(do)not)appear)in)OMIM,)PharmGKB)) 223)) example:)ABCB5))Acute)myeloid)leukemia))
  25. 25. Novel)annotaDons)E)I)#&Occurrences& Gene& Disease& Pubmed) OMIM) PharmGKB) Gene&Wiki) 7) GAST% gastrinoma) 7) RBP3% reDnoblastoma) 7) SSX1% synovial)sarcoma) 6) TG% Graves)disease) 6) CRYGC%% Cataract) 6) SOX8% mental)retardaDon) 6) WRN%% Werner)syndrome) 6) ABL1%% leukemia) 6) MLL3%% leukemia) 6) SNAI2%% breast)carcinoma) 2010)or)later)
  26. 26. Novel)annotaDons)E)II)#&Occurrences& Gene& Disease& Pubmed) OMIM) PharmGKB) Gene&Wiki) 2) ABCB5) acute)myeloid)leukemia) 2) HOXB7) leukemia) 2) SULF1) carcinoma) 2) ALPP) reDnoblastoma) 2) FOXM1) Melanoma) 2009)or)later)
  27. 27. Current)limitaDons))•  Dizeez)actually)punishes)desired)behavior)(adding)new,)unknown) associaDons))by)not)awarding)points)•  Does)not)allow)player)to)enter)associaDons)other)than)those)in) the)provided)list)•  GenESP)fixes)both)problems)
  28. 28. GenESP:)gene)E)concept)associaDon)with)a)partner)
  29. 29. Gene)–)concept)associaDon)with)a)partner)(modeled)amer)the)ESP)Game).)See:)Ahn)and)Dabbish)(2004))Labeling)images)with)a)computer)game,)SIGCHI) h]p://genegames.org)
  30. 30. A)reEusable)pa]ern) Gene) Disease) Gene) FuncDon) Gene) Gene)Gene) relaDonship) Gene) The Gene Wiki Hairball!
  31. 31. )Geong)players))
  32. 32. MulDplayer)Online)E)“Farmville) Social)gaming) for)gene)annotaDon”) EducaDng)players) Arena)mode) Building)a)community)) labs)vs.)labs) Dizeez& TP53!) SOX2!)
  33. 33. Epilogue)Crowdsourcing)for)knowledge)acquisiDon)
  34. 34. TradiDonal)model) Crowdsourced)model)Knowledge) Knowledge) Small)expert) group) Data) Data)and)contributors)
  35. 35. TradiDonal)model) Crowdsourced)model)ComputaDon)
  36. 36. Annotate)all)human)genes)
  37. 37. Special)thanks)to:) Su)Lab)@)TSRI) Erik)Clarke) Ian)Macleod)Ben)Good) Max)Nanis) Chunlei)Wu)Andrew)Su) Crowdsourcing)Biology)@)GSoC)2012!)Students:)Clarence)Leung) Interwebs&Carolina)Lidstrom) h]p://sulab.org) loguerci@scripps.edu) @sal999) +Salvatore)Loguercio) Funding&and&Support& (BioGPS:)GM83924,)Gene)Wiki:)GM089820))

×