Your SlideShare is downloading. ×
  • Like
Games for improving human phenotype prediction
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Games for improving human phenotype prediction



  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • The brain mosaic image came from Tim Brady at MIT via Google image search. Tim, if you want me to take it down, just let me know..
    Are you sure you want to
    Your message goes here
    Be the first to like this
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Games for improving human phenotype prediction Benjamin M Good, Salvatore Loguercio, Andrew I Su The Scripps Research Institute, La Jolla, California, USA ABSTRACT ABSTRACT Dizeez: gene – disease annotation quiz Combo: feature selection with community intelligenceAn important goal for biomedical research is to produce genetic and Select the disease related to the clue • Goal: pick the best set of genesgenomic predictors for human phenotypes such as disease prognosis or gene. Guess as many as you can in • Best: the gene set that produces the best decision tree classifierdrug response. To this end, we can now quantify an extremely large one minute. • Classifier: created using training data and selected genes, used tonumber of potential biomarkers for any biological sample. In fact, asingle sample could reasonably be described by millions of molecular predict phenotype (e.g. breast cancer prognosis)variations in DNA, RNA, proteins, and metabolites. However, the actual Every guess adds weight to a linknumber of samples processed typically remains small in comparison. As a between a gene and a disease.result, attempts to use this data to build predictors often face problems A game board A handof overfitting. (While a predictive pattern may describe training datavery well, it may not reproduce well on other datasets.) Preliminary Results 713 games, 180 players;It has recently been shown that biological knowledge in the form of geneannotations and pathway databases can be used to guide the process ofinferring phenotype predictors [1-3]. While promising, such methods are Overall: 4,585 unique gene-limited by the amount, quality and problem-specific applicability of the disease assertions.structured knowledge that is available. 224 assertions provided moreFollowing in the line of games that have recently demonstrated success than once and not found inas a means of ‘crowdsourcing’ difficult biological problems [4,5], we are OMIM/PharmGKB.developing games with the purpose of improving human phenotype Inferred Score: 78 (percent correct) decision treepredictions. Our games work on two levels: (1) games such as Dizeez Top associationsand GenESP collect novel gene annotations and (2) games like Combo provided four or more Game Score: determined byengage players directly in the process of predictor inference. times and not found in estimating performance of trees constructed using the selected Feature sets from many OMIM/PharmGKB. features on training data. individual games used to createPlay game prototypes at: a Decision Tree Forest classifier. Even after limited game playing, the Dizeez game resulted in the (Each tree votes once.) identification of several novel gene-disease annotations. Game Objectives Human Guided Forest GeneESP: gene – concept association with a partner Ensemble classifier where Phenotype • Capture general components are decision trees constructed using community manually selected subsets of knowledge in a features. Adaptation of gene pathway useful structure Network Guided and Random Forests [1,2]. gene Community Guess what genes your partner REFERENCES is thinking about when they 1. Dutkowski and Ideker (2011) Protein Networks as Logic Functions in Development and Cancer. PLoS Computational Biology see ‘neuroblastoma’ 2. Winter et al (2012) Google Goes Cancer: Improving Outcome Prediction for Cancer Patients by Network-Based Ranking of Marker Genes. PLoS Computational Biology • Concentrate Improvements compared to Dizeez: 3. Liu et al (2012) Identifying dysregulated pathways in cancers from pathway interaction networks. BMC Bioinformatics 4. Good and Su (2011) Games with a Scientific Purpose. Genome Biology community knowledge • Reward new, useful annotations with points 5. Kawrykow et al (2012) Phylo: A Citizen Science Approach for Improving Multiple Sequence Alignment. PLoS One and reasoning around • Add social interaction CONTACT predicting a particular • Enable gene-gene, gene-disease, gene-function Benjamin Good: Salvatore Loguercio: Andrew Su: phenotype games on the same platform • Increase scalability of annotation collection (does FUNDING Phenotype 1 We acknowledge support from the National Institute of General Medical Sciences (GM089820 and not depend on a database of ‘right’ answers) GM083924) and the NIH through the FaceBase Consortium for a particular emphasis on Phenotype 2 craniofacial genes (DE-20057). .