openSNP - Crowdsourcing Genome Wide Association Studies


Published on

Slides of an invited talk at the MPI in Tübingen

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

openSNP - Crowdsourcing Genome Wide Association Studies

  1. 1. Crowdsourcing Genome 23.01.12, Bastian GreshakeWide Association Studies
  2. 2. some words about me• BSc in Life Sciences (2010)• Working at Biodiversity & Climate Research Center (since 2010)• MSc studies at the Goethe University in Frankfurt/Main (since 2011)• Not exactly a biologist with much professional background in human genetics, but...
  3. 3. some words about me• some background in data mining (mainly transcriptomics)• some experience with web applications• interest in social media & crowd-sourcing• customer of DTC genetic testing myself
  4. 4. finding DTC results up to now
  5. 5. mining DTC genetictests• results are hidden somewhere on the web• often no phenotypic annotation• not easily re-usable
  6. 6. let’s code it:• wants to be a central repository for sharing DTC results• enables users to share phenotypes as well• lowers barrier to participate• motivation to share through benefits for users• can we take it a step further and provide data for GWAS?
  7. 7. mining DTC genetic tests• lots of potential for open data (100k+ customers)• cheap data source for scientists Would you share DTC test results? (n=226) 6 % 26 % 68 % Yes Only with DTC company No
  8. 8. the front
  9. 9. technical implementation • framework: Ruby on Rails • database: PostgreSQL • task management via resque (known of GitHub) • basic API via JSON-queries
  10. 10. other resources• Personal Genome Project • data is open • participation not
  11. 11. Personal Genome Project
  12. 12. other resources• Personal Genome Project • data is open • participation not • no easy way to download data, no API etc.• genomera • participation will be open (currently invited beta) • focus on small scale studies/experiments
  13. 13. genomera
  14. 14. problems & potential of patient driven/crowd-sourced research• problems • sample sizes • bias in participants • motivation of participants • accuracy of data• potential • possible sample sizes • low costs • "warm fuzzy feeling inside" for patients
  15. 15. positive examples: PatientsLikeMe• around since ~2006• published a dozen studies since then• famous example: ALS research on lithium carbonate intake (149 patients, 447 controls) Paul Wicks et al. (2011) Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm, Nature Biotechnology 29, 411–414
  16. 16. positive examples: 23andMe• published some studies in 2010/2011• done with self-reported data• studies include 10.000+ to 30.000+ participants
  17. 17. positive examples: 23andMe – general traits“Replications of associations [...] for hair color, eye color,and freckling validate the Web-based, self-reportingparadigm. The identification of novel associations for hairmorphology [...], freckling [...], the ability to smell themethanethiol produced after eating asparagus [...], andphotic sneeze reflex [...] illustrates the power of theapproach. Nicolas Eriksson et al. (2010) Web-Based, Participant-Driven Studies Yield Novel Genetic Associations for Common Traits. PLoS Genet 6(6): e1000993. doi:10.1371/ journal.pgen.1000993
  18. 18. positive examples: 23andMe – Parkinson’s Disease“We discovered two novel, genome-wide significantassociations with [Parkinson’s Disease]—both replicatedin an independent cohort. We also replicated 20previously discovered genetic associations (includingLRRK2, GBA, SNCA, MAPT, GAK, and the HLA region),providing support for our novel study design. Chuong B. Do et al. (2011) Web-Based Genome-Wide Association Study Identifies Two Novel Loci and a Substantial Genetic Component for Parkinsons Disease. PLoS Genet 7(6): e1002141. doi:10.1371/journal.pgen.1002141
  19. 19. Quantified Self and Science
  20. 20. Quantified Self Movement
  21. 21. QS projects• tracking health in response to work-outs (minimizing impacts of disease/genetic predisposition)• track response to different drugs• tracking well-being in response to eating habits (butter vs arithmetics)
  22. 22. butter vs arithmetics source: Seth Roberts -
  23. 23. my conclusions• technology enables new kinds of research• DTC results and patient driven research can lead to new scientific knowledge• can be a valuable addition to traditional research
  24. 24. openSNP: now & future• won the Mendeley/PLoS Binary Battle in 2011• got some funding of the German WikiMedia foundation to get more people genotyped• collaborating with consent to research to get IRB approved consent-process• working on implementing the Distributed Annotation System
  25. 25. thanks for your attention source: CC-BY-NC