MyGene.info talk at ISMB/BOSC 2013

762 views

Published on

MyGene.info: Gene Annotation Query as a Service

talk at ISMB/BOSC 2013

Published in: Science, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
762
On SlideShare
0
From Embeds
0
Number of Embeds
18
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

MyGene.info talk at ISMB/BOSC 2013

  1. 1. MyGene.info Chunlei Wu, Ph.D. The Scripps Research Institute La Jolla, CA, USA BOSC 2013 July 20, 2013 - Making Elastic Gene API
  2. 2. Being “Elastic“ ● Fast ● Always ON ● Up-to-date ● Scalable ● Extensible
  3. 3. Common procedure for gene data retrieval Entrez Ensembl UniProt Internal gene object Data sources Data parsing update regularly ...
  4. 4. Common procedure for gene data retrieval Entrez Ensembl UniProt Internal gene object Data sources Data parsing update regularly As A Service? ... Common procedure for gene data retrieval Entrez Ensembl UniProt Internal gene object Data sources Data parsing update regularly As A Service? ...
  5. 5. Working model - 1 Entrez Ensembl UniProt Data sources Merging Query Engine User queries ...
  6. 6. Working model - 1 Entrez Ensembl UniProt Data sources Merging Query Engine ... { “entrezgene“: 1017 } { “uniprot“: “P24941” } A dummy merging example: { “ensemblgene“: “ENSG00000123374” } { “entrezgene“: 1017, “ensemblgene“: “ENSG00000123374”, “uniprot“: “P24941” }
  7. 7. Gene object in noSQL database “key” “document” 1017: { “Symbol”: “CDK2”, “Ensembl”: “ENSG00000123374”, “RefSeq”: [ “NM_001798”, “NM_052827” ], “Reporter”: { “U95A”: [ “1792_g_at”, “1833_at” ], “U133A”:[ “211804_s_at”, “2045252_at”, “211803_at” ] } }
  8. 8. Syncing from data-hub to query instance Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub Public query instance
  9. 9. Syncing from data-hub to query instances Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub Public query instance
  10. 10. Public query instance http://MyGene.info (currently v2 API, two endpoints) http://MyGene.info/v2/query?q=<query> any query term(s) matching gene hits http://MyGene.info/v2/gene/<geneid> gene id(s) matching gene objects
  11. 11. Public query instance ● Support ALL species, from NCBI (>12K species, >13M genes) ● >40 annotation fields and expanding ● Weekly-updated ● Flexible query interface ● Simple queries ● Fielded queries ● Wildcard queries ● Genomic interval queries ● Species filter ● Returning fields filter ● Support batch queries, JSONP, CORS ● Committed for long-term availability ?q=cdk2 ?q=symbol:cdk2 ?q=cdk* ?q=chr1:1-100,000&species=human ?q=cdk2&species=mouse,rat ?q=cdk2&fields=symbol,homologene http://MyGene.info
  12. 12. Public query instance High-performance host (serving ~500K requests/day) http://MyGene.info
  13. 13. Public query instance MyGene.py - Python wrapper https://pypi.python.org/pypi/mygene Third-party packages pip install mygene
  14. 14. Public query instance MyGene.autocomplete - Gene query autocomplete widget https://bitbucket.org/sulab/mygene.autocomplete Third-party packages
  15. 15. Public query instance MyGene.autocomplete - Gene query autocomplete widget https://bitbucket.org/sulab/mygene.autocomplete Third-party packages
  16. 16. Working model – 2 Entrez Ensembl UniProt Data sources Merging Query Engine User queriesMerging 1 Merging 2 Merging 3 Query Engine Query Engine ...
  17. 17. Syncing from data-hub to query instances Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub User queries Query Engine Private data hub Merging Merging Private query instancePublic query instance Private 1 Private 2 ...
  18. 18. Private query instance ● Dedicated host ● Same powerful query interface ● Third-party packages still work ● Public data still get sync-ed ● Allow to merge private data
  19. 19. To reach us? Questions on public query instance or interested in setting up your own private query instance? Please let us know: help@mygene.info
  20. 20. Code repositories ● Web front-end https://bitbucket.org/sulab/mygene.info Apache 2 licensed ● Data hub https://bitbucket.org/sulab/mygene.hub GPL v3 licensed
  21. 21. Acknowledgement Funding and Support R01GM083924 Sulab Andrew Su Benjamin Good Max Nannis Salvatore Loguercio Katie Fisch Tobias Meissner
  22. 22. Syncing from data-hub to query instances Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub User queries Query Engine Private data hub Merging Merging Private query instancePublic query instance Private 1 Private 2 ...
  23. 23. Private query instance ● Same powerful query interface ● Third-party packages still work ● Public data get sync-ed ● Allow to merge private dataPublic data hub Query Engine Private data hub MergingMerging Private query instance Private 1 Private 2 ... Entrez Ensembl UniProt Data sources ...

×