MyGene.info talk at ISMB/BOSC 2013

A
MyGene.info
Chunlei Wu, Ph.D.
The Scripps Research Institute
La Jolla, CA, USA
BOSC 2013
July 20, 2013
- Making Elastic Gene API
Being “Elastic“
●
Fast
●
Always ON
●
Up-to-date
●
Scalable
●
Extensible
Common procedure for gene data retrieval
Entrez
Ensembl
UniProt
Internal gene object
Data sources Data parsing
update regularly
...
Common procedure for gene data retrieval
Entrez
Ensembl
UniProt
Internal gene object
Data sources Data parsing
update regularly
As A Service?
...
Common procedure for gene data retrieval
Entrez
Ensembl
UniProt
Internal gene object
Data sources Data parsing
update regularly
As A Service?
...
Working model - 1
Entrez
Ensembl
UniProt
Data sources
Merging
Query Engine
User queries
...
Working model - 1
Entrez
Ensembl
UniProt
Data sources
Merging
Query Engine
...
{
“entrezgene“: 1017
}
{
“uniprot“: “P24941”
}
A dummy merging example:
{
“ensemblgene“:
“ENSG00000123374”
}
{
“entrezgene“: 1017,
“ensemblgene“:
“ENSG00000123374”,
“uniprot“: “P24941”
}
Gene object in noSQL database
“key” “document”
1017: {
“Symbol”: “CDK2”,
“Ensembl”: “ENSG00000123374”,
“RefSeq”: [
“NM_001798”,
“NM_052827”
],
“Reporter”: {
“U95A”: [
“1792_g_at”,
“1833_at”
],
“U133A”:[
“211804_s_at”,
“2045252_at”,
“211803_at”
]
}
}
Syncing from data-hub to query instance
Merging
Entrez
Ensembl
UniProt
Data sources
...
Query Engine
User queries
Public data hub
Public query instance
Syncing from data-hub to query instances
Merging
Entrez
Ensembl
UniProt
Data sources
...
Query Engine
User queries
Public data hub
Public query instance
Public query instance
http://MyGene.info
(currently v2 API, two endpoints)
http://MyGene.info/v2/query?q=<query>
any query term(s)
matching gene hits
http://MyGene.info/v2/gene/<geneid>
gene id(s)
matching gene objects
Public query instance
●
Support ALL species, from NCBI (>12K species, >13M genes)
●
>40 annotation fields and expanding
●
Weekly-updated
●
Flexible query interface
●
Simple queries
●
Fielded queries
●
Wildcard queries
●
Genomic interval queries
●
Species filter
●
Returning fields filter
●
Support batch queries, JSONP, CORS
●
Committed for long-term availability
?q=cdk2
?q=symbol:cdk2
?q=cdk*
?q=chr1:1-100,000&species=human
?q=cdk2&species=mouse,rat
?q=cdk2&fields=symbol,homologene
http://MyGene.info
Public query instance
High-performance host (serving ~500K requests/day)
http://MyGene.info
Public query instance
MyGene.py - Python wrapper
https://pypi.python.org/pypi/mygene
Third-party packages
pip install mygene
Public query instance
MyGene.autocomplete
- Gene query autocomplete widget
https://bitbucket.org/sulab/mygene.autocomplete
Third-party packages
Public query instance
MyGene.autocomplete
- Gene query autocomplete widget
https://bitbucket.org/sulab/mygene.autocomplete
Third-party packages
Working model – 2
Entrez
Ensembl
UniProt
Data sources
Merging
Query Engine
User queriesMerging 1
Merging 2
Merging 3
Query Engine
Query Engine
...
Syncing from data-hub to query instances
Merging
Entrez
Ensembl
UniProt
Data sources
...
Query Engine
User queries
Public data hub
User queries
Query Engine
Private data hub
Merging
Merging
Private query instancePublic query instance
Private 1
Private 2
...
Private query instance
●
Dedicated host
●
Same powerful query interface
●
Third-party packages still work
●
Public data still get sync-ed
●
Allow to merge private data
To reach us?
Questions on public query instance
or
interested in setting up your own private query instance?
Please let us know:
help@mygene.info
Code repositories
●
Web front-end
https://bitbucket.org/sulab/mygene.info
Apache 2 licensed
●
Data hub
https://bitbucket.org/sulab/mygene.hub
GPL v3 licensed
Acknowledgement
Funding and Support
R01GM083924
Sulab
Andrew Su
Benjamin Good
Max Nannis
Salvatore Loguercio
Katie Fisch
Tobias Meissner
MyGene.info talk at ISMB/BOSC 2013
MyGene.info talk at ISMB/BOSC 2013
Syncing from data-hub to query instances
Merging
Entrez
Ensembl
UniProt
Data sources
...
Query Engine
User queries
Public data hub
User queries
Query Engine
Private data hub
Merging
Merging
Private query instancePublic query instance
Private 1
Private 2
...
Private query instance
●
Same powerful query interface
●
Third-party packages still work
●
Public data get sync-ed
●
Allow to merge private dataPublic data hub
Query Engine
Private data hub
MergingMerging
Private query instance
Private 1
Private 2
...
Entrez
Ensembl
UniProt
Data sources
...
1 of 25

Recommended

F01-Cloud-Mygene.info by
F01-Cloud-Mygene.infoF01-Cloud-Mygene.info
F01-Cloud-Mygene.infoBioinformatics Open Source Conference
849 views17 slides
UCSD / DBMI seminar 2015-02-6 by
UCSD / DBMI seminar 2015-02-6UCSD / DBMI seminar 2015-02-6
UCSD / DBMI seminar 2015-02-6Andrew Su
3.1K views70 slides
Open biomedical knowledge using crowdsourcing and citizen science by
Open biomedical knowledge using crowdsourcing and citizen scienceOpen biomedical knowledge using crowdsourcing and citizen science
Open biomedical knowledge using crowdsourcing and citizen scienceAndrew Su
3.1K views66 slides
Qt Framework Events Signals Threads by
Qt Framework Events Signals ThreadsQt Framework Events Signals Threads
Qt Framework Events Signals ThreadsNeera Mital
6K views17 slides
MyGene.info learn-more by
MyGene.info learn-moreMyGene.info learn-more
MyGene.info learn-moreanewgene
484 views12 slides
Ensembl Browser Workshop by
Ensembl Browser WorkshopEnsembl Browser Workshop
Ensembl Browser WorkshopDenise Carvalho-Silva, PhD
878 views142 slides

More Related Content

Viewers also liked

Next generation sequencing by
Next generation sequencingNext generation sequencing
Next generation sequencingDayananda Salam
21.9K views25 slides
Next Gen Sequencing (NGS) Technology Overview by
Next Gen Sequencing (NGS) Technology OverviewNext Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology OverviewDominic Suciu
48.1K views32 slides
NGS technologies - platforms and applications by
NGS technologies - platforms and applicationsNGS technologies - platforms and applications
NGS technologies - platforms and applicationsAGRF_Ltd
15.4K views38 slides
Ngs ppt by
Ngs pptNgs ppt
Ngs pptArcha Dave
58.1K views16 slides
A Comparison of NGS Platforms. by
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.mkim8
59K views25 slides
Introduction to next generation sequencing by
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencingVHIR Vall d’Hebron Institut de Recerca
115.1K views62 slides

Viewers also liked(7)

Next generation sequencing by Dayananda Salam
Next generation sequencingNext generation sequencing
Next generation sequencing
Dayananda Salam21.9K views
Next Gen Sequencing (NGS) Technology Overview by Dominic Suciu
Next Gen Sequencing (NGS) Technology OverviewNext Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology Overview
Dominic Suciu48.1K views
NGS technologies - platforms and applications by AGRF_Ltd
NGS technologies - platforms and applicationsNGS technologies - platforms and applications
NGS technologies - platforms and applications
AGRF_Ltd15.4K views
Ngs ppt by Archa Dave
Ngs pptNgs ppt
Ngs ppt
Archa Dave58.1K views
A Comparison of NGS Platforms. by mkim8
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.
mkim859K views
NGS - Basic principles and sequencing platforms by Annelies Haegeman
NGS - Basic principles and sequencing platformsNGS - Basic principles and sequencing platforms
NGS - Basic principles and sequencing platforms
Annelies Haegeman27.2K views

Similar to MyGene.info talk at ISMB/BOSC 2013

Executing Provenance-Enabled Queries over Web Data by
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
1.5K views28 slides
PERICLES Workflow for the automated updating of Digital Ecosystem Models with... by
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...PERICLES_FP7
197 views41 slides
Building Robust Pipelines with Airflow | Wrangle Conference 2017 by
Building Robust Pipelines with Airflow | Wrangle Conference 2017Building Robust Pipelines with Airflow | Wrangle Conference 2017
Building Robust Pipelines with Airflow | Wrangle Conference 2017Cloudera, Inc.
3.3K views30 slides
Building Robust Pipelines with Airflow by
Building Robust Pipelines with AirflowBuilding Robust Pipelines with Airflow
Building Robust Pipelines with AirflowErin Shellman
2K views30 slides
Analytics of analytics pipelines: from optimising re-execution to general Dat... by
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...Paolo Missier
61 views38 slides
Sharing massive data analysis: from provenance to linked experiment reports by
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
84 views49 slides

Similar to MyGene.info talk at ISMB/BOSC 2013(20)

Executing Provenance-Enabled Queries over Web Data by eXascale Infolab
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
eXascale Infolab1.5K views
PERICLES Workflow for the automated updating of Digital Ecosystem Models with... by PERICLES_FP7
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
PERICLES_FP7197 views
Building Robust Pipelines with Airflow | Wrangle Conference 2017 by Cloudera, Inc.
Building Robust Pipelines with Airflow | Wrangle Conference 2017Building Robust Pipelines with Airflow | Wrangle Conference 2017
Building Robust Pipelines with Airflow | Wrangle Conference 2017
Cloudera, Inc.3.3K views
Building Robust Pipelines with Airflow by Erin Shellman
Building Robust Pipelines with AirflowBuilding Robust Pipelines with Airflow
Building Robust Pipelines with Airflow
Erin Shellman2K views
Analytics of analytics pipelines: from optimising re-execution to general Dat... by Paolo Missier
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Paolo Missier61 views
Sharing massive data analysis: from provenance to linked experiment reports by Gaignard Alban
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
Gaignard Alban84 views
ckan 2.0: Harvesting from other sources by Chengjen Lee
ckan 2.0: Harvesting from other sourcesckan 2.0: Harvesting from other sources
ckan 2.0: Harvesting from other sources
Chengjen Lee3.5K views
20170402 Crop Innovation and Business - Amsterdam by Allen Day, PhD
20170402 Crop Innovation and Business - Amsterdam20170402 Crop Innovation and Business - Amsterdam
20170402 Crop Innovation and Business - Amsterdam
Allen Day, PhD657 views
i5k Workspace Workshop - AGS2017 by Monica Poelchau
i5k Workspace Workshop - AGS2017i5k Workspace Workshop - AGS2017
i5k Workspace Workshop - AGS2017
Monica Poelchau133 views
Building a Driver: Lessons Learned From Developing the Internet Explorer Driver by seleniumconf
Building a Driver: Lessons Learned From Developing the Internet Explorer DriverBuilding a Driver: Lessons Learned From Developing the Internet Explorer Driver
Building a Driver: Lessons Learned From Developing the Internet Explorer Driver
seleniumconf984 views
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge by Chunlei Wu
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
Chunlei Wu127 views
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ... by Bonnie Hurwitz
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
Bonnie Hurwitz1.8K views
AI Food detector; A model of Generative adversarial network for food Classifier by jimmy majumder
AI Food detector; A model of Generative adversarial network for food ClassifierAI Food detector; A model of Generative adversarial network for food Classifier
AI Food detector; A model of Generative adversarial network for food Classifier
jimmy majumder63 views
Addmi 11-intro to-patterns by odanyboy
Addmi 11-intro to-patternsAddmi 11-intro to-patterns
Addmi 11-intro to-patterns
odanyboy578 views
NEXT- A System for Real-World Development, Evaluation, and Application of Act... by Nicholas Glattard
NEXT- A System for Real-World Development, Evaluation, and Application of Act...NEXT- A System for Real-World Development, Evaluation, and Application of Act...
NEXT- A System for Real-World Development, Evaluation, and Application of Act...
Nicholas Glattard117 views
Performance Analysis of Idle Programs by greenwop
Performance Analysis of Idle ProgramsPerformance Analysis of Idle Programs
Performance Analysis of Idle Programs
greenwop1.2K views
Entity Extraction from Natural Language Text using Apache NiFi and Idyl E3 by Mountain Fog
Entity Extraction from Natural Language Text using Apache NiFi and Idyl E3Entity Extraction from Natural Language Text using Apache NiFi and Idyl E3
Entity Extraction from Natural Language Text using Apache NiFi and Idyl E3
Mountain Fog1.9K views

Recently uploaded

POSTER IV LAWCN_ROVER_IUE.pdf by
POSTER IV LAWCN_ROVER_IUE.pdfPOSTER IV LAWCN_ROVER_IUE.pdf
POSTER IV LAWCN_ROVER_IUE.pdfSOCIEDAD JULIO GARAVITO
12 views1 slide
Assessment and Evaluation GROUP 3.pdf by
Assessment and Evaluation GROUP 3.pdfAssessment and Evaluation GROUP 3.pdf
Assessment and Evaluation GROUP 3.pdfkimberlyndelgado18
10 views10 slides
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...ILRI
6 views6 slides
Experimental animal Guinea pigs.pptx by
Experimental animal Guinea pigs.pptxExperimental animal Guinea pigs.pptx
Experimental animal Guinea pigs.pptxMansee Arya
40 views16 slides
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...ILRI
9 views1 slide
vitamine B1.pptx by
vitamine B1.pptxvitamine B1.pptx
vitamine B1.pptxajithkilpart
29 views22 slides

Recently uploaded(20)

Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by ILRI
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI6 views
Experimental animal Guinea pigs.pptx by Mansee Arya
Experimental animal Guinea pigs.pptxExperimental animal Guinea pigs.pptx
Experimental animal Guinea pigs.pptx
Mansee Arya40 views
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by ILRI
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI9 views
A giant thin stellar stream in the Coma Galaxy Cluster by Sérgio Sacani
A giant thin stellar stream in the Coma Galaxy ClusterA giant thin stellar stream in the Coma Galaxy Cluster
A giant thin stellar stream in the Coma Galaxy Cluster
Sérgio Sacani19 views
Oral_Presentation_by_Fatma (2).pdf by fatmaalmrzqi
Oral_Presentation_by_Fatma (2).pdfOral_Presentation_by_Fatma (2).pdf
Oral_Presentation_by_Fatma (2).pdf
fatmaalmrzqi8 views
selection of preformed arch wires during the alignment stage of preadjusted o... by MaherFouda1
selection of preformed arch wires during the alignment stage of preadjusted o...selection of preformed arch wires during the alignment stage of preadjusted o...
selection of preformed arch wires during the alignment stage of preadjusted o...
MaherFouda17 views
Indian council for child welfare by RenuWaghmare2
Indian council for child welfareIndian council for child welfare
Indian council for child welfare
RenuWaghmare27 views
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy... by Anmol Vishnu Gupta
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...
Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana... by jahnviarora989
Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana...Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana...
Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana...
jahnviarora9897 views
2. Natural Sciences and Technology Author Siyavula.pdf by ssuser821efa
2. Natural Sciences and Technology Author Siyavula.pdf2. Natural Sciences and Technology Author Siyavula.pdf
2. Natural Sciences and Technology Author Siyavula.pdf
ssuser821efa11 views
별헤는 사람들 2023년 12월호 전명원 교수 자료 by sciencepeople
별헤는 사람들 2023년 12월호 전명원 교수 자료별헤는 사람들 2023년 12월호 전명원 교수 자료
별헤는 사람들 2023년 12월호 전명원 교수 자료
sciencepeople68 views

MyGene.info talk at ISMB/BOSC 2013

  • 1. MyGene.info Chunlei Wu, Ph.D. The Scripps Research Institute La Jolla, CA, USA BOSC 2013 July 20, 2013 - Making Elastic Gene API
  • 3. Common procedure for gene data retrieval Entrez Ensembl UniProt Internal gene object Data sources Data parsing update regularly ...
  • 4. Common procedure for gene data retrieval Entrez Ensembl UniProt Internal gene object Data sources Data parsing update regularly As A Service? ... Common procedure for gene data retrieval Entrez Ensembl UniProt Internal gene object Data sources Data parsing update regularly As A Service? ...
  • 5. Working model - 1 Entrez Ensembl UniProt Data sources Merging Query Engine User queries ...
  • 6. Working model - 1 Entrez Ensembl UniProt Data sources Merging Query Engine ... { “entrezgene“: 1017 } { “uniprot“: “P24941” } A dummy merging example: { “ensemblgene“: “ENSG00000123374” } { “entrezgene“: 1017, “ensemblgene“: “ENSG00000123374”, “uniprot“: “P24941” }
  • 7. Gene object in noSQL database “key” “document” 1017: { “Symbol”: “CDK2”, “Ensembl”: “ENSG00000123374”, “RefSeq”: [ “NM_001798”, “NM_052827” ], “Reporter”: { “U95A”: [ “1792_g_at”, “1833_at” ], “U133A”:[ “211804_s_at”, “2045252_at”, “211803_at” ] } }
  • 8. Syncing from data-hub to query instance Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub Public query instance
  • 9. Syncing from data-hub to query instances Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub Public query instance
  • 10. Public query instance http://MyGene.info (currently v2 API, two endpoints) http://MyGene.info/v2/query?q=<query> any query term(s) matching gene hits http://MyGene.info/v2/gene/<geneid> gene id(s) matching gene objects
  • 11. Public query instance ● Support ALL species, from NCBI (>12K species, >13M genes) ● >40 annotation fields and expanding ● Weekly-updated ● Flexible query interface ● Simple queries ● Fielded queries ● Wildcard queries ● Genomic interval queries ● Species filter ● Returning fields filter ● Support batch queries, JSONP, CORS ● Committed for long-term availability ?q=cdk2 ?q=symbol:cdk2 ?q=cdk* ?q=chr1:1-100,000&species=human ?q=cdk2&species=mouse,rat ?q=cdk2&fields=symbol,homologene http://MyGene.info
  • 12. Public query instance High-performance host (serving ~500K requests/day) http://MyGene.info
  • 13. Public query instance MyGene.py - Python wrapper https://pypi.python.org/pypi/mygene Third-party packages pip install mygene
  • 14. Public query instance MyGene.autocomplete - Gene query autocomplete widget https://bitbucket.org/sulab/mygene.autocomplete Third-party packages
  • 15. Public query instance MyGene.autocomplete - Gene query autocomplete widget https://bitbucket.org/sulab/mygene.autocomplete Third-party packages
  • 16. Working model – 2 Entrez Ensembl UniProt Data sources Merging Query Engine User queriesMerging 1 Merging 2 Merging 3 Query Engine Query Engine ...
  • 17. Syncing from data-hub to query instances Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub User queries Query Engine Private data hub Merging Merging Private query instancePublic query instance Private 1 Private 2 ...
  • 18. Private query instance ● Dedicated host ● Same powerful query interface ● Third-party packages still work ● Public data still get sync-ed ● Allow to merge private data
  • 19. To reach us? Questions on public query instance or interested in setting up your own private query instance? Please let us know: help@mygene.info
  • 20. Code repositories ● Web front-end https://bitbucket.org/sulab/mygene.info Apache 2 licensed ● Data hub https://bitbucket.org/sulab/mygene.hub GPL v3 licensed
  • 21. Acknowledgement Funding and Support R01GM083924 Sulab Andrew Su Benjamin Good Max Nannis Salvatore Loguercio Katie Fisch Tobias Meissner
  • 24. Syncing from data-hub to query instances Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub User queries Query Engine Private data hub Merging Merging Private query instancePublic query instance Private 1 Private 2 ...
  • 25. Private query instance ● Same powerful query interface ● Third-party packages still work ● Public data get sync-ed ● Allow to merge private dataPublic data hub Query Engine Private data hub MergingMerging Private query instance Private 1 Private 2 ... Entrez Ensembl UniProt Data sources ...