Overview of BioThings project (https://biothings.io) with the highlight of BioThings Studio tool, a web development environment for building Biomedical APIs
2. API – Application Programming Interface
Data API is a way to abstract the data-access layer.
3. Presentation Layer
Business logic Layer
Data Layer
Application 1
Presentation Layer
Business logic Layer
Data Layer
Application 2
View
Controller
Model
Repetitive data wrangling:
• Parsing dump files
• ID conversion
• Data merging
• Data transformation
• Source monitoring
• Download scheduler
• … …
Presentation Layer
Business logic Layer
Common Data Layer
Application 1
Presentation Layer
Business logic Layer
Data Layer
Application 2
7. • Get gene object(s) via either NCBI/Ensembl gene ids:
• http://mygene.info/v3/gene/1017
• http://mygene.info/v3/gene/ENSG00000123374
• http://mygene.info/v3/gene/1017?fields=symbol,name,pathway,uniprot
• Find matching gene objects with any query terms:
• http://mygene.info/v3/query?q=CDK2
• http://mygene.info/v3/query?q=name:kinase&species=human
• http://mygene.info/v3/query?q=name:kinase AND _exists_:pathway
• http://mygene.info/v3/query?q=pathway.kegg.name:wnt&fields=entrezgene,symbol,taxid,interpro
Batch queries supported via POST
8. Aggregates annotations for
97 million drugs/chemicals from 12 resources
I have a list of drug/chemical ids, want to get annotations
about them?
Drug/chemical annotation service:
GET /v1/drug/<drugid>
POST /v1/drug/ (batch mode)
I want to get matching drugs/chemicals with my query
term(s)
Drug/chemical query service:
GET /v1/query/?q= <query>
POST /v1/query/ (batch mode)
http://mygene.info http://myvariant.info http://mychem.info
Aggregates annotations for
32 million genes from 30 resources
I have a list of gene ids, want to get annotations about
them?
Gene annotation service:
GET /v3/gene/<geneid>
POST /v3/gene/ (batch mode)
I want to get matching genes with my query term(s)
Gene query service:
GET /v3/query/?q= <query>
POST /v3/query/ (batch mode)
Aggregates annotations for
950 million variants from 21 resources
I have a list of variant ids, want to get annotations about
them?
Variant annotation service:
GET /v1/variant/<hgvsid>
POST /v1/variant/ (batch mode)
I want to get matching variants with my query term(s)
Variant query service:
GET /v1/query/?q= <query>
POST /v1/query/ (batch mode)
MyDisease.info mydisease.info
BioThings API for Taxonomy t.biothings.io
Other BioThings APIs:
9. Simple to use
Comprehensive
- MyGene.info: 32M genes from 30K species
- MyVariant.info: 874M (700M observed)
- MyChem.info: 97M chemicals/drugs
Developer-friendly (support CORS, gzip, https, msgpack, etc.)
• “fields” parameter to filter down the response to what’s needed
• “fetch_all” feature for streaming large query results
Python, R, JavaScript clients
Usability
Sustainability
Always up-to-date (weekly updated)
High-performance and scalable
High-availability
ENTERPRISE
GRADE
10. A Python package turns data sources into a high-quality API
pip install biothings
https://pypi.org/project/biothings/
13. Register as a new data source
dump data
upload data
inspect data
BioThings Studio is provided as a
ready-to-start Docker image.
See tutorial at
http://docs.biothings.io/en/latest/do
c/studio.html
https://github.com/biothings/biothings_studio
14. Upload and inspect data from the parser
https://github.com/biothings/biothings_studio
15. Make a new “data-build”
https://github.com/biothings/biothings_studio
16. Create a new release and setup the API
https://github.com/biothings/biothings_studio
17. Start your API!
• Get PheWAS associations to a specific SNP:
• http://localhost:8000/variant/chr12:g.56364321A>G
• http://localhost:8000/query?q=rs1250552
• http://localhost:8000/query?q=rs1250552&fields=phewas.gwas_asso
ciations,phewas.gene,phewas.rsid
• Find PheWAS associations to a gene:
• http://localhost:8000/query?q=CDK2
• Find PheWAS associations to “Asthma”:
• http://localhost:8000/query?q=phewas.gwas_associations:asthma
Accessible from Python/R/Javascript biothings_client too
https://github.com/biothings/biothings_studio
19. Accessible
Findable
Interoperable
Reusable
If you want fast and update-
to-date access to gene,
variant, chemical, drug data.
If you want to quickly turn
your data into a high-
performance API.
If you built your API and want
others to find your API and use
it together with other APIs for a
specific workflow.
20. Scripps Research
Andrew Su (sulab.org)
Cyrus Afrasiabi
Sebastien Lelong
Jiwen (Kevin) Xin
Marco Cano Alvarado
Xinhua (Jerry) Zhou
Ginger Tsueng
Byung Ryul Jeon
Greg Taylor
Nina Moore
Maastricht Univ.
Michel Dumontier
(dumontierlab.com)
Amrapali Zaveri
Kody Moodley
Trish Whetzel (EBI)
Shima Dastgheib (NuMedii)
Ruben Verborgh (Ghent Univ.)
Paul Avillach (Harvard)
Gabor Korodi (Harvard)
Raymond Terryn (Univ. of Miami)
Kathleen Jagodnik (Mount Sinai)
Pedro Assis (Stanford)
Funding support from
NIH Data Commons
API interoperability working group
Univ. of Washington
Sean Mooney
Vikas R Pejaver
Translator, CD2H