SlideShare a Scribd company logo
1 of 20
Chunlei Wu, Ph.D.
cwu@scripps.edu
@chunleiwu
https://wulab.io
Associate Professor
Dept. of Integrative Structural and Computational Biology
The Scripps Research Institute
La Jolla, CA, USA
07/23/2019
API – Application Programming Interface
Data API is a way to abstract the data-access layer.
Presentation Layer
Business logic Layer
Data Layer
Application 1
Presentation Layer
Business logic Layer
Data Layer
Application 2
View
Controller
Model
Repetitive data wrangling:
• Parsing dump files
• ID conversion
• Data merging
• Data transformation
• Source monitoring
• Download scheduler
• … …
Presentation Layer
Business logic Layer
Common Data Layer
Application 1
Presentation Layer
Business logic Layer
Data Layer
Application 2
It's about
Modularization
photo credits: http://www.edmentum.com/sites/edmentum.com/files/solutions/content/building_0.jpg
http://www.howcsharp.com/img/0/68/dont-repeat-yourself-dry-300x211.jpg
http://blog.capinc.com/wp-content/uploads/2013/02/Recycle_Logo_by_Har1-300x263.png
Reusability DRY principle
https://mygene.info
Avg. 10M requests
from 14K unique IPs
every month
{
“_id”: “1017”,
“symbol”: “CDK2”,
“ensembl”: “ENSG00000123374”,
“refseq”: [
“NM_001798”,
“NM_052827”
],
“reporter”: {
“U95A”: [
“1792_g_at”,
“1833_at”
],
“U133A”:[
“211804_s_at”,
“2045252_at”,
“211803_at”
]
}
}
Source merging criteria:
matching NCBI or Ensembl Gene ids
HGNC
MGI
RGD
Refseq
Ensembl
UniProt
UniGene
Homologene
PantherDB
GO
Reactome
Wikipathways
KEGG
PDB
PFAM
Interpro
Prosite
PIR
Pharmgkb
UMLS
Wikipedia
Pharos
…
• Get gene object(s) via either NCBI/Ensembl gene ids:
• http://mygene.info/v3/gene/1017
• http://mygene.info/v3/gene/ENSG00000123374
• http://mygene.info/v3/gene/1017?fields=symbol,name,pathway,uniprot
• Find matching gene objects with any query terms:
• http://mygene.info/v3/query?q=CDK2
• http://mygene.info/v3/query?q=name:kinase&species=human
• http://mygene.info/v3/query?q=name:kinase AND _exists_:pathway
• http://mygene.info/v3/query?q=pathway.kegg.name:wnt&fields=entrezgene,symbol,taxid,interpro
Batch queries supported via POST
Aggregates annotations for
97 million drugs/chemicals from 12 resources
I have a list of drug/chemical ids, want to get annotations
about them?
Drug/chemical annotation service:
GET /v1/drug/<drugid>
POST /v1/drug/ (batch mode)
I want to get matching drugs/chemicals with my query
term(s)
Drug/chemical query service:
GET /v1/query/?q= <query>
POST /v1/query/ (batch mode)
http://mygene.info http://myvariant.info http://mychem.info
Aggregates annotations for
32 million genes from 30 resources
I have a list of gene ids, want to get annotations about
them?
Gene annotation service:
GET /v3/gene/<geneid>
POST /v3/gene/ (batch mode)
I want to get matching genes with my query term(s)
Gene query service:
GET /v3/query/?q= <query>
POST /v3/query/ (batch mode)
Aggregates annotations for
950 million variants from 21 resources
I have a list of variant ids, want to get annotations about
them?
Variant annotation service:
GET /v1/variant/<hgvsid>
POST /v1/variant/ (batch mode)
I want to get matching variants with my query term(s)
Variant query service:
GET /v1/query/?q= <query>
POST /v1/query/ (batch mode)
MyDisease.info mydisease.info
BioThings API for Taxonomy t.biothings.io
Other BioThings APIs:
 Simple to use
 Comprehensive
- MyGene.info: 32M genes from 30K species
- MyVariant.info: 874M (700M observed)
- MyChem.info: 97M chemicals/drugs
 Developer-friendly (support CORS, gzip, https, msgpack, etc.)
• “fields” parameter to filter down the response to what’s needed
• “fetch_all” feature for streaming large query results
 Python, R, JavaScript clients
Usability
Sustainability
 Always up-to-date (weekly updated)
 High-performance and scalable
 High-availability
ENTERPRISE
GRADE
A Python package turns data sources into a high-quality API
pip install biothings
https://pypi.org/project/biothings/
https://github.com/biothings/biothings_studioMyGene.info data sources shown in BioThings Studio
New version
downloaded
Need dev’s
attentions
Sebastien
Code base: https://github.com/kevinxin90/phewas
 Write your own parser
Example data source: PheWAS at https://phewascatalog.org/
Tracked for updates
https://github.com/biothings/biothings_studio
 Register as a new data source
dump data
upload data
inspect data
BioThings Studio is provided as a
ready-to-start Docker image.
See tutorial at
http://docs.biothings.io/en/latest/do
c/studio.html
https://github.com/biothings/biothings_studio
 Upload and inspect data from the parser
https://github.com/biothings/biothings_studio
 Make a new “data-build”
https://github.com/biothings/biothings_studio
 Create a new release and setup the API
https://github.com/biothings/biothings_studio
 Start your API!
• Get PheWAS associations to a specific SNP:
• http://localhost:8000/variant/chr12:g.56364321A>G
• http://localhost:8000/query?q=rs1250552
• http://localhost:8000/query?q=rs1250552&fields=phewas.gwas_asso
ciations,phewas.gene,phewas.rsid
• Find PheWAS associations to a gene:
• http://localhost:8000/query?q=CDK2
• Find PheWAS associations to “Asthma”:
• http://localhost:8000/query?q=phewas.gwas_associations:asthma
Accessible from Python/R/Javascript biothings_client too
https://github.com/biothings/biothings_studio
pending.biothings.io
Accessible
Findable
Interoperable
Reusable
If you want fast and update-
to-date access to gene,
variant, chemical, drug data.
If you want to quickly turn
your data into a high-
performance API.
If you built your API and want
others to find your API and use
it together with other APIs for a
specific workflow.
Scripps Research
Andrew Su (sulab.org)
Cyrus Afrasiabi
Sebastien Lelong
Jiwen (Kevin) Xin
Marco Cano Alvarado
Xinhua (Jerry) Zhou
Ginger Tsueng
Byung Ryul Jeon
Greg Taylor
Nina Moore
Maastricht Univ.
Michel Dumontier
(dumontierlab.com)
Amrapali Zaveri
Kody Moodley
Trish Whetzel (EBI)
Shima Dastgheib (NuMedii)
Ruben Verborgh (Ghent Univ.)
Paul Avillach (Harvard)
Gabor Korodi (Harvard)
Raymond Terryn (Univ. of Miami)
Kathleen Jagodnik (Mount Sinai)
Pedro Assis (Stanford)
Funding support from
NIH Data Commons
API interoperability working group
Univ. of Washington
Sean Mooney
Vikas R Pejaver
Translator, CD2H

More Related Content

Similar to BioThings API: Promoting Best-practices via a Biomedical API Development Ecosystem

BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeChunlei Wu
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...Bonnie Hurwitz
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software DatasetsTao Xie
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and FutureKeiichiro Ono
 
DataOps - Production ML
DataOps - Production MLDataOps - Production ML
DataOps - Production MLAl Zindiq
 
Tech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning productsTech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning productsGianmario Spacagna
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...Robert Grossman
 
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...GigaScience, BGI Hong Kong
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Chakkrit (Kla) Tantithamthavorn
 
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache TomcatCase Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache TomcatVMware Hyperic
 
Live Coding 12 Factor App
Live Coding 12 Factor AppLive Coding 12 Factor App
Live Coding 12 Factor AppEmily Jiang
 
Educause Annual 2007
Educause Annual 2007Educause Annual 2007
Educause Annual 2007Neil Matatall
 
Tony Reid Resume
Tony Reid ResumeTony Reid Resume
Tony Reid Resumestoryhome
 
Architecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemArchitecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemYael Garten
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemShirshanka Das
 
RDF Validation in a Linked Data World - A vision beyond structural and value ...
RDF Validation in a Linked Data World - A vision beyond structural and value ...RDF Validation in a Linked Data World - A vision beyond structural and value ...
RDF Validation in a Linked Data World - A vision beyond structural and value ...Nandana Mihindukulasooriya
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in ProductionDataWorks Summit
 
Research software and Dataverse
Research software and DataverseResearch software and Dataverse
Research software and Dataversephilipdurbin
 

Similar to BioThings API: Promoting Best-practices via a Biomedical API Development Ecosystem (20)

BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
 
DataOps - Production ML
DataOps - Production MLDataOps - Production ML
DataOps - Production ML
 
HPC For Bioinformatics
HPC For BioinformaticsHPC For Bioinformatics
HPC For Bioinformatics
 
Tech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning productsTech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning products
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
 
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache TomcatCase Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
 
Live Coding 12 Factor App
Live Coding 12 Factor AppLive Coding 12 Factor App
Live Coding 12 Factor App
 
Educause Annual 2007
Educause Annual 2007Educause Annual 2007
Educause Annual 2007
 
OGCE SC10
OGCE SC10OGCE SC10
OGCE SC10
 
Tony Reid Resume
Tony Reid ResumeTony Reid Resume
Tony Reid Resume
 
Architecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemArchitecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystem
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
 
RDF Validation in a Linked Data World - A vision beyond structural and value ...
RDF Validation in a Linked Data World - A vision beyond structural and value ...RDF Validation in a Linked Data World - A vision beyond structural and value ...
RDF Validation in a Linked Data World - A vision beyond structural and value ...
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 
Research software and Dataverse
Research software and DataverseResearch software and Dataverse
Research software and Dataverse
 

More from Chunlei Wu

BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...Chunlei Wu
 
BioThings SDK: a toolkit for building high-performance data APIs in biology
BioThings SDK: a toolkit for building high-performance data APIs in biologyBioThings SDK: a toolkit for building high-performance data APIs in biology
BioThings SDK: a toolkit for building high-performance data APIs in biologyChunlei Wu
 
MyVariant.info: Variant Annotation as a Service
MyVariant.info: Variant Annotation as a ServiceMyVariant.info: Variant Annotation as a Service
MyVariant.info: Variant Annotation as a ServiceChunlei Wu
 
High-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsHigh-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsChunlei Wu
 
Biothings APIs: high-performance bioentity-centric web services
Biothings APIs: high-performance bioentity-centric web servicesBiothings APIs: high-performance bioentity-centric web services
Biothings APIs: high-performance bioentity-centric web servicesChunlei Wu
 
Chunlei wu heart_bd2k_201602_ebi
Chunlei wu heart_bd2k_201602_ebiChunlei wu heart_bd2k_201602_ebi
Chunlei wu heart_bd2k_201602_ebiChunlei Wu
 
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.info
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.infoChunlei Wu BD2K 201601 MyGene.info and MyVariant.info
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.infoChunlei Wu
 

More from Chunlei Wu (7)

BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
 
BioThings SDK: a toolkit for building high-performance data APIs in biology
BioThings SDK: a toolkit for building high-performance data APIs in biologyBioThings SDK: a toolkit for building high-performance data APIs in biology
BioThings SDK: a toolkit for building high-performance data APIs in biology
 
MyVariant.info: Variant Annotation as a Service
MyVariant.info: Variant Annotation as a ServiceMyVariant.info: Variant Annotation as a Service
MyVariant.info: Variant Annotation as a Service
 
High-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsHigh-performance web services for gene and variant annotations
High-performance web services for gene and variant annotations
 
Biothings APIs: high-performance bioentity-centric web services
Biothings APIs: high-performance bioentity-centric web servicesBiothings APIs: high-performance bioentity-centric web services
Biothings APIs: high-performance bioentity-centric web services
 
Chunlei wu heart_bd2k_201602_ebi
Chunlei wu heart_bd2k_201602_ebiChunlei wu heart_bd2k_201602_ebi
Chunlei wu heart_bd2k_201602_ebi
 
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.info
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.infoChunlei Wu BD2K 201601 MyGene.info and MyVariant.info
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.info
 

Recently uploaded

Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 

Recently uploaded (20)

Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 

BioThings API: Promoting Best-practices via a Biomedical API Development Ecosystem

  • 1. Chunlei Wu, Ph.D. cwu@scripps.edu @chunleiwu https://wulab.io Associate Professor Dept. of Integrative Structural and Computational Biology The Scripps Research Institute La Jolla, CA, USA 07/23/2019
  • 2. API – Application Programming Interface Data API is a way to abstract the data-access layer.
  • 3. Presentation Layer Business logic Layer Data Layer Application 1 Presentation Layer Business logic Layer Data Layer Application 2 View Controller Model Repetitive data wrangling: • Parsing dump files • ID conversion • Data merging • Data transformation • Source monitoring • Download scheduler • … … Presentation Layer Business logic Layer Common Data Layer Application 1 Presentation Layer Business logic Layer Data Layer Application 2
  • 4. It's about Modularization photo credits: http://www.edmentum.com/sites/edmentum.com/files/solutions/content/building_0.jpg http://www.howcsharp.com/img/0/68/dont-repeat-yourself-dry-300x211.jpg http://blog.capinc.com/wp-content/uploads/2013/02/Recycle_Logo_by_Har1-300x263.png Reusability DRY principle
  • 5. https://mygene.info Avg. 10M requests from 14K unique IPs every month
  • 6. { “_id”: “1017”, “symbol”: “CDK2”, “ensembl”: “ENSG00000123374”, “refseq”: [ “NM_001798”, “NM_052827” ], “reporter”: { “U95A”: [ “1792_g_at”, “1833_at” ], “U133A”:[ “211804_s_at”, “2045252_at”, “211803_at” ] } } Source merging criteria: matching NCBI or Ensembl Gene ids HGNC MGI RGD Refseq Ensembl UniProt UniGene Homologene PantherDB GO Reactome Wikipathways KEGG PDB PFAM Interpro Prosite PIR Pharmgkb UMLS Wikipedia Pharos …
  • 7. • Get gene object(s) via either NCBI/Ensembl gene ids: • http://mygene.info/v3/gene/1017 • http://mygene.info/v3/gene/ENSG00000123374 • http://mygene.info/v3/gene/1017?fields=symbol,name,pathway,uniprot • Find matching gene objects with any query terms: • http://mygene.info/v3/query?q=CDK2 • http://mygene.info/v3/query?q=name:kinase&species=human • http://mygene.info/v3/query?q=name:kinase AND _exists_:pathway • http://mygene.info/v3/query?q=pathway.kegg.name:wnt&fields=entrezgene,symbol,taxid,interpro Batch queries supported via POST
  • 8. Aggregates annotations for 97 million drugs/chemicals from 12 resources I have a list of drug/chemical ids, want to get annotations about them? Drug/chemical annotation service: GET /v1/drug/<drugid> POST /v1/drug/ (batch mode) I want to get matching drugs/chemicals with my query term(s) Drug/chemical query service: GET /v1/query/?q= <query> POST /v1/query/ (batch mode) http://mygene.info http://myvariant.info http://mychem.info Aggregates annotations for 32 million genes from 30 resources I have a list of gene ids, want to get annotations about them? Gene annotation service: GET /v3/gene/<geneid> POST /v3/gene/ (batch mode) I want to get matching genes with my query term(s) Gene query service: GET /v3/query/?q= <query> POST /v3/query/ (batch mode) Aggregates annotations for 950 million variants from 21 resources I have a list of variant ids, want to get annotations about them? Variant annotation service: GET /v1/variant/<hgvsid> POST /v1/variant/ (batch mode) I want to get matching variants with my query term(s) Variant query service: GET /v1/query/?q= <query> POST /v1/query/ (batch mode) MyDisease.info mydisease.info BioThings API for Taxonomy t.biothings.io Other BioThings APIs:
  • 9.  Simple to use  Comprehensive - MyGene.info: 32M genes from 30K species - MyVariant.info: 874M (700M observed) - MyChem.info: 97M chemicals/drugs  Developer-friendly (support CORS, gzip, https, msgpack, etc.) • “fields” parameter to filter down the response to what’s needed • “fetch_all” feature for streaming large query results  Python, R, JavaScript clients Usability Sustainability  Always up-to-date (weekly updated)  High-performance and scalable  High-availability ENTERPRISE GRADE
  • 10. A Python package turns data sources into a high-quality API pip install biothings https://pypi.org/project/biothings/
  • 11. https://github.com/biothings/biothings_studioMyGene.info data sources shown in BioThings Studio New version downloaded Need dev’s attentions Sebastien
  • 12. Code base: https://github.com/kevinxin90/phewas  Write your own parser Example data source: PheWAS at https://phewascatalog.org/ Tracked for updates https://github.com/biothings/biothings_studio
  • 13.  Register as a new data source dump data upload data inspect data BioThings Studio is provided as a ready-to-start Docker image. See tutorial at http://docs.biothings.io/en/latest/do c/studio.html https://github.com/biothings/biothings_studio
  • 14.  Upload and inspect data from the parser https://github.com/biothings/biothings_studio
  • 15.  Make a new “data-build” https://github.com/biothings/biothings_studio
  • 16.  Create a new release and setup the API https://github.com/biothings/biothings_studio
  • 17.  Start your API! • Get PheWAS associations to a specific SNP: • http://localhost:8000/variant/chr12:g.56364321A>G • http://localhost:8000/query?q=rs1250552 • http://localhost:8000/query?q=rs1250552&fields=phewas.gwas_asso ciations,phewas.gene,phewas.rsid • Find PheWAS associations to a gene: • http://localhost:8000/query?q=CDK2 • Find PheWAS associations to “Asthma”: • http://localhost:8000/query?q=phewas.gwas_associations:asthma Accessible from Python/R/Javascript biothings_client too https://github.com/biothings/biothings_studio
  • 19. Accessible Findable Interoperable Reusable If you want fast and update- to-date access to gene, variant, chemical, drug data. If you want to quickly turn your data into a high- performance API. If you built your API and want others to find your API and use it together with other APIs for a specific workflow.
  • 20. Scripps Research Andrew Su (sulab.org) Cyrus Afrasiabi Sebastien Lelong Jiwen (Kevin) Xin Marco Cano Alvarado Xinhua (Jerry) Zhou Ginger Tsueng Byung Ryul Jeon Greg Taylor Nina Moore Maastricht Univ. Michel Dumontier (dumontierlab.com) Amrapali Zaveri Kody Moodley Trish Whetzel (EBI) Shima Dastgheib (NuMedii) Ruben Verborgh (Ghent Univ.) Paul Avillach (Harvard) Gabor Korodi (Harvard) Raymond Terryn (Univ. of Miami) Kathleen Jagodnik (Mount Sinai) Pedro Assis (Stanford) Funding support from NIH Data Commons API interoperability working group Univ. of Washington Sean Mooney Vikas R Pejaver Translator, CD2H