SlideShare a Scribd company logo
1 of 16
Download to read offline
AllBio EU CodeFest 
/ 
|  
Phd Student @ 
Bioinformatics 
and Population Genomics 
Supervisor: 
Yannick Wurm |  
Before: 
bmpvieira.com/allbio14 
Bruno Vieira @bmpvieira 
@yannick__ 
© 2014 Bruno Vieira CC-BY 4.0
Some problems I faced 
during my research: 
Difficulty getting relevant descriptions 
and datasets from NCBI API using bio* libs 
For web projects, needed to implement 
the same functionality on browser and 
server 
Difficulty writing scalable, reproducible 
and complex bioinformatic pipelines
- Modular and universal bioinformatics 
Bionode.io 
Pipeable UNIX command line tools and 
JavaScript / Node.js APIs for bioinformatic 
analysis workflows on the server and browser. 
Collaborates with - Represent biological data on the web 
- Build data pipelines 
BioJS 
Dat 
Provides a streaming interface between every file 
format and data storage backend. "git for data" 
|  |  
dat-data.com @maxogden @mafintosh
bionode.io (online shell) 
Examples 
BASH 
bionode-ncbi urls assembly Solenopsis invicta | grep genomic.fna 
http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_000188075.1_Si_gnG/ 
GCA_000188075.1_Si_gnG_genomic.fna.gz 
bionode-ncbi download sra arthropoda | bionode-sra 
bionode-ncbi download gff bacteria 
JavaScript 
var ncbi = require('bionode-ncbi') 
ncbi.urls('assembly', 'Solenopsis invicta'), gotData) 
function gotData(urls) { 
var genome = urls[0].genomic.fna 
download(genome) 
})
Difficulty getting relevant description and 
datasets from NCBI API using bio* libs 
Python example 
import xml.etree.ElementTree as ET 
from Bio import Entrez 
Entrez.email = "mail@bmpvieira.com" 
esearch_handle = Entrez.esearch(db="assembly", term="Achromyrmex") 
esearch_record = Entrez.read(esearch_handle) 
for id in esearch_record['IdList']: 
esummary_handle = Entrez.esummary(db="assembly", id=id) 
esummary_record = Entrez.read(esummary_handle) 
documentSummarySet = esummary_record['DocumentSummarySet'] 
document = documentSummarySet['DocumentSummary'][0] 
metadata_XML = document['Meta'].encode('utf-8') 
metadata = ET.fromstring('<root>' + metadata_XML + '</root>') 
for entry in Metadata[1]: 
print entry.text 
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_000188075.1_Si_gnG 
Solution: 
bionode-ncbi
Need to reimplement the same code on 
browser and server. 
Solution: JavaScript everywhere 
Afra 
SequenceServer 
GeneValidator 
BioJS 
Biodalliance 
is converting parsers to 
Bionode
Difficulty writing scalable, reproducible and 
complex bioinformatic pipelines. 
Solution: Node.js Streams everywhere 
var ncbi = require('bionode-ncbi') 
var tool = require('tool-stream') 
var through = require('through2') 
var fork1 = through.obj() 
var fork2 = through.obj() 
ncbi 
.search('sra', 'Solenopsis invicta') 
.pipe(fork1) 
.pipe(dat.reads) 
fork1 
.pipe(tool.extractProperty('expxml.Biosample.id')) 
.pipe(ncbi.search('biosample')) 
.pipe(dat.samples) 
fork1 
.pipe(tool.extractProperty('uid')) 
.pipe(ncbi.link('sra', 'pubmed'))
Benefit from other JS 
projects 
Dat BioJS NoFlo
Reusable, small and tested 
modules
Some users and Contributors: 
Dat 
Biodalliance 
BioJS 
Yeo Lab 
(UC San Diego) 
Michael Lovci 
Olga Botvinnik 
Afra 
GeneValidator 
Soon: 
DNADigest
Thanks! 
Acknowledgements: 
 
 
 
 
 
 
@yannick__ 
@maxogden 
@mafintosh 
@alanmrice 
@dasmoth 
@biodevops
Why Node.js / JavaScript 
applies well to Bioinformatics 
Streams 
Easy to write CLI wrappers 
for Streams 
Reusable, small and tested modules 
Same language everywhere (JavaScript) 
Package Manager that works ( NPM 
) 
Huge number modules ( 93327, 199/day 
) 
Use other JS projects ( Dat , BioJS , NoFlo 
) 
Possible to write 
Desktop GUI apps in JS
Module counts
Package Manager that works 
npm install bionode 
npm install bionode -g 
npm test 
npm start 
npm run test-browser 
npm run build-docs 
npm init 
npm publish 
Not only for JavaScript, C/C++ too: 
Node.js style C/C++ modules 
Native C/C++ running in Google V8

More Related Content

Viewers also liked

Desencadenadores parte ii
Desencadenadores parte iiDesencadenadores parte ii
Desencadenadores parte iidiiego_1769856
 
Crowdsourcing gene predictions & estimating population sizes
Crowdsourcing gene predictions & estimating population sizesCrowdsourcing gene predictions & estimating population sizes
Crowdsourcing gene predictions & estimating population sizesBruno Vieira
 
원자력설명
원자력설명원자력설명
원자력설명boachoi
 
Cuaderno de mate 2do bachillerato
Cuaderno de mate 2do bachilleratoCuaderno de mate 2do bachillerato
Cuaderno de mate 2do bachilleratoCarlos Paredes
 
찬성측
찬성측찬성측
찬성측boachoi
 
Crowdsourcing genome annotation at #ccs14
Crowdsourcing genome annotation at #ccs14Crowdsourcing genome annotation at #ccs14
Crowdsourcing genome annotation at #ccs14Bruno Vieira
 
NNDKP_Cybersquatting - mai multe tipuri, aceeasi problema
NNDKP_Cybersquatting - mai multe tipuri, aceeasi problemaNNDKP_Cybersquatting - mai multe tipuri, aceeasi problema
NNDKP_Cybersquatting - mai multe tipuri, aceeasi problemaNestor_Nestor
 
Effective population size in insects
Effective population size in insectsEffective population size in insects
Effective population size in insectsBruno Vieira
 
Building collaborative workflows for scientific data
Building collaborative workflows for scientific dataBuilding collaborative workflows for scientific data
Building collaborative workflows for scientific dataBruno Vieira
 

Viewers also liked (12)

Desencadenadores parte ii
Desencadenadores parte iiDesencadenadores parte ii
Desencadenadores parte ii
 
Crowdsourcing gene predictions & estimating population sizes
Crowdsourcing gene predictions & estimating population sizesCrowdsourcing gene predictions & estimating population sizes
Crowdsourcing gene predictions & estimating population sizes
 
juego power point
juego power pointjuego power point
juego power point
 
원자력설명
원자력설명원자력설명
원자력설명
 
Cuaderno de mate 2do bachillerato
Cuaderno de mate 2do bachilleratoCuaderno de mate 2do bachillerato
Cuaderno de mate 2do bachillerato
 
찬성측
찬성측찬성측
찬성측
 
Crowdsourcing genome annotation at #ccs14
Crowdsourcing genome annotation at #ccs14Crowdsourcing genome annotation at #ccs14
Crowdsourcing genome annotation at #ccs14
 
11
1111
11
 
반대
반대반대
반대
 
NNDKP_Cybersquatting - mai multe tipuri, aceeasi problema
NNDKP_Cybersquatting - mai multe tipuri, aceeasi problemaNNDKP_Cybersquatting - mai multe tipuri, aceeasi problema
NNDKP_Cybersquatting - mai multe tipuri, aceeasi problema
 
Effective population size in insects
Effective population size in insectsEffective population size in insects
Effective population size in insects
 
Building collaborative workflows for scientific data
Building collaborative workflows for scientific dataBuilding collaborative workflows for scientific data
Building collaborative workflows for scientific data
 

Similar to AllBio and EU CodeFest 2014

Recent Developments in Free Medical Imaging Software
Recent Developments in Free Medical Imaging SoftwareRecent Developments in Free Medical Imaging Software
Recent Developments in Free Medical Imaging SoftwareAndrew Crabb
 
Principles of Reproducible Workflows (U-DAWS) nfcamp2019
Principles of Reproducible Workflows (U-DAWS) nfcamp2019Principles of Reproducible Workflows (U-DAWS) nfcamp2019
Principles of Reproducible Workflows (U-DAWS) nfcamp2019Venkat Malladi
 
Debugging Microservices - QCON 2017
Debugging Microservices - QCON 2017Debugging Microservices - QCON 2017
Debugging Microservices - QCON 2017Idit Levine
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeChunlei Wu
 
Ntino Krampis GSC 2011
Ntino Krampis GSC 2011Ntino Krampis GSC 2011
Ntino Krampis GSC 2011Ntino Krampis
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Jace Liang
 
Clc Bio Basic Company Presentation
Clc Bio Basic Company PresentationClc Bio Basic Company Presentation
Clc Bio Basic Company Presentationclcbio
 
Tran smart api_nov2013
Tran smart api_nov2013Tran smart api_nov2013
Tran smart api_nov2013keesvb
 
Continuous Integration (Jenkins/Hudson)
Continuous Integration (Jenkins/Hudson)Continuous Integration (Jenkins/Hudson)
Continuous Integration (Jenkins/Hudson)Dennys Hsieh
 
Class.mobilefirstfoundation.chapter.2.devops
Class.mobilefirstfoundation.chapter.2.devopsClass.mobilefirstfoundation.chapter.2.devops
Class.mobilefirstfoundation.chapter.2.devopsRoss Tang
 
Containers in Science: neuroimaging use cases
Containers in Science: neuroimaging use casesContainers in Science: neuroimaging use cases
Containers in Science: neuroimaging use casesKrzysztof Gorgolewski
 
web2py:Web development like a boss
web2py:Web development like a bossweb2py:Web development like a boss
web2py:Web development like a bossFrancisco Ribeiro
 
From construction to deployment of LifeWatchGreece the potentail role of EGI-...
From construction to deployment of LifeWatchGreece the potentail role of EGI-...From construction to deployment of LifeWatchGreece the potentail role of EGI-...
From construction to deployment of LifeWatchGreece the potentail role of EGI-...Emmanouella Panteri
 
Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious DiseaseJoão André Carriço
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilChristian Frech
 
End-to-end HTML5 APIs - The Geek Gathering 2013
End-to-end HTML5 APIs - The Geek Gathering 2013End-to-end HTML5 APIs - The Geek Gathering 2013
End-to-end HTML5 APIs - The Geek Gathering 2013Alexandre Morgaut
 

Similar to AllBio and EU CodeFest 2014 (20)

Recent Developments in Free Medical Imaging Software
Recent Developments in Free Medical Imaging SoftwareRecent Developments in Free Medical Imaging Software
Recent Developments in Free Medical Imaging Software
 
Principles of Reproducible Workflows (U-DAWS) nfcamp2019
Principles of Reproducible Workflows (U-DAWS) nfcamp2019Principles of Reproducible Workflows (U-DAWS) nfcamp2019
Principles of Reproducible Workflows (U-DAWS) nfcamp2019
 
Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009
 
C4Bio paper talk
C4Bio paper talkC4Bio paper talk
C4Bio paper talk
 
Debugging Microservices - QCON 2017
Debugging Microservices - QCON 2017Debugging Microservices - QCON 2017
Debugging Microservices - QCON 2017
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
 
Ntino Krampis GSC 2011
Ntino Krampis GSC 2011Ntino Krampis GSC 2011
Ntino Krampis GSC 2011
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology
 
MicrobeDB Overview
MicrobeDB OverviewMicrobeDB Overview
MicrobeDB Overview
 
Clc Bio Basic Company Presentation
Clc Bio Basic Company PresentationClc Bio Basic Company Presentation
Clc Bio Basic Company Presentation
 
Tran smart api_nov2013
Tran smart api_nov2013Tran smart api_nov2013
Tran smart api_nov2013
 
Continuous Integration (Jenkins/Hudson)
Continuous Integration (Jenkins/Hudson)Continuous Integration (Jenkins/Hudson)
Continuous Integration (Jenkins/Hudson)
 
Getting Native with NDK
Getting Native with NDKGetting Native with NDK
Getting Native with NDK
 
Class.mobilefirstfoundation.chapter.2.devops
Class.mobilefirstfoundation.chapter.2.devopsClass.mobilefirstfoundation.chapter.2.devops
Class.mobilefirstfoundation.chapter.2.devops
 
Containers in Science: neuroimaging use cases
Containers in Science: neuroimaging use casesContainers in Science: neuroimaging use cases
Containers in Science: neuroimaging use cases
 
web2py:Web development like a boss
web2py:Web development like a bossweb2py:Web development like a boss
web2py:Web development like a boss
 
From construction to deployment of LifeWatchGreece the potentail role of EGI-...
From construction to deployment of LifeWatchGreece the potentail role of EGI-...From construction to deployment of LifeWatchGreece the potentail role of EGI-...
From construction to deployment of LifeWatchGreece the potentail role of EGI-...
 
Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious Disease
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and Anduril
 
End-to-end HTML5 APIs - The Geek Gathering 2013
End-to-end HTML5 APIs - The Geek Gathering 2013End-to-end HTML5 APIs - The Geek Gathering 2013
End-to-end HTML5 APIs - The Geek Gathering 2013
 

Recently uploaded

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 

Recently uploaded (20)

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 

AllBio and EU CodeFest 2014

  • 1. AllBio EU CodeFest / |  Phd Student @ Bioinformatics and Population Genomics Supervisor: Yannick Wurm |  Before: bmpvieira.com/allbio14 Bruno Vieira @bmpvieira @yannick__ © 2014 Bruno Vieira CC-BY 4.0
  • 2. Some problems I faced during my research: Difficulty getting relevant descriptions and datasets from NCBI API using bio* libs For web projects, needed to implement the same functionality on browser and server Difficulty writing scalable, reproducible and complex bioinformatic pipelines
  • 3. - Modular and universal bioinformatics Bionode.io Pipeable UNIX command line tools and JavaScript / Node.js APIs for bioinformatic analysis workflows on the server and browser. Collaborates with - Represent biological data on the web - Build data pipelines BioJS Dat Provides a streaming interface between every file format and data storage backend. "git for data" |  |  dat-data.com @maxogden @mafintosh
  • 4. bionode.io (online shell) Examples BASH bionode-ncbi urls assembly Solenopsis invicta | grep genomic.fna http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_000188075.1_Si_gnG/ GCA_000188075.1_Si_gnG_genomic.fna.gz bionode-ncbi download sra arthropoda | bionode-sra bionode-ncbi download gff bacteria JavaScript var ncbi = require('bionode-ncbi') ncbi.urls('assembly', 'Solenopsis invicta'), gotData) function gotData(urls) { var genome = urls[0].genomic.fna download(genome) })
  • 5. Difficulty getting relevant description and datasets from NCBI API using bio* libs Python example import xml.etree.ElementTree as ET from Bio import Entrez Entrez.email = "mail@bmpvieira.com" esearch_handle = Entrez.esearch(db="assembly", term="Achromyrmex") esearch_record = Entrez.read(esearch_handle) for id in esearch_record['IdList']: esummary_handle = Entrez.esummary(db="assembly", id=id) esummary_record = Entrez.read(esummary_handle) documentSummarySet = esummary_record['DocumentSummarySet'] document = documentSummarySet['DocumentSummary'][0] metadata_XML = document['Meta'].encode('utf-8') metadata = ET.fromstring('<root>' + metadata_XML + '</root>') for entry in Metadata[1]: print entry.text ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_000188075.1_Si_gnG Solution: bionode-ncbi
  • 6. Need to reimplement the same code on browser and server. Solution: JavaScript everywhere Afra SequenceServer GeneValidator BioJS Biodalliance is converting parsers to Bionode
  • 7. Difficulty writing scalable, reproducible and complex bioinformatic pipelines. Solution: Node.js Streams everywhere var ncbi = require('bionode-ncbi') var tool = require('tool-stream') var through = require('through2') var fork1 = through.obj() var fork2 = through.obj() ncbi .search('sra', 'Solenopsis invicta') .pipe(fork1) .pipe(dat.reads) fork1 .pipe(tool.extractProperty('expxml.Biosample.id')) .pipe(ncbi.search('biosample')) .pipe(dat.samples) fork1 .pipe(tool.extractProperty('uid')) .pipe(ncbi.link('sra', 'pubmed'))
  • 8. Benefit from other JS projects Dat BioJS NoFlo
  • 9.
  • 10.
  • 11. Reusable, small and tested modules
  • 12. Some users and Contributors: Dat Biodalliance BioJS Yeo Lab (UC San Diego) Michael Lovci Olga Botvinnik Afra GeneValidator Soon: DNADigest
  • 13. Thanks! Acknowledgements:       @yannick__ @maxogden @mafintosh @alanmrice @dasmoth @biodevops
  • 14. Why Node.js / JavaScript applies well to Bioinformatics Streams Easy to write CLI wrappers for Streams Reusable, small and tested modules Same language everywhere (JavaScript) Package Manager that works ( NPM ) Huge number modules ( 93327, 199/day ) Use other JS projects ( Dat , BioJS , NoFlo ) Possible to write Desktop GUI apps in JS
  • 16. Package Manager that works npm install bionode npm install bionode -g npm test npm start npm run test-browser npm run build-docs npm init npm publish Not only for JavaScript, C/C++ too: Node.js style C/C++ modules Native C/C++ running in Google V8