SlideShare a Scribd company logo
BioRuby
•a bioinformatics
 library for the Ruby
 language
•>11 years - project
 since Nov. 21, 2000
BioRuby

 is an
 open-source
 project
          BUT, I HAVE A QUESTION...
Aspects of the word ‘OPEN’
 •OPEN for
  redistribution
 •OPEN for source
  code access
 •OPEN for
  contribution
CENTRALIZED APPROACH
• Pros
  –QC for stability and consistency
  –easy to apply coding standard
  –enables extensive tests and documentation
• Cons
  –heavy burden on release managers
  –longer process, sparser release
  –lack of cutting-edge features
Two ways to participate in
  BioRuby development
1. Be a committer
  1.   be a trusted contributor in the community
  2. get an open-bio.org account
  3. be a CSV/SVN committer
2. Send patches to (busy) core-members
  1. wait for patch evaluation
  2. wait for next release of BioRuby
Two ways to participate in
  BioRuby development
1. Be a committer
  1.   be a trusted contributor in the community
  2. get an open-bio.org account
  3. be a CSV/SVN committer
2. Send patches to (busy) core-members
  1. wait for patch evaluation
  2. wait for next release of BioRuby
Actions of BioRuby
 •more OPEN for
  source code
  access
 •more OPEN for
  contribution
ACTION 1

  Social Coding Using GitHub

       In 2010, the BioRuby
       project source repository
       moved to GitHub
• Users can fork the code freely.
• Users still have to wait for
  acceptance of pull-requests to get
  their code incorporated into the
  official repository.
ACTION 2

Plug-in system - BioGem
DECENTRALIZED APPROACH
• Enables expanding BioRuby without
  tweaking its stable core
• plug-ins are maintained by their authors
• encourage ‘best practice’ using a tool
  (biogem command)
  – Standard directory structure
  – version control using Git
  – Using the RubyGems packaging system
  – testing and documentation
The Biogems workflow
Biogems.info – a portal site for Biogem users
 Biogems.info

 rank in total downloads (rank up&down)
 citation, current version,
 day of final release, links to source code,
 status of Travis continuous integration




                           highly motivating (me)
Database /web-service API     File Parser                  Visualization
      bio ucsc api                   bio gff3                    bio graphics
      intermine                      bio assembly          Framework
      eutils                         bio blastxmlparser          bio ngs
      sequenceserver                 bio faster            Toolbox
      goruby                         bio alignment               bio genomic interval
      bio ensembl                    bio nexml                   bio bigbio
Wrapper                              bio kb illumina             bio hello
      bio samtools                   bio octopus                 bio plasmoap
      bio logger                     bio affy                    bio cnls screenscraper
      bio bwa                        bio dbsno                   bio data
      bio signalp                    bio rdf                     bio aliphatic index
      bio sge                        bio hmmer model             bio hydropathy
      bio exportpred                 bio hmmer3 report           bio gngm
      bio tabix                      bio pileup iterator
Application                          bio phyloxml          Biogem Example
      scaffolder                                                 bio hello
      genfrag
      bio isoelectric point                                Biogem Collection
      bio phyta                                                  bio core
      bio tm hmm
      dna sequence aligner
      bio gag
      bio kmer counter
                                              more than 60 Biogems...
Database /web-service API     File Parser                  Visualization
      bio ucsc api                   bio gff3                    bio graphics
      intermine                      bio assembly          Framework
      eutils                         bio blastxmlparser          bio ngs
      sequenceserver                 bio faster            Toolbox
      goruby                         bio alignment               bio genomic interval
      bio ensembl                    bio nexml                   bio bigbio
Wrapper                              bio kb illumina             bio hello
      bio samtools                   bio octopus                 bio plasmoap
      bio logger                     bio affy                    bio cnls screenscraper
      bio bwa                        bio dbsnp                   bio data
      bio signalp                    bio rdf                     bio aliphatic index
      bio sge                        bio hmmer model             bio hydropathy
      bio exportpred                 bio hmmer3 report           bio gngm
      bio tabix                      bio pileup iterator
Application                          bio phyloxml          Biogem Example
      scaffolder                                                 bio hello
      genfrag
      bio isoelectric point                                Biogem Collection
      bio phyta                                                  bio core
      bio tm hmm
      dna sequence aligner
      bio gag                       Database Access-related
      bio kmer counter              Next Generation Sequencing-related
Hiro Mishima
•   NOT a core
    developer of
    BioRuby
•   not a computer
    scientist but a
    dentist
•   semi-dry biologist
•   human geneticist
Ruby UCSC API
>40,000
tables!
How to get started


$ gem install bio-ucsc-api



                             22
A query written in fluent interface.

 require 'bio-ucsc‘
 Bio::Ucsc::Hg19.connect
 result =
   Bio::Ucsc::Hg19::Snp131.
   find_by_name("rs56289060")
 puts result.chrom # => "chr1"


                                       23
SQL made easy
    region = "chr17:7,579,614-7,579,700"
    condition =
      Bio::Ucsc::Hg19::Snp131.
      with_interval(region).select(:name)
    puts condition.to_sql



SELECT name FROM `snp131`
WHERE (chrom = 'chr17' AND bin in (642,80,9,1,0)
 AND ( (chromStart BETWEEN 7579613 AND 7579700)
    OR (chromEnd BETWEEN 7579613 AND 7579700)
    OR (chromStart <= 7579613 AND
        chromEND >= 7579700) ));
                                               24
FUTURE DIRECTION of BioGem
• Still QC by peer-review is important.
  –ensures stability and quality of codes
   and documents
  –educates plug-in authors
• R/Bioconductor has excellent peer-
  review system
  –good coding style and well-formatted
   document
  –requires huge human resources and
   efforts
Solutions would be…

• recommended collections
   • Bio-Core (Raoul J.P. Bonnal)
• loose/casual peer-review
• need to draw up guidelines for
  designing “good” biogems
ACKNOWLEDGMENTS
• All BioRuby contributors
• Ruby UCSC API
  – Jan Aerts
• The BioRuby Panel
  –   Raoul Bonnal
  –   Naohisa Goto
  –   Francesco Strozzi
  –   Toshiaki Katayama
  –   Pjotr Prins
• Dept. of Human Genetics, Nagasaki Univ.
  – Koh-ichiro Yoshiura
• Google Summer of Code students
• O|B|F – Open Bioinformatics Foundation
or   mishima_eng

More Related Content

What's hot

Sequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN Platform
Surya Saha
 
NGS overview
NGS overviewNGS overview
NGS overview
AllSeq
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017
Surya Saha
 
Introduction to 16S Microbiome Analysis
Introduction to 16S Microbiome AnalysisIntroduction to 16S Microbiome Analysis
Introduction to 16S Microbiome Analysis
Bioinformatics and Computational Biosciences Branch
 
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Joe Parker
 
How to write bioinformatics software people will use and cite - t.seemann - ...
How to write bioinformatics software people will use and cite -  t.seemann - ...How to write bioinformatics software people will use and cite -  t.seemann - ...
How to write bioinformatics software people will use and cite - t.seemann - ...
Torsten Seemann
 
Nanopore long-read metagenomics
Nanopore long-read metagenomicsNanopore long-read metagenomics
Nanopore long-read metagenomics
Martin Hölzer
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
jennomics
 
Sweden_eemis_big_data
Sweden_eemis_big_dataSweden_eemis_big_data
Sweden_eemis_big_data
Adina Chuang Howe
 
Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...
Keith Bradnam
 
CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015
Richard Casey
 
16S rRNA Analysis using Mothur Pipeline
16S rRNA Analysis using Mothur Pipeline16S rRNA Analysis using Mothur Pipeline
16S rRNA Analysis using Mothur Pipeline
Eman Abdelrazik
 

What's hot (12)

Sequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN Platform
 
NGS overview
NGS overviewNGS overview
NGS overview
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017
 
Introduction to 16S Microbiome Analysis
Introduction to 16S Microbiome AnalysisIntroduction to 16S Microbiome Analysis
Introduction to 16S Microbiome Analysis
 
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
 
How to write bioinformatics software people will use and cite - t.seemann - ...
How to write bioinformatics software people will use and cite -  t.seemann - ...How to write bioinformatics software people will use and cite -  t.seemann - ...
How to write bioinformatics software people will use and cite - t.seemann - ...
 
Nanopore long-read metagenomics
Nanopore long-read metagenomicsNanopore long-read metagenomics
Nanopore long-read metagenomics
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
 
Sweden_eemis_big_data
Sweden_eemis_big_dataSweden_eemis_big_data
Sweden_eemis_big_data
 
Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...
 
CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015
 
16S rRNA Analysis using Mothur Pipeline
16S rRNA Analysis using Mothur Pipeline16S rRNA Analysis using Mothur Pipeline
16S rRNA Analysis using Mothur Pipeline
 

Viewers also liked

D03-NextGen-Bio-NGS
D03-NextGen-Bio-NGSD03-NextGen-Bio-NGS
Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009
bosc
 
Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008
bosc_2008
 
Amistad
AmistadAmistad
Amistad
guestfd2ea0
 
Experiences with logic programming in bioinformatics
Experiences with logic programming in bioinformaticsExperiences with logic programming in bioinformatics
Experiences with logic programming in bioinformatics
Chris Mungall
 
Sharing Data: An Introductory Workshop from OpenAIRE and Foster
Sharing Data: An Introductory Workshop from OpenAIRE and FosterSharing Data: An Introductory Workshop from OpenAIRE and Foster
Sharing Data: An Introductory Workshop from OpenAIRE and Foster
OpenAIRE
 
Open Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionOpen Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some Introspection
Hilmar Lapp
 
yw jakartarb20101031
yw jakartarb20101031yw jakartarb20101031
yw jakartarb20101031
Yannick Wurm
 
Ch5andch6
Ch5andch6Ch5andch6
Ch5andch6
Larry Jennings
 

Viewers also liked (9)

D03-NextGen-Bio-NGS
D03-NextGen-Bio-NGSD03-NextGen-Bio-NGS
D03-NextGen-Bio-NGS
 
Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009
 
Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008
 
Amistad
AmistadAmistad
Amistad
 
Experiences with logic programming in bioinformatics
Experiences with logic programming in bioinformaticsExperiences with logic programming in bioinformatics
Experiences with logic programming in bioinformatics
 
Sharing Data: An Introductory Workshop from OpenAIRE and Foster
Sharing Data: An Introductory Workshop from OpenAIRE and FosterSharing Data: An Introductory Workshop from OpenAIRE and Foster
Sharing Data: An Introductory Workshop from OpenAIRE and Foster
 
Open Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionOpen Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some Introspection
 
yw jakartarb20101031
yw jakartarb20101031yw jakartarb20101031
yw jakartarb20101031
 
Ch5andch6
Ch5andch6Ch5andch6
Ch5andch6
 

Similar to H Mishima - Biogem, Ruby UCSC API, and BioRuby

20120907 microbiome-intro
20120907 microbiome-intro20120907 microbiome-intro
20120907 microbiome-intro
Leo Lahti
 
Bio4j
Bio4jBio4j
What Synthetic Biology Can Do For You
What Synthetic Biology Can Do For YouWhat Synthetic Biology Can Do For You
What Synthetic Biology Can Do For You
Eric Ma
 
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
InsideScientific
 
Neo4j and bioinformatics
Neo4j and bioinformaticsNeo4j and bioinformatics
Neo4j and bioinformatics
Pablo Pareja Tobes
 
Biopython Project Update 2013
Biopython Project Update 2013Biopython Project Update 2013
Biopython Project Update 2013
pjacock
 
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformaticsDeveloping an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Brad Chapman
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
Chunlei Wu
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
Bioinformatics and Computational Biosciences Branch
 
Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009
bosc
 
NetBioSIG2013-Talk Robin Haw
NetBioSIG2013-Talk Robin Haw NetBioSIG2013-Talk Robin Haw
NetBioSIG2013-Talk Robin Haw
Alexander Pico
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequences
BITS
 
bioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics databioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics data
International Institute of Tropical Agriculture
 
Single-Use-Bioreactors-A-Comprehensive-Examination.pptx
Single-Use-Bioreactors-A-Comprehensive-Examination.pptxSingle-Use-Bioreactors-A-Comprehensive-Examination.pptx
Single-Use-Bioreactors-A-Comprehensive-Examination.pptx
tinatarariyan
 
What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?
ylog
 
Splash presentation tra slides
Splash presentation tra slidesSplash presentation tra slides
Splash presentation tra slides
Eric Holmes
 
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
The University of Queensland
 
The Infobiotics workbench
The Infobiotics workbenchThe Infobiotics workbench
The Infobiotics workbench
Natalio Krasnogor
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
AznaShihab
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
bhargvi sharma
 

Similar to H Mishima - Biogem, Ruby UCSC API, and BioRuby (20)

20120907 microbiome-intro
20120907 microbiome-intro20120907 microbiome-intro
20120907 microbiome-intro
 
Bio4j
Bio4jBio4j
Bio4j
 
What Synthetic Biology Can Do For You
What Synthetic Biology Can Do For YouWhat Synthetic Biology Can Do For You
What Synthetic Biology Can Do For You
 
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
 
Neo4j and bioinformatics
Neo4j and bioinformaticsNeo4j and bioinformatics
Neo4j and bioinformatics
 
Biopython Project Update 2013
Biopython Project Update 2013Biopython Project Update 2013
Biopython Project Update 2013
 
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformaticsDeveloping an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009
 
NetBioSIG2013-Talk Robin Haw
NetBioSIG2013-Talk Robin Haw NetBioSIG2013-Talk Robin Haw
NetBioSIG2013-Talk Robin Haw
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequences
 
bioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics databioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics data
 
Single-Use-Bioreactors-A-Comprehensive-Examination.pptx
Single-Use-Bioreactors-A-Comprehensive-Examination.pptxSingle-Use-Bioreactors-A-Comprehensive-Examination.pptx
Single-Use-Bioreactors-A-Comprehensive-Examination.pptx
 
What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?
 
Splash presentation tra slides
Splash presentation tra slidesSplash presentation tra slides
Splash presentation tra slides
 
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
 
The Infobiotics workbench
The Infobiotics workbenchThe Infobiotics workbench
The Infobiotics workbench
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 

More from Jan Aerts

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
Jan Aerts
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?
Jan Aerts
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?
Jan Aerts
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013
Jan Aerts
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Jan Aerts
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data Analysis
Jan Aerts
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
Jan Aerts
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformatics
Jan Aerts
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
Jan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
Jan Aerts
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing Consortium
Jan Aerts
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
Jan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
Jan Aerts
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysis
Jan Aerts
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
Jan Aerts
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
Jan Aerts
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...
Jan Aerts
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
Jan Aerts
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining components
Jan Aerts
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
Jan Aerts
 

More from Jan Aerts (20)

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data Analysis
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformatics
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing Consortium
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysis
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining components
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 

Recently uploaded

LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
zjhamm304
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Fwdays
 
Discover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched ContentDiscover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched Content
ScyllaDB
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)
HarpalGohil4
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
leebarnesutopia
 
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsGetting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
ScyllaDB
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 

Recently uploaded (20)

LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
 
Discover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched ContentDiscover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched Content
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
 
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsGetting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 

H Mishima - Biogem, Ruby UCSC API, and BioRuby

  • 1.
  • 2. BioRuby •a bioinformatics library for the Ruby language •>11 years - project since Nov. 21, 2000
  • 3. BioRuby is an open-source project BUT, I HAVE A QUESTION...
  • 4.
  • 5. Aspects of the word ‘OPEN’ •OPEN for redistribution •OPEN for source code access •OPEN for contribution
  • 6. CENTRALIZED APPROACH • Pros –QC for stability and consistency –easy to apply coding standard –enables extensive tests and documentation • Cons –heavy burden on release managers –longer process, sparser release –lack of cutting-edge features
  • 7. Two ways to participate in BioRuby development 1. Be a committer 1. be a trusted contributor in the community 2. get an open-bio.org account 3. be a CSV/SVN committer 2. Send patches to (busy) core-members 1. wait for patch evaluation 2. wait for next release of BioRuby
  • 8. Two ways to participate in BioRuby development 1. Be a committer 1. be a trusted contributor in the community 2. get an open-bio.org account 3. be a CSV/SVN committer 2. Send patches to (busy) core-members 1. wait for patch evaluation 2. wait for next release of BioRuby
  • 9.
  • 10. Actions of BioRuby •more OPEN for source code access •more OPEN for contribution
  • 11. ACTION 1 Social Coding Using GitHub In 2010, the BioRuby project source repository moved to GitHub
  • 12. • Users can fork the code freely. • Users still have to wait for acceptance of pull-requests to get their code incorporated into the official repository.
  • 14. DECENTRALIZED APPROACH • Enables expanding BioRuby without tweaking its stable core • plug-ins are maintained by their authors • encourage ‘best practice’ using a tool (biogem command) – Standard directory structure – version control using Git – Using the RubyGems packaging system – testing and documentation
  • 16. Biogems.info – a portal site for Biogem users Biogems.info rank in total downloads (rank up&down) citation, current version, day of final release, links to source code, status of Travis continuous integration highly motivating (me)
  • 17. Database /web-service API File Parser Visualization bio ucsc api bio gff3 bio graphics intermine bio assembly Framework eutils bio blastxmlparser bio ngs sequenceserver bio faster Toolbox goruby bio alignment bio genomic interval bio ensembl bio nexml bio bigbio Wrapper bio kb illumina bio hello bio samtools bio octopus bio plasmoap bio logger bio affy bio cnls screenscraper bio bwa bio dbsno bio data bio signalp bio rdf bio aliphatic index bio sge bio hmmer model bio hydropathy bio exportpred bio hmmer3 report bio gngm bio tabix bio pileup iterator Application bio phyloxml Biogem Example scaffolder bio hello genfrag bio isoelectric point Biogem Collection bio phyta bio core bio tm hmm dna sequence aligner bio gag bio kmer counter more than 60 Biogems...
  • 18. Database /web-service API File Parser Visualization bio ucsc api bio gff3 bio graphics intermine bio assembly Framework eutils bio blastxmlparser bio ngs sequenceserver bio faster Toolbox goruby bio alignment bio genomic interval bio ensembl bio nexml bio bigbio Wrapper bio kb illumina bio hello bio samtools bio octopus bio plasmoap bio logger bio affy bio cnls screenscraper bio bwa bio dbsnp bio data bio signalp bio rdf bio aliphatic index bio sge bio hmmer model bio hydropathy bio exportpred bio hmmer3 report bio gngm bio tabix bio pileup iterator Application bio phyloxml Biogem Example scaffolder bio hello genfrag bio isoelectric point Biogem Collection bio phyta bio core bio tm hmm dna sequence aligner bio gag Database Access-related bio kmer counter Next Generation Sequencing-related
  • 19. Hiro Mishima • NOT a core developer of BioRuby • not a computer scientist but a dentist • semi-dry biologist • human geneticist
  • 22. How to get started $ gem install bio-ucsc-api 22
  • 23. A query written in fluent interface. require 'bio-ucsc‘ Bio::Ucsc::Hg19.connect result = Bio::Ucsc::Hg19::Snp131. find_by_name("rs56289060") puts result.chrom # => "chr1" 23
  • 24. SQL made easy region = "chr17:7,579,614-7,579,700" condition = Bio::Ucsc::Hg19::Snp131. with_interval(region).select(:name) puts condition.to_sql SELECT name FROM `snp131` WHERE (chrom = 'chr17' AND bin in (642,80,9,1,0) AND ( (chromStart BETWEEN 7579613 AND 7579700) OR (chromEnd BETWEEN 7579613 AND 7579700) OR (chromStart <= 7579613 AND chromEND >= 7579700) )); 24
  • 25.
  • 26. FUTURE DIRECTION of BioGem • Still QC by peer-review is important. –ensures stability and quality of codes and documents –educates plug-in authors • R/Bioconductor has excellent peer- review system –good coding style and well-formatted document –requires huge human resources and efforts
  • 27. Solutions would be… • recommended collections • Bio-Core (Raoul J.P. Bonnal) • loose/casual peer-review • need to draw up guidelines for designing “good” biogems
  • 28.
  • 29. ACKNOWLEDGMENTS • All BioRuby contributors • Ruby UCSC API – Jan Aerts • The BioRuby Panel – Raoul Bonnal – Naohisa Goto – Francesco Strozzi – Toshiaki Katayama – Pjotr Prins • Dept. of Human Genetics, Nagasaki Univ. – Koh-ichiro Yoshiura • Google Summer of Code students • O|B|F – Open Bioinformatics Foundation
  • 30. or mishima_eng