Advertisement
Advertisement

More Related Content

Advertisement

More from Jan Aerts(20)

Advertisement

Recently uploaded(20)

H Mishima - Biogem, Ruby UCSC API, and BioRuby

  1. BioRuby •a bioinformatics library for the Ruby language •>11 years - project since Nov. 21, 2000
  2. BioRuby is an open-source project BUT, I HAVE A QUESTION...
  3. Aspects of the word ‘OPEN’ •OPEN for redistribution •OPEN for source code access •OPEN for contribution
  4. CENTRALIZED APPROACH • Pros –QC for stability and consistency –easy to apply coding standard –enables extensive tests and documentation • Cons –heavy burden on release managers –longer process, sparser release –lack of cutting-edge features
  5. Two ways to participate in BioRuby development 1. Be a committer 1. be a trusted contributor in the community 2. get an open-bio.org account 3. be a CSV/SVN committer 2. Send patches to (busy) core-members 1. wait for patch evaluation 2. wait for next release of BioRuby
  6. Two ways to participate in BioRuby development 1. Be a committer 1. be a trusted contributor in the community 2. get an open-bio.org account 3. be a CSV/SVN committer 2. Send patches to (busy) core-members 1. wait for patch evaluation 2. wait for next release of BioRuby
  7. Actions of BioRuby •more OPEN for source code access •more OPEN for contribution
  8. ACTION 1 Social Coding Using GitHub In 2010, the BioRuby project source repository moved to GitHub
  9. • Users can fork the code freely. • Users still have to wait for acceptance of pull-requests to get their code incorporated into the official repository.
  10. ACTION 2 Plug-in system - BioGem
  11. DECENTRALIZED APPROACH • Enables expanding BioRuby without tweaking its stable core • plug-ins are maintained by their authors • encourage ‘best practice’ using a tool (biogem command) – Standard directory structure – version control using Git – Using the RubyGems packaging system – testing and documentation
  12. The Biogems workflow
  13. Biogems.info – a portal site for Biogem users Biogems.info rank in total downloads (rank up&down) citation, current version, day of final release, links to source code, status of Travis continuous integration highly motivating (me)
  14. Database /web-service API File Parser Visualization bio ucsc api bio gff3 bio graphics intermine bio assembly Framework eutils bio blastxmlparser bio ngs sequenceserver bio faster Toolbox goruby bio alignment bio genomic interval bio ensembl bio nexml bio bigbio Wrapper bio kb illumina bio hello bio samtools bio octopus bio plasmoap bio logger bio affy bio cnls screenscraper bio bwa bio dbsno bio data bio signalp bio rdf bio aliphatic index bio sge bio hmmer model bio hydropathy bio exportpred bio hmmer3 report bio gngm bio tabix bio pileup iterator Application bio phyloxml Biogem Example scaffolder bio hello genfrag bio isoelectric point Biogem Collection bio phyta bio core bio tm hmm dna sequence aligner bio gag bio kmer counter more than 60 Biogems...
  15. Database /web-service API File Parser Visualization bio ucsc api bio gff3 bio graphics intermine bio assembly Framework eutils bio blastxmlparser bio ngs sequenceserver bio faster Toolbox goruby bio alignment bio genomic interval bio ensembl bio nexml bio bigbio Wrapper bio kb illumina bio hello bio samtools bio octopus bio plasmoap bio logger bio affy bio cnls screenscraper bio bwa bio dbsnp bio data bio signalp bio rdf bio aliphatic index bio sge bio hmmer model bio hydropathy bio exportpred bio hmmer3 report bio gngm bio tabix bio pileup iterator Application bio phyloxml Biogem Example scaffolder bio hello genfrag bio isoelectric point Biogem Collection bio phyta bio core bio tm hmm dna sequence aligner bio gag Database Access-related bio kmer counter Next Generation Sequencing-related
  16. Hiro Mishima • NOT a core developer of BioRuby • not a computer scientist but a dentist • semi-dry biologist • human geneticist
  17. Ruby UCSC API
  18. >40,000 tables!
  19. How to get started $ gem install bio-ucsc-api 22
  20. A query written in fluent interface. require 'bio-ucsc‘ Bio::Ucsc::Hg19.connect result = Bio::Ucsc::Hg19::Snp131. find_by_name("rs56289060") puts result.chrom # => "chr1" 23
  21. SQL made easy region = "chr17:7,579,614-7,579,700" condition = Bio::Ucsc::Hg19::Snp131. with_interval(region).select(:name) puts condition.to_sql SELECT name FROM `snp131` WHERE (chrom = 'chr17' AND bin in (642,80,9,1,0) AND ( (chromStart BETWEEN 7579613 AND 7579700) OR (chromEnd BETWEEN 7579613 AND 7579700) OR (chromStart <= 7579613 AND chromEND >= 7579700) )); 24
  22. FUTURE DIRECTION of BioGem • Still QC by peer-review is important. –ensures stability and quality of codes and documents –educates plug-in authors • R/Bioconductor has excellent peer- review system –good coding style and well-formatted document –requires huge human resources and efforts
  23. Solutions would be… • recommended collections • Bio-Core (Raoul J.P. Bonnal) • loose/casual peer-review • need to draw up guidelines for designing “good” biogems
  24. ACKNOWLEDGMENTS • All BioRuby contributors • Ruby UCSC API – Jan Aerts • The BioRuby Panel – Raoul Bonnal – Naohisa Goto – Francesco Strozzi – Toshiaki Katayama – Pjotr Prins • Dept. of Human Genetics, Nagasaki Univ. – Koh-ichiro Yoshiura • Google Summer of Code students • O|B|F – Open Bioinformatics Foundation
  25. or mishima_eng
Advertisement