H Mishima - Biogem, Ruby UCSC API, and BioRuby

2,293 views

Published on

Presentation by H Mishima at BOSC2012: Biogem, Ruby UCSC API, and BioRuby

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,293
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
18
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

H Mishima - Biogem, Ruby UCSC API, and BioRuby

  1. 1. BioRuby•a bioinformatics library for the Ruby language•>11 years - project since Nov. 21, 2000
  2. 2. BioRuby is an open-source project BUT, I HAVE A QUESTION...
  3. 3. Aspects of the word ‘OPEN’ •OPEN for redistribution •OPEN for source code access •OPEN for contribution
  4. 4. CENTRALIZED APPROACH• Pros –QC for stability and consistency –easy to apply coding standard –enables extensive tests and documentation• Cons –heavy burden on release managers –longer process, sparser release –lack of cutting-edge features
  5. 5. Two ways to participate in BioRuby development1. Be a committer 1. be a trusted contributor in the community 2. get an open-bio.org account 3. be a CSV/SVN committer2. Send patches to (busy) core-members 1. wait for patch evaluation 2. wait for next release of BioRuby
  6. 6. Two ways to participate in BioRuby development1. Be a committer 1. be a trusted contributor in the community 2. get an open-bio.org account 3. be a CSV/SVN committer2. Send patches to (busy) core-members 1. wait for patch evaluation 2. wait for next release of BioRuby
  7. 7. Actions of BioRuby •more OPEN for source code access •more OPEN for contribution
  8. 8. ACTION 1 Social Coding Using GitHub In 2010, the BioRuby project source repository moved to GitHub
  9. 9. • Users can fork the code freely.• Users still have to wait for acceptance of pull-requests to get their code incorporated into the official repository.
  10. 10. ACTION 2Plug-in system - BioGem
  11. 11. DECENTRALIZED APPROACH• Enables expanding BioRuby without tweaking its stable core• plug-ins are maintained by their authors• encourage ‘best practice’ using a tool (biogem command) – Standard directory structure – version control using Git – Using the RubyGems packaging system – testing and documentation
  12. 12. The Biogems workflow
  13. 13. Biogems.info – a portal site for Biogem users Biogems.info rank in total downloads (rank up&down) citation, current version, day of final release, links to source code, status of Travis continuous integration highly motivating (me)
  14. 14. Database /web-service API File Parser Visualization bio ucsc api bio gff3 bio graphics intermine bio assembly Framework eutils bio blastxmlparser bio ngs sequenceserver bio faster Toolbox goruby bio alignment bio genomic interval bio ensembl bio nexml bio bigbioWrapper bio kb illumina bio hello bio samtools bio octopus bio plasmoap bio logger bio affy bio cnls screenscraper bio bwa bio dbsno bio data bio signalp bio rdf bio aliphatic index bio sge bio hmmer model bio hydropathy bio exportpred bio hmmer3 report bio gngm bio tabix bio pileup iteratorApplication bio phyloxml Biogem Example scaffolder bio hello genfrag bio isoelectric point Biogem Collection bio phyta bio core bio tm hmm dna sequence aligner bio gag bio kmer counter more than 60 Biogems...
  15. 15. Database /web-service API File Parser Visualization bio ucsc api bio gff3 bio graphics intermine bio assembly Framework eutils bio blastxmlparser bio ngs sequenceserver bio faster Toolbox goruby bio alignment bio genomic interval bio ensembl bio nexml bio bigbioWrapper bio kb illumina bio hello bio samtools bio octopus bio plasmoap bio logger bio affy bio cnls screenscraper bio bwa bio dbsnp bio data bio signalp bio rdf bio aliphatic index bio sge bio hmmer model bio hydropathy bio exportpred bio hmmer3 report bio gngm bio tabix bio pileup iteratorApplication bio phyloxml Biogem Example scaffolder bio hello genfrag bio isoelectric point Biogem Collection bio phyta bio core bio tm hmm dna sequence aligner bio gag Database Access-related bio kmer counter Next Generation Sequencing-related
  16. 16. Hiro Mishima• NOT a core developer of BioRuby• not a computer scientist but a dentist• semi-dry biologist• human geneticist
  17. 17. Ruby UCSC API
  18. 18. >40,000tables!
  19. 19. How to get started$ gem install bio-ucsc-api 22
  20. 20. A query written in fluent interface. require bio-ucsc‘ Bio::Ucsc::Hg19.connect result = Bio::Ucsc::Hg19::Snp131. find_by_name("rs56289060") puts result.chrom # => "chr1" 23
  21. 21. SQL made easy region = "chr17:7,579,614-7,579,700" condition = Bio::Ucsc::Hg19::Snp131. with_interval(region).select(:name) puts condition.to_sqlSELECT name FROM `snp131`WHERE (chrom = chr17 AND bin in (642,80,9,1,0) AND ( (chromStart BETWEEN 7579613 AND 7579700) OR (chromEnd BETWEEN 7579613 AND 7579700) OR (chromStart <= 7579613 AND chromEND >= 7579700) )); 24
  22. 22. FUTURE DIRECTION of BioGem• Still QC by peer-review is important. –ensures stability and quality of codes and documents –educates plug-in authors• R/Bioconductor has excellent peer- review system –good coding style and well-formatted document –requires huge human resources and efforts
  23. 23. Solutions would be…• recommended collections • Bio-Core (Raoul J.P. Bonnal)• loose/casual peer-review• need to draw up guidelines for designing “good” biogems
  24. 24. ACKNOWLEDGMENTS• All BioRuby contributors• Ruby UCSC API – Jan Aerts• The BioRuby Panel – Raoul Bonnal – Naohisa Goto – Francesco Strozzi – Toshiaki Katayama – Pjotr Prins• Dept. of Human Genetics, Nagasaki Univ. – Koh-ichiro Yoshiura• Google Summer of Code students• O|B|F – Open Bioinformatics Foundation
  25. 25. or mishima_eng

×