This document discusses the challenges of open source biological software projects including community engagement, integration with other tools, and increasing accessibility (democratization). It provides examples of how the Biopython project addresses these challenges such as through the Google Summer of Code program, improving documentation, and leveraging cloud computing resources to more easily distribute and access biological data and tools.
1. Community Integration Democratization
Biopython: challenges
Brad Chapman
Peter Cock
Biopython contributors
http://biopython.org
10 July 2010
2. Community Integration Democratization
3 challenges for successful open source
projects
Community
Integration
Democratization
3. Community Integration Democratization
Distributed code access
4. Community Integration Democratization
Recruiting and training
Google Summer of Code
2009 Eric Talevich
phyloXML; Bio.Phylo
Nick Matzke
Biogeographical Phylogenetics
2010 Jo˜o Rodrigues
a
Structural biology; Bio.PDB
5. Community Integration Democratization
Answering questions better
6. Community Integration Democratization
Recognizing contributions
7. Community Integration Democratization
Diversity of Python bioinformatics
8. Community Integration Democratization
Interoperability
Avoid re-implementation
Convert core objects
Document workflows with multiple
libraries
Communicate better
10. Community Integration Democratization
Documenting standards
11. Community Integration Democratization
Making code easier to use
>>> from Bio import SeqIO
>>> memory_dict = SeqIO.index("in.gb", "genbank")
>>> memory_dict.keys()
[’Z78484.1’, ... ’Z78471.1’]
>>> seq_record = memory_dict["Z78475.1"]
>>> print seq_record.description
P.supardii 5.8S rRNA gene and ITS1 and ITS2 DNA
>>> seq_record.seq
Seq(’CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGG...GGT’,
IUPACAmbiguousDNA())
12. Community Integration Democratization
Challenges of big data
13. Community Integration Democratization
Cloud: easier to distribute
On-demand computational resources like
Amazon EC2
Provide ready-to-go images
Biopython and many associated
bioinformatics libraries
Biological data
http://github.com/chapmanb/bcbb/tree/master/ec2/biolinux/
14. Community Integration Democratization
Following up
Home http://biopython.org
Code http://github.com/biopython
BOSC Talk to Eric, Tiago or myself