Biopython at BOSC 2010

B
Brad ChapmanBiologist and Programmer at Mass General Hosptial, Boston
Community           Integration        Democratization




            Biopython: challenges

                 Brad Chapman
                   Peter Cock
              Biopython contributors
             http://biopython.org


                  10 July 2010
Community               Integration           Democratization




    3 challenges for successful open source
    projects

            Community
            Integration
            Democratization
Community     Integration   Democratization



Distributed code access
Community             Integration          Democratization



Recruiting and training
    Google Summer of Code

            2009   Eric Talevich
                   phyloXML; Bio.Phylo
                   Nick Matzke
                   Biogeographical Phylogenetics
            2010   Jo˜o Rodrigues
                     a
                   Structural biology; Bio.PDB
Community     Integration    Democratization



Answering questions better
Community     Integration   Democratization



Recognizing contributions
Community     Integration    Democratization



Diversity of Python bioinformatics
Community               Integration           Democratization



Interoperability


            Avoid re-implementation
            Convert core objects
            Document workflows with multiple
            libraries
            Communicate better
Community               Integration            Democratization



Wrapping external tools


    import subprocess
    from Bio.Blast.Applications import (
            NcbiblastxCommandline)
    cl = NcbiblastxCommandline(query="opuntia.fasta",
            db="nr", evalue=0.001, outfmt=5,
            out="opuntia.xml")
    subprocess.call(str(cl))
Community    Integration   Democratization



Documenting standards
Community               Integration            Democratization



Making code easier to use

    >>> from Bio import SeqIO
    >>> memory_dict = SeqIO.index("in.gb", "genbank")
    >>> memory_dict.keys()
    [’Z78484.1’, ... ’Z78471.1’]
    >>> seq_record = memory_dict["Z78475.1"]
    >>> print seq_record.description
    P.supardii 5.8S rRNA gene and ITS1 and ITS2 DNA
    >>> seq_record.seq
    Seq(’CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGG...GGT’,
            IUPACAmbiguousDNA())
Community     Integration   Democratization



Challenges of big data
Community                  Integration                 Democratization



Cloud: easier to distribute

            On-demand computational resources like
            Amazon EC2
            Provide ready-to-go images
            Biopython and many associated
            bioinformatics libraries
            Biological data
    http://github.com/chapmanb/bcbb/tree/master/ec2/biolinux/
Community          Integration      Democratization



Following up


       Home http://biopython.org
        Code http://github.com/biopython
       BOSC Talk to Eric, Tiago or myself
1 of 14

Recommended

Huizenprijzen in amsterdam by
Huizenprijzen in amsterdamHuizenprijzen in amsterdam
Huizenprijzen in amsterdamHenk van der Berg
851 views20 slides
Tabagisme et thrombose habbal by
Tabagisme et thrombose habbalTabagisme et thrombose habbal
Tabagisme et thrombose habbalsfa_angeiologie
633 views50 slides
Elalamy DiabèTe Et Aap Sfa 2009 by
Elalamy DiabèTe Et Aap Sfa 2009Elalamy DiabèTe Et Aap Sfa 2009
Elalamy DiabèTe Et Aap Sfa 2009sfa_angeiologie
604 views37 slides
201506 CSE340 Lecture 15 by
201506 CSE340 Lecture 15201506 CSE340 Lecture 15
201506 CSE340 Lecture 15Javier Gonzalez-Sanchez
560 views31 slides
201506 CSE340 Lecture 20 by
201506 CSE340 Lecture 20 201506 CSE340 Lecture 20
201506 CSE340 Lecture 20 Javier Gonzalez-Sanchez
539 views21 slides
Syst reninangiot pp cv aomi 02fev 2 by
Syst reninangiot pp cv   aomi 02fev 2Syst reninangiot pp cv   aomi 02fev 2
Syst reninangiot pp cv aomi 02fev 2sfa_angeiologie
882 views12 slides

More Related Content

Viewers also liked

Mobile Social Media, Sept. 2010, Do You Want To Be Visible?, Marketing Club K... by
Mobile Social Media, Sept. 2010, Do You Want To Be Visible?, Marketing Club K...Mobile Social Media, Sept. 2010, Do You Want To Be Visible?, Marketing Club K...
Mobile Social Media, Sept. 2010, Do You Want To Be Visible?, Marketing Club K...Jackson Bond
525 views55 slides
Barya Perception by
Barya PerceptionBarya Perception
Barya Perceptionetalcomendras
323 views16 slides
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te... by
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...Javier Gonzalez-Sanchez
1.2K views121 slides
Laserendoveineux b anastasie 1 er partie by
Laserendoveineux  b anastasie   1 er partieLaserendoveineux  b anastasie   1 er partie
Laserendoveineux b anastasie 1 er partiesfa_angeiologie
1.4K views13 slides
Week5-Group-J by
Week5-Group-JWeek5-Group-J
Week5-Group-Js1160114
383 views6 slides
Final programme 27 06 by
Final programme 27 06Final programme 27 06
Final programme 27 06sfa_angeiologie
1.2K views21 slides

Viewers also liked(19)

Mobile Social Media, Sept. 2010, Do You Want To Be Visible?, Marketing Club K... by Jackson Bond
Mobile Social Media, Sept. 2010, Do You Want To Be Visible?, Marketing Club K...Mobile Social Media, Sept. 2010, Do You Want To Be Visible?, Marketing Club K...
Mobile Social Media, Sept. 2010, Do You Want To Be Visible?, Marketing Club K...
Jackson Bond525 views
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te... by Javier Gonzalez-Sanchez
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...
Laserendoveineux b anastasie 1 er partie by sfa_angeiologie
Laserendoveineux  b anastasie   1 er partieLaserendoveineux  b anastasie   1 er partie
Laserendoveineux b anastasie 1 er partie
sfa_angeiologie1.4K views
Week5-Group-J by s1160114
Week5-Group-JWeek5-Group-J
Week5-Group-J
s1160114383 views
Sociale media en journalistiek by Bart Van Belle
Sociale media en journalistiekSociale media en journalistiek
Sociale media en journalistiek
Bart Van Belle1.1K views
Angeiologie 4 2013 - 1-2014 livre des resumes by sfa_angeiologie
Angeiologie 4 2013 - 1-2014 livre des resumesAngeiologie 4 2013 - 1-2014 livre des resumes
Angeiologie 4 2013 - 1-2014 livre des resumes
sfa_angeiologie3.4K views
Uzbekistan caving 2011 by Yura Taras
Uzbekistan caving 2011Uzbekistan caving 2011
Uzbekistan caving 2011
Yura Taras384 views

Similar to Biopython at BOSC 2010

Bio-UnaGrid: Easing bioinformatics workflow execution by
Bio-UnaGrid: Easing bioinformatics workflow executionBio-UnaGrid: Easing bioinformatics workflow execution
Bio-UnaGrid: Easing bioinformatics workflow executionMario Jose Villamizar Cano
824 views48 slides
Biopython Project Update 2013 by
Biopython Project Update 2013Biopython Project Update 2013
Biopython Project Update 2013pjacock
3.1K views28 slides
myExperiment @ Nettab by
myExperiment @ NettabmyExperiment @ Nettab
myExperiment @ NettabDuncan Hull
2.9K views38 slides
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat... by
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...Keiichiro Ono
4.1K views178 slides
Data-driven design of cell factories and communities by
Data-driven design of cell factories and communitiesData-driven design of cell factories and communities
Data-driven design of cell factories and communitiesLaura Berry
271 views30 slides
Bio world going digital, 27 March 2015, Ireland by
Bio world going digital, 27 March 2015, IrelandBio world going digital, 27 March 2015, Ireland
Bio world going digital, 27 March 2015, Irelandbioflux
428 views14 slides

Similar to Biopython at BOSC 2010(20)

Biopython Project Update 2013 by pjacock
Biopython Project Update 2013Biopython Project Update 2013
Biopython Project Update 2013
pjacock3.1K views
myExperiment @ Nettab by Duncan Hull
myExperiment @ NettabmyExperiment @ Nettab
myExperiment @ Nettab
Duncan Hull2.9K views
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat... by Keiichiro Ono
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
Keiichiro Ono4.1K views
Data-driven design of cell factories and communities by Laura Berry
Data-driven design of cell factories and communitiesData-driven design of cell factories and communities
Data-driven design of cell factories and communities
Laura Berry271 views
Bio world going digital, 27 March 2015, Ireland by bioflux
Bio world going digital, 27 March 2015, IrelandBio world going digital, 27 March 2015, Ireland
Bio world going digital, 27 March 2015, Ireland
bioflux428 views
AI for All: Biology is eating the world & AI is eating Biology by Intel® Software
AI for All: Biology is eating the world & AI is eating Biology AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology
Intel® Software606 views
Mercer bosc2010 microsoft_framework by BOSC 2010
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_framework
BOSC 2010782 views
PERICLES Building Digital Ecosystem Models - ‘Eye of the Storm: Preserving Di... by PERICLES_FP7
PERICLES Building Digital Ecosystem Models - ‘Eye of the Storm: Preserving Di...PERICLES Building Digital Ecosystem Models - ‘Eye of the Storm: Preserving Di...
PERICLES Building Digital Ecosystem Models - ‘Eye of the Storm: Preserving Di...
PERICLES_FP7641 views
Ten Simple Rules for Changing How Scholars Communicate by Philip Bourne
Ten Simple Rules for Changing How Scholars CommunicateTen Simple Rules for Changing How Scholars Communicate
Ten Simple Rules for Changing How Scholars Communicate
Philip Bourne2.1K views
Python for Big Data Analytics by Edureka!
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data Analytics
Edureka!17.2K views
Evaluation of Container Virtualized MEGADOCK System in Distributed Computing ... by Kento Aoyama
Evaluation of Container Virtualized MEGADOCK System in Distributed Computing ...Evaluation of Container Virtualized MEGADOCK System in Distributed Computing ...
Evaluation of Container Virtualized MEGADOCK System in Distributed Computing ...
Kento Aoyama484 views
Structure your academic writing well in English by Maura Hannon
Structure your academic writing well in EnglishStructure your academic writing well in English
Structure your academic writing well in English
Maura Hannon13 views
Machine Learning Based Botnet Detection by butest
Machine Learning Based Botnet DetectionMachine Learning Based Botnet Detection
Machine Learning Based Botnet Detection
butest1.2K views

More from Brad Chapman

Amazon resource for bioinformatics by
Amazon resource for bioinformaticsAmazon resource for bioinformatics
Amazon resource for bioinformaticsBrad Chapman
1.1K views36 slides
Developing distributed analysis pipelines with shared community resources usi... by
Developing distributed analysis pipelines with shared community resources usi...Developing distributed analysis pipelines with shared community resources usi...
Developing distributed analysis pipelines with shared community resources usi...Brad Chapman
2.4K views77 slides
Developing an open source community for cloud bioinformatics by
Developing an open source community for cloud bioinformaticsDeveloping an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformaticsBrad Chapman
856 views16 slides
GATK recalibration plot by
GATK recalibration plotGATK recalibration plot
GATK recalibration plotBrad Chapman
355 views1 slide
Next-generation sequencing request management system in Galaxy by
Next-generation sequencing request management system in GalaxyNext-generation sequencing request management system in Galaxy
Next-generation sequencing request management system in GalaxyBrad Chapman
1.6K views14 slides
BioHackathon 2010 Intro by
BioHackathon 2010 IntroBioHackathon 2010 Intro
BioHackathon 2010 IntroBrad Chapman
448 views4 slides

More from Brad Chapman(7)

Amazon resource for bioinformatics by Brad Chapman
Amazon resource for bioinformaticsAmazon resource for bioinformatics
Amazon resource for bioinformatics
Brad Chapman1.1K views
Developing distributed analysis pipelines with shared community resources usi... by Brad Chapman
Developing distributed analysis pipelines with shared community resources usi...Developing distributed analysis pipelines with shared community resources usi...
Developing distributed analysis pipelines with shared community resources usi...
Brad Chapman2.4K views
Developing an open source community for cloud bioinformatics by Brad Chapman
Developing an open source community for cloud bioinformaticsDeveloping an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Brad Chapman856 views
GATK recalibration plot by Brad Chapman
GATK recalibration plotGATK recalibration plot
GATK recalibration plot
Brad Chapman355 views
Next-generation sequencing request management system in Galaxy by Brad Chapman
Next-generation sequencing request management system in GalaxyNext-generation sequencing request management system in Galaxy
Next-generation sequencing request management system in Galaxy
Brad Chapman1.6K views
BioHackathon 2010 Intro by Brad Chapman
BioHackathon 2010 IntroBioHackathon 2010 Intro
BioHackathon 2010 Intro
Brad Chapman448 views
Lowering barriers to publishing biological data on the web by Brad Chapman
Lowering barriers to publishing biological data on the webLowering barriers to publishing biological data on the web
Lowering barriers to publishing biological data on the web
Brad Chapman527 views

Recently uploaded

Microsoft Power Platform.pptx by
Microsoft Power Platform.pptxMicrosoft Power Platform.pptx
Microsoft Power Platform.pptxUni Systems S.M.S.A.
53 views38 slides
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...Bernd Ruecker
40 views69 slides
Democratising digital commerce in India-Report by
Democratising digital commerce in India-ReportDemocratising digital commerce in India-Report
Democratising digital commerce in India-ReportKapil Khandelwal (KK)
18 views161 slides
Special_edition_innovator_2023.pdf by
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdfWillDavies22
18 views6 slides
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 by
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院IttrainingIttraining
58 views8 slides
Info Session November 2023.pdf by
Info Session November 2023.pdfInfo Session November 2023.pdf
Info Session November 2023.pdfAleksandraKoprivica4
13 views15 slides

Recently uploaded(20)

iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker40 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2218 views
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 by IttrainingIttraining
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by Dr. Jimmy Schwarzkopf
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf
Serverless computing with Google Cloud (2023-24) by wesley chun
Serverless computing with Google Cloud (2023-24)Serverless computing with Google Cloud (2023-24)
Serverless computing with Google Cloud (2023-24)
wesley chun11 views
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf
Powerful Google developer tools for immediate impact! (2023-24) by wesley chun
Powerful Google developer tools for immediate impact! (2023-24)Powerful Google developer tools for immediate impact! (2023-24)
Powerful Google developer tools for immediate impact! (2023-24)
wesley chun10 views
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... by James Anderson
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
James Anderson92 views
HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn22 views

Biopython at BOSC 2010

  • 1. Community Integration Democratization Biopython: challenges Brad Chapman Peter Cock Biopython contributors http://biopython.org 10 July 2010
  • 2. Community Integration Democratization 3 challenges for successful open source projects Community Integration Democratization
  • 3. Community Integration Democratization Distributed code access
  • 4. Community Integration Democratization Recruiting and training Google Summer of Code 2009 Eric Talevich phyloXML; Bio.Phylo Nick Matzke Biogeographical Phylogenetics 2010 Jo˜o Rodrigues a Structural biology; Bio.PDB
  • 5. Community Integration Democratization Answering questions better
  • 6. Community Integration Democratization Recognizing contributions
  • 7. Community Integration Democratization Diversity of Python bioinformatics
  • 8. Community Integration Democratization Interoperability Avoid re-implementation Convert core objects Document workflows with multiple libraries Communicate better
  • 9. Community Integration Democratization Wrapping external tools import subprocess from Bio.Blast.Applications import ( NcbiblastxCommandline) cl = NcbiblastxCommandline(query="opuntia.fasta", db="nr", evalue=0.001, outfmt=5, out="opuntia.xml") subprocess.call(str(cl))
  • 10. Community Integration Democratization Documenting standards
  • 11. Community Integration Democratization Making code easier to use >>> from Bio import SeqIO >>> memory_dict = SeqIO.index("in.gb", "genbank") >>> memory_dict.keys() [’Z78484.1’, ... ’Z78471.1’] >>> seq_record = memory_dict["Z78475.1"] >>> print seq_record.description P.supardii 5.8S rRNA gene and ITS1 and ITS2 DNA >>> seq_record.seq Seq(’CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGG...GGT’, IUPACAmbiguousDNA())
  • 12. Community Integration Democratization Challenges of big data
  • 13. Community Integration Democratization Cloud: easier to distribute On-demand computational resources like Amazon EC2 Provide ready-to-go images Biopython and many associated bioinformatics libraries Biological data http://github.com/chapmanb/bcbb/tree/master/ec2/biolinux/
  • 14. Community Integration Democratization Following up Home http://biopython.org Code http://github.com/biopython BOSC Talk to Eric, Tiago or myself