SlideShare a Scribd company logo
1 of 16
Download to read offline
Biopython Project Update
     Peter Cock, Plant Pathology, SCRI, Dundee, UK
10th Annual Bioinformatics Open Source Conference (BOSC)
           Stockholm, Sweden, 28 June 2009
Contents

•  Brief introduction to Biopython & history
•  Releases since BOSC 2008
•  Current and future projects
•  CVS, git and github
•  BoF hackathon and tutorial at BOSC 2009
Biopython

•  Free, open source library for bioinformatics
•  Supported by Open Bioinformatics Foundation
•  Runs on Windows, Linux, Mac OS X, etc
•  International team of volunteer developers
•  Currently about three releases per year
•  Extensive “Biopython Tutorial & Cookbook”
•  See www.biopython.org for details
Biopython’s Ten Year History

1999 •  Started by Jeff Chang & Andrew Dalke
2000 •  First release, Biopython 0.90
2001 •  Biopython 1.00, “semi-complete”
 …    •  Biopython 1.10, …, 1.41
2007 •  Biopython 1.43 (Bio.SeqIO), 1.44
2008 •  Biopython 1.45, 1.47, 1.48, 1.49
2009 •  Biopython 1.50, 1.51beta
      •  OA Publication, Cock et al.
Biopython Publication – Cock et al. 2009




          N.B. Open Access!
November 2008 – Biopython 1.49

•  Support for Python 2.6
•  Switched from “Numeric” to “NumPy”
   (important Numerical library for Python)
•  More biological methods on core Seq object
April 2009 – Biopython 1.50

•  New Bio.Motif module for sequence motifs
   (to replace Bio.AliceAce and Bio.MEME)
•  Support for QUAL and FASTQ in Bio.SeqIO
   (important NextGen sequencing formats)
•  Integration of GenomeDiagram for figures
   (Pritchard et al. 2006)
Biopython 1.50 includes GenomeDiagram
                      De novo assembly
                      of 42kb phage from
                      Roche 454 data
                         “Feature Track”
                         showing ORFs

                         Scale tick marks

                         “Barchart Track” of
                         read depth (~100,
                         scale max 200)
Reading a FASTA file with Bio.SeqIO
>FL3BO7415JACDX	
TTAATTTTATTTTGTCGGCTAAAGAGATTTTTAGCTAAACGTTCAATTGCTTTAGCTGAA	
GTACGAGCAGATACTCCAATCGCAATTGTTTCTTCATTTAAAATTAGCTCGTCGCCACCT	
TCAATTGGAAATTTATAATCACGATCTAACCAGATTGGTACATTATGTTTTGCAAATCTT	
GGATGATATTTAATGATGTACTCCATGAATAATGATTCACGTCTACGCGCTGGTTCTCTC	
ATCTTATTTATCGTTAAGCCA	
>FL3BO7415I7AFR	
...	



from Bio import SeqIO	
for rec in SeqIO.parse(open("phage.fasta"), "fasta") :	
    print rec.id, len(rec.seq), rec.seq[:10]+"..."	


FL3BO7415JACDX   261   TTAATTTTAT...	
FL3BO7415I7AFR
FL3BO7415JCAY5
                 267
                 136
                       CATTAACTAA...	
                       TTTCTTTTCT...	                           Focus on the
FL3BO7415JB41R   208   CTCTTTTATG...	
FL3BO7415I6HKB
FL3BO7415I63UC
                 268
                 219
                       GGTATTTGAA...	
                       AACATGTGAG...	
                                                                filename and
...	
                                                                format (“fasta”)…
Reading a FASTQ file with Bio.SeqIO
@FL3BO7415JACDX	
TTAATTTTATTTTGTCGGCTAAAGAGATTTTTAGCTAAACGTTCAATTGCTTTAGCTGAAGTACGAGCAGATACTCCAATCGCAATTGTTTCTTC
ATTTAAAATTAGCTCGTCGCCACCTTCAATTGGAAATTTATAATCACGATCTAACCAGATTGGTACATTATGTTTTGCAAATCTTGGATGATATT
TAATGATGTACTCCATGAATAATGATTCACGTCTACGCGCTGGTTCTCTCATCTTATTTATCGTTAAGCCA	
+	
BBBB2262=1111FFGGGHHHHIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFGGGFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFGGGGFFFFFFFFFFFFFFFFFGB
BBCFFFFFFFFFFFFFFFFFFFFFFFGGGGGGGIIIIIIIGGGIIIGGGIIGGGG@AAAAA?===@@@???	
@FL3BO7415I7AFR	
...	


from Bio import SeqIO	
for rec in SeqIO.parse(open("phage.fastq"), "fastq") :	
    print rec.id, len(rec.seq), rec.seq[:10]+"..."	
    print rec.letter_annotations["phred_quality"][:10], "..."	

FL3BO7415JACDX 261 TTAATTTTAT...	
[33, 33, 33, 33, 17, 17, 21, 17, 28,
FL3BO7415I7AFR 267 CATTAACTAA...	
                                       16] ...	
                                                                   Just filename and
[37, 37, 37, 37, 37, 37, 37, 37, 38,   38] ...	
FL3BO7415JCAY5 136 TTTCTTTTCT...	
[37, 37, 36, 36, 29, 29, 29, 29, 36,   37] ...	
                                                                   format changed
FL3BO7415JB41R 208 CTCTTTTATG...	
[37, 37, 37, 38, 38, 38, 38, 38, 37,   37] ...	                    (“fasta” to “fastq”)
FL3BO7415I6HKB 268 GGTATTTGAA...	
[37, 37, 37, 37, 34, 34, 34, 37, 37,   37] ...	
FL3BO7415I63UC 219 AACATGTGAG...	
[37, 37, 37, 37, 37, 37, 37, 37, 37,   37] ...	
...
June 2009 – Biopython 1.51 beta

•  Support for Illumina 1.3+ FASTQ files
   (in addition to Sanger FASTQ and older
   Solexa/Illumina FASTQ files)
•  Faster parsing of UniProt/SwissProt files
•  Bio.SeqIO now writes feature table in
   GenBank output
                         Already being used at SCRI
                         for genome annotation, e.g.
                         with RAST and Artemis
Google Summer of Code Projects

•  Eric Talevich - Parsing and writing phyloXML

  •  Mentors Brad Chapman & Christian Zmasek

•  Nick Matzke - Biogeographical Phylogenetics

  •  Mentors Stephen Smith, Brad Chapman & David Kidd

•  Hosted by NESCent Phyloinformatics Group

•  Code development on github branches...
Other Notable Active Projects

•  Brad Chapman – GFF parsing
•  Tiago Antão – Population genetics statistics
•  Peter Cock – Parsing Roche 454 SFF files
   (with Jose Blanca, co-author of sff_extract)
•  Plus other ongoing refinements and
   documentation improvements
Distributed Development

•  Currently work from a stable branch in CVS
•  CVS master is mirrored to github.com
•  Several sub-projects are being developed on
   github branches (in public)
•  This is letting us get familiar with git & github
•  Suggested plan is to switch from CVS to git
   summer 2009 (still hosted by OBF), continue
   to push to github for public collaboration
Acknowledgements
•  Other Biopython contributors & developers!
•  Open Bioinformatics Foundation (OBF)
   supports Biopython (and BioPerl etc)
•  Society for General Microbiology
   (SGM) for my travel costs
•  My Biopython work supported by:
  •  EPSRC funded PhD
     (MOAC DTC, University of Warwick, UK)
  •  SCRI (Scottish Crop Research Institute),
     who also paid my conference fees
What next?

•  This afternoon’s “Birds of a Feather” session:
         Biopython Tutorial and/or Hackathon

•  Sign up to our mailing list?




•  Homepage www.biopython.org

More Related Content

Similar to Cock Biopython Bosc2009

BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopythontiago
 
Biopython
BiopythonBiopython
Biopythonbosc
 
Python 2 is dead! Drag your old code into the modern age
Python 2 is dead! Drag your old code into the modern agePython 2 is dead! Drag your old code into the modern age
Python 2 is dead! Drag your old code into the modern ageBecky Smith
 
2016 bioinformatics i_bio_python_wimvancriekinge
2016 bioinformatics i_bio_python_wimvancriekinge2016 bioinformatics i_bio_python_wimvancriekinge
2016 bioinformatics i_bio_python_wimvancriekingeProf. Wim Van Criekinge
 
그렇게 커미터가 된다: Python을 통해 오픈소스 생태계 가르치기
그렇게 커미터가 된다: Python을 통해 오픈소스 생태계 가르치기그렇게 커미터가 된다: Python을 통해 오픈소스 생태계 가르치기
그렇게 커미터가 된다: Python을 통해 오픈소스 생태계 가르치기Jeongkyu Shin
 
Python Evolution
Python EvolutionPython Evolution
Python EvolutionQuintagroup
 
BioRuby -- Bioinformatics Library
BioRuby -- Bioinformatics LibraryBioRuby -- Bioinformatics Library
BioRuby -- Bioinformatics Libraryngotogenome
 
Teaching with JupyterHub - lessons learned
Teaching with JupyterHub - lessons learnedTeaching with JupyterHub - lessons learned
Teaching with JupyterHub - lessons learnedMartin Christen
 
Bonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBOSC 2010
 
Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...
Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...
Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...Jian-Hong Pan
 
PyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 TutorialPyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 TutorialJustin Lin
 
Pharo 7.0 and 8.0 alpha
Pharo 7.0 and 8.0 alphaPharo 7.0 and 8.0 alpha
Pharo 7.0 and 8.0 alphaPharo
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache KafkaJoe Stein
 
fastp: the FASTQ pre-processor
fastp: the FASTQ pre-processorfastp: the FASTQ pre-processor
fastp: the FASTQ pre-processorHoffman Lab
 
2015 bioinformatics python_io_wim_vancriekinge
2015 bioinformatics python_io_wim_vancriekinge2015 bioinformatics python_io_wim_vancriekinge
2015 bioinformatics python_io_wim_vancriekingeProf. Wim Van Criekinge
 
New Features of Python 3.10
New Features of Python 3.10New Features of Python 3.10
New Features of Python 3.10Gabor Guta
 
Git 101, or, how to sanely manage your Koha customizations
Git 101, or, how to sanely manage your Koha customizationsGit 101, or, how to sanely manage your Koha customizations
Git 101, or, how to sanely manage your Koha customizationsIan Walls
 

Similar to Cock Biopython Bosc2009 (20)

BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopython
 
Biopython
BiopythonBiopython
Biopython
 
2015 bioinformatics bio_python
2015 bioinformatics bio_python2015 bioinformatics bio_python
2015 bioinformatics bio_python
 
Python 2 is dead! Drag your old code into the modern age
Python 2 is dead! Drag your old code into the modern agePython 2 is dead! Drag your old code into the modern age
Python 2 is dead! Drag your old code into the modern age
 
2016 bioinformatics i_bio_python_wimvancriekinge
2016 bioinformatics i_bio_python_wimvancriekinge2016 bioinformatics i_bio_python_wimvancriekinge
2016 bioinformatics i_bio_python_wimvancriekinge
 
Python Orientation
Python OrientationPython Orientation
Python Orientation
 
그렇게 커미터가 된다: Python을 통해 오픈소스 생태계 가르치기
그렇게 커미터가 된다: Python을 통해 오픈소스 생태계 가르치기그렇게 커미터가 된다: Python을 통해 오픈소스 생태계 가르치기
그렇게 커미터가 된다: Python을 통해 오픈소스 생태계 가르치기
 
Python Evolution
Python EvolutionPython Evolution
Python Evolution
 
BioRuby -- Bioinformatics Library
BioRuby -- Bioinformatics LibraryBioRuby -- Bioinformatics Library
BioRuby -- Bioinformatics Library
 
Teaching with JupyterHub - lessons learned
Teaching with JupyterHub - lessons learnedTeaching with JupyterHub - lessons learned
Teaching with JupyterHub - lessons learned
 
Bonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_ruby
 
Introduction to FIWARE IoT
Introduction to FIWARE IoTIntroduction to FIWARE IoT
Introduction to FIWARE IoT
 
Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...
Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...
Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...
 
PyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 TutorialPyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 Tutorial
 
Pharo 7.0 and 8.0 alpha
Pharo 7.0 and 8.0 alphaPharo 7.0 and 8.0 alpha
Pharo 7.0 and 8.0 alpha
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 
fastp: the FASTQ pre-processor
fastp: the FASTQ pre-processorfastp: the FASTQ pre-processor
fastp: the FASTQ pre-processor
 
2015 bioinformatics python_io_wim_vancriekinge
2015 bioinformatics python_io_wim_vancriekinge2015 bioinformatics python_io_wim_vancriekinge
2015 bioinformatics python_io_wim_vancriekinge
 
New Features of Python 3.10
New Features of Python 3.10New Features of Python 3.10
New Features of Python 3.10
 
Git 101, or, how to sanely manage your Koha customizations
Git 101, or, how to sanely manage your Koha customizationsGit 101, or, how to sanely manage your Koha customizations
Git 101, or, how to sanely manage your Koha customizations
 

More from bosc

Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009bosc
 
Bosc Intro 20090627
Bosc Intro 20090627Bosc Intro 20090627
Bosc Intro 20090627bosc
 
Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009bosc
 
Schbath Rmes Bosc2009
Schbath Rmes Bosc2009Schbath Rmes Bosc2009
Schbath Rmes Bosc2009bosc
 
Kallio Chipster Bosc2009
Kallio Chipster Bosc2009Kallio Chipster Bosc2009
Kallio Chipster Bosc2009bosc
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009bosc
 
Rice Emboss Bosc2009
Rice Emboss Bosc2009Rice Emboss Bosc2009
Rice Emboss Bosc2009bosc
 
Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009bosc
 
Senger Soaplab Bosc2009
Senger Soaplab Bosc2009Senger Soaplab Bosc2009
Senger Soaplab Bosc2009bosc
 
Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009bosc
 
Snell Psoda Bosc2009
Snell Psoda Bosc2009Snell Psoda Bosc2009
Snell Psoda Bosc2009bosc
 
Procter Vamsas Bosc2009
Procter Vamsas Bosc2009Procter Vamsas Bosc2009
Procter Vamsas Bosc2009bosc
 
Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009bosc
 
Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009bosc
 
Moeller Debian Bosc2009
Moeller Debian Bosc2009Moeller Debian Bosc2009
Moeller Debian Bosc2009bosc
 
Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009bosc
 
Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009bosc
 
Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009bosc
 
Trelles_QnormBOSC2009
Trelles_QnormBOSC2009Trelles_QnormBOSC2009
Trelles_QnormBOSC2009bosc
 
Rother_ModeRNA_BOSC2009
Rother_ModeRNA_BOSC2009Rother_ModeRNA_BOSC2009
Rother_ModeRNA_BOSC2009bosc
 

More from bosc (20)

Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009
 
Bosc Intro 20090627
Bosc Intro 20090627Bosc Intro 20090627
Bosc Intro 20090627
 
Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009
 
Schbath Rmes Bosc2009
Schbath Rmes Bosc2009Schbath Rmes Bosc2009
Schbath Rmes Bosc2009
 
Kallio Chipster Bosc2009
Kallio Chipster Bosc2009Kallio Chipster Bosc2009
Kallio Chipster Bosc2009
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009
 
Rice Emboss Bosc2009
Rice Emboss Bosc2009Rice Emboss Bosc2009
Rice Emboss Bosc2009
 
Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009
 
Senger Soaplab Bosc2009
Senger Soaplab Bosc2009Senger Soaplab Bosc2009
Senger Soaplab Bosc2009
 
Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009
 
Snell Psoda Bosc2009
Snell Psoda Bosc2009Snell Psoda Bosc2009
Snell Psoda Bosc2009
 
Procter Vamsas Bosc2009
Procter Vamsas Bosc2009Procter Vamsas Bosc2009
Procter Vamsas Bosc2009
 
Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009
 
Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009
 
Moeller Debian Bosc2009
Moeller Debian Bosc2009Moeller Debian Bosc2009
Moeller Debian Bosc2009
 
Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009
 
Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009
 
Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009
 
Trelles_QnormBOSC2009
Trelles_QnormBOSC2009Trelles_QnormBOSC2009
Trelles_QnormBOSC2009
 
Rother_ModeRNA_BOSC2009
Rother_ModeRNA_BOSC2009Rother_ModeRNA_BOSC2009
Rother_ModeRNA_BOSC2009
 

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 

Recently uploaded (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 

Cock Biopython Bosc2009

  • 1. Biopython Project Update Peter Cock, Plant Pathology, SCRI, Dundee, UK 10th Annual Bioinformatics Open Source Conference (BOSC) Stockholm, Sweden, 28 June 2009
  • 2. Contents •  Brief introduction to Biopython & history •  Releases since BOSC 2008 •  Current and future projects •  CVS, git and github •  BoF hackathon and tutorial at BOSC 2009
  • 3. Biopython •  Free, open source library for bioinformatics •  Supported by Open Bioinformatics Foundation •  Runs on Windows, Linux, Mac OS X, etc •  International team of volunteer developers •  Currently about three releases per year •  Extensive “Biopython Tutorial & Cookbook” •  See www.biopython.org for details
  • 4. Biopython’s Ten Year History 1999 •  Started by Jeff Chang & Andrew Dalke 2000 •  First release, Biopython 0.90 2001 •  Biopython 1.00, “semi-complete” … •  Biopython 1.10, …, 1.41 2007 •  Biopython 1.43 (Bio.SeqIO), 1.44 2008 •  Biopython 1.45, 1.47, 1.48, 1.49 2009 •  Biopython 1.50, 1.51beta •  OA Publication, Cock et al.
  • 5. Biopython Publication – Cock et al. 2009 N.B. Open Access!
  • 6. November 2008 – Biopython 1.49 •  Support for Python 2.6 •  Switched from “Numeric” to “NumPy” (important Numerical library for Python) •  More biological methods on core Seq object
  • 7. April 2009 – Biopython 1.50 •  New Bio.Motif module for sequence motifs (to replace Bio.AliceAce and Bio.MEME) •  Support for QUAL and FASTQ in Bio.SeqIO (important NextGen sequencing formats) •  Integration of GenomeDiagram for figures (Pritchard et al. 2006)
  • 8. Biopython 1.50 includes GenomeDiagram De novo assembly of 42kb phage from Roche 454 data “Feature Track” showing ORFs Scale tick marks “Barchart Track” of read depth (~100, scale max 200)
  • 9. Reading a FASTA file with Bio.SeqIO >FL3BO7415JACDX TTAATTTTATTTTGTCGGCTAAAGAGATTTTTAGCTAAACGTTCAATTGCTTTAGCTGAA GTACGAGCAGATACTCCAATCGCAATTGTTTCTTCATTTAAAATTAGCTCGTCGCCACCT TCAATTGGAAATTTATAATCACGATCTAACCAGATTGGTACATTATGTTTTGCAAATCTT GGATGATATTTAATGATGTACTCCATGAATAATGATTCACGTCTACGCGCTGGTTCTCTC ATCTTATTTATCGTTAAGCCA >FL3BO7415I7AFR ... from Bio import SeqIO for rec in SeqIO.parse(open("phage.fasta"), "fasta") : print rec.id, len(rec.seq), rec.seq[:10]+"..." FL3BO7415JACDX 261 TTAATTTTAT... FL3BO7415I7AFR FL3BO7415JCAY5 267 136 CATTAACTAA... TTTCTTTTCT... Focus on the FL3BO7415JB41R 208 CTCTTTTATG... FL3BO7415I6HKB FL3BO7415I63UC 268 219 GGTATTTGAA... AACATGTGAG... filename and ... format (“fasta”)…
  • 10. Reading a FASTQ file with Bio.SeqIO @FL3BO7415JACDX TTAATTTTATTTTGTCGGCTAAAGAGATTTTTAGCTAAACGTTCAATTGCTTTAGCTGAAGTACGAGCAGATACTCCAATCGCAATTGTTTCTTC ATTTAAAATTAGCTCGTCGCCACCTTCAATTGGAAATTTATAATCACGATCTAACCAGATTGGTACATTATGTTTTGCAAATCTTGGATGATATT TAATGATGTACTCCATGAATAATGATTCACGTCTACGCGCTGGTTCTCTCATCTTATTTATCGTTAAGCCA + BBBB2262=1111FFGGGHHHHIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFGGGFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFGGGGFFFFFFFFFFFFFFFFFGB BBCFFFFFFFFFFFFFFFFFFFFFFFGGGGGGGIIIIIIIGGGIIIGGGIIGGGG@AAAAA?===@@@??? @FL3BO7415I7AFR ... from Bio import SeqIO for rec in SeqIO.parse(open("phage.fastq"), "fastq") : print rec.id, len(rec.seq), rec.seq[:10]+"..." print rec.letter_annotations["phred_quality"][:10], "..." FL3BO7415JACDX 261 TTAATTTTAT... [33, 33, 33, 33, 17, 17, 21, 17, 28, FL3BO7415I7AFR 267 CATTAACTAA... 16] ... Just filename and [37, 37, 37, 37, 37, 37, 37, 37, 38, 38] ... FL3BO7415JCAY5 136 TTTCTTTTCT... [37, 37, 36, 36, 29, 29, 29, 29, 36, 37] ... format changed FL3BO7415JB41R 208 CTCTTTTATG... [37, 37, 37, 38, 38, 38, 38, 38, 37, 37] ... (“fasta” to “fastq”) FL3BO7415I6HKB 268 GGTATTTGAA... [37, 37, 37, 37, 34, 34, 34, 37, 37, 37] ... FL3BO7415I63UC 219 AACATGTGAG... [37, 37, 37, 37, 37, 37, 37, 37, 37, 37] ... ...
  • 11. June 2009 – Biopython 1.51 beta •  Support for Illumina 1.3+ FASTQ files (in addition to Sanger FASTQ and older Solexa/Illumina FASTQ files) •  Faster parsing of UniProt/SwissProt files •  Bio.SeqIO now writes feature table in GenBank output Already being used at SCRI for genome annotation, e.g. with RAST and Artemis
  • 12. Google Summer of Code Projects •  Eric Talevich - Parsing and writing phyloXML •  Mentors Brad Chapman & Christian Zmasek •  Nick Matzke - Biogeographical Phylogenetics •  Mentors Stephen Smith, Brad Chapman & David Kidd •  Hosted by NESCent Phyloinformatics Group •  Code development on github branches...
  • 13. Other Notable Active Projects •  Brad Chapman – GFF parsing •  Tiago Antão – Population genetics statistics •  Peter Cock – Parsing Roche 454 SFF files (with Jose Blanca, co-author of sff_extract) •  Plus other ongoing refinements and documentation improvements
  • 14. Distributed Development •  Currently work from a stable branch in CVS •  CVS master is mirrored to github.com •  Several sub-projects are being developed on github branches (in public) •  This is letting us get familiar with git & github •  Suggested plan is to switch from CVS to git summer 2009 (still hosted by OBF), continue to push to github for public collaboration
  • 15. Acknowledgements •  Other Biopython contributors & developers! •  Open Bioinformatics Foundation (OBF) supports Biopython (and BioPerl etc) •  Society for General Microbiology (SGM) for my travel costs •  My Biopython work supported by: •  EPSRC funded PhD (MOAC DTC, University of Warwick, UK) •  SCRI (Scottish Crop Research Institute), who also paid my conference fees
  • 16. What next? •  This afternoon’s “Birds of a Feather” session: Biopython Tutorial and/or Hackathon •  Sign up to our mailing list? •  Homepage www.biopython.org