EMBOSS European Molecular Biology Open Software Suite Peter Rice pmr@ebi.ac.uk
A quick introduction
Open source package for sequence analysis
ANSI C source code
GPL licensed applications, LGPL libraries
200+ applications
100+ third party applications in 15 associated packages
Project started 1996 at Sanger and HGMP
Now based at EBI
Release 6.1.0 15th July 2009
Funded by UK-BBSRC and EMBL-EBI
BOSC: EMBOSS 2009 29.06.09
A near death experience
April 2004: The UK Medical Research Council decided to close the UK Human Genome Mapping Project Resource Centre (now the Rosalind Franklin Institute)
That was where all the EMBOSS developers worked
We announced the potential end of EMBOSS development to our user community
HGMP closed in July 2005
The developers moved to EBI, interim funding to April 2006.
Funding was secured in May 2006 (BBSRC)
… and again in May 2009 (BBSRC)
As far as we are aware, all our academic and industry users continued running EMBOSS … with no risk
That is a huge advantage for open source licensing
BOSC EMBOSS 2009 29.06.09
Who do we serve?
Expert software developers
Bioinformaticians
Computer scientists
Expert users
Biology research community
Industry
Scientific users
Biology research community
Industry
BOSC: EMBOSS 2009 29.06.09
EMBOSS World Wide BOSC: EMBOSS 2009 29.06.09 We have users in every continent - and a picture to prove it. This is British Antarctica. We are promised another photo from the frozen North The first EMBOSS course was in Beijing, April 1999. The wEMBOSS interface is from Canada, Argentina and Belgium
EMBOSS command line interface
EMBOSS applications run from the command line
This is not the only interface
There are over 100 interfaces and packaged systems available
All applications have a command definition file (.acd)
Defines all inputs, outputs, and other options
Read at startup
Contains all command line options with descriptions
Template for any other interface
BOSC: EMBOSS 2009 29.06.09
EMBOSS command line example
% antigenic
Input protein sequence(s): uniprot:actb1_fugru
Minimum length of antigenic region [6]:
Output report [actb1_fugru.antigenic]:
% antigenic uniprot:actb1_fugru -auto
BOSC: EMBOSS 2009 29.06.09
EMBOSS ACD File
application: antigenic [
documentation: "Finds antigenic sites in proteins"
Sequence type for DNA/protein, possible ambiguity codes, gaps
Sequences in files
40+ formats supported - auto detection
Sequence databases
Remote servers
SRS, Entrez, MRS
User-specified URL
Locally indexed - using the original data files
Local script utilities
BOSC: EMBOSS 2009 29.06.09
EMBOSS Web Interface BOSC: EMBOSS 2009 29.06.09 http://emboss.ch.embnet.org/wEMBOSS/
EMBOSS SoapLab Service BOSC: EMBOSS 2009 29.06.09 MyGrid/EMBRACE projects: for use by Taverna Workflows
EMBOSS User Survey BOSC: EMBOSS 2009 29.06.09
EMBOSS Update
Release 6.1.0 as usual on 15th July 2009
New EMBL and UniProt formats
With full set of cross-references
FASTQ short read formats
Jemboss GUI included as standard
Further profiling for enhanced efficiency
2000+ QA tests (more needed)
Updated Phylip 3.68 … and file format variants
Services for EMBRACE/SoapLab2
DAS testing
BOSC: EMBOSS 2009 29.06.09
Example Dasty screen:
Example Ensembl screen:
EMBOSS Future plans
Three open source books: users, developers, admin
Cambridge University Press
Original text can be freely reused
New areas of interest
Metadata and ontologies (EDAM, taxonomy, GO, SO, …)
(all) public data resources
Coordinate systems (ensembl, gene/protein input/results)
Project-based working
Next-generation sequence data – used by ordinary biologists
100+ new applications
Database index updates
Scientific advisory board
Developer courses: anywhere, any time
BOSC: EMBOSS 2009 29.06.09
The Emboss Team BOSC: EMBOSS 2009 29.06.09 Peter Rice Alan Bleasby Jon Ison Mahmut Uludag Mon 12:15 Technology Track Mon 17:45 Poster U43 Wed 13:00 Birds of a Feather
Acknowledgements
EBI: Peter Rice, Alan Bleasby, Jon Ison, Martin Senger, Tom Oinn, Jaina Mistry, Rodrigo Lopez, Sharmilla Pillai, Hamish McWilliam
RFCGR/HGMP: Alan Bleasby, Jon Ison, Tim Carver, Hugh Morgan, Claude Beazley, Lisa Mullan, Damian Counsell, Gary Williams, Val Curwen, Mark Faller, Sinead O’Leary, Thon deBoer, Martin Bishop
LION: Thomas Laurent, Bijay Jassal, Bren Vaughan, Thure Etzold
Sanger Institute: Ian Longden, Richard Bruskiewich, Simon Kelley
National bioinformatics service providers in: Norway, Spain, Italy, Netherlands, Germany, Belgium, Russia, China, Canada, Australia, Argentina
Others: Catherine Letondal, Don Gilbert, Rodger Staden, Bill Pearson, Webb Miller, Marie-Laetitia Denayer, Amandine Schurmann, Gabriele Weiler, Luke McCarthy, David Mathog, David Bauer, Henrikki Almusa, Thomas Siegmund, Scott Markel, Darryl Leon, Bastien Chevreux...
0 comments
Post a comment