Rice Emboss Bosc2009
Upcoming SlideShare
Loading in...5
×
 

Rice Emboss Bosc2009

on

  • 1,112 views

 

Statistics

Views

Total Views
1,112
Views on SlideShare
1,112
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Rice Emboss Bosc2009 Rice Emboss Bosc2009 Presentation Transcript

  • EMBOSS European Molecular Biology Open Software Suite Peter Rice pmr@ebi.ac.uk
  • A quick introduction
    • Open source package for sequence analysis
      • ANSI C source code
      • GPL licensed applications, LGPL libraries
      • 200+ applications
      • 100+ third party applications in 15 associated packages
      • Project started 1996 at Sanger and HGMP
      • Now based at EBI
      • Release 6.1.0 15th July 2009
      • Funded by UK-BBSRC and EMBL-EBI
    BOSC: EMBOSS 2009 29.06.09
  • A near death experience
    • April 2004: The UK Medical Research Council decided to close the UK Human Genome Mapping Project Resource Centre (now the Rosalind Franklin Institute)
    • That was where all the EMBOSS developers worked
    • We announced the potential end of EMBOSS development to our user community
    • HGMP closed in July 2005
    • The developers moved to EBI, interim funding to April 2006.
    • Funding was secured in May 2006 (BBSRC)
    • … and again in May 2009 (BBSRC)
    • As far as we are aware, all our academic and industry users continued running EMBOSS … with no risk
    • That is a huge advantage for open source licensing
    BOSC EMBOSS 2009 29.06.09
  • Who do we serve?
    • Expert software developers
      • Bioinformaticians
      • Computer scientists
    • Expert users
      • Biology research community
      • Industry
    • Scientific users
      • Biology research community
      • Industry
    BOSC: EMBOSS 2009 29.06.09
  • EMBOSS World Wide BOSC: EMBOSS 2009 29.06.09 We have users in every continent - and a picture to prove it. This is British Antarctica. We are promised another photo from the frozen North The first EMBOSS course was in Beijing, April 1999. The wEMBOSS interface is from Canada, Argentina and Belgium
  • EMBOSS command line interface
    • EMBOSS applications run from the command line
    • This is not the only interface
      • There are over 100 interfaces and packaged systems available
    • All applications have a command definition file (.acd)
      • Defines all inputs, outputs, and other options
      • Read at startup
      • Contains all command line options with descriptions
      • Template for any other interface
    BOSC: EMBOSS 2009 29.06.09
  • EMBOSS command line example
    • % antigenic
    • Input protein sequence(s): uniprot:actb1_fugru
    • Minimum length of antigenic region [6]:
    • Output report [actb1_fugru.antigenic]:
    • % antigenic uniprot:actb1_fugru -auto
    BOSC: EMBOSS 2009 29.06.09
  • EMBOSS ACD File
    • application: antigenic [
    • documentation: "Finds antigenic sites in proteins"
    • groups: "Protein:Motifs"
    • ]
    • section: input [
    • information: "Input section"
    • type: "page"
    • ]
    • seqall: sequence [
    • parameter: "Y"
    • type: "PureProtein"
    • ]
    • endsection: input
    • section: required [
    • information: "Required section"
    • type: "page"
    • ]
    BOSC: EMBOSS 2009 29.06.09 integer: minlen [ standard: "Y" minimum: "1" maximum: "50" default: "6" information: "Minimum length of antigenic region" ] endsection: required section: output [ information: "Output section" type: "page" ] report: outfile [ parameter: "Y" rformat: "motif" multiple: "Y" taglist: "int:pos=Max_score_pos" ] endsection: output
  • EMBOSS makes things easy
    • ACD files define sequence input
      • Sequence type for DNA/protein, possible ambiguity codes, gaps
      • Sequences in files
        • 40+ formats supported - auto detection
      • Sequence databases
        • Remote servers
          • SRS, Entrez, MRS
          • User-specified URL
        • Locally indexed - using the original data files
        • Local script utilities
    BOSC: EMBOSS 2009 29.06.09
  • EMBOSS Web Interface BOSC: EMBOSS 2009 29.06.09 http://emboss.ch.embnet.org/wEMBOSS/
  • EMBOSS SoapLab Service BOSC: EMBOSS 2009 29.06.09 MyGrid/EMBRACE projects: for use by Taverna Workflows
  • EMBOSS User Survey BOSC: EMBOSS 2009 29.06.09
  • EMBOSS Update
    • Release 6.1.0 as usual on 15th July 2009
    • New EMBL and UniProt formats
      • With full set of cross-references
    • FASTQ short read formats
    • Jemboss GUI included as standard
    • Further profiling for enhanced efficiency
    • 2000+ QA tests (more needed)
    • Updated Phylip 3.68 … and file format variants
    • Services for EMBRACE/SoapLab2
    • DAS testing
    BOSC: EMBOSS 2009 29.06.09
  • Example Dasty screen:
  • Example Ensembl screen:
  • EMBOSS Future plans
    • Three open source books: users, developers, admin
      • Cambridge University Press
      • Original text can be freely reused
    • New areas of interest
      • Metadata and ontologies (EDAM, taxonomy, GO, SO, …)
      • (all) public data resources
      • Coordinate systems (ensembl, gene/protein input/results)
      • Project-based working
      • Next-generation sequence data – used by ordinary biologists
      • 100+ new applications
    • Database index updates
    • Scientific advisory board
    • Developer courses: anywhere, any time
    BOSC: EMBOSS 2009 29.06.09
  • The Emboss Team BOSC: EMBOSS 2009 29.06.09 Peter Rice Alan Bleasby Jon Ison Mahmut Uludag Mon 12:15 Technology Track Mon 17:45 Poster U43 Wed 13:00 Birds of a Feather
  • Acknowledgements
    • EBI: Peter Rice, Alan Bleasby, Jon Ison, Martin Senger, Tom Oinn, Jaina Mistry, Rodrigo Lopez, Sharmilla Pillai, Hamish McWilliam
    • RFCGR/HGMP: Alan Bleasby, Jon Ison, Tim Carver, Hugh Morgan, Claude Beazley, Lisa Mullan, Damian Counsell, Gary Williams, Val Curwen, Mark Faller, Sinead O’Leary, Thon deBoer, Martin Bishop
    • LION: Thomas Laurent, Bijay Jassal, Bren Vaughan, Thure Etzold
    • Sanger Institute: Ian Longden, Richard Bruskiewich, Simon Kelley
    • National bioinformatics service providers in: Norway, Spain, Italy, Netherlands, Germany, Belgium, Russia, China, Canada, Australia, Argentina
    • Others: Catherine Letondal, Don Gilbert, Rodger Staden, Bill Pearson, Webb Miller, Marie-Laetitia Denayer, Amandine Schurmann, Gabriele Weiler, Luke McCarthy, David Mathog, David Bauer, Henrikki Almusa, Thomas Siegmund, Scott Markel, Darryl Leon, Bastien Chevreux...
    • IBM, Hewlett-Packard, (Compaq), Apple, SGI, Sun, LION bioscience, SciTegic, Accelrys, Cambridge University Press
    • Open-Bio Foundation, Sourceforge
    • ... And the British Antarctic Survey
    • http://emboss.sourceforge.net
    • http://emboss.open-bio.org/wiki
    BOSC: EMBOSS 2009 29.06.09