• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Biocatalogue, FileQuirks, MyExperiment

Biocatalogue, FileQuirks, MyExperiment



Presentation from IIMCB Seminar. Summary from my Fellowship in MyGrid, Manchester

Presentation from IIMCB Seminar. Summary from my Fellowship in MyGrid, Manchester



Total Views
Views on SlideShare
Embed Views



2 Embeds 2

http://www.slideshare.net 1
http://webcache.googleusercontent.com 1



Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Biocatalogue, FileQuirks, MyExperiment Biocatalogue, FileQuirks, MyExperiment Presentation Transcript

    • Summary from my fellowship in Manchester e-Science in Manchester Jerzy Orłowski Jerzy Orłowski
    • What will I talk About
      • Part one: Biocatalogue SearchByData
        • Searching for services that will analyze or process your data file
        • Other ideas, born meanwhile
      • Part two: Other things I've done meanwhile
      • Part three: How do they do it in MyGrid
        • New methodology that we could adopt
        • How can I help you, how you can help me
      • Part one: Biocatalogue SearchByData
    • Biocatalogue
      • The BioCatalogue is a catalogue of Life Science Web Services
      • A web service is a network application with programmatic interface
      • BioCatalogue relies on community annotation
        • Service providers
        • Users
        • Curator
      • Technology: Ruby on Rails
    • Browsing services
    • My contribution
      • Search ByData
        • Ability to find services not on tags, providers etc. but on exemple input files
      • Algorithm based on FileQuirks based in GeneSilico
      • User provides an real input file, which is matched with example inputs of all the services using regular expressions
      • Services most likely to analyze / process user file are returned
    • Other ideas – getting example input files
      • The main limitation of Search By Data is lack of example inputs for services
        • for 1169 services, more than 3000 operations there are no more than 500 example inputs
        • Most of inputs are numbers or ids
      • Idea – get more inputs:
        • From people executing the services
          • Taverna Provenance
          • Soap Servlet
        • By executing services by bots with some data
    • Soap Servlet
      • Automatic generation of web interface for web services:
      • For users: allows to quickly test or execute a service
      • For us: allows to get example inputs for services
      • Currently – alpha version
    • Soap Servlet interface for Afold
    • Soap Servlet interface for Afold
    • Part 2: Other projects I've done meanwhile
      • GeneSilico web services
        • Turning some of our programs into SOAP services
        • ProteinSilico
        • ModeRNA
        • Parts of MetaServer
        • Parts of MetaRNA
        • See: https://wiki.genesilico.pl/GenesilicoSOAPServices
        • Good documentation on BioCatalogue, used to test Search By Data
      • FileQuirks
      • FileQuirks – web server for recognition of biological data types
        • New user interface
        • More data types
        • Help pages
        • Summary sent to NAR (waiting for decision)
      • http://filequirks.genesilico.pl
    • FileQuirks Help Pages
      • I decided to use Joomla CMS
        • Help pages have standard format
        • Joomla make them easy to write and update
        • GeneSilico home page is written in Joomla so it would be easy to migrate/merge and graphic template already exists
      • It easy to add help pages of other services
      • Software and server list on http://www.genesilico.pl/index.php/servers.html is outdated
      • Maybe we should clean up?
    • Genesilico web services
      • Web service is a network tool with programatic API “program as a service”
      • Pros
        • Compatibility between languages (XML is the protocol)
        • Code reusage – no need to install programs
        • Easy linking with other tools
        • Automatic user interface generation
      • Cons:
        • You have to maintain the server
        • Harder to make it private
        • Less suitable for systems that take a long time to execute
    • Example 1
      • MetaMQAP
        • Kudlaty Chimera MetaMQAP plugin uses MetaMQAP (wrote his own interface)
        • Toolkit uses MetaMQAP
        • I have also written scripts for using MetaMQAP
        • Conclusions:
          • MetaMQAP needs to be installed and maintained on many different systems by different people
          • Making a SOAP server will save people time
    • Example 2
      • Methods for RNA secondary structure prediction
        • They are used by RNA MetaServer
        • Tomek Puton uses them for CompaRNA
        • They were used by me for testing Search By Data
        • Conclusions:
          • SOAP interface for fast methods exists
          • It just need updating and incorporating in other tools
    • GeneSilico web services Instructions on: https://wiki.genesilico.pl/GenesilicoSOAPServices
    • How do they do it in MyGrid? Some methodology we might adopt or just be aware of
    • Working system
      • They do dot make science itself – they make tools for scientists
        • And science about how new technologies are adopted in science
      • Every project is collaboration with other groups
      • There is always more than one people working on a project
        • more than 25% of time spent on meetings
      • Code developers are not scientists, but employees
        • They do not wrote papers nor grants
    • Working system
      • 2 “uncommon” positions
        • Project manager
          • Not a scientist
          • Not a developer
          • Responsible for:
            • keeping up with release schedule
            • grant schedule
            • cooperation between projects
        • Service curator
          • Not a developer
          • Responsible for
            • keeping in touch with user community
            • Organizing meetings with focus group, jamborees etc.
            • Service documentation
            • Service visibility: Wikipedia, Google, links ...
    • Working system
      • No seminars
      • Instead weekly meeting with advances on all projects
      • A lot of project dedicated meetings and teleconferences
    • Sharing policy
      • Code and ideas are even from the beginning of the project
        • Scientific finding can be published only once but tools can be better and better
        • Selling your ideas enables cooperations and making tools compatible – better grants
        • Publishing your code (git, svn) get you more users – nice for publications and grants
    • Development
      • Languages: Java and Ruby on Rails
      • Every code is under version control
        • Massive branching and merging
      • Dependency management systems (maven)
      • All services are hosted
        • Collaborations (EMBL-EBI)
        • Corporate hosting
        • Clouds (Amazon EC2)
      • Making user community
    • Summary – what we could discuss
      • Programatic interfaces (Middleware)
        • I can make SOAP interfaces for you, deploy and publish them
        • I would require you to use such interfaces in your future code
      • What else I can give:
        • CMS for public help pages for programs and web servers
      • What I'd like to ask
        • To test FileQuirks, GeneSilico web services and Search by data
      • Sharing policy for software development projects
      • SVN and Web pages cleanup
      • Visibility on the web (proper linking, wikipedia, various lists etc.)
    • Acknowledgments
      • MyGrid
        • Carole Goble
        • Charlotte Hooson-Sykes
        • Jithen Bhagat, Franck Tanoh, Soahib Sufi, Peter Li and others
      • University of Southampton
        • David da Roure
        • David Neuman and others
      • Leiden University
        • Marco Roos
      • GeneSilico
        • Janusz Bujnicki
        • Iga Korneta
        • Piotr Iwaniuk, Jakub Jopek, Bartosz Bedyński, Artur Skarżycki