Galaxy

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Galaxy - Presentation Transcript

    1. Galaxy (http://g2.bx.psu.edu)
    2. What is Galaxy? • An open-source framework for integrating various computational tools and databases into a cohesive workspace • A web-based service we (Penn State) provide, integrating many popular tools and resources for comparative genomics • A completely self-contained Python application for building your own Galaxy style sites
    3. Galaxy’s web user interface
    4. Integrating tools into Galaxy
    5. How Galaxy integrates existing web-based tools
    6. Proxy based tools User makes request to Galaxy
    7. Proxy based tools Galaxy delegates request to external site
    8. Proxy based tools External site generates response • If data, Galaxy determines type, processes, and adds to ‘history’ • Otherwise, return response to user
    9. External tools User makes request to Galaxy
    10. External tools Galaxy sends user directly to external site with extra URL data
    11. External tools User interacts directly with external site
    12. External tools When data is generated the user is sent back to Galaxy. Data can be fetched immediately, or wait for notification from the external site
    13. How Galaxy integrates existing command line tools
    14. HTML inputs generated from abstract parameter description
    15. HTML inputs generated from abstract parameter description
    16. HTML inputs generated from abstract parameter description
    17. HTML inputs generated from abstract parameter description
    18. Tool help generated from a simple text format
    19. Automatic input validation based on type, or more...
    20. Template for generating } command line from parameter values
    21. Output datasets } generated by the tool
    22. Special actions to be run } before / after execution
    23. Functional tests to be run with the “full stack” in place
    24. Running functional tests for a speci c tool on the command line
    25. Test results, on command line and as HTML report
    26. Dealing with more complex interface needs
    27. Repeating sets of parameters
    28. Template language for building complex command lines
    29. Conditional groups, grouping constructs can be nested
    30. Command line tool expects a con guration le
    31. Con guration le is generated based on user input
    32. Job execution in Galaxy
    33. Flexible execution environment • Dependencies between jobs handled by “JobManager” within Galaxy. • Either in-process with the web application, or a separate process managing a queue to which multiple front-ends submit
    34. Flexible execution environment • Once jobs are ready, submitted to a “JobRunner” • Runners are pluggable • Can have multiple runners, and jobs to di erent runners depending on capabilities • Current implementations: • Local runner executing a limited number of local processes • PBS runner dispatches to a cluster of worker nodes • Pluggable queueing policies in the works!
    35. Deeper customization of Galaxy
    36. Galaxy web interface is easily customized / branded
    37. Custom datatypes • Datatypes supported by a Galaxy instance can be con gured at runtime • Completely reengineering “metadata” • Easy way to de ne custom metadata • Automatically generated editing interfaces (similar to tool interfaces) • Actions on datatypes (displaying at external sites, format conversion) all pluggable • Nothing “genomics” speci c will be hardcoded!
    38. The future
    39. Future tool development • Tools for statistical genetics • Collaborating closely with the “RGenetics” project (http://rgenetics.org) • Tools for phylogenetic analysis • Based on HyPhy (http://hyphy.org)
    40. Work ow support • Work ow construction by example • Users will continue to build analysis as they do now, and will be able to extraction portions of their histories as reusable work ows • Will probably work for most existing histories! (we’ve been saving the right data all along) • Explicit work ow construction and editing • Support for repetitive invocation of tools and work ows, and aggregation of results • Saving and sharing of work ows, reproducible!
    41. Some Technical Details
    42. Under the hood • Python 2.4, though some dependencies use CPython speci c extensions • Web framework: PythonPaste, Routes, WebHelpers, Beaker, CheetahTemplate, ... • SQLAlchemy for database abstraction
    43. Out of the box con guration • Just checkout from subversion and run! • All dependencies packaged as eggs • Pure python HTTP server included (paste.httpserver) • Embedded database (sqlite) • Datasets stored on local lesystem • Jobs run locally
    44. PSU production con guration • Deployed behind Apache using mod_proxy • Python threads do not scale across CPUs, we use both forking and threading similar to Apache’s worker MPM • PostgreSQL • Jobs dispatched to a PBS cluster using “pbs- python”
    45. The core Galaxy development team
    46. Acknowledgements • Galaxy collaborators: • Ross Lazarus, Sergei Kosakovsky Pond • UCSC Genome Browser team • Biomart team • National Science Foundation

    + boscbosc, 3 years ago

    custom

    1385 views, 0 favs, 0 embeds more stats

    Title: Galaxy
    Author: James Taylor

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 1385
      • 1385 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 54
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Tags