• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Iga workflow
 

Iga workflow

on

  • 808 views

 

Statistics

Views

Total Views
808
Views on SlideShare
763
Embed Views
45

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 45

http://lanyrd.com 45

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Iga workflow Iga workflow Presentation Transcript

    • IGA WorkflowAlessandro Gervaso Vittorio Zamboni
    • IntroductionIGA Workflow is a web-based tool createdusing the Django Framework.It is intended as a management tool for the IGAwet-lab.We use it to track the lifecycle of biologicalsamples, from vial to file.
    • Overview● Laboratory management, from sample to flowcell● Bioinformatic analyses● Pipelines management● Technology● Other applications and future developments
    • Biology for dummies
    • Technology - BasicsDatabase server● Postgresql● RedisWorkers● CeleryWeb server● nginx + uWSGI
    • Overview - Lab● SAMPLE: The basic unit in the lab● LIBRARY: a treated sample with an attached chemical TAG.● POOL: a set of libraries, ready to be placed on a flowcells lane.
    • Overview - Lab
    • Overview - LabThe main challenge was to replace notebookswith a tool that allows to:● insert samples, libraries, pools, and edit them;● create lanes and runs and the configuration files for the physical sequencer;● collect the sequencer results and map them in a easy wayAlmost done!
    • Overview - LabWe started using the basic Django admin BUT● the page loading was slow● due to the admins nature we lacked flexibility● we were forcing the lab people procedures● management was cumbersome
    • Overview - AnalysesAfter the physical sequencing the raw data(basecalls) must be converted in FASTQ files.The FASTQ files are FASTA files with someembedded quality stats.They are the starting point for almost everygenomic analysis.
    • Overview - AnalysesTo optimize time and resources we use acluster of Celery workers.● we track the software packages used● we track their parameters● we create a set of useful stats
    • TechnologyAdditions:● Informational celery tasks (list directories content, copy files between devices with dbus plugin and UDisks instead of ugly hacks, ...)
    • Overview - PipelinesFASTQ files alignments and assembliesEach analysis use different software insequence or in parallel.Using hundreds of samples, the analyses cantbe handled manually.
    • PipelinesThe results of each pipelines (like previousanalyses) must be tracked.Since CLI based software is not user-friendly,we develop a graphical pipeline builder.Users are able to choose and combine differentsoftwares to perform their own analyses.
    • Pipeline● Workers have different queues in order to satisfy different tasks● Workers tasks talk each other with Redis to avoid inconsistencies and to improve performances
    • PipelinesDifferent queues (stored in Redis dbs): ○ available ○ queued ○ active ○ completed ○ error
    • PipelinesVideo demonstration of a working pipeline withuse of different kind of steps.(1 video: 1 minute)
    • Under development● simple interface that allow customers to: ○ insert their samples directly ○ watch the results of their pipeline in a genome browser - also made with Django (see below)!● barcodes● genome browser (like GMOD GBrowse, but with the greatness of Python instead of the confusion of Perl)
    • Genome browserAn application that allows browsing a genomesannotations (like genes, or where reads arealigned).Actually, the best web genome browser isGMOD Gbrowse.
    • Genome browserThe challenge is to develop a genome browserthat has a set of basic features and couldaccept plugins for particular type of data - likeGMOD Gbrowse.In addition, it must be quick and easy tomanage - NOT like GMOD Gbrowse.
    • Genome browserVideo demonstration of the genome browser.(2 videos: 1 minute + 1 minute)
    • Acknowledgments● The wet-lab Teams that developed: ladies at IGA ● Django ● JQuery ● nginx● WEBdeBS ● uWSGI ● Celery ● Redis ● pip and virtualenv ● PostgreSQL ● All the open source projects involved