• Save
Iga workflow
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
880
On Slideshare
833
From Embeds
47
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 47

http://lanyrd.com 47

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. IGA WorkflowAlessandro Gervaso Vittorio Zamboni
  • 2. IntroductionIGA Workflow is a web-based tool createdusing the Django Framework.It is intended as a management tool for the IGAwet-lab.We use it to track the lifecycle of biologicalsamples, from vial to file.
  • 3. Overview● Laboratory management, from sample to flowcell● Bioinformatic analyses● Pipelines management● Technology● Other applications and future developments
  • 4. Biology for dummies
  • 5. Technology - BasicsDatabase server● Postgresql● RedisWorkers● CeleryWeb server● nginx + uWSGI
  • 6. Overview - Lab● SAMPLE: The basic unit in the lab● LIBRARY: a treated sample with an attached chemical TAG.● POOL: a set of libraries, ready to be placed on a flowcells lane.
  • 7. Overview - Lab
  • 8. Overview - LabThe main challenge was to replace notebookswith a tool that allows to:● insert samples, libraries, pools, and edit them;● create lanes and runs and the configuration files for the physical sequencer;● collect the sequencer results and map them in a easy wayAlmost done!
  • 9. Overview - LabWe started using the basic Django admin BUT● the page loading was slow● due to the admins nature we lacked flexibility● we were forcing the lab people procedures● management was cumbersome
  • 10. Overview - AnalysesAfter the physical sequencing the raw data(basecalls) must be converted in FASTQ files.The FASTQ files are FASTA files with someembedded quality stats.They are the starting point for almost everygenomic analysis.
  • 11. Overview - AnalysesTo optimize time and resources we use acluster of Celery workers.● we track the software packages used● we track their parameters● we create a set of useful stats
  • 12. TechnologyAdditions:● Informational celery tasks (list directories content, copy files between devices with dbus plugin and UDisks instead of ugly hacks, ...)
  • 13. Overview - PipelinesFASTQ files alignments and assembliesEach analysis use different software insequence or in parallel.Using hundreds of samples, the analyses cantbe handled manually.
  • 14. PipelinesThe results of each pipelines (like previousanalyses) must be tracked.Since CLI based software is not user-friendly,we develop a graphical pipeline builder.Users are able to choose and combine differentsoftwares to perform their own analyses.
  • 15. Pipeline● Workers have different queues in order to satisfy different tasks● Workers tasks talk each other with Redis to avoid inconsistencies and to improve performances
  • 16. PipelinesDifferent queues (stored in Redis dbs): ○ available ○ queued ○ active ○ completed ○ error
  • 17. PipelinesVideo demonstration of a working pipeline withuse of different kind of steps.(1 video: 1 minute)
  • 18. Under development● simple interface that allow customers to: ○ insert their samples directly ○ watch the results of their pipeline in a genome browser - also made with Django (see below)!● barcodes● genome browser (like GMOD GBrowse, but with the greatness of Python instead of the confusion of Perl)
  • 19. Genome browserAn application that allows browsing a genomesannotations (like genes, or where reads arealigned).Actually, the best web genome browser isGMOD Gbrowse.
  • 20. Genome browserThe challenge is to develop a genome browserthat has a set of basic features and couldaccept plugins for particular type of data - likeGMOD Gbrowse.In addition, it must be quick and easy tomanage - NOT like GMOD Gbrowse.
  • 21. Genome browserVideo demonstration of the genome browser.(2 videos: 1 minute + 1 minute)
  • 22. Acknowledgments● The wet-lab Teams that developed: ladies at IGA ● Django ● JQuery ● nginx● WEBdeBS ● uWSGI ● Celery ● Redis ● pip and virtualenv ● PostgreSQL ● All the open source projects involved