Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

pipeline_structure_overview

455 views

Published on

Published in: Design
  • Be the first to comment

  • Be the first to like this

pipeline_structure_overview

  1. 1. NPG Pipeline Overview analyse_RTA PB_cal post_qseq post_qc_review manual qc
  2. 2. analyse_RTA OLB(Bustard)tocreateqseq demultiplex CASAVA(Gerald)Recalibration
  3. 3. PB_cal OLB(bcl2qseq)tocreateqseq demultiplex PB_callanerecalibrationNo recalibration
  4. 4. post_qseq Produce per lane fastq (qseq2fastq.pl) Produce per lane srf (illumina2srf & srf_index_hash) Split out nonconsented data Split fastqs by multiplex tag qX_yield insert_size adapter sequence_error gc_fraction
  5. 5. post_qseq Create run analysis schema information contaminationbam file generation md5 generation bam_markduplicates
  6. 6. post_qseq gc_biasbam indexing Check cluster counts Manual QC Stage
  7. 7. post_qc_review archive_to_sra archive_to_irods Upload fastqcheck Upload auto_qc Upload illumina analysis Tidy up staging area
  8. 8. Additional Notes ● Spider runs at the start and finish at the end of all the pipelines. ● Spider caches web pages which are used throughout the pipeline and sets an environment variable, so that all launched jobs can access them. ● Finish is very important, as it ties off the log files, and writes a json string of the processes launched, which is needed for the schema generation.
  9. 9. Additional Notes ● Status changes have been left out, along with some file checking and creation of tag specific lane files which occur at the start of the primary analysis pipelines. These happen, but are not responsible directly for the files and qc that you see. ● The production version of the primary pipeline launches a version of the secondary pipeline which creates a Latest_Summary link to it's archival files and QC.
  10. 10. Additional Notes ● Status changes have been left out, along with some file checking and creation of tag specific lane files which occur at the start of the primary analysis pipelines. These happen, but are not responsible directly for the files and qc that you see. ● The production version of the primary pipeline launches a version of the secondary pipeline which creates a Latest_Summary link to it's archival files and QC.

×