SlideShare a Scribd company logo
BPIPE: BIOINFORMATICS
PIPELINE FRAMEWORKSpeaker: Mohamed Nadhir Djekidel (那弟尔)
2015/11/06
WHY WE NEED PIPELINES
➤ Bioinformatics analysis is generally a set steps.
➤ In some analysis we need a combination of tools (bowtie, samtools,…etc)
➤ Some tasks are repetitive (especially if we have many files).
➤ Need to edit the script if the program crush in the middle
➤ Some time we have hard coded scripts that are not portable
➤ …..
MOTIVATIONS BEHIND PIPE
➤ dedicated programming language for defining and executing
bioinformatics pipelines
➤ No much programmable skills are needed
➤ Simple definition of tasks
➤ easy restart of the job from the point of failure
➤ Easy Parallelism and job sequence management
➤ Integration with Cluster Resource Managers ( GSE, PBS, LSF)
➤ Modular development of re-usable pipeline stages.
➤ Automatic logging
BPIP’S ARCHITECTURE
➤ BPIPE Language: Based on Groovy, but shell scripting in generally ok.
➤ The Bpipe Job Management Tool: BASH Shell + Java
➤ Log management : creates .bpipe folder
➤ Communication with Resource Managers: sending jobs to the queue,…etc
BASIC BPIPE STRUCTURES
stage_one
stage_two
CONVERT A SHELL SCRIPT TO BPIPE
Original BASH script
BPIPE Script
DYNAMIC INPUT AND OUTPUT
Used the variables $input and $output instead
PARALLEL TASKS
Use brackets {}, to specify parallel tasks
step1
step2 step3
step1
step2 step4
step3 step5
PARALLEL TASKS -CONT
step1
step2 step4
step3 step5
step6 (Step6 will wait until both branches are finished)
RUN ON A CLUSTER
➤ create a pipe.config file in you working directory
➤ select the SGE system and specify configuration (optional)
PIPELINE REPORT
A file index.html will be generated in the doc folder
INPUT SPLIT
➤ Inputs can be grouped using regular expressions
➤ * used as a general selector and it affects the ordering
➤ % used for splitting
Example
INPUT SPLIT - EXAMPLES
Input
The script
Default parameters
INPUT SPLIT - EXAMPLES
Pass individual files
Order alphabetically
Group files
CONTROLLING OUTPUT NAMING
Filter : Keeps the same extension and adds the filter
file.csv file.nocomments.csv
Transform : changes the extension
file.csv file.xml
file_fast.zip
CONTROLLING OUTPUT NAMING
Produce : produces an output file with the specified name
RUNNING R CODE
SOME COMMANDS
ADDING INFORMATION TO THE SCRIPT
USEFUL TUTORIALS
➤ Download bpipe: https://github.com/ssadedin/bpipe
➤ Documentation: http://docs.bpipe.org/
➤ A complete workshop: https://github.com/tucano/bpipe_workshop
➤ Paper : http://bioinformatics.oxfordjournals.org/content/28/11/1525.full
THANKS

More Related Content

What's hot

Hand Hygiene in healthcare
Hand Hygiene in healthcareHand Hygiene in healthcare
Hand Hygiene in healthcare
Garima Aggarwal
 
X ray imaging intensifier
X ray imaging intensifierX ray imaging intensifier
X ray imaging intensifier
Surgicaltechie.com
 
Biomedical waste management esi mc
Biomedical waste management esi mc Biomedical waste management esi mc
Biomedical waste management esi mc Sumi Nandwani
 
Dr vishnu x ray imaging
Dr vishnu x  ray imagingDr vishnu x  ray imaging
Dr vishnu x ray imaging
vishnu mohan
 
EEG
EEGEEG
Radiofrequency Ablation
Radiofrequency AblationRadiofrequency Ablation
Radiofrequency Ablation
kitalvarez19
 
Assess the use of telecardiology platform
Assess the use of telecardiology platformAssess the use of telecardiology platform
Assess the use of telecardiology platform
Teleradiology Solutions
 
Electrosurgical unit
Electrosurgical unitElectrosurgical unit
Electrosurgical unit
prasadvagal
 

What's hot (9)

Hand Hygiene in healthcare
Hand Hygiene in healthcareHand Hygiene in healthcare
Hand Hygiene in healthcare
 
X ray imaging intensifier
X ray imaging intensifierX ray imaging intensifier
X ray imaging intensifier
 
Biomedical waste management esi mc
Biomedical waste management esi mc Biomedical waste management esi mc
Biomedical waste management esi mc
 
Dr vishnu x ray imaging
Dr vishnu x  ray imagingDr vishnu x  ray imaging
Dr vishnu x ray imaging
 
EEG
EEGEEG
EEG
 
Radiofrequency Ablation
Radiofrequency AblationRadiofrequency Ablation
Radiofrequency Ablation
 
Assess the use of telecardiology platform
Assess the use of telecardiology platformAssess the use of telecardiology platform
Assess the use of telecardiology platform
 
Electrosurgical unit
Electrosurgical unitElectrosurgical unit
Electrosurgical unit
 
Medical dissection lab syringe pump
Medical dissection lab syringe pumpMedical dissection lab syringe pump
Medical dissection lab syringe pump
 

Viewers also liked

Topological associated domains- Hi-C
Topological associated domains- Hi-CTopological associated domains- Hi-C
Topological associated domains- Hi-C
Mohamed Nadhir Djekidel
 
Usability and Bioinformatics: experience and research challenges
Usability and Bioinformatics: experience and research challengesUsability and Bioinformatics: experience and research challenges
Usability and Bioinformatics: experience and research challenges
bolk
 
B4OS-2012
B4OS-2012B4OS-2012
Integrative_omics_lecture_feb112016_UAB
Integrative_omics_lecture_feb112016_UABIntegrative_omics_lecture_feb112016_UAB
Integrative_omics_lecture_feb112016_UABSophia Banton
 
Multi-omics Pathway Visualization
Multi-omics Pathway VisualizationMulti-omics Pathway Visualization
Multi-omics Pathway Visualization
Anwesha Bohler
 
The Ondex Data Integration Framework
The Ondex Data Integration FrameworkThe Ondex Data Integration Framework
The Ondex Data Integration Framework
bosc
 
Knowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysisKnowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysisCOST action BM1006
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Juan Antonio Vizcaino
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
Duncan Hull
 
The Galaxy framework as a unifying bioinformatics solution for multi-omic dat...
The Galaxy framework as a unifying bioinformatics solution for multi-omic dat...The Galaxy framework as a unifying bioinformatics solution for multi-omic dat...
The Galaxy framework as a unifying bioinformatics solution for multi-omic dat...
pratikomics
 
Applications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And ProcessApplications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And ProcessProf. Dr. Basavaraj Nanjwade
 
2015 vancouver-vanbug
2015 vancouver-vanbug2015 vancouver-vanbug
2015 vancouver-vanbug
c.titus.brown
 

Viewers also liked (13)

Topological associated domains- Hi-C
Topological associated domains- Hi-CTopological associated domains- Hi-C
Topological associated domains- Hi-C
 
Usability and Bioinformatics: experience and research challenges
Usability and Bioinformatics: experience and research challengesUsability and Bioinformatics: experience and research challenges
Usability and Bioinformatics: experience and research challenges
 
B4OS-2012
B4OS-2012B4OS-2012
B4OS-2012
 
Integrative_omics_lecture_feb112016_UAB
Integrative_omics_lecture_feb112016_UABIntegrative_omics_lecture_feb112016_UAB
Integrative_omics_lecture_feb112016_UAB
 
Multi-omics Pathway Visualization
Multi-omics Pathway VisualizationMulti-omics Pathway Visualization
Multi-omics Pathway Visualization
 
The Ondex Data Integration Framework
The Ondex Data Integration FrameworkThe Ondex Data Integration Framework
The Ondex Data Integration Framework
 
Knowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysisKnowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysis
 
integration_Aug2015
integration_Aug2015integration_Aug2015
integration_Aug2015
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
The Galaxy framework as a unifying bioinformatics solution for multi-omic dat...
The Galaxy framework as a unifying bioinformatics solution for multi-omic dat...The Galaxy framework as a unifying bioinformatics solution for multi-omic dat...
The Galaxy framework as a unifying bioinformatics solution for multi-omic dat...
 
Applications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And ProcessApplications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And Process
 
2015 vancouver-vanbug
2015 vancouver-vanbug2015 vancouver-vanbug
2015 vancouver-vanbug
 

Similar to BPIPE: a bioinformatics pipeline framework

Makefile for python projects
Makefile for python projectsMakefile for python projects
Makefile for python projects
Mpho Mphego
 
Don't Fear the Autotools
Don't Fear the AutotoolsDon't Fear the Autotools
Don't Fear the AutotoolsScott Garman
 
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
Diego Freniche Brito
 
PHP
PHPPHP
An Introduction to Apache Pig
An Introduction to Apache PigAn Introduction to Apache Pig
An Introduction to Apache Pig
Sachin Vakkund
 
Building CLIs with Ruby
Building CLIs with RubyBuilding CLIs with Ruby
Building CLIs with Ruby
drizzlo
 
Tutorial contributing to nf-core
Tutorial contributing to nf-coreTutorial contributing to nf-core
Tutorial contributing to nf-core
Gisela Gabernet
 
An Introduction to Go
An Introduction to GoAn Introduction to Go
An Introduction to Go
Imesh Gunaratne
 
Upstate CSCI 450 PHP Chapters 5, 12, 13
Upstate CSCI 450 PHP Chapters 5, 12, 13Upstate CSCI 450 PHP Chapters 5, 12, 13
Upstate CSCI 450 PHP Chapters 5, 12, 13
DanWooster1
 
Let's Go
Let's GoLet's Go
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
Steve Caron
 
Yocto: Training in English
Yocto: Training in EnglishYocto: Training in English
Yocto: Training in EnglishOtavio Salvador
 
Barcelona Multilanguage
Barcelona MultilanguageBarcelona Multilanguage
Barcelona Multilanguage
guest3a6661
 
Preparing an Open Source Documentation Repository for Translations
Preparing an Open Source Documentation Repository for TranslationsPreparing an Open Source Documentation Repository for Translations
Preparing an Open Source Documentation Repository for Translations
HPCC Systems
 
Putting Phing to Work for You
Putting Phing to Work for YouPutting Phing to Work for You
Putting Phing to Work for You
hozn
 
Software Build processes and Git
Software Build processes and GitSoftware Build processes and Git
Software Build processes and Git
Alec Clews
 
AddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based ProgrammingAddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based Programming
Samuel Lampa
 
Anatomy of Autoconfig in Oracle E-Business Suite
Anatomy of Autoconfig in Oracle E-Business SuiteAnatomy of Autoconfig in Oracle E-Business Suite
Anatomy of Autoconfig in Oracle E-Business Suite
vasuballa
 
Ansible intro
Ansible introAnsible intro

Similar to BPIPE: a bioinformatics pipeline framework (20)

Makefile for python projects
Makefile for python projectsMakefile for python projects
Makefile for python projects
 
Don't Fear the Autotools
Don't Fear the AutotoolsDon't Fear the Autotools
Don't Fear the Autotools
 
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
 
PHP
PHPPHP
PHP
 
An Introduction to Apache Pig
An Introduction to Apache PigAn Introduction to Apache Pig
An Introduction to Apache Pig
 
Building CLIs with Ruby
Building CLIs with RubyBuilding CLIs with Ruby
Building CLIs with Ruby
 
Tutorial contributing to nf-core
Tutorial contributing to nf-coreTutorial contributing to nf-core
Tutorial contributing to nf-core
 
An Introduction to Go
An Introduction to GoAn Introduction to Go
An Introduction to Go
 
Upstate CSCI 450 PHP Chapters 5, 12, 13
Upstate CSCI 450 PHP Chapters 5, 12, 13Upstate CSCI 450 PHP Chapters 5, 12, 13
Upstate CSCI 450 PHP Chapters 5, 12, 13
 
Let's Go
Let's GoLet's Go
Let's Go
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
 
Yocto: Training in English
Yocto: Training in EnglishYocto: Training in English
Yocto: Training in English
 
Barcelona Multilanguage
Barcelona MultilanguageBarcelona Multilanguage
Barcelona Multilanguage
 
Preparing an Open Source Documentation Repository for Translations
Preparing an Open Source Documentation Repository for TranslationsPreparing an Open Source Documentation Repository for Translations
Preparing an Open Source Documentation Repository for Translations
 
pracfinal
pracfinalpracfinal
pracfinal
 
Putting Phing to Work for You
Putting Phing to Work for YouPutting Phing to Work for You
Putting Phing to Work for You
 
Software Build processes and Git
Software Build processes and GitSoftware Build processes and Git
Software Build processes and Git
 
AddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based ProgrammingAddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based Programming
 
Anatomy of Autoconfig in Oracle E-Business Suite
Anatomy of Autoconfig in Oracle E-Business SuiteAnatomy of Autoconfig in Oracle E-Business Suite
Anatomy of Autoconfig in Oracle E-Business Suite
 
Ansible intro
Ansible introAnsible intro
Ansible intro
 

Recently uploaded

in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 

Recently uploaded (20)

in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 

BPIPE: a bioinformatics pipeline framework

  • 1. BPIPE: BIOINFORMATICS PIPELINE FRAMEWORKSpeaker: Mohamed Nadhir Djekidel (那弟尔) 2015/11/06
  • 2. WHY WE NEED PIPELINES ➤ Bioinformatics analysis is generally a set steps. ➤ In some analysis we need a combination of tools (bowtie, samtools,…etc) ➤ Some tasks are repetitive (especially if we have many files). ➤ Need to edit the script if the program crush in the middle ➤ Some time we have hard coded scripts that are not portable ➤ …..
  • 3. MOTIVATIONS BEHIND PIPE ➤ dedicated programming language for defining and executing bioinformatics pipelines ➤ No much programmable skills are needed ➤ Simple definition of tasks ➤ easy restart of the job from the point of failure ➤ Easy Parallelism and job sequence management ➤ Integration with Cluster Resource Managers ( GSE, PBS, LSF) ➤ Modular development of re-usable pipeline stages. ➤ Automatic logging
  • 4. BPIP’S ARCHITECTURE ➤ BPIPE Language: Based on Groovy, but shell scripting in generally ok. ➤ The Bpipe Job Management Tool: BASH Shell + Java ➤ Log management : creates .bpipe folder ➤ Communication with Resource Managers: sending jobs to the queue,…etc
  • 6. CONVERT A SHELL SCRIPT TO BPIPE Original BASH script BPIPE Script
  • 7. DYNAMIC INPUT AND OUTPUT Used the variables $input and $output instead
  • 8. PARALLEL TASKS Use brackets {}, to specify parallel tasks step1 step2 step3 step1 step2 step4 step3 step5
  • 9. PARALLEL TASKS -CONT step1 step2 step4 step3 step5 step6 (Step6 will wait until both branches are finished)
  • 10. RUN ON A CLUSTER ➤ create a pipe.config file in you working directory ➤ select the SGE system and specify configuration (optional)
  • 11. PIPELINE REPORT A file index.html will be generated in the doc folder
  • 12. INPUT SPLIT ➤ Inputs can be grouped using regular expressions ➤ * used as a general selector and it affects the ordering ➤ % used for splitting Example
  • 13. INPUT SPLIT - EXAMPLES Input The script Default parameters
  • 14. INPUT SPLIT - EXAMPLES Pass individual files Order alphabetically Group files
  • 15. CONTROLLING OUTPUT NAMING Filter : Keeps the same extension and adds the filter file.csv file.nocomments.csv Transform : changes the extension file.csv file.xml file_fast.zip
  • 16. CONTROLLING OUTPUT NAMING Produce : produces an output file with the specified name
  • 19. ADDING INFORMATION TO THE SCRIPT
  • 20. USEFUL TUTORIALS ➤ Download bpipe: https://github.com/ssadedin/bpipe ➤ Documentation: http://docs.bpipe.org/ ➤ A complete workshop: https://github.com/tucano/bpipe_workshop ➤ Paper : http://bioinformatics.oxfordjournals.org/content/28/11/1525.full