SlideShare a Scribd company logo
1 of 72
Download to read offline
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Lab meeting—technical talk
GNU Parallel
Coby Viner
Hoffman Lab
Thursday December 7, 2023
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Overview
Why use GNU Parallel?
Basic examples from the tutorial
Basic elements of syntax [from the tutorial]
Much more syntax for many other tasks
Selected recent features
More tutorial examples
More tutorial examples
More tutorial examples
More tutorial examples
More tutorial examples
Some examples of my GNU parallel usage
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Why use GNU Parallel?
a shell tool for executing jobs in parallel using one or more com-
puters.
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Why use GNU Parallel?
a shell tool for executing jobs in parallel using one or more com-
puters.
I Easily parallelize perfectly parallel tasks
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Why use GNU Parallel?
a shell tool for executing jobs in parallel using one or more com-
puters.
I Easily parallelize perfectly parallel tasks
I For each chromosome…
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Why use GNU Parallel?
a shell tool for executing jobs in parallel using one or more com-
puters.
I Easily parallelize perfectly parallel tasks
I For each chromosome…
I For each sex, for each technical replicate, for each hyper-parameter(s)
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Why use GNU Parallel?
a shell tool for executing jobs in parallel using one or more com-
puters.
I Easily parallelize perfectly parallel tasks
I For each chromosome…
I For each sex, for each technical replicate, for each hyper-parameter(s)
I Job submission scripts within a for loop
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Why use GNU Parallel?
a shell tool for executing jobs in parallel using one or more com-
puters.
I Easily parallelize perfectly parallel tasks
I For each chromosome…
I For each sex, for each technical replicate, for each hyper-parameter(s)
I Job submission scripts within a for loop
I Improved, cleaner, syntax (for the programmer), even in serial
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Why use GNU Parallel?
a shell tool for executing jobs in parallel using one or more com-
puters.
I Easily parallelize perfectly parallel tasks
I For each chromosome…
I For each sex, for each technical replicate, for each hyper-parameter(s)
I Job submission scripts within a for loop
I Improved, cleaner, syntax (for the programmer), even in serial
I Facile interleaving of tasks, in the order one is thinking about them
A basic [man page] example: “Working as xargs -n1.
Argument appending”
find . -name '*.html' | parallel gzip --best
A basic [man page] example: “Working as xargs -n1.
Argument appending”
find . -name '*.html' | parallel gzip --best
find . -type f -print0 | 
parallel -q0 perl -i -pe 's/FOO BAR/FUBAR/g'
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Easy installation from source
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Easy installation from source
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Easy installation from source
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Easy installation from source
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Easy installation from source
Another basic [man page] example: “Inserting multiple
arguments”
bash: /bin/mv: Argument list too long
ls | grep -E '.log$' | parallel mv {} destdir
Another basic [man page] example: “Inserting multiple
arguments”
bash: /bin/mv: Argument list too long
ls | grep -E '.log$' | parallel mv {} destdir
ls | grep -E '.log$' | parallel -m mv {} destdir
Basic elements of syntax [from the tutorial]
Input:
parallel echo ::: A B C # command line
cat abc-file | parallel echo # from STDIN
parallel -a abc-file echo # from a file
Basic elements of syntax [from the tutorial]
Input:
parallel echo ::: A B C # command line
cat abc-file | parallel echo # from STDIN
parallel -a abc-file echo # from a file
Output [line order may vary]:
A
B
C
Basic elements of syntax [from the tutorial]
Multiple inputs.
Input:
parallel echo ::: A B C ::: D E F
cat abc-file | parallel -a - -a def-file echo
parallel -a abc-file -a def-file echo
cat abc-file | parallel echo :::: - def-file # alt. file
parallel echo ::: A B C :::: def-file # mix cmd. and file
Basic elements of syntax [from the tutorial]
Multiple inputs.
Input:
parallel echo ::: A B C ::: D E F
cat abc-file | parallel -a - -a def-file echo
parallel -a abc-file -a def-file echo
cat abc-file | parallel echo :::: - def-file # alt. file
parallel echo ::: A B C :::: def-file # mix cmd. and file
Output [line order may vary]:
A D
A E
A F
B D
B E
B F
C D
C E
C F
Basic elements of syntax [from the tutorial]
Matching input.
Input:
parallel --xapply echo ::: A B C ::: D E F
Basic elements of syntax [from the tutorial]
Matching input.
Input:
parallel --xapply echo ::: A B C ::: D E F
Output [line order may vary]:
A D
B E
C F
Basic elements of syntax [from the tutorial]
Matching input.
Input:
parallel --xapply echo ::: A B C ::: D E F
Output [line order may vary]:
A D
B E
C F
I –xapply will wrap, if insufficient input is provided.
Basic elements of syntax [from the tutorial]
Replacement strings: The 7 predefined replacement strings
Input:
parallel echo {} ::: A/B.C
parallel echo {.} ::: A/B.C
Output:
A/B.C
A/B
Basic elements of syntax [from the tutorial]
Replacement strings: The 7 predefined replacement strings
Input:
parallel echo {} ::: A/B.C
parallel echo {.} ::: A/B.C
Output:
A/B.C
A/B
Rep. String Result
. remove ext.
/ remove path
// only path
/. only ext. and path
# job number
% job slot number
Basic elements of syntax [from the tutorial]
Customizing replacement strings
--extensionreplace to change {.} etc.
Shorthand custom (PCRE+) replacement strings
GNU parallel’s 7 replacement strings:
--rpl '{} '
--rpl '{#} $_=$job->seq()'
--rpl '{%} $_=$job->slot()'
--rpl '{/} s:.*/::'
--rpl '{//} $Global::use{”File::Basename”} 
||= eval ”use File::Basename; 1;”; $_ = dirname($_);'
--rpl '{/.} s:.*/::; s:.[^/.]+$::;'
--rpl '{.} s:.[^/.]+$::'
Basic elements of syntax [from the tutorial]
Multiple input sources and positional replacement:
parallel echo {1} and {2} ::: A B ::: C D
Basic elements of syntax [from the tutorial]
Multiple input sources and positional replacement:
parallel echo {1} and {2} ::: A B ::: C D
I Always try to define replacements, with {<>} syntax.
Basic elements of syntax [from the tutorial]
Multiple input sources and positional replacement:
parallel echo {1} and {2} ::: A B ::: C D
I Always try to define replacements, with {<>} syntax.
I Test with --dry-run first.
Basic elements of syntax [from the tutorial]
More replacement strings
--plus adds the replacement strings
{+/} {+.} {+..} {+...} {..} {...} {/..} {/...} {##}.
{+foo} matches the opposite of {foo}:
{} =
{+/}/{/} =
{.}.{+.} =
{+/}/{/.}.{+.} =
{..}.{+..} =
{+/}/{/..}.{+..} =
{...}.{+...} =
{+/}/{/...}.{+...}
Basic elements of syntax [from the tutorial]
--plus also adds:
I Since May 2021: now includes {%%regexp} and {##regexp}.
Basic elements of syntax [from the tutorial]
--plus also adds:
I Since May 2021: now includes {%%regexp} and {##regexp}.
I Since Dec. 2020, {hgrp} that gives the intersection of the hostgroups of
the job and the sshlogin that the job is run on.
Basic elements of syntax [from the tutorial]
--plus also adds:
I Since May 2021: now includes {%%regexp} and {##regexp}.
I Since Dec. 2020, {hgrp} that gives the intersection of the hostgroups of
the job and the sshlogin that the job is run on.
I Since May 2020: also activates the replacement strings
{slot} = $PARALLEL_JOBSLOT, {sshlogin} = $PARALLEL_SSHLOGIN, {host}.
Lab meeting—
technical talk
Coby Viner
Use cases
Basic examples
Basic syntax
Additional
syntax
Recent features
More examples
More examples
More examples
More examples
More examples
Real examples
Performance over time
20100424
20100615
20100620
20100822
20100922
20101115
20101202
20110122
20110205
20110422
20110622
20110822
20111122
20120122
20120322
20120522
20120722
20121022
20121222
20130222
20130522
20130722
20130922
20131122
20140122
20140322
20140522
20140722
20140922
20141122
20150222
20150422
20150622
20150822
20151022
20151222
20160222
20160422
20160622
20160822
20161022
20161222
20170222
20170422
20170622
20170822
20171022
20171222
20180222
20180422
20180622
20180822
20181022
20181222
20190222
20190422
20190622
20190822
20191022
20191222
20200222
20200422
20200622
20200822
20201022
5
6
7
8
9
10
11
12
GNU Parallel overhead for different versions
3000 trials each running 1000 jobs
Command
milliseconds/job
Much more syntax for many other tasks
I --pipe: instead of STDIN as command args, data sent to STDIN of
command
Much more syntax for many other tasks
I --pipe: instead of STDIN as command args, data sent to STDIN of
command
I command_A | command_B | command_C, where command_B is slow
Much more syntax for many other tasks
I --pipe: instead of STDIN as command args, data sent to STDIN of
command
I command_A | command_B | command_C, where command_B is slow
I Remote execution to directly parallelize over multiple machines
Much more syntax for many other tasks
I --pipe: instead of STDIN as command args, data sent to STDIN of
command
I command_A | command_B | command_C, where command_B is slow
I Remote execution to directly parallelize over multiple machines
I Working directly with a SQL database
Much more syntax for many other tasks
I --pipe: instead of STDIN as command args, data sent to STDIN of
command
I command_A | command_B | command_C, where command_B is slow
I Remote execution to directly parallelize over multiple machines
I Working directly with a SQL database
I Shebang: often cat input_file | parallel command, but can do
#!/usr/bin/parallel --shebang -r echo
Much more syntax for many other tasks
I --pipe: instead of STDIN as command args, data sent to STDIN of
command
I command_A | command_B | command_C, where command_B is slow
I Remote execution to directly parallelize over multiple machines
I Working directly with a SQL database
I Shebang: often cat input_file | parallel command, but can do
#!/usr/bin/parallel --shebang -r echo
I As a counting semaphore: parallel --semaphore or sem
Much more syntax for many other tasks
I --pipe: instead of STDIN as command args, data sent to STDIN of
command
I command_A | command_B | command_C, where command_B is slow
I Remote execution to directly parallelize over multiple machines
I Working directly with a SQL database
I Shebang: often cat input_file | parallel command, but can do
#!/usr/bin/parallel --shebang -r echo
I As a counting semaphore: parallel --semaphore or sem
I Default is one slot: a mutex
Selected recent features (post-2020)
I --latest-line shows only the latest line of running jobs.
Selected recent features (post-2020)
I --latest-line shows only the latest line of running jobs.
I --color colors output in different colors per job (and additional related
features).
Selected recent features (post-2020)
I --latest-line shows only the latest line of running jobs.
I --color colors output in different colors per job (and additional related
features).
I --sshlogin: now quite fully-featured
Selected recent features (post-2020)
I --latest-line shows only the latest line of running jobs.
I --color colors output in different colors per job (and additional related
features).
I --sshlogin: now quite fully-featured
I --delay 123auto will auto-adjust --delay. If jobs fail due to being
spawned too quickly, --delay will exponentially increase.
Selected recent features (post-2020)
I --latest-line shows only the latest line of running jobs.
I --color colors output in different colors per job (and additional related
features).
I --sshlogin: now quite fully-featured
I --delay 123auto will auto-adjust --delay. If jobs fail due to being
spawned too quickly, --delay will exponentially increase.
I --memsuspend
Selected recent features (post-2020)
I --latest-line shows only the latest line of running jobs.
I --color colors output in different colors per job (and additional related
features).
I --sshlogin: now quite fully-featured
I --delay 123auto will auto-adjust --delay. If jobs fail due to being
spawned too quickly, --delay will exponentially increase.
I --memsuspend
I {= =}: includes yyyy_mm_dd_hh_mm_ss(),
yyyy_mm_dd_hh_mm(), etc.
Selected recent features (post-2020)
I --latest-line shows only the latest line of running jobs.
I --color colors output in different colors per job (and additional related
features).
I --sshlogin: now quite fully-featured
I --delay 123auto will auto-adjust --delay. If jobs fail due to being
spawned too quickly, --delay will exponentially increase.
I --memsuspend
I {= =}: includes yyyy_mm_dd_hh_mm_ss(),
yyyy_mm_dd_hh_mm(), etc.
I --filter, e.g., {1} < {2}+1.
Selected recent features (post-2020)
I --latest-line shows only the latest line of running jobs.
I --color colors output in different colors per job (and additional related
features).
I --sshlogin: now quite fully-featured
I --delay 123auto will auto-adjust --delay. If jobs fail due to being
spawned too quickly, --delay will exponentially increase.
I --memsuspend
I {= =}: includes yyyy_mm_dd_hh_mm_ss(),
yyyy_mm_dd_hh_mm(), etc.
I --filter, e.g., {1} < {2}+1.
I --template <text file>, with replacement strings. Replaces the
replacement strings and saves it under a new filename.
Another [man page] example: “Aggregating content of files”
parallel --header : echo x{X}y{Y}z{Z} > 
x{X}y{Y}z{Z} 
::: X {1..5} ::: Y {01..10} ::: Z {1..5}
Another [man page] example: “Aggregating content of files”
parallel --header : echo x{X}y{Y}z{Z} > 
x{X}y{Y}z{Z} 
::: X {1..5} ::: Y {01..10} ::: Z {1..5}
parallel eval 'cat {=s/y01/y*/=} > 
{=s/y01//=}' ::: *y01*
This runs: cat x1y*z1 > x1z1, ∀x∀z
Another [man page] example: directly call SLURM
#!/bin/bash
#SBATCH --time 00:02:00
#SBATCH --ntasks=4
#SBATCH --job-name GnuParallelDemo
#SBATCH --output gnuparallel.out
module purge
module load gnu_parallel
my_parallel=”parallel --delay .2 -j $SLURM_NTASKS”
my_srun=”srun --export=all --exclusive -n1”
my_srun=”$my_srun --cpus-per-task=1 --cpu-bind=cores”
$my_parallel ”$my_srun” echo This is job {} ::: {1..20}
Another [man page] example: myprog on FASTA input
cat file.fasta |
parallel --pipe -N1 --recstart '>' --rrs 
'read a; echo Name: ”$a”; myprog $(tr -d ”n”)'
Another [man page] example: fastq-reader on interleaved
FASTQ input
parallel --pipe-part -a big.fq --block -1 --regexp 
--recend 'n' --recstart '@.*(/1| 1:.*)n[A-Za-zn.~]' 
fastq-reader
Another [man page] example: simple scheduler
true >jobqueue;
while true; do
tail -n+0 -f jobqueue |
(parallel -E StOpHeRe -S ..; echo GNU Parallel is now done;
perl -e 'while(<>){/StOpHeRe/ and last};print <>' jobqueue > j2;
(seq 1000 » jobqueue &);
echo Done appending dummy data forcing tail to exit)
mv j2 jobqueue
done
Another [man page] example: simple scheduler
true >jobqueue;
while true; do
tail -n+0 -f jobqueue |
(parallel -E StOpHeRe -S ..; echo GNU Parallel is now done;
perl -e 'while(<>){/StOpHeRe/ and last};print <>' jobqueue > j2;
(seq 1000 » jobqueue &);
echo Done appending dummy data forcing tail to exit)
mv j2 jobqueue
done
# Day time
echo 50% > jobfile
cp day_server_list ~/.parallel/sshloginfile
# Night time
echo 100% > jobfile
cp night_server_list ~/.parallel/sshloginfile
Post-meme2images inkscape conversions for publication-ready
CentriMo plots and sequence logos
parallel inkscape --vacuum-defs --export-pdf={.}.pdf {} 
::: ”$centrimo_eps_1” ”$centrimo_eps_2”
Post-meme2images inkscape conversions for publication-ready
CentriMo plots and sequence logos
parallel inkscape --vacuum-defs --export-pdf={.}.pdf {} 
::: ”$centrimo_eps_1” ”$centrimo_eps_2”
parallel ”inkscape --vacuum-defs --export-pdf={.}.pdf {};
pdfcrop --hires --clip --margins '0 0 0 -12' {.}.pdf;
mv -f {.}-crop.pdf {.}.pdf
” ::: logo+([:digit:])$VECTOR_FILE_EXT
Fixing directory structures—symbolic link issues (for data
provenance)
parallel --dry-run -j 1 --rpl 
'{s} s@.*?((?:fe)?male_d+-d+).*@$1@' ”{s}; 
ln -s /$(readlink {}) {}” 
::: $(find . -mindepth 3 -maxdepth 3 -xtype l)
Fixing directory structures—symbolic link issues (for data
provenance)
parallel --dry-run -j 1 --rpl 
'{s} s@.*?((?:fe)?male_d+-d+).*@$1@' ”{s}; 
ln -s /$(readlink {}) {}” 
::: $(find . -mindepth 3 -maxdepth 3 -xtype l)
parallel --rpl 
'{s} s:.+?/(.+?)_peaks.narrowPeak.gz$:
1_summits.bed.gz:' 
”ln -s ../../linked-2015-10-07-.../data/MACS/{s}
{//}/” ::: */*_peaks.narrowPeak.gz
Fixing directory structures—symbolic link issues (for data
provenance)
parallel --dry-run -j 1 --rpl 
'{s} s@.*?((?:fe)?male_d+-d+).*@$1@' ”{s}; 
ln -s /$(readlink {}) {}” 
::: $(find . -mindepth 3 -maxdepth 3 -xtype l)
parallel --rpl 
'{s} s:.+?/(.+?)_peaks.narrowPeak.gz$:
1_summits.bed.gz:' 
”ln -s ../../linked-2015-10-07-.../data/MACS/{s}
{//}/” ::: */*_peaks.narrowPeak.gz
parallel -j 1 --rpl '{...} s:/.*::;' 
”dir=$(readlink -f {} | 
sed -r 's:/linked.+?/:/{...}/:'); 
mkdir $dir; rm -f {}; ln -s $dir {//}/M-ChIP_runs” 
::: $(find linked-2016-01-31-* -type l -name 'M-ChIP_runs')
Exploring/collating complex CentriMo results
parallel --dry-run -j 1 
--rpl '{sex} s:.*?(w*male)-d+_d+.*:1:' 
--rpl '{rep} s:.*?male-(d+_d+).*:1:' 
--rpl ”{TFinfo} s:.*?([^/]+)-expandedTo500bpRegions-
mod.*:1:” 
--rpl '{thresh} s:.*?d+_d+-(0.d+).*:1:' 
”awk '$0 !~ /^#/ {$1=””; $2=””; 
print ”{TFinfo}”,”{sex}”,”{rep}”,”
{thresh}”$0;}' {} | 
sed -r 's/[[:space:]]+/t/g'” 
::: $(find ../MEME-ChIP_runs-initial_controls/ 
-mindepth 5 -wholename 
'*hypothesis_testing_selected_controlledVars/
centrimo_out/centrimo.txt' | head
)
Processing ChIP-seq peak data with MACS
parallel --rpl '{/..SRF} s:../w+[-.](SRFw*).*:$1:i;' 
macs14 callpeak -t {} -n {/..SRF} -g 'mm' 
-s 51 --bw 150 -S -p 0.0001 
::: ../*.alignment.mm8.bed.gz
Processing ChIP-seq peak data with MACS
parallel --rpl '{/..SRF} s:../w+[-.](SRFw*).*:$1:i;' 
macs14 callpeak -t {} -n {/..SRF} -g 'mm' 
-s 51 --bw 150 -S -p 0.0001 
::: ../*.alignment.mm8.bed.gz
parallel ”zcat {} | awk 'BEGIN{FS=OFS=”t”} NR > 1 
{print $2,$3,$4;}' | 
pigz -9 > {/.}.bed.gz” ::: ../*MACS_peaks_annot.txt.gz
liftoverAll '.bed.gz'
Processing ChIP-seq peak data with MACS
function liftoverAll {
parallel liftOver {} ”$LIFTOVER_CHAIN_FILE_FULL_PATH” 
../$LIFTED_OVER_DIR_NAME/{/.}.liftedmm9 
../$LIFTED_OVER_DIR_NAME/{/.}.unlifted 
::: *”$1”
pigz -9 ../$LIFTED_OVER_DIR_NAME/*.liftedmm9 
../$LIFTED_OVER_DIR_NAME/*.unlifted
}
Processing ChIP-seq peak data with MACS
function liftoverAll {
parallel liftOver {} ”$LIFTOVER_CHAIN_FILE_FULL_PATH” 
../$LIFTED_OVER_DIR_NAME/{/.}.liftedmm9 
../$LIFTED_OVER_DIR_NAME/{/.}.unlifted 
::: *”$1”
pigz -9 ../$LIFTED_OVER_DIR_NAME/*.liftedmm9 
../$LIFTED_OVER_DIR_NAME/*.unlifted
}
parallel -j ${NSLOTS:=1} --xapply 
--rpl '{r} s:.*RS(d+).*:1:' 
”$MACS_CMD_AND_COMMON_PARAMS -f BAMPE -n 'M-r{1r}' 
-t {1} -c {2} |& tee -a '$OUT_DIR/M-r{1r}.log'” 
::: $IN_DIR/1494*@(1|2|3).bam 
::: $IN_DIR/1494*@(4|5|6).bam
Pipeline—processing bisulfite sequencing data with Methpipe
merge_methcount_cmds=$(
parallel -j $NSLOTS --joblog ”x.log” 
--rpl '{-../} s:.*/::; s:(.[^.]+)+$::; s:-d+$::;' 
--dry-run 
”echo ”$MODULE_LOAD_CMD export LC_ALL=C;
cat $ALIGNED_DIR/{-../}*.tomr | 
sort -k 1,1 -k 2,2n -k 3,3n -k 6,6 | ldots | 
methcounts -v -c $BISMARK_REF 
-o $COUNTS_DIR/{-../}_pool_ALL.meth /dev/stdin” 
| tee -a /dev/stderr | qsub ldots 
::: $IN_DIR/*.1.fastq.gz | sort -V | uniq
)
Run BedTools Coverage on many files
parallel -j $OMP_NUM_THREADS 
--delay 1 --lb --resume --resume-failed --joblog 
”$BASE_DATA_DIR/$(basename $0 .sh)-${output_job_file_suffix#-}.log” 
--plus --rpl '{acc} s:.*(SRRw+).*:$1:' --tag --tagstring '{1acc}' ”
... [expanded below]
” ::: $BASE_PEAK_DIR/*/peaks.bed* ::: '-hist' '-d' :::+ 'hist' 'pos'
Run BedTools Coverage on many files
parallel -j $OMP_NUM_THREADS 
--delay 1 --lb --resume --resume-failed --joblog 
”$BASE_DATA_DIR/$(basename $0 .sh)-${output_job_file_suffix#-}.log” 
--plus --rpl '{acc} s:.*(SRRw+).*:$1:' --tag --tagstring '{1acc}' ”
... [expanded below]
” ::: $BASE_PEAK_DIR/*/peaks.bed* ::: '-hist' '-d' :::+ 'hist' 'pos'
temp_BED=”$(mktemp).bed”
bedtools slop -g $assembly -i ”{1}” -b $SLOP_LEN > $temp_BED
cat -- ”$file” | bedtools coverage $BAM_IN_PARAM stdin $BED_IN_PARAM 
$temp_BED {2} -iobuf $READ_BUF_SIZE
...
Run BedTools Coverage on many files
if [[ ! -s $”$output_basename-reverse$output_ext” ]]; then
parallel --env _ -j $(($OMP_NUM_THREADS>1 ? 2 : 1)) --lb -I@@ 
--rpl '@name@ s:+:forward:; s:-:reverse:' ”
$MODULE_LOAD_CMD
strand_spec_output_file=”$output_basename-@name@$output_ext”
sambamba view -t $(($OMP_NUM_THREADS>8 ? 8 : 1)) -f bam -l 0 -F ”strand
$BAM_input_file $CHR_SUBSET | run_bedtools_cov_cmd 'stdin' > $stra
if [[ @@ == '-' ]]; then
sed -i 's/+/-/' $strand_spec_output_file
fi
pigz -9 -p $(($OMP_NUM_THREADS>8 ? 4 : 1)) $strand_spec_output_file
” ::: '+' '-'

More Related Content

Similar to GNU Parallel: Lab meeting—technical talk

Compiler Construction | Lecture 1 | What is a compiler?
Compiler Construction | Lecture 1 | What is a compiler?Compiler Construction | Lecture 1 | What is a compiler?
Compiler Construction | Lecture 1 | What is a compiler?Eelco Visser
 
Incredible Machine with Pipelines and Generators
Incredible Machine with Pipelines and GeneratorsIncredible Machine with Pipelines and Generators
Incredible Machine with Pipelines and Generatorsdantleech
 
Python Programming Basics for begginners
Python Programming Basics for begginnersPython Programming Basics for begginners
Python Programming Basics for begginnersAbishek Purushothaman
 
Makefile for python projects
Makefile for python projectsMakefile for python projects
Makefile for python projectsMpho Mphego
 
CS4200 2019 Lecture 1: Introduction
CS4200 2019 Lecture 1: IntroductionCS4200 2019 Lecture 1: Introduction
CS4200 2019 Lecture 1: IntroductionEelco Visser
 
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGemsPlugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGemsSadayuki Furuhashi
 
Ansible Configuration Management Tool 소개 및 활용
Ansible Configuration Management Tool 소개 및 활용 Ansible Configuration Management Tool 소개 및 활용
Ansible Configuration Management Tool 소개 및 활용 Steven Shim
 
Simple tools to fight bigger quality battle
Simple tools to fight bigger quality battleSimple tools to fight bigger quality battle
Simple tools to fight bigger quality battleAnand Ramdeo
 
BDD with Behat and Symfony2
BDD with Behat and Symfony2BDD with Behat and Symfony2
BDD with Behat and Symfony2katalisha
 
Working Effectively With Legacy Perl Code
Working Effectively With Legacy Perl CodeWorking Effectively With Legacy Perl Code
Working Effectively With Legacy Perl Codeerikmsp
 
Fabien Potencier "Symfony 4 in action"
Fabien Potencier "Symfony 4 in action"Fabien Potencier "Symfony 4 in action"
Fabien Potencier "Symfony 4 in action"Fwdays
 
Reusando componentes Zope fuera de Zope
Reusando componentes Zope fuera de ZopeReusando componentes Zope fuera de Zope
Reusando componentes Zope fuera de Zopementtes
 
Visual Studio .NET2010
Visual Studio .NET2010Visual Studio .NET2010
Visual Studio .NET2010Satish Verma
 
SymfonyCon Berlin 2016 - Symfony Plugin for PhpStorm - 3 years later
SymfonyCon Berlin 2016 - Symfony Plugin for PhpStorm - 3 years laterSymfonyCon Berlin 2016 - Symfony Plugin for PhpStorm - 3 years later
SymfonyCon Berlin 2016 - Symfony Plugin for PhpStorm - 3 years laterHaehnchen
 
Aspect-oriented programming in Perl
Aspect-oriented programming in PerlAspect-oriented programming in Perl
Aspect-oriented programming in Perlmegakott
 
Hands-on with the Symfony2 Framework
Hands-on with the Symfony2 FrameworkHands-on with the Symfony2 Framework
Hands-on with the Symfony2 FrameworkRyan Weaver
 
How Symfony changed my life (#SfPot, Paris, 19th November 2015)
How Symfony changed my life (#SfPot, Paris, 19th November 2015)How Symfony changed my life (#SfPot, Paris, 19th November 2015)
How Symfony changed my life (#SfPot, Paris, 19th November 2015)Matthias Noback
 
End-to-end CI/CD deployments of containerized applications using AWS services
End-to-end CI/CD deployments of containerized applications using AWS servicesEnd-to-end CI/CD deployments of containerized applications using AWS services
End-to-end CI/CD deployments of containerized applications using AWS servicesMassimo Ferre'
 

Similar to GNU Parallel: Lab meeting—technical talk (20)

Compiler Construction | Lecture 1 | What is a compiler?
Compiler Construction | Lecture 1 | What is a compiler?Compiler Construction | Lecture 1 | What is a compiler?
Compiler Construction | Lecture 1 | What is a compiler?
 
Incredible Machine with Pipelines and Generators
Incredible Machine with Pipelines and GeneratorsIncredible Machine with Pipelines and Generators
Incredible Machine with Pipelines and Generators
 
Python Programming Basics for begginners
Python Programming Basics for begginnersPython Programming Basics for begginners
Python Programming Basics for begginners
 
Makefile for python projects
Makefile for python projectsMakefile for python projects
Makefile for python projects
 
CS4200 2019 Lecture 1: Introduction
CS4200 2019 Lecture 1: IntroductionCS4200 2019 Lecture 1: Introduction
CS4200 2019 Lecture 1: Introduction
 
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGemsPlugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGems
 
Ansible Configuration Management Tool 소개 및 활용
Ansible Configuration Management Tool 소개 및 활용 Ansible Configuration Management Tool 소개 및 활용
Ansible Configuration Management Tool 소개 및 활용
 
C# tutorial
C# tutorialC# tutorial
C# tutorial
 
Simple tools to fight bigger quality battle
Simple tools to fight bigger quality battleSimple tools to fight bigger quality battle
Simple tools to fight bigger quality battle
 
BDD with Behat and Symfony2
BDD with Behat and Symfony2BDD with Behat and Symfony2
BDD with Behat and Symfony2
 
Working Effectively With Legacy Perl Code
Working Effectively With Legacy Perl CodeWorking Effectively With Legacy Perl Code
Working Effectively With Legacy Perl Code
 
Python1
Python1Python1
Python1
 
Fabien Potencier "Symfony 4 in action"
Fabien Potencier "Symfony 4 in action"Fabien Potencier "Symfony 4 in action"
Fabien Potencier "Symfony 4 in action"
 
Reusando componentes Zope fuera de Zope
Reusando componentes Zope fuera de ZopeReusando componentes Zope fuera de Zope
Reusando componentes Zope fuera de Zope
 
Visual Studio .NET2010
Visual Studio .NET2010Visual Studio .NET2010
Visual Studio .NET2010
 
SymfonyCon Berlin 2016 - Symfony Plugin for PhpStorm - 3 years later
SymfonyCon Berlin 2016 - Symfony Plugin for PhpStorm - 3 years laterSymfonyCon Berlin 2016 - Symfony Plugin for PhpStorm - 3 years later
SymfonyCon Berlin 2016 - Symfony Plugin for PhpStorm - 3 years later
 
Aspect-oriented programming in Perl
Aspect-oriented programming in PerlAspect-oriented programming in Perl
Aspect-oriented programming in Perl
 
Hands-on with the Symfony2 Framework
Hands-on with the Symfony2 FrameworkHands-on with the Symfony2 Framework
Hands-on with the Symfony2 Framework
 
How Symfony changed my life (#SfPot, Paris, 19th November 2015)
How Symfony changed my life (#SfPot, Paris, 19th November 2015)How Symfony changed my life (#SfPot, Paris, 19th November 2015)
How Symfony changed my life (#SfPot, Paris, 19th November 2015)
 
End-to-end CI/CD deployments of containerized applications using AWS services
End-to-end CI/CD deployments of containerized applications using AWS servicesEnd-to-end CI/CD deployments of containerized applications using AWS services
End-to-end CI/CD deployments of containerized applications using AWS services
 

More from Hoffman Lab

Efficient querying of genomic reference databases with gget
Efficient querying of genomic reference databases with ggetEfficient querying of genomic reference databases with gget
Efficient querying of genomic reference databases with ggetHoffman Lab
 
WashU Epigenome Browser
WashU Epigenome BrowserWashU Epigenome Browser
WashU Epigenome BrowserHoffman Lab
 
Wireguard: A Virtual Private Network Tunnel
Wireguard: A Virtual Private Network TunnelWireguard: A Virtual Private Network Tunnel
Wireguard: A Virtual Private Network TunnelHoffman Lab
 
Plotting heatmap with matplotlib/seaborn
Plotting heatmap with matplotlib/seabornPlotting heatmap with matplotlib/seaborn
Plotting heatmap with matplotlib/seabornHoffman Lab
 
Go Get Data (GGD)
Go Get Data (GGD)Go Get Data (GGD)
Go Get Data (GGD)Hoffman Lab
 
fastp: the FASTQ pre-processor
fastp: the FASTQ pre-processorfastp: the FASTQ pre-processor
fastp: the FASTQ pre-processorHoffman Lab
 
R markdown and Rmdformats
R markdown and RmdformatsR markdown and Rmdformats
R markdown and RmdformatsHoffman Lab
 
File searching tools
File searching toolsFile searching tools
File searching toolsHoffman Lab
 
Better BibTeX (BBT) for Zotero
Better BibTeX (BBT) for ZoteroBetter BibTeX (BBT) for Zotero
Better BibTeX (BBT) for ZoteroHoffman Lab
 
Awk primer and Bioawk
Awk primer and BioawkAwk primer and Bioawk
Awk primer and BioawkHoffman Lab
 
Terminals and Shells
Terminals and ShellsTerminals and Shells
Terminals and ShellsHoffman Lab
 
BioRender & Glossary/Acronym
BioRender & Glossary/AcronymBioRender & Glossary/Acronym
BioRender & Glossary/AcronymHoffman Lab
 
BioSyntax: syntax highlighting for computational biology
BioSyntax: syntax highlighting for computational biologyBioSyntax: syntax highlighting for computational biology
BioSyntax: syntax highlighting for computational biologyHoffman Lab
 
Get Good With Git
Get Good With GitGet Good With Git
Get Good With GitHoffman Lab
 
Tech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome BrowserTech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome BrowserHoffman Lab
 
MultiQC: summarize analysis results for multiple tools and samples in a singl...
MultiQC: summarize analysis results for multiple tools and samples in a singl...MultiQC: summarize analysis results for multiple tools and samples in a singl...
MultiQC: summarize analysis results for multiple tools and samples in a singl...Hoffman Lab
 
dreamRs: interactive ggplot2
dreamRs: interactive ggplot2dreamRs: interactive ggplot2
dreamRs: interactive ggplot2Hoffman Lab
 
Basic Cryptography & Security
Basic Cryptography & SecurityBasic Cryptography & Security
Basic Cryptography & SecurityHoffman Lab
 

More from Hoffman Lab (20)

TCRpower
TCRpowerTCRpower
TCRpower
 
Efficient querying of genomic reference databases with gget
Efficient querying of genomic reference databases with ggetEfficient querying of genomic reference databases with gget
Efficient querying of genomic reference databases with gget
 
WashU Epigenome Browser
WashU Epigenome BrowserWashU Epigenome Browser
WashU Epigenome Browser
 
Wireguard: A Virtual Private Network Tunnel
Wireguard: A Virtual Private Network TunnelWireguard: A Virtual Private Network Tunnel
Wireguard: A Virtual Private Network Tunnel
 
Plotting heatmap with matplotlib/seaborn
Plotting heatmap with matplotlib/seabornPlotting heatmap with matplotlib/seaborn
Plotting heatmap with matplotlib/seaborn
 
Go Get Data (GGD)
Go Get Data (GGD)Go Get Data (GGD)
Go Get Data (GGD)
 
fastp: the FASTQ pre-processor
fastp: the FASTQ pre-processorfastp: the FASTQ pre-processor
fastp: the FASTQ pre-processor
 
R markdown and Rmdformats
R markdown and RmdformatsR markdown and Rmdformats
R markdown and Rmdformats
 
File searching tools
File searching toolsFile searching tools
File searching tools
 
Better BibTeX (BBT) for Zotero
Better BibTeX (BBT) for ZoteroBetter BibTeX (BBT) for Zotero
Better BibTeX (BBT) for Zotero
 
Awk primer and Bioawk
Awk primer and BioawkAwk primer and Bioawk
Awk primer and Bioawk
 
Terminals and Shells
Terminals and ShellsTerminals and Shells
Terminals and Shells
 
BioRender & Glossary/Acronym
BioRender & Glossary/AcronymBioRender & Glossary/Acronym
BioRender & Glossary/Acronym
 
Linters in R
Linters in RLinters in R
Linters in R
 
BioSyntax: syntax highlighting for computational biology
BioSyntax: syntax highlighting for computational biologyBioSyntax: syntax highlighting for computational biology
BioSyntax: syntax highlighting for computational biology
 
Get Good With Git
Get Good With GitGet Good With Git
Get Good With Git
 
Tech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome BrowserTech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome Browser
 
MultiQC: summarize analysis results for multiple tools and samples in a singl...
MultiQC: summarize analysis results for multiple tools and samples in a singl...MultiQC: summarize analysis results for multiple tools and samples in a singl...
MultiQC: summarize analysis results for multiple tools and samples in a singl...
 
dreamRs: interactive ggplot2
dreamRs: interactive ggplot2dreamRs: interactive ggplot2
dreamRs: interactive ggplot2
 
Basic Cryptography & Security
Basic Cryptography & SecurityBasic Cryptography & Security
Basic Cryptography & Security
 

Recently uploaded

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

GNU Parallel: Lab meeting—technical talk

  • 1. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Lab meeting—technical talk GNU Parallel Coby Viner Hoffman Lab Thursday December 7, 2023
  • 2. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Overview Why use GNU Parallel? Basic examples from the tutorial Basic elements of syntax [from the tutorial] Much more syntax for many other tasks Selected recent features More tutorial examples More tutorial examples More tutorial examples More tutorial examples More tutorial examples Some examples of my GNU parallel usage
  • 3. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Why use GNU Parallel? a shell tool for executing jobs in parallel using one or more com- puters.
  • 4. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Why use GNU Parallel? a shell tool for executing jobs in parallel using one or more com- puters. I Easily parallelize perfectly parallel tasks
  • 5. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Why use GNU Parallel? a shell tool for executing jobs in parallel using one or more com- puters. I Easily parallelize perfectly parallel tasks I For each chromosome…
  • 6. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Why use GNU Parallel? a shell tool for executing jobs in parallel using one or more com- puters. I Easily parallelize perfectly parallel tasks I For each chromosome… I For each sex, for each technical replicate, for each hyper-parameter(s)
  • 7. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Why use GNU Parallel? a shell tool for executing jobs in parallel using one or more com- puters. I Easily parallelize perfectly parallel tasks I For each chromosome… I For each sex, for each technical replicate, for each hyper-parameter(s) I Job submission scripts within a for loop
  • 8. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Why use GNU Parallel? a shell tool for executing jobs in parallel using one or more com- puters. I Easily parallelize perfectly parallel tasks I For each chromosome… I For each sex, for each technical replicate, for each hyper-parameter(s) I Job submission scripts within a for loop I Improved, cleaner, syntax (for the programmer), even in serial
  • 9. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Why use GNU Parallel? a shell tool for executing jobs in parallel using one or more com- puters. I Easily parallelize perfectly parallel tasks I For each chromosome… I For each sex, for each technical replicate, for each hyper-parameter(s) I Job submission scripts within a for loop I Improved, cleaner, syntax (for the programmer), even in serial I Facile interleaving of tasks, in the order one is thinking about them
  • 10. A basic [man page] example: “Working as xargs -n1. Argument appending” find . -name '*.html' | parallel gzip --best
  • 11. A basic [man page] example: “Working as xargs -n1. Argument appending” find . -name '*.html' | parallel gzip --best find . -type f -print0 | parallel -q0 perl -i -pe 's/FOO BAR/FUBAR/g'
  • 12. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Easy installation from source
  • 13. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Easy installation from source
  • 14. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Easy installation from source
  • 15. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Easy installation from source
  • 16. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Easy installation from source
  • 17. Another basic [man page] example: “Inserting multiple arguments” bash: /bin/mv: Argument list too long ls | grep -E '.log$' | parallel mv {} destdir
  • 18. Another basic [man page] example: “Inserting multiple arguments” bash: /bin/mv: Argument list too long ls | grep -E '.log$' | parallel mv {} destdir ls | grep -E '.log$' | parallel -m mv {} destdir
  • 19. Basic elements of syntax [from the tutorial] Input: parallel echo ::: A B C # command line cat abc-file | parallel echo # from STDIN parallel -a abc-file echo # from a file
  • 20. Basic elements of syntax [from the tutorial] Input: parallel echo ::: A B C # command line cat abc-file | parallel echo # from STDIN parallel -a abc-file echo # from a file Output [line order may vary]: A B C
  • 21. Basic elements of syntax [from the tutorial] Multiple inputs. Input: parallel echo ::: A B C ::: D E F cat abc-file | parallel -a - -a def-file echo parallel -a abc-file -a def-file echo cat abc-file | parallel echo :::: - def-file # alt. file parallel echo ::: A B C :::: def-file # mix cmd. and file
  • 22. Basic elements of syntax [from the tutorial] Multiple inputs. Input: parallel echo ::: A B C ::: D E F cat abc-file | parallel -a - -a def-file echo parallel -a abc-file -a def-file echo cat abc-file | parallel echo :::: - def-file # alt. file parallel echo ::: A B C :::: def-file # mix cmd. and file Output [line order may vary]: A D A E A F B D B E B F C D C E C F
  • 23. Basic elements of syntax [from the tutorial] Matching input. Input: parallel --xapply echo ::: A B C ::: D E F
  • 24. Basic elements of syntax [from the tutorial] Matching input. Input: parallel --xapply echo ::: A B C ::: D E F Output [line order may vary]: A D B E C F
  • 25. Basic elements of syntax [from the tutorial] Matching input. Input: parallel --xapply echo ::: A B C ::: D E F Output [line order may vary]: A D B E C F I –xapply will wrap, if insufficient input is provided.
  • 26. Basic elements of syntax [from the tutorial] Replacement strings: The 7 predefined replacement strings Input: parallel echo {} ::: A/B.C parallel echo {.} ::: A/B.C Output: A/B.C A/B
  • 27. Basic elements of syntax [from the tutorial] Replacement strings: The 7 predefined replacement strings Input: parallel echo {} ::: A/B.C parallel echo {.} ::: A/B.C Output: A/B.C A/B Rep. String Result . remove ext. / remove path // only path /. only ext. and path # job number % job slot number
  • 28. Basic elements of syntax [from the tutorial] Customizing replacement strings --extensionreplace to change {.} etc. Shorthand custom (PCRE+) replacement strings GNU parallel’s 7 replacement strings: --rpl '{} ' --rpl '{#} $_=$job->seq()' --rpl '{%} $_=$job->slot()' --rpl '{/} s:.*/::' --rpl '{//} $Global::use{”File::Basename”} ||= eval ”use File::Basename; 1;”; $_ = dirname($_);' --rpl '{/.} s:.*/::; s:.[^/.]+$::;' --rpl '{.} s:.[^/.]+$::'
  • 29. Basic elements of syntax [from the tutorial] Multiple input sources and positional replacement: parallel echo {1} and {2} ::: A B ::: C D
  • 30. Basic elements of syntax [from the tutorial] Multiple input sources and positional replacement: parallel echo {1} and {2} ::: A B ::: C D I Always try to define replacements, with {<>} syntax.
  • 31. Basic elements of syntax [from the tutorial] Multiple input sources and positional replacement: parallel echo {1} and {2} ::: A B ::: C D I Always try to define replacements, with {<>} syntax. I Test with --dry-run first.
  • 32. Basic elements of syntax [from the tutorial] More replacement strings --plus adds the replacement strings {+/} {+.} {+..} {+...} {..} {...} {/..} {/...} {##}. {+foo} matches the opposite of {foo}: {} = {+/}/{/} = {.}.{+.} = {+/}/{/.}.{+.} = {..}.{+..} = {+/}/{/..}.{+..} = {...}.{+...} = {+/}/{/...}.{+...}
  • 33. Basic elements of syntax [from the tutorial] --plus also adds: I Since May 2021: now includes {%%regexp} and {##regexp}.
  • 34. Basic elements of syntax [from the tutorial] --plus also adds: I Since May 2021: now includes {%%regexp} and {##regexp}. I Since Dec. 2020, {hgrp} that gives the intersection of the hostgroups of the job and the sshlogin that the job is run on.
  • 35. Basic elements of syntax [from the tutorial] --plus also adds: I Since May 2021: now includes {%%regexp} and {##regexp}. I Since Dec. 2020, {hgrp} that gives the intersection of the hostgroups of the job and the sshlogin that the job is run on. I Since May 2020: also activates the replacement strings {slot} = $PARALLEL_JOBSLOT, {sshlogin} = $PARALLEL_SSHLOGIN, {host}.
  • 36. Lab meeting— technical talk Coby Viner Use cases Basic examples Basic syntax Additional syntax Recent features More examples More examples More examples More examples More examples Real examples Performance over time 20100424 20100615 20100620 20100822 20100922 20101115 20101202 20110122 20110205 20110422 20110622 20110822 20111122 20120122 20120322 20120522 20120722 20121022 20121222 20130222 20130522 20130722 20130922 20131122 20140122 20140322 20140522 20140722 20140922 20141122 20150222 20150422 20150622 20150822 20151022 20151222 20160222 20160422 20160622 20160822 20161022 20161222 20170222 20170422 20170622 20170822 20171022 20171222 20180222 20180422 20180622 20180822 20181022 20181222 20190222 20190422 20190622 20190822 20191022 20191222 20200222 20200422 20200622 20200822 20201022 5 6 7 8 9 10 11 12 GNU Parallel overhead for different versions 3000 trials each running 1000 jobs Command milliseconds/job
  • 37. Much more syntax for many other tasks I --pipe: instead of STDIN as command args, data sent to STDIN of command
  • 38. Much more syntax for many other tasks I --pipe: instead of STDIN as command args, data sent to STDIN of command I command_A | command_B | command_C, where command_B is slow
  • 39. Much more syntax for many other tasks I --pipe: instead of STDIN as command args, data sent to STDIN of command I command_A | command_B | command_C, where command_B is slow I Remote execution to directly parallelize over multiple machines
  • 40. Much more syntax for many other tasks I --pipe: instead of STDIN as command args, data sent to STDIN of command I command_A | command_B | command_C, where command_B is slow I Remote execution to directly parallelize over multiple machines I Working directly with a SQL database
  • 41. Much more syntax for many other tasks I --pipe: instead of STDIN as command args, data sent to STDIN of command I command_A | command_B | command_C, where command_B is slow I Remote execution to directly parallelize over multiple machines I Working directly with a SQL database I Shebang: often cat input_file | parallel command, but can do #!/usr/bin/parallel --shebang -r echo
  • 42. Much more syntax for many other tasks I --pipe: instead of STDIN as command args, data sent to STDIN of command I command_A | command_B | command_C, where command_B is slow I Remote execution to directly parallelize over multiple machines I Working directly with a SQL database I Shebang: often cat input_file | parallel command, but can do #!/usr/bin/parallel --shebang -r echo I As a counting semaphore: parallel --semaphore or sem
  • 43. Much more syntax for many other tasks I --pipe: instead of STDIN as command args, data sent to STDIN of command I command_A | command_B | command_C, where command_B is slow I Remote execution to directly parallelize over multiple machines I Working directly with a SQL database I Shebang: often cat input_file | parallel command, but can do #!/usr/bin/parallel --shebang -r echo I As a counting semaphore: parallel --semaphore or sem I Default is one slot: a mutex
  • 44. Selected recent features (post-2020) I --latest-line shows only the latest line of running jobs.
  • 45. Selected recent features (post-2020) I --latest-line shows only the latest line of running jobs. I --color colors output in different colors per job (and additional related features).
  • 46. Selected recent features (post-2020) I --latest-line shows only the latest line of running jobs. I --color colors output in different colors per job (and additional related features). I --sshlogin: now quite fully-featured
  • 47. Selected recent features (post-2020) I --latest-line shows only the latest line of running jobs. I --color colors output in different colors per job (and additional related features). I --sshlogin: now quite fully-featured I --delay 123auto will auto-adjust --delay. If jobs fail due to being spawned too quickly, --delay will exponentially increase.
  • 48. Selected recent features (post-2020) I --latest-line shows only the latest line of running jobs. I --color colors output in different colors per job (and additional related features). I --sshlogin: now quite fully-featured I --delay 123auto will auto-adjust --delay. If jobs fail due to being spawned too quickly, --delay will exponentially increase. I --memsuspend
  • 49. Selected recent features (post-2020) I --latest-line shows only the latest line of running jobs. I --color colors output in different colors per job (and additional related features). I --sshlogin: now quite fully-featured I --delay 123auto will auto-adjust --delay. If jobs fail due to being spawned too quickly, --delay will exponentially increase. I --memsuspend I {= =}: includes yyyy_mm_dd_hh_mm_ss(), yyyy_mm_dd_hh_mm(), etc.
  • 50. Selected recent features (post-2020) I --latest-line shows only the latest line of running jobs. I --color colors output in different colors per job (and additional related features). I --sshlogin: now quite fully-featured I --delay 123auto will auto-adjust --delay. If jobs fail due to being spawned too quickly, --delay will exponentially increase. I --memsuspend I {= =}: includes yyyy_mm_dd_hh_mm_ss(), yyyy_mm_dd_hh_mm(), etc. I --filter, e.g., {1} < {2}+1.
  • 51. Selected recent features (post-2020) I --latest-line shows only the latest line of running jobs. I --color colors output in different colors per job (and additional related features). I --sshlogin: now quite fully-featured I --delay 123auto will auto-adjust --delay. If jobs fail due to being spawned too quickly, --delay will exponentially increase. I --memsuspend I {= =}: includes yyyy_mm_dd_hh_mm_ss(), yyyy_mm_dd_hh_mm(), etc. I --filter, e.g., {1} < {2}+1. I --template <text file>, with replacement strings. Replaces the replacement strings and saves it under a new filename.
  • 52. Another [man page] example: “Aggregating content of files” parallel --header : echo x{X}y{Y}z{Z} > x{X}y{Y}z{Z} ::: X {1..5} ::: Y {01..10} ::: Z {1..5}
  • 53. Another [man page] example: “Aggregating content of files” parallel --header : echo x{X}y{Y}z{Z} > x{X}y{Y}z{Z} ::: X {1..5} ::: Y {01..10} ::: Z {1..5} parallel eval 'cat {=s/y01/y*/=} > {=s/y01//=}' ::: *y01* This runs: cat x1y*z1 > x1z1, ∀x∀z
  • 54. Another [man page] example: directly call SLURM #!/bin/bash #SBATCH --time 00:02:00 #SBATCH --ntasks=4 #SBATCH --job-name GnuParallelDemo #SBATCH --output gnuparallel.out module purge module load gnu_parallel my_parallel=”parallel --delay .2 -j $SLURM_NTASKS” my_srun=”srun --export=all --exclusive -n1” my_srun=”$my_srun --cpus-per-task=1 --cpu-bind=cores” $my_parallel ”$my_srun” echo This is job {} ::: {1..20}
  • 55. Another [man page] example: myprog on FASTA input cat file.fasta | parallel --pipe -N1 --recstart '>' --rrs 'read a; echo Name: ”$a”; myprog $(tr -d ”n”)'
  • 56. Another [man page] example: fastq-reader on interleaved FASTQ input parallel --pipe-part -a big.fq --block -1 --regexp --recend 'n' --recstart '@.*(/1| 1:.*)n[A-Za-zn.~]' fastq-reader
  • 57. Another [man page] example: simple scheduler true >jobqueue; while true; do tail -n+0 -f jobqueue | (parallel -E StOpHeRe -S ..; echo GNU Parallel is now done; perl -e 'while(<>){/StOpHeRe/ and last};print <>' jobqueue > j2; (seq 1000 » jobqueue &); echo Done appending dummy data forcing tail to exit) mv j2 jobqueue done
  • 58. Another [man page] example: simple scheduler true >jobqueue; while true; do tail -n+0 -f jobqueue | (parallel -E StOpHeRe -S ..; echo GNU Parallel is now done; perl -e 'while(<>){/StOpHeRe/ and last};print <>' jobqueue > j2; (seq 1000 » jobqueue &); echo Done appending dummy data forcing tail to exit) mv j2 jobqueue done # Day time echo 50% > jobfile cp day_server_list ~/.parallel/sshloginfile # Night time echo 100% > jobfile cp night_server_list ~/.parallel/sshloginfile
  • 59. Post-meme2images inkscape conversions for publication-ready CentriMo plots and sequence logos parallel inkscape --vacuum-defs --export-pdf={.}.pdf {} ::: ”$centrimo_eps_1” ”$centrimo_eps_2”
  • 60. Post-meme2images inkscape conversions for publication-ready CentriMo plots and sequence logos parallel inkscape --vacuum-defs --export-pdf={.}.pdf {} ::: ”$centrimo_eps_1” ”$centrimo_eps_2” parallel ”inkscape --vacuum-defs --export-pdf={.}.pdf {}; pdfcrop --hires --clip --margins '0 0 0 -12' {.}.pdf; mv -f {.}-crop.pdf {.}.pdf ” ::: logo+([:digit:])$VECTOR_FILE_EXT
  • 61. Fixing directory structures—symbolic link issues (for data provenance) parallel --dry-run -j 1 --rpl '{s} s@.*?((?:fe)?male_d+-d+).*@$1@' ”{s}; ln -s /$(readlink {}) {}” ::: $(find . -mindepth 3 -maxdepth 3 -xtype l)
  • 62. Fixing directory structures—symbolic link issues (for data provenance) parallel --dry-run -j 1 --rpl '{s} s@.*?((?:fe)?male_d+-d+).*@$1@' ”{s}; ln -s /$(readlink {}) {}” ::: $(find . -mindepth 3 -maxdepth 3 -xtype l) parallel --rpl '{s} s:.+?/(.+?)_peaks.narrowPeak.gz$: 1_summits.bed.gz:' ”ln -s ../../linked-2015-10-07-.../data/MACS/{s} {//}/” ::: */*_peaks.narrowPeak.gz
  • 63. Fixing directory structures—symbolic link issues (for data provenance) parallel --dry-run -j 1 --rpl '{s} s@.*?((?:fe)?male_d+-d+).*@$1@' ”{s}; ln -s /$(readlink {}) {}” ::: $(find . -mindepth 3 -maxdepth 3 -xtype l) parallel --rpl '{s} s:.+?/(.+?)_peaks.narrowPeak.gz$: 1_summits.bed.gz:' ”ln -s ../../linked-2015-10-07-.../data/MACS/{s} {//}/” ::: */*_peaks.narrowPeak.gz parallel -j 1 --rpl '{...} s:/.*::;' ”dir=$(readlink -f {} | sed -r 's:/linked.+?/:/{...}/:'); mkdir $dir; rm -f {}; ln -s $dir {//}/M-ChIP_runs” ::: $(find linked-2016-01-31-* -type l -name 'M-ChIP_runs')
  • 64. Exploring/collating complex CentriMo results parallel --dry-run -j 1 --rpl '{sex} s:.*?(w*male)-d+_d+.*:1:' --rpl '{rep} s:.*?male-(d+_d+).*:1:' --rpl ”{TFinfo} s:.*?([^/]+)-expandedTo500bpRegions- mod.*:1:” --rpl '{thresh} s:.*?d+_d+-(0.d+).*:1:' ”awk '$0 !~ /^#/ {$1=””; $2=””; print ”{TFinfo}”,”{sex}”,”{rep}”,” {thresh}”$0;}' {} | sed -r 's/[[:space:]]+/t/g'” ::: $(find ../MEME-ChIP_runs-initial_controls/ -mindepth 5 -wholename '*hypothesis_testing_selected_controlledVars/ centrimo_out/centrimo.txt' | head )
  • 65. Processing ChIP-seq peak data with MACS parallel --rpl '{/..SRF} s:../w+[-.](SRFw*).*:$1:i;' macs14 callpeak -t {} -n {/..SRF} -g 'mm' -s 51 --bw 150 -S -p 0.0001 ::: ../*.alignment.mm8.bed.gz
  • 66. Processing ChIP-seq peak data with MACS parallel --rpl '{/..SRF} s:../w+[-.](SRFw*).*:$1:i;' macs14 callpeak -t {} -n {/..SRF} -g 'mm' -s 51 --bw 150 -S -p 0.0001 ::: ../*.alignment.mm8.bed.gz parallel ”zcat {} | awk 'BEGIN{FS=OFS=”t”} NR > 1 {print $2,$3,$4;}' | pigz -9 > {/.}.bed.gz” ::: ../*MACS_peaks_annot.txt.gz liftoverAll '.bed.gz'
  • 67. Processing ChIP-seq peak data with MACS function liftoverAll { parallel liftOver {} ”$LIFTOVER_CHAIN_FILE_FULL_PATH” ../$LIFTED_OVER_DIR_NAME/{/.}.liftedmm9 ../$LIFTED_OVER_DIR_NAME/{/.}.unlifted ::: *”$1” pigz -9 ../$LIFTED_OVER_DIR_NAME/*.liftedmm9 ../$LIFTED_OVER_DIR_NAME/*.unlifted }
  • 68. Processing ChIP-seq peak data with MACS function liftoverAll { parallel liftOver {} ”$LIFTOVER_CHAIN_FILE_FULL_PATH” ../$LIFTED_OVER_DIR_NAME/{/.}.liftedmm9 ../$LIFTED_OVER_DIR_NAME/{/.}.unlifted ::: *”$1” pigz -9 ../$LIFTED_OVER_DIR_NAME/*.liftedmm9 ../$LIFTED_OVER_DIR_NAME/*.unlifted } parallel -j ${NSLOTS:=1} --xapply --rpl '{r} s:.*RS(d+).*:1:' ”$MACS_CMD_AND_COMMON_PARAMS -f BAMPE -n 'M-r{1r}' -t {1} -c {2} |& tee -a '$OUT_DIR/M-r{1r}.log'” ::: $IN_DIR/1494*@(1|2|3).bam ::: $IN_DIR/1494*@(4|5|6).bam
  • 69. Pipeline—processing bisulfite sequencing data with Methpipe merge_methcount_cmds=$( parallel -j $NSLOTS --joblog ”x.log” --rpl '{-../} s:.*/::; s:(.[^.]+)+$::; s:-d+$::;' --dry-run ”echo ”$MODULE_LOAD_CMD export LC_ALL=C; cat $ALIGNED_DIR/{-../}*.tomr | sort -k 1,1 -k 2,2n -k 3,3n -k 6,6 | ldots | methcounts -v -c $BISMARK_REF -o $COUNTS_DIR/{-../}_pool_ALL.meth /dev/stdin” | tee -a /dev/stderr | qsub ldots ::: $IN_DIR/*.1.fastq.gz | sort -V | uniq )
  • 70. Run BedTools Coverage on many files parallel -j $OMP_NUM_THREADS --delay 1 --lb --resume --resume-failed --joblog ”$BASE_DATA_DIR/$(basename $0 .sh)-${output_job_file_suffix#-}.log” --plus --rpl '{acc} s:.*(SRRw+).*:$1:' --tag --tagstring '{1acc}' ” ... [expanded below] ” ::: $BASE_PEAK_DIR/*/peaks.bed* ::: '-hist' '-d' :::+ 'hist' 'pos'
  • 71. Run BedTools Coverage on many files parallel -j $OMP_NUM_THREADS --delay 1 --lb --resume --resume-failed --joblog ”$BASE_DATA_DIR/$(basename $0 .sh)-${output_job_file_suffix#-}.log” --plus --rpl '{acc} s:.*(SRRw+).*:$1:' --tag --tagstring '{1acc}' ” ... [expanded below] ” ::: $BASE_PEAK_DIR/*/peaks.bed* ::: '-hist' '-d' :::+ 'hist' 'pos' temp_BED=”$(mktemp).bed” bedtools slop -g $assembly -i ”{1}” -b $SLOP_LEN > $temp_BED cat -- ”$file” | bedtools coverage $BAM_IN_PARAM stdin $BED_IN_PARAM $temp_BED {2} -iobuf $READ_BUF_SIZE ...
  • 72. Run BedTools Coverage on many files if [[ ! -s $”$output_basename-reverse$output_ext” ]]; then parallel --env _ -j $(($OMP_NUM_THREADS>1 ? 2 : 1)) --lb -I@@ --rpl '@name@ s:+:forward:; s:-:reverse:' ” $MODULE_LOAD_CMD strand_spec_output_file=”$output_basename-@name@$output_ext” sambamba view -t $(($OMP_NUM_THREADS>8 ? 8 : 1)) -f bam -l 0 -F ”strand $BAM_input_file $CHR_SUBSET | run_bedtools_cov_cmd 'stdin' > $stra if [[ @@ == '-' ]]; then sed -i 's/+/-/' $strand_spec_output_file fi pigz -9 -p $(($OMP_NUM_THREADS>8 ? 4 : 1)) $strand_spec_output_file ” ::: '+' '-'