Productivity

Joachim Jacob
8 and 15 November 2013
Multiple commands
In bash, commands put on one line when be
separated by “;”
$ wget
http://homepage.tudelft.nl/19j49/t-SNE_files/tSNE_linux.ta
r.gz ; tar xvfz tSNE_linux.tar.gz
Multiple commands
Commands on a oneliner can also be separated by
&& or ||
&& Only execute the command if the preceding one
finished correctly.
$ curl corz.org/ip && echo 'n'

|| (not a pipe!) - Inverse of the above. Only execute
the command if the preceding one did not succesfully
ends.
Piping a list of files with xargs
A pipe reads the output of a command.
$ ls | less

Some commands requires the file name to be
passed, instead of the content of the file. E.g. this
doesn't work:
$ ls | file
Usage: file [-bchikLlNnprsvz0] [--apple]
[--mime-encoding] [--mime-type]
[-e testname] [-F separator] [-f
namefile] [-m magicfiles] file ...
file -C [-m magicfiles]
file [--help]
Piping a list of files with xargs
Some commands requires the file name to be
passed, instead of the content of the file.
xargs passes the output of a command as a list of
arguments to another program.
$ ls | xargs file
bin:
directory
buddy.sh:
Bourne-Again shell
script, ASCII text executable
Compression_exercise:
directory
Desktop:
directory
Documents:
directory
Downloads:
directory
FastQValidator.0.1.1.tgz:
gzip compressed
data, from Unix, last modified: Fri Oct 19 16:44:23 2012
.bashrc
~/.bashrc is a hidden configuration file for bash in
your home.
It configures the prompt in your terminal.
It contains aliases to commands.
alias example
When you enter a first word on the command line
that bash does not recognize as a command, it will
search in the aliases for the word.
You can specify aliases in .bashrc. An example:
Alias example
Some interesting aliases
alias
alias
alias
alias
alias

ll='ls -lh'
dirsize="du -sh */"
uncom='grep -v -E "^#|^$"'
hosts="cat /etc/hosts"
dedup="awk '! x[$0]++' "

Aliases are perfectly suited for storing one-liners: find
some at
https://wikis.utexas.edu/display/bioiteam/Scott%27s+
list+of+linux+one-liners
Alias exercise
→ exercise link
Finding stuff: locate
Extremely quick and convenient:
locate
However, it won't find the newest files you created.
First you need to update the database by running:
updatedb
It accepts wildcards. Example:
$ locate *.sam
Bonus: How to filter on a certain location?
Finding stuff: find
More elaborate tool to find stuff:
$ find -name alignment.sam
Find won't find without specifying options:
-name : to search on the name of the file
-type : to search for the type: (f)ile, (d)irectory, (l)ink
-perm : to search for the permissions (111 or rwx)
…
This is the power tool to find stuff.
Finding stuff: find
The most powerful option of find:
-exec Execute a command on the found entities.
Finding stuff: find
The most powerful option of find:
-exec Execute a command on the found entities.
$ find -name *.gz
./DRR000542_2.fastq.subset.gz
./DRR000542_1.fastq.subset.gz
./DRR000545_2.fastq.subset.gz
./DRR000545_1.fastq.subset.gz
$ find -name *.gz -exec gunzip {} ;
$ ls
DRR000542_1.fastq.subset DRR000545_1.fastq.subset
DRR000542_2.fastq.subset DRR000545_2.fastq.subset
Command substitution in bash
In bash, the output of commands can be directly
stored in a variable. Put the command between
back-ticks.
$ test=`ls -l`
$ echo $test
total 7929624 -rw-rw-r-- 1 joachim joachim 15326 May 10
2013 0538c2b.jpg -rw-rw-r-- 1 joachim joachim 4914797 Nov
8 16:15 18d7alY
Command substitution in bash
A variable can also contain a list. A list contains
several entities (e.g. files).
Extracting first 100k lines from compressed text file:
for filename in `ls DRR00054*tar.gz`;
do zcat $filename | head -n 1000000
>${file%.gz}.subset; done




The output of ls is being put in a list. 'for' assigns one after the other
the name of the file to the variable file. This variable is used in the
oneliner zcat | head.
Keywords
.bashrc
;
alias
prompt
locate
find
Command substitution

Write in your own words what the terms mean
Break

Productivity tips - Introduction to linux for bioinformatics

  • 1.
  • 2.
    Multiple commands In bash,commands put on one line when be separated by “;” $ wget http://homepage.tudelft.nl/19j49/t-SNE_files/tSNE_linux.ta r.gz ; tar xvfz tSNE_linux.tar.gz
  • 3.
    Multiple commands Commands ona oneliner can also be separated by && or || && Only execute the command if the preceding one finished correctly. $ curl corz.org/ip && echo 'n' || (not a pipe!) - Inverse of the above. Only execute the command if the preceding one did not succesfully ends.
  • 4.
    Piping a listof files with xargs A pipe reads the output of a command. $ ls | less Some commands requires the file name to be passed, instead of the content of the file. E.g. this doesn't work: $ ls | file Usage: file [-bchikLlNnprsvz0] [--apple] [--mime-encoding] [--mime-type] [-e testname] [-F separator] [-f namefile] [-m magicfiles] file ... file -C [-m magicfiles] file [--help]
  • 5.
    Piping a listof files with xargs Some commands requires the file name to be passed, instead of the content of the file. xargs passes the output of a command as a list of arguments to another program. $ ls | xargs file bin: directory buddy.sh: Bourne-Again shell script, ASCII text executable Compression_exercise: directory Desktop: directory Documents: directory Downloads: directory FastQValidator.0.1.1.tgz: gzip compressed data, from Unix, last modified: Fri Oct 19 16:44:23 2012
  • 6.
    .bashrc ~/.bashrc is ahidden configuration file for bash in your home. It configures the prompt in your terminal. It contains aliases to commands.
  • 7.
    alias example When youenter a first word on the command line that bash does not recognize as a command, it will search in the aliases for the word. You can specify aliases in .bashrc. An example:
  • 8.
    Alias example Some interestingaliases alias alias alias alias alias ll='ls -lh' dirsize="du -sh */" uncom='grep -v -E "^#|^$"' hosts="cat /etc/hosts" dedup="awk '! x[$0]++' " Aliases are perfectly suited for storing one-liners: find some at https://wikis.utexas.edu/display/bioiteam/Scott%27s+ list+of+linux+one-liners
  • 9.
  • 10.
    Finding stuff: locate Extremelyquick and convenient: locate However, it won't find the newest files you created. First you need to update the database by running: updatedb It accepts wildcards. Example: $ locate *.sam Bonus: How to filter on a certain location?
  • 11.
    Finding stuff: find Moreelaborate tool to find stuff: $ find -name alignment.sam Find won't find without specifying options: -name : to search on the name of the file -type : to search for the type: (f)ile, (d)irectory, (l)ink -perm : to search for the permissions (111 or rwx) … This is the power tool to find stuff.
  • 12.
    Finding stuff: find Themost powerful option of find: -exec Execute a command on the found entities.
  • 13.
    Finding stuff: find Themost powerful option of find: -exec Execute a command on the found entities. $ find -name *.gz ./DRR000542_2.fastq.subset.gz ./DRR000542_1.fastq.subset.gz ./DRR000545_2.fastq.subset.gz ./DRR000545_1.fastq.subset.gz $ find -name *.gz -exec gunzip {} ; $ ls DRR000542_1.fastq.subset DRR000545_1.fastq.subset DRR000542_2.fastq.subset DRR000545_2.fastq.subset
  • 14.
    Command substitution inbash In bash, the output of commands can be directly stored in a variable. Put the command between back-ticks. $ test=`ls -l` $ echo $test total 7929624 -rw-rw-r-- 1 joachim joachim 15326 May 10 2013 0538c2b.jpg -rw-rw-r-- 1 joachim joachim 4914797 Nov 8 16:15 18d7alY
  • 15.
    Command substitution inbash A variable can also contain a list. A list contains several entities (e.g. files). Extracting first 100k lines from compressed text file: for filename in `ls DRR00054*tar.gz`; do zcat $filename | head -n 1000000 >${file%.gz}.subset; done The output of ls is being put in a list. 'for' assigns one after the other the name of the file to the variable file. This variable is used in the oneliner zcat | head.
  • 16.
  • 17.