1 Before we start. . .
you can get information about any external (= shell-independant) command by con-
sulting the UNIX Manual Pages:
$> man command
for more information about the manual system itself, do:
$> man man
if your are looking for a keyword blah in a man page, after launching the man program,
¯ to ﬁnd the next occurence of the word your are looking for, use n.
¯ to quit the man page simply hit q.
2 Filesystem exploration
2.1 Filesystem structure & namespace
The ﬁlesystem is an abstraction used by UNIX to access to information storage devices
(hard drives, ﬂoppies, cd-roms, network storage). The ﬁlesystem is organised in direc-
tories that form a tree. Each directory is a node of this tree. Files, and other objects
like Unix Sockets, FIFOs, and devices, are all leaves of this tree.
Each node in the tree can be uniquely identiﬁed by the sequence of parent nodes to
traverse before reaching the object. This is called a path. A path is the seqence of the
nodes (directories) names, separated by slashes. (For example: an/example/of/path)
The root of the ﬁlesystem tree is a special directory called / (slash). Thus, paths starting
with a / are considered absolute (they start at the root of the ﬁlesystem). On the other
hand, paths not starting with a / are understood as relative to the current directory.
Every user gets a personal directory. traditionally it is located in /home/username.
There is also a shortcut for home directories: a tilde (˜) designates the home direc-
tory of the current user; a tilde followed by a username (˜alice) designates the home
directory of this user.
Programs are stored in /bin, /sbin, /usr/bin, /usr/sbin, /usr/local/bin and /usr/local/sbin.
All the system-wide conﬁguration ﬁles are located in /etc
Temporary ﬁles can be created in /tmp. Files you create here are not accounted into
your quota, so it is convenient to work with. However, the contents of /tmp will even-
tually be deleted upon reboot.
Hidden ﬁles or directories start with a dot (.)
When you work in your shell, you always have a current directory.
2.2.1 cd — moving around
To change your current directory, do:
$> cd /my/new/directory
To go back to your home directory, do:
to go back one directory in the directory tree, do:
$> cd ..
2.2.2 pwd — knowing where you are
At any moment, you can use the pwd command (Print Working Directory) to display
you current directory
2.2.3 ls — examining directories
Use the command ls to list the contents of a directory. The syntax is
$> ls directory
If directory is omitted, ls assumes it is ”.” (the current directory).
Most common options are:
-a : to include hidden ﬁles
-l : to get a long listing (including ﬁle sizes, dates and permissions)
-d : to list the directory itself (as in its parent) instead of its contents (useful when you
do ”ls -ld *” for instance).
2.3 Organising directories
2.3.1 mkdir - creating new directories
$> mkdir directory
2.3.2 rmdir - deleting empty directories
$> rmdir directory
Note: the directory must be empty. To delete a non-empty directory, use rm with the
recursive option (-R)
2.4 Managing ﬁles
Preliminary warning Unlike other systems, Unix is very permissive regarding ﬁle names.
In fact, you can use almost every printable or non-printable characters, including line-
feed, backspace, tab and so on. However, many characters have a special meaning
to the shell, leading to unpredictable results. A strangely named ﬁle can become un-
deletable from the shell, mangle the directory listing display or even be interpreted as
an argument to a command, changing the behavior of the command.
Conclusion: When choosing a name for a ﬁle, rry only using alphanumerics and the
following characters: - . and do NOT start a ﬁle name with a dash (-)
2.4.1 touch — creating empty ﬁle
touch won’t erase anything even if the ﬁle already exists so it’s a safe way of creating
$> touch filename
2.4.2 cp — copying ﬁles
Two possibles usages:
$> cp source destination
to copy a source ﬁle into a destination ﬁle, and
$> cp source1 source2 source3 destination/
to copy several ﬁles into a destination directory, keeping the original ﬁlenames.
2.4.3 rm — removing ﬁles
To delete ﬁle(s):
$> rm file1 file2 file3 ...
To delete a ﬁle that starts with - (for example -toto);
$> rm -- -toto
To remove the entire contents of a directory:
$> rm -r directory/
2.4.4 mv — moving and renaming ﬁles
To move a ﬁle:
$> mv file /destination/directory
To rename a ﬁle:
$> mv oldname newname
Note: with the option “-i”, you are asked for conﬁrmation every time a ﬁle is deleted
or overwritten. It is a good idea to make this the default, e.g., by deﬁning appropriate
aliases, i.e. with bash:
$> alias rm=quot;rm -iquot;
If you do that, you can temporarily negate the -i ﬂag with a -f (force) ﬂag.
2.4.5 chown, chgrp, chmod — managing permissions
You can use these tools to manage access control for your ﬁles. s You can change the
owner of a ﬁle(s) with chown:
$> chown owner file1 file2 file3...
You can use chgrp to change the ﬁle(s) group:
$> chgrp group file1 file2 file3...
You can use chmod to change the access mode of a ﬁle(s):
$> chmod mode file1 file2 file3...
mode being of the form <who><action><right>, where: who can be one of
u(ser), g(roup) or o(thers) action can be one of + (allow), - (deny), = (allow only this)
right can be one of r(ead), w(rite) or x(ecute) (and some others, too.. look at the man
For a ﬁle, the rights meanings are straightforward:
r: you can read the ﬁle
w: you can write in the ﬁle (and therefore delete it)
x: you can execute the ﬁle
For a directory, the meanings are more complex:
r: you can list the directory contents
w: you can add new ﬁles in the directory, and delete existing ones (technically, that
means you can modify all the ﬁles you can read, and at least delete all the ﬁles
x: you can traverse the directory (meaning you can access the ﬁles in it)
Some common modes you can use:
go= to deny access to everybody but yourself
a+x to make a script executable for everybody
go=x to make a secret directory (other people can access the
contents only if they know the exact ﬁle names)
2.5 Archiving ﬁles
Some widely used archiving formats (.zip, .rar) include both ﬁle collating and com-
pression. Under UNIX, those two tasks are handled by separate utilities
2.5.1 tar — packing many ﬁles in one
To pack ﬁles into a ”tarball” (archive), do:
$> tar -cf nouvelle_archive.tar fichier1 fichier2 fichier3 ...
To unpack a tarball, do:
$> tar -xf archive.tar
Warning: don’t use absolute ﬁlenames (those starting with a /) when creating a tarball,
otherwise tar will attempt to put them on the same absolute location when extracting,
which can lead to problems if you take ﬁles from one system to another.
2.5.2 gzip - compressing ﬁles
To compress a ﬁle, do:
$> gzip filename
This will create a compressed version of the ﬁle named ﬁlename .gz
To decompress it, do:
$> gunzip filename.gz
The recommended approach is to ﬁrst use tar to collate many ﬁles, then use gzip to
compress the resulting tarball.
2.6 Working with text ﬁles
2.6.1 more & less — reading ﬁles quickly
To read a text ﬁle, use
$> more filename
You can advance in the text by hitting space or enter. You can go back by hitting ”b”.
You can search for a keyword, or even a whole regular expression (see below) by hitting
/ (slash), typing in the keyword or regexp, and hitting enter. After a search, you can hit
”n” to go to the next match.
2.6.2 cat — quickly creating text ﬁles
When you want to quickly create short text ﬁles, you don’t have to open a heavy editor.
$> cat > filename
then type the contents of the ﬁle. When you’re ﬁnished, go to a new line and hit Ctrl-D.
2.7 Finding Files
Find looks recursively for ﬁles matching some criterions in a given directory. The
$> find directory (criterions...)
To look in the current directory for regular ﬁles with names ending in .java, do
$> find . -type f -name quot;*.javaquot;
To get a list of all the ﬁles and directories in the current directory (that is, without
applying any criterion), do
$> find .
Warning: You have to quote the wildcards, otherwise the shell will expand them, which
is not what you want (see below about shell wildcards).
3 Examining ﬁles
3.0.2 Identifying ﬁle types: ﬁle
Some common desktop operating systems use ﬁle name extensions (.txt, .c, .mp3, etc...)
to denote the ﬁle type. With UNIX, you don’t have to use ﬁle extensions to denote a
When you don’t know what a particular ﬁle contains, use the ﬁle command. It guesses
the ﬁle type by looking directly at the contents of the ﬁle, regardless of its actual name.
$> file filename
3.0.3 Counting the number of lines in a text ﬁle: wc
wc with the l switch counts the number of lines in a ﬁle. For example you can use
usnoop |grep Syn |wc -l if you want to know how many Syn appear in your
$> cat textfile | wc -l
3.0.4 Searching text ﬁles for strings: grep
Grep looks for lines in a text stream matching a particular pattern, and prints only the
matching (or non-matching) lines. The syntax is
$> grep pattern file1 file2 file3...
The pattern can be a simple keyword. For a ﬁxed-string pattern, passing the -F option to
grep will make it faster. For more complex patterns, you have to use regular expressions
4 Regular Expressions
RegExpes are the UNIX way of describing text patterns. They are used in many text
processing languages such as sed, awk and perl. RegExpes are just simple text, but
some characters acquire a special meaning:
. (Dot) Stands for any character
? (Question mark) The previous character is optional
(Star) The previous character repeats zero or more times
+ (Plus) The last character repeats one or mor times
ˆ (Caret) Stands for the beginning of the line
$ (Dollar) Stands for the end of the line
You can also express a choice among many characters by listing them between brack-
ets:  means either 1, 2 or 3.
You can negate the choice by prepending it with a caret: [ˆabc] means any character
but a, b or c.
You can aggregate contiguous characters with a dash: [a-z] means any lowercase al-
phabetic character, [0-9] means any digit, and so on.
ˆa*$ Matches a, aa, aaa, aaaa, . . .
abc* Matches anything containing ab, abc, abcc, abccc, abcccc, . . .
ˆab[cd] Matches anything starting with abc or abd
ˆab.*cd$ Matches anything starting with ab and ending with cd
ˆab.c$ Matches abac, abbc, abcc, abdc, abec, ab c, ab1c, ab$c, . . .
For more information, you can refer to the grep manpage (man grep), or better, to the
perl RegExp manpage (perldoc perlre)
5 usnoop : selecting which packets to be captured
You can use pcap ﬁlter with usnoop or tcpdump to choose which packets you want to
be captured. Here are some common ﬁlters :
5.0.5 arp, tcp, icmp and udp keywords
use one of these keywords to catch only a particular protocol. For example if you want
to catch only arp request and answer use :
$> usnoop arp
5.0.6 host, net and port keywords
host hostname to catch only trafﬁc going or coming from this hostname.
net network to catch only trafﬁc going or coming from this network.
port portnumber to catch only trafﬁc going or coming from this speciﬁc port number.
$> usnoop host www.epfl.ch
$> usnoop net 128.178
$> usnoop port 80
5.0.7 src and dst keywords
you can speciﬁy a transfer direction for host, net and port. If you want to match packets
coming from a source port use src port or if you want to match packets going to a
particular destination host use dst host
$> usnoop dst port 80
$> usnoop src host 220.127.116.11
5.0.8 matching ﬂags in tcp packets
if you want to check for these common tcp ﬂags use :
SYN : ’tcp & 2 != 0’
FIN : ’tcp & 1 != 0’
ACK : ’tcp & 16 != 0’
RST : ’tcp & 4 != 0’
$> usnoop ’tcp & 2 != 0’
5.0.9 using and, or, ( ) and not between keywords
just combine precedent keywords with and, or, ( ) and not to build powerful ﬁlters :
catch all DNS trafﬁc :
$> usnoop udp and port 53
catch tcp trafﬁc with SYN and ACK bit coming from in3sun1
$> usnoop ’tcp & 2 != 0 and tcp & 16 != 0 and src host in3sun1’
catch all tcp and udp trafﬁc from www.epﬂ.ch but discard tcp port 22 one
$> usnoop ’((tcp and not port 22) or udp) and host www.epfl.ch’
6 Using the shell
Whenever you want to use a program expecting a list of ﬁles as arguments, you can
use wildcards to have the shell build an appropriate list for you. Shell wildcards work
somewhat like regexpes, but are much more simple.
Warning: regexpes and shell wildcards are not compatible. A valid wildcard is often
not a valid regexp, and conversely
The two most common wildcards are the star (*), that stands for zero, one or more
characters of any sort, and the question mark (?), that stands for exactly one, but any,
character. You can also specify a choice of possibles characters by listing them between
square braces ( stands for either 1, 2, or 3).
Whenever you use an argument containing wildcards, the shell will automatically re-
place it by a list of ﬁle names in the current directory matching the pattern. For instance,
for deleting all the java source ﬁles in a directory, do
$> rm *.java
Note that the shell only looks in the current directory. If you want a list of ﬁles found
through an entire directory tree, you have to use the ﬁnd command in conjunction with
xargs or a backtick operator (see below).
6.2 I/O Redirection
In Unix, every running program can read data from a special stream called ”Standard
Input” (or STDIN), and write data to two others special streams (Standard Output, and
Standard Error, rsp. STDOUT and STDERR).
By default, when you run a command, STDIN is associated to your keyboard, while
STDOUT and STDERR are associated with your display. That means the program ???
You can redirect STDOUT to a ﬁle, so that the command write its results in a ﬁle rather
than on your screen. To do so, use the notation ¿ﬁle .
$> ls -d * >catalog
will create a list of the ﬁles in the current directory and write it in a ﬁle named ”catalog”.
Also, you can redirect STDIN so that the program takes its inputs from a ﬁle rather
than from the keyboard, with the notation ¿ﬁle.
6.3 Pipes and ﬁlters
Sometimes, you want to chain the effects of two commands.
You can use the pipe symbol ”|” to connect the STDOUT of a command to the STDIN
of a second one.
UNIX has a lot of small ”ﬁlter” programs that you can use to process the results from
When used without ﬁlenames, more will act as a pager for its STDIN. You can then
use it to add paging to the output of any command:
$> usnoop -i /tmp/file | more
the tee command gets data from its STDIN and puts it both in STDOUT and in a ﬁle.
You can use it to both view and save the results in a ﬁle. For example,
$> usnoop -i /tmp/file | tee results | more
Will both page the output and save it in the results ﬁle
The sort command can sort output lines by alphabetic or numeric order (use option -n
for numeric sort). You can reverse the sort with the -r option To list all the ﬁles in a
directory by ascending number or words, you can do:
$> wc * | sort -rn
To create a sorted version of a ﬁle, you can do
$> sort <original >sorted
6.3.4 sed, awk, perl
sed (Stream EDitor) is a powerful program that can be used to perform various transfor-
mations on a text stream. It is line-oriented, and also relies on regular expressions for
its pattern matching. It is too complex to be explained here, but plenty of information
is available in the man page.
awk is another text-processing program, line-and-ﬁelds oriented. It can be used to
process text ﬁles databases. You can also look up more information in the man page.
those two programs are practical for small processing tasks, but their performance and
ﬂexibility is very limited. Much more complex tasks can be achieved with the Perl
language, which is becoming widely used among unix distributions. Perl programming
is beyond the scope of this document.
6.4 Turning a stream into arguments
6.4.1 The backtick operator “
The backticks take the output of a command and expand it into an argument stream.
For instance, to recursively grep all java ﬁles in a directory for a speciﬁc keyword, you
$> find -type f -name quot;*.javaquot;
to get a list of all the java ﬁles in a directory. Then, to turn it into an argument list for
$> grep keyword ‘find . -type f -name quot;*.javaquot;‘
Xargs reads lines from its STDIN and turns them into arguments for a command. It
works pretty much like the backtick, except it is more ﬂexible. For the example above,
an equivalent would be
$> find . -type f -name quot;*.javaquot; | xargs grep keyword
6.5 Job control
6.5.1 ; — synchronous execution
If you want to execute sequentially several commands, you can separate them with
$> ls /bin; ls /sbin
Each separate command will start when the previous one is completed, regardless of
its exit status.
6.5.2 & and wait — asynchronous execution
You can execute several commands in parallel; each command starts immediatly. This
will print ﬁle.ps while simultaneously creating a compressed version ﬁle.ps.gz
$> lp file.ps & gzip file.ps
If you do not specify the last command, you return to the shell. Therefore, you can
launch a command in the background by appending a & to it. However, the command
will still use your terminal as standard input and output, so you have to use proper redi-
rection when launching commands in backgroud. This runs a command in background,
saving its output in a ﬁle ’results’ and discarding the errors.
$> command >results >&/dev/null &
If you need to have a command sequence to stop and wait for a program to complete,
6.5.3 Conditional synchronous execution
Occasionally, you want to execute a command only if the previous one succeeded (or
not). For this, you can use the && and || operators.
&& acts the same as ; excepts that il will not continue if the ﬁrst command fails.
|| will execute the second command only if the previous one failed.
$> cp a b && rm a
Will copy ﬁle a to ﬁle b, then erase a only if the copy suceeded.
$> cp a b ; rm a
will erase a even if the copy failed.
$> mv a b || cp a b
will rename a to b, or fallback to making a copy if the rename failed (for instance, if a
6.5.4 ps — showing running processes
You can use ps to show a list of running processes on your machine and their ID with
the ps command.
With no argument ps will show processes owned by you and that were run from the
The ’-A’ option lists all processes running on the machine. the ’-l’ option shows de-
tailed information for each process.
6.5.5 kill — sending signals
UNIX processes can also communicate with signals. Signals can be used to interrupt,
restart or destroy processes. The syntax is
$> kill -signal PID
To kill a process given its pid, or
$> kill -signal %job
To kill a process by job number (see below).
If you do not specify a signal, the TERM signal is assumed.
The most common signals are:
INT interruption (program should interrupt what it’s doing)
STOP suspends the execution of the current process (the execution can be resumed later)
CONT resume the execution of a suspended process (i.e. with the STOP signal)
TERM termination (program must clean up garbage and exit immediately)
KILL kills the process (should be only used in last resort.)
You can get a list of available signals by typing
$> kill -l
$> kill -TERM 2245
$> kill -TERM %2
6.5.6 shortcuts to send signals
You can send a SIGINT to the current program running in a terminal by hitting Ctrl-C
You can send a SIGSTOP (program is suspended and can continue later) by hitting
6.5.7 jobs, fg and bg - controlling job execution
Each program either stopped or running in background becomes a ”Job” of the shell.
You can display current jobs and their numbers with the command jobs.
You can bring back a job that you stopped with Ctrl-Z, or that was running in the
background, with the command
$> fg <job number>
You can run a stopped job in the background with
$> bg <job number>
If you don’t give a job number, fg and bg will assume the most recent job.
Finally, you can get the PID of a job with the construct %¡job number¿. For instance,
to send the HUP signal to job n 5, do
$> kill -HUP %5
7.0.8 PostScript ﬁles
Beyond printing simple text ﬁles, most UNIX printers support the PostScript language
to print more complex documents. You can send either text or PostScript ﬁles to the
printer with lp.
7.0.9 lp — printing .ps and text ﬁles
$> lp -c file.ps
The -c option copies ﬁles to the spooler before printing. It can be needed in IN1 and
IN3 because of conﬁguration glitches.
7.0.10 a2ps — printing any ﬁle types ! (almost)
a2ps is a ﬁltering program that produces PostScript ﬁles out of lots of different ﬁle
types. You can send the ﬁles to the printer using lp:
$> a2ps -o MyClass.ps MyClass.java
$> lp MyClass.ps && rm MyClass.ps
Thanks to the people who contributed to this document:
¯ Matthias Grossglauser
¯ Olivier Hochreutiner
¯ Sebastien Mathieu