Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Unix primer


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Unix primer

  1. 1. Unix Primer Maxime Augier 03.04.2004 1 Before we start. . . you can get information about any external (= shell-independant) command by con- sulting the UNIX Manual Pages: $> man command for more information about the manual system itself, do: $> man man if your are looking for a keyword blah in a man page, after launching the man program, type /blah<enter> ¯ to find the next occurence of the word your are looking for, use n. ¯ to quit the man page simply hit q. 2 Filesystem exploration 2.1 Filesystem structure & namespace The filesystem is an abstraction used by UNIX to access to information storage devices (hard drives, floppies, cd-roms, network storage). The filesystem is organised in direc- tories that form a tree. Each directory is a node of this tree. Files, and other objects like Unix Sockets, FIFOs, and devices, are all leaves of this tree. Each node in the tree can be uniquely identified by the sequence of parent nodes to traverse before reaching the object. This is called a path. A path is the seqence of the nodes (directories) names, separated by slashes. (For example: an/example/of/path) The root of the filesystem tree is a special directory called / (slash). Thus, paths starting with a / are considered absolute (they start at the root of the filesystem). On the other hand, paths not starting with a / are understood as relative to the current directory. Every user gets a personal directory. traditionally it is located in /home/username. There is also a shortcut for home directories: a tilde (˜) designates the home direc- 1
  2. 2. tory of the current user; a tilde followed by a username (˜alice) designates the home directory of this user. Programs are stored in /bin, /sbin, /usr/bin, /usr/sbin, /usr/local/bin and /usr/local/sbin. All the system-wide configuration files are located in /etc Temporary files can be created in /tmp. Files you create here are not accounted into your quota, so it is convenient to work with. However, the contents of /tmp will even- tually be deleted upon reboot. Hidden files or directories start with a dot (.) 2.2 Navigating When you work in your shell, you always have a current directory. 2.2.1 cd — moving around To change your current directory, do: $> cd /my/new/directory To go back to your home directory, do: $> cd to go back one directory in the directory tree, do: $> cd .. 2.2.2 pwd — knowing where you are At any moment, you can use the pwd command (Print Working Directory) to display you current directory $> pwd 2.2.3 ls — examining directories Use the command ls to list the contents of a directory. The syntax is $> ls directory If directory is omitted, ls assumes it is ”.” (the current directory). Most common options are: -a : to include hidden files -l : to get a long listing (including file sizes, dates and permissions) -d : to list the directory itself (as in its parent) instead of its contents (useful when you do ”ls -ld *” for instance). 2
  3. 3. 2.3 Organising directories 2.3.1 mkdir - creating new directories $> mkdir directory 2.3.2 rmdir - deleting empty directories $> rmdir directory Note: the directory must be empty. To delete a non-empty directory, use rm with the recursive option (-R) 2.4 Managing files Preliminary warning Unlike other systems, Unix is very permissive regarding file names. In fact, you can use almost every printable or non-printable characters, including line- feed, backspace, tab and so on. However, many characters have a special meaning to the shell, leading to unpredictable results. A strangely named file can become un- deletable from the shell, mangle the directory listing display or even be interpreted as an argument to a command, changing the behavior of the command. Conclusion: When choosing a name for a file, rry only using alphanumerics and the following characters: - . and do NOT start a file name with a dash (-) 2.4.1 touch — creating empty file touch won’t erase anything even if the file already exists so it’s a safe way of creating empty file. $> touch filename 2.4.2 cp — copying files Two possibles usages: $> cp source destination to copy a source file into a destination file, and $> cp source1 source2 source3 destination/ to copy several files into a destination directory, keeping the original filenames. 2.4.3 rm — removing files To delete file(s): $> rm file1 file2 file3 ... 3
  4. 4. To delete a file that starts with - (for example -toto); $> rm -- -toto To remove the entire contents of a directory: $> rm -r directory/ 2.4.4 mv — moving and renaming files To move a file: $> mv file /destination/directory To rename a file: $> mv oldname newname Note: with the option “-i”, you are asked for confirmation every time a file is deleted or overwritten. It is a good idea to make this the default, e.g., by defining appropriate aliases, i.e. with bash: $> alias rm=quot;rm -iquot; If you do that, you can temporarily negate the -i flag with a -f (force) flag. 2.4.5 chown, chgrp, chmod — managing permissions You can use these tools to manage access control for your files. s You can change the owner of a file(s) with chown: $> chown owner file1 file2 file3... You can use chgrp to change the file(s) group: $> chgrp group file1 file2 file3... You can use chmod to change the access mode of a file(s): $> chmod mode file1 file2 file3... mode being of the form <who><action><right>, where: who can be one of u(ser), g(roup) or o(thers) action can be one of + (allow), - (deny), = (allow only this) right can be one of r(ead), w(rite) or x(ecute) (and some others, too.. look at the man page) For a file, the rights meanings are straightforward: r: you can read the file w: you can write in the file (and therefore delete it) x: you can execute the file For a directory, the meanings are more complex: r: you can list the directory contents 4
  5. 5. w: you can add new files in the directory, and delete existing ones (technically, that means you can modify all the files you can read, and at least delete all the files you can’t). x: you can traverse the directory (meaning you can access the files in it) Some common modes you can use: go= to deny access to everybody but yourself a+x to make a script executable for everybody go=x to make a secret directory (other people can access the contents only if they know the exact file names) 2.5 Archiving files Some widely used archiving formats (.zip, .rar) include both file collating and com- pression. Under UNIX, those two tasks are handled by separate utilities 2.5.1 tar — packing many files in one To pack files into a ”tarball” (archive), do: $> tar -cf nouvelle_archive.tar fichier1 fichier2 fichier3 ... To unpack a tarball, do: $> tar -xf archive.tar Warning: don’t use absolute filenames (those starting with a /) when creating a tarball, otherwise tar will attempt to put them on the same absolute location when extracting, which can lead to problems if you take files from one system to another. 2.5.2 gzip - compressing files To compress a file, do: $> gzip filename This will create a compressed version of the file named filename .gz To decompress it, do: $> gunzip filename.gz The recommended approach is to first use tar to collate many files, then use gzip to compress the resulting tarball. 2.6 Working with text files 2.6.1 more & less — reading files quickly To read a text file, use 5
  6. 6. $> more filename You can advance in the text by hitting space or enter. You can go back by hitting ”b”. You can search for a keyword, or even a whole regular expression (see below) by hitting / (slash), typing in the keyword or regexp, and hitting enter. After a search, you can hit ”n” to go to the next match. 2.6.2 cat — quickly creating text files When you want to quickly create short text files, you don’t have to open a heavy editor. Instead, type $> cat > filename then type the contents of the file. When you’re finished, go to a new line and hit Ctrl-D. 2.7 Finding Files 2.7.1 find Find looks recursively for files matching some criterions in a given directory. The syntax is $> find directory (criterions...) To look in the current directory for regular files with names ending in .java, do $> find . -type f -name quot;*.javaquot; To get a list of all the files and directories in the current directory (that is, without applying any criterion), do $> find . Warning: You have to quote the wildcards, otherwise the shell will expand them, which is not what you want (see below about shell wildcards). 3 Examining files 3.0.2 Identifying file types: file Some common desktop operating systems use file name extensions (.txt, .c, .mp3, etc...) to denote the file type. With UNIX, you don’t have to use file extensions to denote a particular filetype. When you don’t know what a particular file contains, use the file command. It guesses the file type by looking directly at the contents of the file, regardless of its actual name. $> file filename 6
  7. 7. 3.0.3 Counting the number of lines in a text file: wc wc with the l switch counts the number of lines in a file. For example you can use usnoop |grep Syn |wc -l if you want to know how many Syn appear in your usnoop trace. $> cat textfile | wc -l 3.0.4 Searching text files for strings: grep Grep looks for lines in a text stream matching a particular pattern, and prints only the matching (or non-matching) lines. The syntax is $> grep pattern file1 file2 file3... The pattern can be a simple keyword. For a fixed-string pattern, passing the -F option to grep will make it faster. For more complex patterns, you have to use regular expressions (or RegExpes). 4 Regular Expressions RegExpes are the UNIX way of describing text patterns. They are used in many text processing languages such as sed, awk and perl. RegExpes are just simple text, but some characters acquire a special meaning: Character Meaning . (Dot) Stands for any character ? (Question mark) The previous character is optional (Star) The previous character repeats zero or more times + (Plus) The last character repeats one or mor times ˆ (Caret) Stands for the beginning of the line $ (Dollar) Stands for the end of the line You can also express a choice among many characters by listing them between brack- ets: [123] means either 1, 2 or 3. You can negate the choice by prepending it with a caret: [ˆabc] means any character but a, b or c. You can aggregate contiguous characters with a dash: [a-z] means any lowercase al- phabetic character, [0-9] means any digit, and so on. Examples: ˆa*$ Matches a, aa, aaa, aaaa, . . . abc* Matches anything containing ab, abc, abcc, abccc, abcccc, . . . ˆab[cd] Matches anything starting with abc or abd ˆab.*cd$ Matches anything starting with ab and ending with cd ˆab.c$ Matches abac, abbc, abcc, abdc, abec, ab c, ab1c, ab$c, . . . For more information, you can refer to the grep manpage (man grep), or better, to the perl RegExp manpage (perldoc perlre) 7
  8. 8. 5 usnoop : selecting which packets to be captured You can use pcap filter with usnoop or tcpdump to choose which packets you want to be captured. Here are some common filters : 5.0.5 arp, tcp, icmp and udp keywords use one of these keywords to catch only a particular protocol. For example if you want to catch only arp request and answer use : $> usnoop arp 5.0.6 host, net and port keywords host hostname to catch only traffic going or coming from this hostname. net network to catch only traffic going or coming from this network. port portnumber to catch only traffic going or coming from this specific port number. $> usnoop host $> usnoop net 128.178 $> usnoop port 80 5.0.7 src and dst keywords you can specifiy a transfer direction for host, net and port. If you want to match packets coming from a source port use src port or if you want to match packets going to a particular destination host use dst host $> usnoop dst port 80 $> usnoop src host 5.0.8 matching flags in tcp packets if you want to check for these common tcp flags use : SYN : ’tcp[13] & 2 != 0’ FIN : ’tcp[13] & 1 != 0’ ACK : ’tcp[13] & 16 != 0’ RST : ’tcp[13] & 4 != 0’ $> usnoop ’tcp[13] & 2 != 0’ 8
  9. 9. 5.0.9 using and, or, ( ) and not between keywords just combine precedent keywords with and, or, ( ) and not to build powerful filters : catch all DNS traffic : $> usnoop udp and port 53 catch tcp traffic with SYN and ACK bit coming from in3sun1 $> usnoop ’tcp[13] & 2 != 0 and tcp[13] & 16 != 0 and src host in3sun1’ catch all tcp and udp traffic from but discard tcp port 22 one $> usnoop ’((tcp and not port 22) or udp) and host’ 6 Using the shell 6.1 Wildcards Whenever you want to use a program expecting a list of files as arguments, you can use wildcards to have the shell build an appropriate list for you. Shell wildcards work somewhat like regexpes, but are much more simple. Warning: regexpes and shell wildcards are not compatible. A valid wildcard is often not a valid regexp, and conversely The two most common wildcards are the star (*), that stands for zero, one or more characters of any sort, and the question mark (?), that stands for exactly one, but any, character. You can also specify a choice of possibles characters by listing them between square braces ([123] stands for either 1, 2, or 3). Whenever you use an argument containing wildcards, the shell will automatically re- place it by a list of file names in the current directory matching the pattern. For instance, for deleting all the java source files in a directory, do $> rm *.java Note that the shell only looks in the current directory. If you want a list of files found through an entire directory tree, you have to use the find command in conjunction with xargs or a backtick operator (see below). 6.2 I/O Redirection In Unix, every running program can read data from a special stream called ”Standard Input” (or STDIN), and write data to two others special streams (Standard Output, and Standard Error, rsp. STDOUT and STDERR). By default, when you run a command, STDIN is associated to your keyboard, while STDOUT and STDERR are associated with your display. That means the program ??? You can redirect STDOUT to a file, so that the command write its results in a file rather than on your screen. To do so, use the notation ¿file . 9
  10. 10. $> ls -d * >catalog will create a list of the files in the current directory and write it in a file named ”catalog”. Also, you can redirect STDIN so that the program takes its inputs from a file rather than from the keyboard, with the notation ¿file. 6.3 Pipes and filters Sometimes, you want to chain the effects of two commands. You can use the pipe symbol ”|” to connect the STDOUT of a command to the STDIN of a second one. UNIX has a lot of small ”filter” programs that you can use to process the results from a command 6.3.1 more When used without filenames, more will act as a pager for its STDIN. You can then use it to add paging to the output of any command: $> usnoop -i /tmp/file | more 6.3.2 tee the tee command gets data from its STDIN and puts it both in STDOUT and in a file. You can use it to both view and save the results in a file. For example, $> usnoop -i /tmp/file | tee results | more Will both page the output and save it in the results file 6.3.3 sort The sort command can sort output lines by alphabetic or numeric order (use option -n for numeric sort). You can reverse the sort with the -r option To list all the files in a directory by ascending number or words, you can do: $> wc * | sort -rn To create a sorted version of a file, you can do $> sort <original >sorted 6.3.4 sed, awk, perl sed (Stream EDitor) is a powerful program that can be used to perform various transfor- mations on a text stream. It is line-oriented, and also relies on regular expressions for its pattern matching. It is too complex to be explained here, but plenty of information is available in the man page. 10
  11. 11. awk is another text-processing program, line-and-fields oriented. It can be used to process text files databases. You can also look up more information in the man page. those two programs are practical for small processing tasks, but their performance and flexibility is very limited. Much more complex tasks can be achieved with the Perl language, which is becoming widely used among unix distributions. Perl programming is beyond the scope of this document. 6.4 Turning a stream into arguments 6.4.1 The backtick operator “ The backticks take the output of a command and expand it into an argument stream. For instance, to recursively grep all java files in a directory for a specific keyword, you can use: $> find -type f -name quot;*.javaquot; to get a list of all the java files in a directory. Then, to turn it into an argument list for grep, do $> grep keyword ‘find . -type f -name quot;*.javaquot;‘ 6.4.2 xargs Xargs reads lines from its STDIN and turns them into arguments for a command. It works pretty much like the backtick, except it is more flexible. For the example above, an equivalent would be $> find . -type f -name quot;*.javaquot; | xargs grep keyword 6.5 Job control 6.5.1 ; — synchronous execution If you want to execute sequentially several commands, you can separate them with semicolons (;) $> ls /bin; ls /sbin Each separate command will start when the previous one is completed, regardless of its exit status. 6.5.2 & and wait — asynchronous execution You can execute several commands in parallel; each command starts immediatly. This will print while simultaneously creating a compressed version $> lp & gzip 11
  12. 12. If you do not specify the last command, you return to the shell. Therefore, you can launch a command in the background by appending a & to it. However, the command will still use your terminal as standard input and output, so you have to use proper redi- rection when launching commands in backgroud. This runs a command in background, saving its output in a file ’results’ and discarding the errors. $> command >results >&/dev/null & If you need to have a command sequence to stop and wait for a program to complete, 6.5.3 Conditional synchronous execution Occasionally, you want to execute a command only if the previous one succeeded (or not). For this, you can use the && and || operators. && acts the same as ; excepts that il will not continue if the first command fails. || will execute the second command only if the previous one failed. $> cp a b && rm a Will copy file a to file b, then erase a only if the copy suceeded. $> cp a b ; rm a will erase a even if the copy failed. $> mv a b || cp a b will rename a to b, or fallback to making a copy if the rename failed (for instance, if a was write-protected) 6.5.4 ps — showing running processes You can use ps to show a list of running processes on your machine and their ID with the ps command. With no argument ps will show processes owned by you and that were run from the same terminal. The ’-A’ option lists all processes running on the machine. the ’-l’ option shows de- tailed information for each process. 6.5.5 kill — sending signals UNIX processes can also communicate with signals. Signals can be used to interrupt, restart or destroy processes. The syntax is $> kill -signal PID To kill a process given its pid, or $> kill -signal %job 12
  13. 13. To kill a process by job number (see below). If you do not specify a signal, the TERM signal is assumed. The most common signals are: INT interruption (program should interrupt what it’s doing) STOP suspends the execution of the current process (the execution can be resumed later) CONT resume the execution of a suspended process (i.e. with the STOP signal) TERM termination (program must clean up garbage and exit immediately) KILL kills the process (should be only used in last resort.) You can get a list of available signals by typing $> kill -l Examples : $> kill -TERM 2245 $> kill -TERM %2 6.5.6 shortcuts to send signals You can send a SIGINT to the current program running in a terminal by hitting Ctrl-C You can send a SIGSTOP (program is suspended and can continue later) by hitting Ctrl-Z 6.5.7 jobs, fg and bg - controlling job execution Each program either stopped or running in background becomes a ”Job” of the shell. You can display current jobs and their numbers with the command jobs. You can bring back a job that you stopped with Ctrl-Z, or that was running in the background, with the command $> fg <job number> You can run a stopped job in the background with $> bg <job number> If you don’t give a job number, fg and bg will assume the most recent job. Finally, you can get the PID of a job with the construct %¡job number¿. For instance, to send the HUP signal to job n 5, do $> kill -HUP %5 13
  14. 14. 7 Printing 7.0.8 PostScript files Beyond printing simple text files, most UNIX printers support the PostScript language to print more complex documents. You can send either text or PostScript files to the printer with lp. 7.0.9 lp — printing .ps and text files $> lp -c The -c option copies files to the spooler before printing. It can be needed in IN1 and IN3 because of configuration glitches. 7.0.10 a2ps — printing any file types ! (almost) a2ps is a filtering program that produces PostScript files out of lots of different file types. You can send the files to the printer using lp: $> a2ps -o $> lp && rm 8 Acknowledgements: Thanks to the people who contributed to this document: ¯ Matthias Grossglauser ¯ Olivier Hochreutiner ¯ Sebastien Mathieu 14