240-491 Adv. UNIX: Filters/4

475
-1

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
475
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
23
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

240-491 Adv. UNIX: Filters/4

  1. 1. Advanced UNIX <ul><li>Objectives </li></ul><ul><ul><li>to discuss five useful filters: tr , grep , awk , sed , and find </li></ul></ul>240-491 Special Topics in Comp. Eng. 1 Semester 2, 2000-2001 4. Filters (Part II, Sobell)
  2. 2. 1. tr <ul><li>format: </li></ul><ul><ul><li>tr [options] string1 [string2] </li></ul></ul><ul><li>tr reads its standard input and translates each character in string1 to the corresponding character in string2 </li></ul>
  3. 3. Examples <ul><li>$ echo 12abc3def4 | tr ’abcdef’ ’xyzabc’ 12xyz3abc4 </li></ul><ul><li>$ echo 12abc3de4 | tr ’[a-c][d-f]’ ’[x-z][a-c]’ 12xyz3abc4 </li></ul><ul><li>$ cat foo.txt | tr ’[A-Z]’ ’[a-z]’ </li></ul>
  4. 4. <ul><li>$ tr ’15’ ’ ’ < file1 > file2 </li></ul><ul><ul><li>15 is carriage return </li></ul></ul><ul><li>$ cat mail.txt | tr -s ’ ป ’ ’ ’ > new-mail.txt </li></ul><ul><ul><li>ป represents tab; could write 11 </li></ul></ul><ul><ul><li>-s means remove duplicates of string2 in output </li></ul></ul><ul><li>$ echo Can you read this? | tr -d ’aeiou’ Cn y rd ths? </li></ul>
  5. 5. “rot13” Text <ul><li>$ echo Gur chapuyvar bs gur wbxr vf ... | tr ’[N-Z][A-M][n-z][a-m]’ ’[A-M][N-Z][a-m][n-z]’ The punchline of the joke is ... </li></ul>Popular in 1970-1980’s.
  6. 6. 2. grep <ul><li>Format: </li></ul><ul><ul><li>grep [options] pattern [file-list] </li></ul></ul><ul><li>Search one or more files, line by line, for a pattern (a regular expression). Actions taken depend on options. </li></ul>
  7. 7. Variants of grep <ul><li>grep Uses basic RE pattern </li></ul><ul><li>f grep Fast grep. Pattern can only be an ordinary string. </li></ul><ul><li>e grep Extended grep. Pattern can use full REs. </li></ul>
  8. 8. grep options <ul><li>-c print a count of matching lines </li></ul><ul><li>-i ignore case in pattern during search </li></ul><ul><li>-l list filenames with match </li></ul><ul><li>-n precede each matching line by a line number </li></ul><ul><li>-v print lines that do not match pattern </li></ul>
  9. 9. Examples <ul><li>File testa File testb File testc aaabb aaaaa AAAAA bbbcc bbbbb BBBBB ff-ff ccccc CCCCC cccdd ddddd DDDDD dddaa </li></ul>continued
  10. 10. <ul><li>$ grep bb testa aaabb bbbcc </li></ul><ul><li>$ grep -v bb testa ff-ff cccdd dddaa </li></ul><ul><li>$ grep -n bb testa 1: aaabb 2: bbbcc </li></ul>continued
  11. 11. <ul><li>$ grep bb * testa: aaabb testa: bbbcc testb: bbbbb </li></ul><ul><li>$ grep -i bb * $ grep -i BB * testa: aaabb testa: aaabb testa: bbbcc testa: bbbcc testb: bbbbb testb: bbbbb testc: BBBBB testc: BBBBB </li></ul>
  12. 12. Fancier Patterns <ul><li>$ grep ’fun..ion’ file </li></ul><ul><li>$ grep -n ’^#define’ file </li></ul><ul><li>$ grep ’^#de[a-z]*’ file </li></ul><ul><li>$ egrep ’while|if’ *.c </li></ul><ul><li>$ egrep ’[0-9]+’ *.c </li></ul>
  13. 13. 3. awk <ul><li>format: </li></ul><ul><ul><li>awk program file-list </li></ul></ul><ul><ul><li>awk -f program-file file-list </li></ul></ul><ul><li>awk is a pattern scanning and action processing language </li></ul><ul><li>The action language is very like C. </li></ul>
  14. 14. Overview <ul><li>3.1. Patterns & Actions </li></ul><ul><li>3.2. awk Processing Cycle </li></ul><ul><li>3.3. How awk Sees a Line </li></ul><ul><li>3.4. Pattern Expressions </li></ul><ul><li>3.5. ‘,’ Range Operator </li></ul>continued
  15. 15. <ul><li>3.6. Many Built-in Functions </li></ul><ul><li>3.7. BEGIN and END </li></ul><ul><li>3.8. First awk Program File: pre_header </li></ul><ul><li>3.9. Action Language </li></ul><ul><li>3.10. Associative Arrays </li></ul>
  16. 16. 3.1. Patterns & Actions <ul><li>An awk program consists of: </li></ul><ul><ul><li>pattern {action} pattern {action} : </li></ul></ul>
  17. 17. 3.2. awk Processing Cycle <ul><li>1. Read next input line. </li></ul><ul><li>2. Apply all awk patterns sequentially. </li></ul><ul><li>3. If a pattern matches, do its action. </li></ul><ul><li>4. Go to step (1). </li></ul>
  18. 18. Example <ul><li>$ cat cars plym fury 77 73 2500 chevy nova 79 60 3000 ford mustang 65 45 10000 volvo gl 78 102 9850 ford ltd 83 15 10500 chevy nova 80 50 3500 fiat 600 65 115 450 honda accord 81 30 6000 ford thundbd 84 10 17000 toyota tercel 82 180 750 chevy impala 65 85 1550 ford bronco 83 25 9500 </li></ul>continued
  19. 19. <ul><li>$ awk ’/chevy/ {print}’ cars chevy nova 79 60 3000 chevy nova 80 50 3500 chevy impala 65 85 1550 </li></ul><ul><li>$ awk ’/chevy/’ cars chevy nova 79 60 3000 chevy nova 80 50 3500 chevy impala 65 85 1550 </li></ul><ul><li>$ awk ’/^h/’ cars honda accord 81 30 6000 </li></ul>
  20. 20. 3.3. How awk Sees a Line <ul><li>awk views each line as a record consisting of fields separated by spaces. </li></ul><ul><li>Each field is referred to by a variable called $<number> : </li></ul><ul><ul><li>$1, $2, $3 , etc. </li></ul></ul><ul><ul><li>$0 refers to the whole line (record) </li></ul></ul><ul><li>The current line number is stored in NR </li></ul>continued
  21. 21. <ul><li>$ awk ’{print $3, $1}’ cars 77 plym 79 chevy 65 ford : 83 ford </li></ul><ul><li>$ awk ’/chevy/ {print $3, $1}’ cars 79 chevy 80 chevy 65 chevy </li></ul>
  22. 22. 3.4. Pattern Expressions <ul><li>Format: </li></ul><ul><ul><li>variable OP pattern </li></ul></ul><ul><li>OP forms: </li></ul><ul><ul><li>matching: ~ !~ </li></ul></ul><ul><ul><li>ariithmetic: < <= == != >= > </li></ul></ul><ul><ul><li>boolean: && || ! </li></ul></ul>continued
  23. 23. <ul><li>$ awk ’$1 ~ /h/’ cars chevy nova 79 60 3000 chevy nova 80 50 3500 honda accord 81 30 6000 chevy impala 65 85 1550 </li></ul><ul><li>$ awk ’$1 ~ /^h/’ cars honda accord 81 30 6000 </li></ul>continued
  24. 24. <ul><li>$ awk ’$2 ~ /^[tm]/ {print $3, $2, “$” $5}’ cars 65 mustang $10000 84 thundbd $17000 82 tercel $750 </li></ul><ul><li>$ awk ’$3 ~ /5$/ {print $3, $1, “$” $5}’ cars 65 ford $10000 65 fiat $450 65 chevy $1550 </li></ul>continued
  25. 25. <ul><li>$ awk ’$3 == 65’ cars ford mustang 65 45 10000 fiat 600 65 115 450 chevy impala 65 85 1550 </li></ul><ul><li>$ awk ’$5 <= 3000’ cars plym fury 77 73 2500 chevy nova 79 60 3000 fiat 600 65 115 450 toyota tercel 82 180 750 chevy impala 65 85 1550 </li></ul>continued
  26. 26. <ul><li>$ awk ’$5 >= “2000” && $5 < “9000”’ cars plym fury 77 73 2500 chevy nova 79 60 3000 chevy nova 80 50 3500 fiat 600 65 115 450 honda accord 81 30 6000 toyota tercel 82 180 750 </li></ul><ul><li>$ awk ’$5 >= 2000 && $5 < 9000’ cars plym fury 77 73 2500 chevy nova 79 60 3000 chevy nova 80 50 3500 honda accord 81 30 6000 </li></ul>
  27. 27. 3.5. ‘,’ Range Operator <ul><li>Format: </li></ul><ul><ul><li>pattern1 , pattern2 </li></ul></ul><ul><li>Select a range of lines. </li></ul><ul><ul><li>the first line of the range matches pattern1 </li></ul></ul><ul><ul><li>the last line of the range matches pattern2 </li></ul></ul><ul><li>May return several groups of lines </li></ul>continued
  28. 28. <ul><li>$ awk ’/volvo/ , /fiat/’ cars volvo gl 78 102 9850 ford ltd 83 15 10500 chevy nova 80 50 3500 fiat 600 65 115 450 </li></ul><ul><li>$ awk ’NR == 2 , NR ==4’ cars chevy nova 79 60 3000 ford mustang 65 45 10000 volvo gl 78 102 9850 </li></ul>continued
  29. 29. <ul><li>$ awk ’/chevy/ , /ford/’ cars chevy nova 79 60 3000 ford mustang 65 45 10000 chevy nova 80 50 3500 fiat 600 65 115 450 honda accord 81 30 6000 ford thundbd 84 10 17000 chevy impala 65 85 1550 ford bronco 83 25 9500 </li></ul>three groups
  30. 30. 3.6. Many Built-in Functions <ul><li>length(str) length of string str length length of current line </li></ul><ul><li>split(strings, array, delimitor) split string into parts based on the delimitor , and place in array </li></ul><ul><ul><li>split(“a bcd ef g1”, arr, “ “) </li></ul></ul>continued
  31. 31. <ul><li>$ awk ’length > 23 {print NR}’ cars 3 9 10 </li></ul>
  32. 32. 3.7. BEGIN and END <ul><li>BEGIN {action} executed before first line is processed </li></ul><ul><li>END {action} executed after last line is processed </li></ul><ul><li>$ awk ’END {print NR, “cars for sale.”}’ cars 12 cars for sale </li></ul>
  33. 33. 3.8. First awk Program File <ul><li>$ cat pr_header # # pr_header # BEGIN { print “Make Model Year Miles Price” print “---------------------------------” } {print} </li></ul>continued
  34. 34. <ul><li>$ awk -f pr_header cars Make Model Year Miles Price --------------------------------- plym fury 77 73 2500 chevy nova 79 60 3000 : : chevy impala 65 85 1550 ford bronco 83 25 9500 </li></ul>
  35. 35. redirect_out <ul><li>$ cat redirect_out /chevy/ {print > “chev.txt”} /ford/ {print > “ford.txt”} END {print “done.”} </li></ul><ul><li>$ awk -f redirect_out cars done. $ cat chev.txt chevy nova 79 60 3000 chevy nova 80 50 3500 chevy impala 65 85 1550 </li></ul>
  36. 36. 3.9. Action Language <ul><li>Very C like: </li></ul><ul><ul><li>var = expr </li></ul></ul><ul><ul><li>if (cond) stat1 else stat2 </li></ul></ul><ul><ul><li>while (cond) stat </li></ul></ul><ul><ul><li>for (expr1; cond; expr2) stat </li></ul></ul><ul><ul><li>printf “format” expr1, expr2, ... </li></ul></ul><ul><ul><li>{ stat1 ; stat2; ... ; statN } </li></ul></ul><ul><li>User-defined variables do not need to be declared </li></ul>continued
  37. 37. <ul><li>Long statements, conditions, expressions may need to be typed over several lines. </li></ul><ul><li>Use ‘’ to hide newline: </li></ul><ul><ul><li>if ($3 > 2000 && $3 < 3000) print $3 </li></ul></ul>
  38. 38. price_range <ul><li>$ cat price_range { if ($5 <= 5000) $5 = “inexpensive” else if ($5 > 5000 && $5 < 10000) $5 = “please ask” else if ($5 >= 10000) $5 = “expensive” printf “%-10s %-8s 19%2d %5d %-12s ”, $1, $2, $3, $4, $5 } </li></ul>continued
  39. 39. <ul><li>$ awk -f price_range cars plym fury 1977 73 inexpensive chevy nova 1979 60 inexpensive : : ford bronco 1983 25 please ask </li></ul>
  40. 40. summary <ul><li>$ cat summary BEGIN { yearsum = 0 ; costsum = 0 newcostsum = 0 ; newcnt = 0 } { yearsum += $3 ; costsum += $5 } $3 > 80 { newcostsum += $5 ; newcnt++ } END { printf “Avg. car age: %3.1f yrs ”, 90 - (yearsum/NR) printf “Avg. car cost: $%7.2f ”, costsum/NR printf “Avg. newer car cost: $7.2f ”, newcostsum/newcnt } </li></ul>continued
  41. 41. <ul><li>$ awk -f summary cars Avg. car age: 13.2 yrs Avg. car cost: $6216.67 Avg. newer car cost: $8750.00 </li></ul>
  42. 42. 3.10. Associative Arrays <ul><li>Arrays that use strings as indexes: </li></ul><ul><ul><li>array[string] = value </li></ul></ul><ul><li>Special for-loop for awk arrays: </li></ul><ul><ul><li>for (elem in array) action </li></ul></ul>continued
  43. 43. manuf <ul><li>$ cat manuf {manuf[$1]++} END { for (name in manuf) print name, manuf[name] } </li></ul>continued
  44. 44. <ul><li>$ awk -f manuf cars honda 1 fiat 1 volvo 1 ford 4 plym 1 chevy 3 toyota 1 </li></ul>
  45. 45. Sorted Output <ul><li>Sort by first column (i.e. by name): </li></ul><ul><ul><li>$ awk -f manuf cars | sort </li></ul></ul><ul><li>Sort by second column (i.e. by number): </li></ul><ul><ul><li>$ awk -f manuf cars | sort +1 </li></ul></ul>
  46. 46. 4. sed <ul><li>Format: </li></ul><ul><ul><li>sed ’list of ed commands’ file </li></ul></ul><ul><li>Read lines one at a time from the input file </li></ul><ul><ul><li>apply ed commands in order to each line </li></ul></ul><ul><ul><li>write edited line to stdout </li></ul></ul><ul><li>ed is an old UNIX editor </li></ul><ul><ul><li>vi without full-screen mode </li></ul></ul><ul><ul><li>did you think vi was tough :) </li></ul></ul>
  47. 47. 4.1. Search and Replace <ul><li>The ‘ s ’ command searches for a pattern (a regular expression), and replaces it with the new string: </li></ul><ul><ul><li>’ s/pattern/new-string/g’ </li></ul></ul><ul><ul><li>‘ g ’ means global (everywhere on line) </li></ul></ul>
  48. 48. Examples <ul><li>$ sed ’s/UNIX/UNIX(TM)/g’ file > new-file </li></ul><ul><li>$ sed ’s/^/ /’ file > new-file </li></ul><ul><ul><li>put a tab at the start of every line (no g needed) </li></ul></ul><ul><li>$ sed ’s/[ ][ ]*//g’ file > new-file </li></ul><ul><ul><li>replace every sequence of blanks or tabs with a newline </li></ul></ul><ul><ul><li>this splits the input into 1 word/line </li></ul></ul>continued
  49. 49. <ul><li>$who ad tty1 Sep 29 07:14 ron tty3 Sep 29 10:31 td tty4 Sep 29 08:36 $ who | sed ’s/ .* / /’ ad 07:14 ron 10:31 td 08:36 $ </li></ul>replace a blank and everything that follows it (as much as possible, including more blanks) up to the last blank
  50. 50. More Information <ul><li>sed can use most ed commands, not just s </li></ul><ul><li>See the entry on sed in Sobell, p.680-691 </li></ul>
  51. 51. 5. find <ul><li>Format: </li></ul><ul><ul><li>find starting-directory matching-conditions-and-actions </li></ul></ul><ul><li>find searches all the directories below the starting directory. </li></ul><ul><ul><li>it carries out the specified actions on the files that match the specified conditions </li></ul></ul>
  52. 52. <ul><li>Assume we are in my home directory, and want to find the cars file (used in the awk examples): </li></ul><ul><ul><li>$ find . -name cars -print ./teach/adv-unix/filters/cars $ </li></ul></ul>Basic Example starting point -name condition -print action
  53. 53. <ul><li>-name nm the filename is nm </li></ul><ul><li>-type ty ty is a file type: f = file, d = directory, etc. </li></ul><ul><li>-user usr the file’s owner is usr </li></ul><ul><li>-group grp the file’s group owner is grp </li></ul>5.1. Some Matching Conditions continued
  54. 54. <ul><li>-atime n file was last accessed exactly n days ago </li></ul><ul><li>-mtime n file was last modified exactly n days ago </li></ul><ul><li>-size n file is exactly n 512-byte blocks long </li></ul><ul><li>Can use + or - to mean more or less. </li></ul>
  55. 55. 5.2. Example Conditions <ul><li>-mtime +7 last modified more than 7 days ago </li></ul><ul><li>-size +100 larger than 50K </li></ul><ul><li>“And”ing conditions: </li></ul><ul><ul><li>-atime +60 -mtime +120 </li></ul></ul><ul><ul><li>files last accessed more than 2 months ago and last modified more than 4 months ago </li></ul></ul>continued
  56. 56. <ul><li>“Or”ing Conditions: </li></ul><ul><ul><li>( -mtime +7 -o -atime +30 ) </li></ul></ul><ul><ul><li>files last modified more than 7 days ago or last accedded more than 30 days ago </li></ul></ul><ul><li>“Not” </li></ul><ul><ul><li>-name *.dat ! -name gold.dat </li></ul></ul><ul><ul><li>all “. dat ” files except gold.dat </li></ul></ul>
  57. 57. 5.3. Some Actions <ul><li>-print display pathname of matching file </li></ul><ul><li>-exec cmd execute cmd on file </li></ul><ul><li>-ok cmd prompt before executing cmd on file </li></ul><ul><li>Commands must end with ; and use {} to mean the matching file, e.g.: </li></ul><ul><ul><li>-ok rm {} ; </li></ul></ul>
  58. 58. 5.4. Examples <ul><li>$ find . -name *.c -print </li></ul><ul><ul><li>Starting from the current directory, display the pathnames of all the files ending in “ .c ” </li></ul></ul><ul><li>$ find . ( -name core -o -name junk ) -print -ok rm {} ; </li></ul><ul><ul><li>Print the pathnames of all the core and junk files in the current directory and below, and prompt to remove them. </li></ul></ul>continued
  59. 59. <ul><li>$ find /usr -size +100 -mtime +30 -exec ls -l {} ; </li></ul><ul><ul><li>Display a long list of all the files under /usr larger than about 500K that have not been modified in a month. </li></ul></ul>
  60. 60. 5.5. Problems with Permissions <ul><li>A find over the entire filesystem will print many error messages when access is denied to other user’s directories. </li></ul><ul><li>These error messages (sent to stderr ) can be redirected to /dev/null (a UNIX “black hole”). </li></ul>
  61. 61. Example <ul><li>Search for a file/directory called zip anywhere below the root directory: </li></ul><ul><ul><li>$ find / -name zip -print find: /exports/tmp/code/4210341: Permission denied find: /exports/tmp/code/4210389: Permission denied find: /exports/home/suthon/private: Permission denied find: /exports/home/cj/mail: Permission denied : : </li></ul></ul>continued
  62. 62. <ul><li>Redirect standard errors to the black hole using 2> </li></ul><ul><li>$ find / -name zip -print 2> /dev/null /exports/home/s4110068/project/zip /exports/home/s4110316/project/zip /exports/home/s4110316/zip /exports/home/s4110316/zip/zip $ </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×