Unix - Class7 - awk


Published on

It will provide brief knowledge regarding Extraction and Reporting of textual data.

This PPT is not complete. It is in Progress state. In Future you may get more information regarding awk.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Unix - Class7 - awk

  1. 1. UNIX - awk Data extraction and formatted Reporting Tool Presentation By Nihar R Paital
  2. 2. Introduction Developer : Alfred Aho Peter Weinberger Brian Kernighan Appears in : Version 7 UNIX onwards Developed during : 1970 s Developed at : Bell Labs Category : UNIX Utility Supported by : All UNIX flavors Nihar R Paital
  3. 3. Definition The AWK utility is a data extraction and reporting tool that uses a data-driven scripting language consisting of a set of actions to be taken against textual data (either in files or data streams) for the purpose of producing formatted reports. Nihar R Paital
  4. 4. It performs basic text formatting on an inputstream ( A file / input from a pipeline ) Formatting using input file$ awk {print $n} FilenameExample:$ awk {print $1} awk.txt > awk.txt.bak Formatting using a filter in a pipeline$ generate_data | awk {print $1}Example:$ cat awk.txt | awk {print $1} > awk.txt.bakBefore proceeding to next slide please create a file named awk.txt with following Contents. [28/Sep/2010:04:08:20] "GET /robots.txt HTTP/1.1" 200 0 "msnbot" [28/Sep/2010:04:20:11] "GET / HTTP/1.1" 304 - "Baiduspider" Nihar R Paital
  5. 5. Basic but important for awk Syntax :  awk {print $n} filename  Generate data : awk {print $n} Awk programs will start with a "{" and end with a "}" $0 is the entire line Awk parses the line in to fields for you automatically, using any whitespace (space, tab) as a delimiter. Fields of a regular file will be available using $1,$2,$3 … etc NF : It is a special Variable contains the number of fields in the current line. We can print the last field by printing the field $NF NR : It prints the row number being currently processed. Nihar R Paital
  6. 6. Basic Examples $ awk {print $0} awk.txt It will print all the lines as they are in File $ echo this is a test | awk {print $3} It will print a $ echo this is a test | awk {print $NF} It prints "test" $ awk {print $1, $(NF-2) } awk.txt It will print the last 3rd word of file awk.txt $ awk {print NR ") " $1 " -> " $(NF-2)}‘ Output: 1) -> 200 2) -> 304 Nihar R Paital
  7. 7. Advance use of AWK$ awk {print $2} logs.txtOutput: [28/Sep/2010:04:08:20] [28/Sep/2010:04:20:11]The date field is separated by "/" and ":" characters.Suppose I want to print like[28/Sep/2010[28/Sep/2010$ awk {print $2} logs.txt | awk BEGIN{FS=":"}{print $1}Output: [28/Sep/2010 [28/Sep/2010Here FS=“:” means Field Separator as colon(:)$ awk {print $2} logs.txt | awk BEGIN{FS=":"}{print $1} | sed s/[//Output: 28/Sep/2010 28/Sep/2010Here We are Substituting [ with NULL value Nihar R Paital
  8. 8. Advance Use of AWKIf I want to return only the 200 status lines$ awk {if ($(NF-2) == "200") {print $0}} logs.txt Output: [28/Sep/2010:04:08:20] "GET /robots.txt HTTP/1.1" 200 0 "msnbot"$ awk {a+=$(NF-2); print "Total so far:", a} logs.txt Output: Total so far: 200 Total so far: 504$ awk {a+=$(NF-2)}END{print "Total:", a} logs.txt Output: Total: 504 Nihar R Paital
  9. 9. Nihar R Paital