SYSTEM PROGRAMMING
OLUFEMI OLOLADE
OLAEWE
ASSISTANT SOFTWARE
DEVELOPER
[BSc. , M.PHIL]
INTRODUCTION TO AWK UTILITY
AWK is a programming language created by Aho, Kernighan, and
Weinberger.
It is useful for:
manipulation of data files,
text retrieval and processing,
generation of reports, and
for prototyping and experimenting with algorithms.
Versions: awk, nawk, mawk, pgawk, and gawk (GNU).
AWK CONTD
An AWK program is a sequence of pattern {action} pairs and function
definitions.
Short programs are entered on the command line usually enclosed in '
' to avoid shell
interpretation.
Longer programs can be read in from a file with the -f option.
AWK CONTD.
SYNTAX
AWK CONTD.
• pattern {action}.
• One, but not both, of pattern {action} can be omitted.
• i.e., a program must have either pattern or {action}, or both.
• If pattern is missing, action is applied to all lines (it is implicitly matched),
• if action is missing, the matched line is printed (it is implicitly {print}).
• E.g., the command: awk '/for/' testfile prints all lines containing
string “for” in testfile
BASIC TERMINOLOGY OF INPUT
FILES
• Data on the input file is broken into records as determined by the
record
separator variable, RS.
• By default, RS = "n" i.e. new line.
• Each line of data or text on the input file is referred to as a record.
• Records are read in one at a time, and the current record is stored
in the field
variable $0.
A record is split into fields which are stored in the field buffers $1,
$2, ..., $NF.
• A field is thus, a unit of data in a line (record).
• Each field in a record is separated from the other fields by the field
separator,
FS.
• The default field separator is whitespace.
SOME SYSTEM/BUILT-IN VARIABLES
EXAMPLE 1
EXAMPLE 2
• A pattern can be:
BEGIN, END,
expression
expression, expression
• Note that, BEGIN and END patterns require an action.
• An AWK script can be divided into three main parts as follows:
• BEGIN: performs pre-processing that must be completed before awk
starts
reading records from the input file.
• Mostly to initialize variables and to create report headings.
• BODY: contains main processing logic to be applied to input records,
• like a loop that processes input data one record at a time:
• the body executes mostly ones for each record.
• END: post-processing contains logic to be executed after all input
data have
been processed.
• Logic such as printing report grand total are performed in this part of
the
script.
STATEMENTS
• Statements in an AWK program are terminated by newlines, semi-
colons or both.
• Groups of statements such as actions or loop bodies are blocked via
{...} as in C.
• The last statement in a block doesn't need a terminator.
• Blank lines have no meaning; an empty statement is terminated with a
semicolon.
• Long statements can be continued with a backslash, .
• A statement can be broken without a backslash after a comma, left
brace, &&, ||,
do, else, the right parenthesis of an if, while or for statement, and the
right
parenthesis of a function definition.
EXPRESSIONS AND OPERATORS
Primary AWK expressions are
– numeric constants,
– string constants,
– variables,
– fields,
– arrays and
– function calls.
The identifier for a variable, array or function can be a sequence of
– letters, digits and underscores
– and does not start with a digit.
• Variables are not declared; they exist when first referenced and are initialized to
null.
New expressions are composed with the following operators in order
of
increasing precedence.
EXPRESSION PATTERN TYPES
• uses marching
• either searches through an entire record for a possible march using
regular
expression enclosed by ‘/’s
• or explicitly searches for a march in a particular field or group of
fields using
the expressions ~ (march) or !~ (not march).
Introduction to AWK utility on unix.pptx
Introduction to AWK utility on unix.pptx
Introduction to AWK utility on unix.pptx
Introduction to AWK utility on unix.pptx
Introduction to AWK utility on unix.pptx

Introduction to AWK utility on unix.pptx

  • 1.
    SYSTEM PROGRAMMING OLUFEMI OLOLADE OLAEWE ASSISTANTSOFTWARE DEVELOPER [BSc. , M.PHIL]
  • 2.
    INTRODUCTION TO AWKUTILITY AWK is a programming language created by Aho, Kernighan, and Weinberger. It is useful for: manipulation of data files, text retrieval and processing, generation of reports, and for prototyping and experimenting with algorithms.
  • 3.
    Versions: awk, nawk,mawk, pgawk, and gawk (GNU).
  • 4.
    AWK CONTD An AWKprogram is a sequence of pattern {action} pairs and function definitions. Short programs are entered on the command line usually enclosed in ' ' to avoid shell interpretation. Longer programs can be read in from a file with the -f option.
  • 5.
  • 6.
  • 7.
  • 8.
    • pattern {action}. •One, but not both, of pattern {action} can be omitted. • i.e., a program must have either pattern or {action}, or both. • If pattern is missing, action is applied to all lines (it is implicitly matched), • if action is missing, the matched line is printed (it is implicitly {print}). • E.g., the command: awk '/for/' testfile prints all lines containing string “for” in testfile
  • 9.
    BASIC TERMINOLOGY OFINPUT FILES • Data on the input file is broken into records as determined by the record separator variable, RS. • By default, RS = "n" i.e. new line. • Each line of data or text on the input file is referred to as a record. • Records are read in one at a time, and the current record is stored in the field variable $0.
  • 10.
    A record issplit into fields which are stored in the field buffers $1, $2, ..., $NF. • A field is thus, a unit of data in a line (record). • Each field in a record is separated from the other fields by the field separator, FS. • The default field separator is whitespace.
  • 11.
  • 12.
  • 13.
  • 14.
    • A patterncan be: BEGIN, END, expression expression, expression • Note that, BEGIN and END patterns require an action. • An AWK script can be divided into three main parts as follows:
  • 16.
    • BEGIN: performspre-processing that must be completed before awk starts reading records from the input file. • Mostly to initialize variables and to create report headings. • BODY: contains main processing logic to be applied to input records, • like a loop that processes input data one record at a time: • the body executes mostly ones for each record. • END: post-processing contains logic to be executed after all input data have been processed. • Logic such as printing report grand total are performed in this part of the script.
  • 17.
    STATEMENTS • Statements inan AWK program are terminated by newlines, semi- colons or both. • Groups of statements such as actions or loop bodies are blocked via {...} as in C. • The last statement in a block doesn't need a terminator. • Blank lines have no meaning; an empty statement is terminated with a semicolon. • Long statements can be continued with a backslash, . • A statement can be broken without a backslash after a comma, left brace, &&, ||, do, else, the right parenthesis of an if, while or for statement, and the right parenthesis of a function definition.
  • 19.
    EXPRESSIONS AND OPERATORS PrimaryAWK expressions are – numeric constants, – string constants, – variables, – fields, – arrays and – function calls.
  • 20.
    The identifier fora variable, array or function can be a sequence of – letters, digits and underscores – and does not start with a digit. • Variables are not declared; they exist when first referenced and are initialized to null.
  • 21.
    New expressions arecomposed with the following operators in order of increasing precedence.
  • 22.
    EXPRESSION PATTERN TYPES •uses marching • either searches through an entire record for a possible march using regular expression enclosed by ‘/’s • or explicitly searches for a march in a particular field or group of fields using the expressions ~ (march) or !~ (not march).