SlideShare a Scribd company logo
1 of 33
Download to read offline
Colloquium - awk
v1.0
A. Magee
April 4, 2010
1 / 19
Colloquium - awk, v1.0
A. Magee
Outline
1 Introduction
What does awk offer?
When should I use awk?
2 Learning by example
Sample File
Polling a Field
Doing a Little Math
2 / 19
Colloquium - awk, v1.0
A. Magee
Outline
1 Introduction
What does awk offer?
When should I use awk?
2 Learning by example
Sample File
Polling a Field
Doing a Little Math
2 / 19
Colloquium - awk, v1.0
A. Magee
Introduction What?
What does awk offer?
awk is a text processor that works well on database types of files.
It operates on a file or stream of characters where a newline character
terminates a line.
It works best on files with unique text item delimiters like whitespace,
comma, colon, etc.
It can operate on specific lines that you describe.
It can make programatic text manipulation quick and painless.
3 / 19
Colloquium - awk, v1.0
A. Magee
Introduction What?
What does awk offer?
awk is a text processor that works well on database types of files.
It operates on a file or stream of characters where a newline character
terminates a line.
It works best on files with unique text item delimiters like whitespace,
comma, colon, etc.
It can operate on specific lines that you describe.
It can make programatic text manipulation quick and painless.
3 / 19
Colloquium - awk, v1.0
A. Magee
Introduction What?
What does awk offer?
awk is a text processor that works well on database types of files.
It operates on a file or stream of characters where a newline character
terminates a line.
It works best on files with unique text item delimiters like whitespace,
comma, colon, etc.
It can operate on specific lines that you describe.
It can make programatic text manipulation quick and painless.
3 / 19
Colloquium - awk, v1.0
A. Magee
Introduction When?
When should I use awk?
For parsing well structured data.
For editing a file at precisely defined places.
When you are too lazy (or smart) to open a WYSIWYG editor.
4 / 19
Colloquium - awk, v1.0
A. Magee
Introduction When?
When should I use awk?
For parsing well structured data.
For editing a file at precisely defined places.
When you are too lazy (or smart) to open a WYSIWYG editor.
4 / 19
Colloquium - awk, v1.0
A. Magee
Introduction When?
When should I use awk?
For parsing well structured data.
For editing a file at precisely defined places.
When you are too lazy (or smart) to open a WYSIWYG editor.
4 / 19
Colloquium - awk, v1.0
A. Magee
Examples Sample File
A sample file
Here’s a short file from an ls listing that we can play with, let’s call it
sample.txt.
drwxr-xr-x 22 root root 4096 2010-02-15 12:59 .
drwxr-xr-x 22 root root 4096 2010-02-15 12:59 ..
drwxr-xr-x 2 root root 4096 2010-02-27 19:25 bin
drwxr-xr-x 3 root root 4096 2010-02-27 19:27 boot
lrwxrwxrwx 1 root root 11 2008-03-08 08:56 cdrom -> media/cdrom
drwxr-xr-x 14 root root 3200 2010-01-17 11:45 dev
drwxr-xr-x 85 root root 12288 2010-04-04 22:16 etc
lrwxrwxrwx 1 root root 22 2010-02-10 12:09 home -> /usr/bob
5 / 19
Colloquium - awk, v1.0
A. Magee
Examples Sample File
Another sample file
Here’s a short file from a database that we can play with, let’s call it
sample2.txt.
psmith01 CLASS2B YEAR2 1 N ADVANCED STAFF 1 Y Y
smehta CLASS3G LOCAL 1 Y STANDARD PUPIL 2.1 N Y
mrsjohns SNHOJ UNRESTRICTED -1 Y ADVANCED STAFF 2 Y N
psmith02 CLASS4D UKSCHOOLS 0 N ADVANCED STAFF 10 Y Y
scohen CLASS3G LOCAL 2 Y STANDARD PUPIL 1 N N
swright CLASS1J YEAR1 1 N STANDARD PUPIL 1 N Y
amarkov CLASS4E UKSCHOOLS 3 Y STANDARD PUPIL 1 N N
6 / 19
Colloquium - awk, v1.0
A. Magee
Examples Polling
Example 1
> awk ’{print NF}’ sample.txt
8
8
8
8
10
8
8
10
Each line awk processes in called a record.
As with many commands we generally want to wrap our expression
with quotes.
{...}: A command group.
NF: The number of fields in the record.
7 / 19
Colloquium - awk, v1.0
A. Magee
Examples Polling
Example 1
> awk ’{print NF}’ sample.txt
8
8
8
8
10
8
8
10
Each line awk processes in called a record.
As with many commands we generally want to wrap our expression
with quotes.
{...}: A command group.
NF: The number of fields in the record.
7 / 19
Colloquium - awk, v1.0
A. Magee
Examples Polling
Example 1
> awk ’{print NF}’ sample.txt
8
8
8
8
10
8
8
10
Each line awk processes in called a record.
As with many commands we generally want to wrap our expression
with quotes.
{...}: A command group.
NF: The number of fields in the record.
7 / 19
Colloquium - awk, v1.0
A. Magee
Examples Polling
Example 2
> awk ’/ˆl/ {print $NF}’ sample.txt
media/cdrom
/usr/bob
/.../: This matches any line containing the regex.
In this case we match any line that starts with the letter l.
{...}: A command group.
$NF: The last field of the line.
This command prints all the destinations of the symbolic links from
the listing.
What’s another way to get the same results?
8 / 19
Colloquium - awk, v1.0
A. Magee
Examples Polling
Example 2
> awk ’/ˆl/ {print $NF}’ sample.txt
media/cdrom
/usr/bob
/.../: This matches any line containing the regex.
In this case we match any line that starts with the letter l.
{...}: A command group.
$NF: The last field of the line.
This command prints all the destinations of the symbolic links from
the listing.
What’s another way to get the same results?
8 / 19
Colloquium - awk, v1.0
A. Magee
Examples Polling
Example 2
> awk ’/ˆl/ {print $NF}’ sample.txt
media/cdrom
/usr/bob
/.../: This matches any line containing the regex.
In this case we match any line that starts with the letter l.
{...}: A command group.
$NF: The last field of the line.
This command prints all the destinations of the symbolic links from
the listing.
What’s another way to get the same results?
8 / 19
Colloquium - awk, v1.0
A. Magee
Examples Polling
Example 3
> awk ’{print NR,$0}’ sample.txt
1 drwxr-xr-x 22 root root 4096 2010-02-15 12:59 .
2 drwxr-xr-x 22 root root 4096 2010-02-15 12:59 ..
3 drwxr-xr-x 2 root root 4096 2010-02-27 19:25 bin
4 drwxr-xr-x 3 root root 4096 2010-02-27 19:27 boot
5 lrwxrwxrwx 1 root root 11 2008-03-08 08:56 cdrom -> media/cdrom
6 drwxr-xr-x 14 root root 3200 2010-01-17 11:45 dev
7 drwxr-xr-x 85 root root 12288 2010-04-04 22:16 etc
8 lrwxrwxrwx 1 root root 22 2010-02-10 12:09 home -> /usr/bob
NR: The current record number.
$0: Special symbol representing every field.
This simply prints each line preceded by it’s record number.
9 / 19
Colloquium - awk, v1.0
A. Magee
Examples Polling
Example 4
> awk ’{print $NR}’ sample.txt
drwxr-xr-x
22
root
root
11
2010-01-17
22:16
home
What does this silly command do?
Could it be useful?
10 / 19
Colloquium - awk, v1.0
A. Magee
Examples Math
Example 5
> awk -F, ’BEGIN {prod = 1} {prod *= $NR} END
{print prod}’ diag.dat
24
The file diag.dat contains a square upper-diagonal matrix.
The determinate of such a matrix is simply the product of the
diagonals.
prod must be initialized to 1, otherwise it is assumed to be 0.
Initializations are done in the BEGIN {...} command
The END keyword delimits which commands should be run after the
records are processed.
-F: Redefine a single character field delimiter.
11 / 19
Colloquium - awk, v1.0
A. Magee
Examples Math
Example 5
> awk -F, ’BEGIN {prod = 1} {prod *= $NR} END
{print prod}’ diag.dat
24
The file diag.dat contains a square upper-diagonal matrix.
The determinate of such a matrix is simply the product of the
diagonals.
prod must be initialized to 1, otherwise it is assumed to be 0.
Initializations are done in the BEGIN {...} command
The END keyword delimits which commands should be run after the
records are processed.
-F: Redefine a single character field delimiter.
11 / 19
Colloquium - awk, v1.0
A. Magee
Examples Math
Example 5
> awk -F, ’BEGIN {prod = 1} {prod *= $NR} END
{print prod}’ diag.dat
24
The file diag.dat contains a square upper-diagonal matrix.
The determinate of such a matrix is simply the product of the
diagonals.
prod must be initialized to 1, otherwise it is assumed to be 0.
Initializations are done in the BEGIN {...} command
The END keyword delimits which commands should be run after the
records are processed.
-F: Redefine a single character field delimiter.
11 / 19
Colloquium - awk, v1.0
A. Magee
Examples Math
Example 5
> awk -F, ’BEGIN {prod = 1} {prod *= $NR} END
{print prod}’ diag.dat
24
The file diag.dat contains a square upper-diagonal matrix.
The determinate of such a matrix is simply the product of the
diagonals.
prod must be initialized to 1, otherwise it is assumed to be 0.
Initializations are done in the BEGIN {...} command
The END keyword delimits which commands should be run after the
records are processed.
-F: Redefine a single character field delimiter.
11 / 19
Colloquium - awk, v1.0
A. Magee
Examples Math
Example 5
> awk -F, ’BEGIN {prod = 1} {prod *= $NR} END
{print prod}’ diag.dat
24
The file diag.dat contains a square upper-diagonal matrix.
The determinate of such a matrix is simply the product of the
diagonals.
prod must be initialized to 1, otherwise it is assumed to be 0.
Initializations are done in the BEGIN {...} command
The END keyword delimits which commands should be run after the
records are processed.
-F: Redefine a single character field delimiter.
11 / 19
Colloquium - awk, v1.0
A. Magee
Examples Math
Non-explicit Details
> awk ’{sum += $5; print $5} END {print "total: "sum}’ sample.txt
31905
Variables do not need predefinition; undefined variables are null.
This c-like syntax sums the fifth column of each record.
Commands in a {...} are separated by semicolons (;).
General structure is
BEGIN {...} pattern {...} pattern {...} ... END {...}
Variables are not strongly typed. They may be a string or number
depending on how you operate on it.
12 / 19
Colloquium - awk, v1.0
A. Magee
Examples Math
Example 6 & 7
> awk ’{sum += $8} END {print sum/NR}’ sample2.txt
2.2625
This is not correct! (compute by hand to verify.)
Examine the file carefully to understand why.
> awk ’!/ˆ#/ {sum += $8; cnt++} END {print sum/cnt}’ sample2.txt
2.58571
Here the problem has been resolved by keeping a count of lines
matched.
Notice that lines starting with a # have been excluded.
13 / 19
Colloquium - awk, v1.0
A. Magee
Examples Math
Example 6 & 7
> awk ’{sum += $8} END {print sum/NR}’ sample2.txt
2.2625
This is not correct! (compute by hand to verify.)
Examine the file carefully to understand why.
> awk ’!/ˆ#/ {sum += $8; cnt++} END {print sum/cnt}’ sample2.txt
2.58571
Here the problem has been resolved by keeping a count of lines
matched.
Notice that lines starting with a # have been excluded.
13 / 19
Colloquium - awk, v1.0
A. Magee
Examples Math
Example 8
Recall the sed addressing model x∼y.
> awk ’(1+NR)%3 == 0 {print $0}’ sample2.txt
psmith01 CLASS2B YEAR2 1 N ADVANCED STAFF 1 Y Y
psmith02 CLASS4D UKSCHOOLS 0 N ADVANCED STAFFE 10 Y Y
amarkov CLASS4E UKSCHOOLS 3 Y STANDARD PUPIL 1 N N
NB: NR is zero indexed.
Here x is 1 and y is 3.
14 / 19
Colloquium - awk, v1.0
A. Magee
Appendix
3 Appendix
Tons of Control
15 / 19
Colloquium - awk, v1.0
A. Magee
Appendix Tons of Control
More Built-Ins
FILENAME - Input file name.
FS - The field separator.
RS - The record separator (default is newline).
OFS - Output field separator.
ORS - Output record separator.
OFMT - Output format for numbers.
16 / 19
Colloquium - awk, v1.0
A. Magee
Appendix Tons of Control
Math Functions
Relationals: <, ≤, ! =, ==, ≥, >
Operators: +, −, ∗, /, ∧, %
Also pre- and post- increment and decrement.
++, −−
Assignment: =, + =, − =, ∗ =, / =, % =
Many other math operations: sqrt(), log(), exp(), int(), etc.
17 / 19
Colloquium - awk, v1.0
A. Magee
Appendix Tons of Control
String Functions
substr(string, begin, length)
split(string, array, separator)
index(string, substring)
18 / 19
Colloquium - awk, v1.0
A. Magee
Appendix Tons of Control
Control Structures
if ... else
while
for
19 / 19
Colloquium - awk, v1.0
A. Magee

More Related Content

What's hot

What's hot (20)

ARC - Moqod mobile talks meetup
ARC - Moqod mobile talks meetupARC - Moqod mobile talks meetup
ARC - Moqod mobile talks meetup
 
The CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGitThe CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGit
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
Handling inline assembly in Clang and LLVM
Handling inline assembly in Clang and LLVMHandling inline assembly in Clang and LLVM
Handling inline assembly in Clang and LLVM
 
Demystifying the Go Scheduler
Demystifying the Go SchedulerDemystifying the Go Scheduler
Demystifying the Go Scheduler
 
C++ Core Guidelines
C++ Core GuidelinesC++ Core Guidelines
C++ Core Guidelines
 
Exploit techniques - a quick review
Exploit techniques - a quick reviewExploit techniques - a quick review
Exploit techniques - a quick review
 
Dusting the globe: analysis of NASA World Wind project
Dusting the globe: analysis of NASA World Wind projectDusting the globe: analysis of NASA World Wind project
Dusting the globe: analysis of NASA World Wind project
 
Building Custom PHP Extensions
Building Custom PHP ExtensionsBuilding Custom PHP Extensions
Building Custom PHP Extensions
 
Php and threads ZTS
Php and threads ZTSPhp and threads ZTS
Php and threads ZTS
 
SymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performancesSymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performances
 
Introduction to Rust language programming
Introduction to Rust language programmingIntroduction to Rust language programming
Introduction to Rust language programming
 
Python Programming Essentials - M31 - PEP 8
Python Programming Essentials - M31 - PEP 8Python Programming Essentials - M31 - PEP 8
Python Programming Essentials - M31 - PEP 8
 
Linux version of PVS-Studio couldn't help checking CodeLite
Linux version of PVS-Studio couldn't help checking CodeLiteLinux version of PVS-Studio couldn't help checking CodeLite
Linux version of PVS-Studio couldn't help checking CodeLite
 
Python Programming Essentials - M13 - Tuples
Python Programming Essentials - M13 - TuplesPython Programming Essentials - M13 - Tuples
Python Programming Essentials - M13 - Tuples
 
Oop object oriented programing topics
Oop object oriented programing topicsOop object oriented programing topics
Oop object oriented programing topics
 
A few words about OpenSSL
A few words about OpenSSLA few words about OpenSSL
A few words about OpenSSL
 
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Bridge TensorFlow to run on Intel nGraph backends (v0.5)Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
 
Python and Ruby implementations compared by the error density
Python and Ruby implementations compared by the error densityPython and Ruby implementations compared by the error density
Python and Ruby implementations compared by the error density
 
Checking the Open-Source Multi Theft Auto Game
Checking the Open-Source Multi Theft Auto GameChecking the Open-Source Multi Theft Auto Game
Checking the Open-Source Multi Theft Auto Game
 

Similar to Awk Introduction

COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docxCOMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
cargillfilberto
 
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docxCOMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
drandy1
 
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docxCOMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
monicafrancis71118
 
Objectives Assignment 09 Applications of Stacks COS.docx
Objectives Assignment 09 Applications of Stacks COS.docxObjectives Assignment 09 Applications of Stacks COS.docx
Objectives Assignment 09 Applications of Stacks COS.docx
dunhamadell
 
Assignment1 B 0
Assignment1 B 0Assignment1 B 0
Assignment1 B 0
Mahmoud
 

Similar to Awk Introduction (20)

awk_intro.ppt
awk_intro.pptawk_intro.ppt
awk_intro.ppt
 
Awk programming
Awk programming Awk programming
Awk programming
 
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docxCOMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
 
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docxCOMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
 
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docxCOMM 166 Final Research Proposal GuidelinesThe proposal should.docx
COMM 166 Final Research Proposal GuidelinesThe proposal should.docx
 
Unix day4 v1.3
Unix day4 v1.3Unix day4 v1.3
Unix day4 v1.3
 
Unix Tutorial
Unix TutorialUnix Tutorial
Unix Tutorial
 
Python Programming Basics for begginners
Python Programming Basics for begginnersPython Programming Basics for begginners
Python Programming Basics for begginners
 
Nzitf Velociraptor Workshop
Nzitf Velociraptor WorkshopNzitf Velociraptor Workshop
Nzitf Velociraptor Workshop
 
Swift, swiftly
Swift, swiftlySwift, swiftly
Swift, swiftly
 
Oct.22nd.Presentation.Final
Oct.22nd.Presentation.FinalOct.22nd.Presentation.Final
Oct.22nd.Presentation.Final
 
C tour Unix
C tour UnixC tour Unix
C tour Unix
 
C++ programming
C++ programmingC++ programming
C++ programming
 
Objectives Assignment 09 Applications of Stacks COS.docx
Objectives Assignment 09 Applications of Stacks COS.docxObjectives Assignment 09 Applications of Stacks COS.docx
Objectives Assignment 09 Applications of Stacks COS.docx
 
Assignment1 B 0
Assignment1 B 0Assignment1 B 0
Assignment1 B 0
 
C++ programming
C++ programmingC++ programming
C++ programming
 
Linux intro 3 grep + Unix piping
Linux intro 3 grep + Unix pipingLinux intro 3 grep + Unix piping
Linux intro 3 grep + Unix piping
 
Linux class 15 26 oct 2021
Linux class 15   26 oct 2021Linux class 15   26 oct 2021
Linux class 15 26 oct 2021
 
Unix interview questions
Unix interview questionsUnix interview questions
Unix interview questions
 
The Ring programming language version 1.8 book - Part 45 of 202
The Ring programming language version 1.8 book - Part 45 of 202The Ring programming language version 1.8 book - Part 45 of 202
The Ring programming language version 1.8 book - Part 45 of 202
 

Recently uploaded

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 

Recently uploaded (20)

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 

Awk Introduction

  • 1. Colloquium - awk v1.0 A. Magee April 4, 2010 1 / 19 Colloquium - awk, v1.0 A. Magee
  • 2. Outline 1 Introduction What does awk offer? When should I use awk? 2 Learning by example Sample File Polling a Field Doing a Little Math 2 / 19 Colloquium - awk, v1.0 A. Magee
  • 3. Outline 1 Introduction What does awk offer? When should I use awk? 2 Learning by example Sample File Polling a Field Doing a Little Math 2 / 19 Colloquium - awk, v1.0 A. Magee
  • 4. Introduction What? What does awk offer? awk is a text processor that works well on database types of files. It operates on a file or stream of characters where a newline character terminates a line. It works best on files with unique text item delimiters like whitespace, comma, colon, etc. It can operate on specific lines that you describe. It can make programatic text manipulation quick and painless. 3 / 19 Colloquium - awk, v1.0 A. Magee
  • 5. Introduction What? What does awk offer? awk is a text processor that works well on database types of files. It operates on a file or stream of characters where a newline character terminates a line. It works best on files with unique text item delimiters like whitespace, comma, colon, etc. It can operate on specific lines that you describe. It can make programatic text manipulation quick and painless. 3 / 19 Colloquium - awk, v1.0 A. Magee
  • 6. Introduction What? What does awk offer? awk is a text processor that works well on database types of files. It operates on a file or stream of characters where a newline character terminates a line. It works best on files with unique text item delimiters like whitespace, comma, colon, etc. It can operate on specific lines that you describe. It can make programatic text manipulation quick and painless. 3 / 19 Colloquium - awk, v1.0 A. Magee
  • 7. Introduction When? When should I use awk? For parsing well structured data. For editing a file at precisely defined places. When you are too lazy (or smart) to open a WYSIWYG editor. 4 / 19 Colloquium - awk, v1.0 A. Magee
  • 8. Introduction When? When should I use awk? For parsing well structured data. For editing a file at precisely defined places. When you are too lazy (or smart) to open a WYSIWYG editor. 4 / 19 Colloquium - awk, v1.0 A. Magee
  • 9. Introduction When? When should I use awk? For parsing well structured data. For editing a file at precisely defined places. When you are too lazy (or smart) to open a WYSIWYG editor. 4 / 19 Colloquium - awk, v1.0 A. Magee
  • 10. Examples Sample File A sample file Here’s a short file from an ls listing that we can play with, let’s call it sample.txt. drwxr-xr-x 22 root root 4096 2010-02-15 12:59 . drwxr-xr-x 22 root root 4096 2010-02-15 12:59 .. drwxr-xr-x 2 root root 4096 2010-02-27 19:25 bin drwxr-xr-x 3 root root 4096 2010-02-27 19:27 boot lrwxrwxrwx 1 root root 11 2008-03-08 08:56 cdrom -> media/cdrom drwxr-xr-x 14 root root 3200 2010-01-17 11:45 dev drwxr-xr-x 85 root root 12288 2010-04-04 22:16 etc lrwxrwxrwx 1 root root 22 2010-02-10 12:09 home -> /usr/bob 5 / 19 Colloquium - awk, v1.0 A. Magee
  • 11. Examples Sample File Another sample file Here’s a short file from a database that we can play with, let’s call it sample2.txt. psmith01 CLASS2B YEAR2 1 N ADVANCED STAFF 1 Y Y smehta CLASS3G LOCAL 1 Y STANDARD PUPIL 2.1 N Y mrsjohns SNHOJ UNRESTRICTED -1 Y ADVANCED STAFF 2 Y N psmith02 CLASS4D UKSCHOOLS 0 N ADVANCED STAFF 10 Y Y scohen CLASS3G LOCAL 2 Y STANDARD PUPIL 1 N N swright CLASS1J YEAR1 1 N STANDARD PUPIL 1 N Y amarkov CLASS4E UKSCHOOLS 3 Y STANDARD PUPIL 1 N N 6 / 19 Colloquium - awk, v1.0 A. Magee
  • 12. Examples Polling Example 1 > awk ’{print NF}’ sample.txt 8 8 8 8 10 8 8 10 Each line awk processes in called a record. As with many commands we generally want to wrap our expression with quotes. {...}: A command group. NF: The number of fields in the record. 7 / 19 Colloquium - awk, v1.0 A. Magee
  • 13. Examples Polling Example 1 > awk ’{print NF}’ sample.txt 8 8 8 8 10 8 8 10 Each line awk processes in called a record. As with many commands we generally want to wrap our expression with quotes. {...}: A command group. NF: The number of fields in the record. 7 / 19 Colloquium - awk, v1.0 A. Magee
  • 14. Examples Polling Example 1 > awk ’{print NF}’ sample.txt 8 8 8 8 10 8 8 10 Each line awk processes in called a record. As with many commands we generally want to wrap our expression with quotes. {...}: A command group. NF: The number of fields in the record. 7 / 19 Colloquium - awk, v1.0 A. Magee
  • 15. Examples Polling Example 2 > awk ’/ˆl/ {print $NF}’ sample.txt media/cdrom /usr/bob /.../: This matches any line containing the regex. In this case we match any line that starts with the letter l. {...}: A command group. $NF: The last field of the line. This command prints all the destinations of the symbolic links from the listing. What’s another way to get the same results? 8 / 19 Colloquium - awk, v1.0 A. Magee
  • 16. Examples Polling Example 2 > awk ’/ˆl/ {print $NF}’ sample.txt media/cdrom /usr/bob /.../: This matches any line containing the regex. In this case we match any line that starts with the letter l. {...}: A command group. $NF: The last field of the line. This command prints all the destinations of the symbolic links from the listing. What’s another way to get the same results? 8 / 19 Colloquium - awk, v1.0 A. Magee
  • 17. Examples Polling Example 2 > awk ’/ˆl/ {print $NF}’ sample.txt media/cdrom /usr/bob /.../: This matches any line containing the regex. In this case we match any line that starts with the letter l. {...}: A command group. $NF: The last field of the line. This command prints all the destinations of the symbolic links from the listing. What’s another way to get the same results? 8 / 19 Colloquium - awk, v1.0 A. Magee
  • 18. Examples Polling Example 3 > awk ’{print NR,$0}’ sample.txt 1 drwxr-xr-x 22 root root 4096 2010-02-15 12:59 . 2 drwxr-xr-x 22 root root 4096 2010-02-15 12:59 .. 3 drwxr-xr-x 2 root root 4096 2010-02-27 19:25 bin 4 drwxr-xr-x 3 root root 4096 2010-02-27 19:27 boot 5 lrwxrwxrwx 1 root root 11 2008-03-08 08:56 cdrom -> media/cdrom 6 drwxr-xr-x 14 root root 3200 2010-01-17 11:45 dev 7 drwxr-xr-x 85 root root 12288 2010-04-04 22:16 etc 8 lrwxrwxrwx 1 root root 22 2010-02-10 12:09 home -> /usr/bob NR: The current record number. $0: Special symbol representing every field. This simply prints each line preceded by it’s record number. 9 / 19 Colloquium - awk, v1.0 A. Magee
  • 19. Examples Polling Example 4 > awk ’{print $NR}’ sample.txt drwxr-xr-x 22 root root 11 2010-01-17 22:16 home What does this silly command do? Could it be useful? 10 / 19 Colloquium - awk, v1.0 A. Magee
  • 20. Examples Math Example 5 > awk -F, ’BEGIN {prod = 1} {prod *= $NR} END {print prod}’ diag.dat 24 The file diag.dat contains a square upper-diagonal matrix. The determinate of such a matrix is simply the product of the diagonals. prod must be initialized to 1, otherwise it is assumed to be 0. Initializations are done in the BEGIN {...} command The END keyword delimits which commands should be run after the records are processed. -F: Redefine a single character field delimiter. 11 / 19 Colloquium - awk, v1.0 A. Magee
  • 21. Examples Math Example 5 > awk -F, ’BEGIN {prod = 1} {prod *= $NR} END {print prod}’ diag.dat 24 The file diag.dat contains a square upper-diagonal matrix. The determinate of such a matrix is simply the product of the diagonals. prod must be initialized to 1, otherwise it is assumed to be 0. Initializations are done in the BEGIN {...} command The END keyword delimits which commands should be run after the records are processed. -F: Redefine a single character field delimiter. 11 / 19 Colloquium - awk, v1.0 A. Magee
  • 22. Examples Math Example 5 > awk -F, ’BEGIN {prod = 1} {prod *= $NR} END {print prod}’ diag.dat 24 The file diag.dat contains a square upper-diagonal matrix. The determinate of such a matrix is simply the product of the diagonals. prod must be initialized to 1, otherwise it is assumed to be 0. Initializations are done in the BEGIN {...} command The END keyword delimits which commands should be run after the records are processed. -F: Redefine a single character field delimiter. 11 / 19 Colloquium - awk, v1.0 A. Magee
  • 23. Examples Math Example 5 > awk -F, ’BEGIN {prod = 1} {prod *= $NR} END {print prod}’ diag.dat 24 The file diag.dat contains a square upper-diagonal matrix. The determinate of such a matrix is simply the product of the diagonals. prod must be initialized to 1, otherwise it is assumed to be 0. Initializations are done in the BEGIN {...} command The END keyword delimits which commands should be run after the records are processed. -F: Redefine a single character field delimiter. 11 / 19 Colloquium - awk, v1.0 A. Magee
  • 24. Examples Math Example 5 > awk -F, ’BEGIN {prod = 1} {prod *= $NR} END {print prod}’ diag.dat 24 The file diag.dat contains a square upper-diagonal matrix. The determinate of such a matrix is simply the product of the diagonals. prod must be initialized to 1, otherwise it is assumed to be 0. Initializations are done in the BEGIN {...} command The END keyword delimits which commands should be run after the records are processed. -F: Redefine a single character field delimiter. 11 / 19 Colloquium - awk, v1.0 A. Magee
  • 25. Examples Math Non-explicit Details > awk ’{sum += $5; print $5} END {print "total: "sum}’ sample.txt 31905 Variables do not need predefinition; undefined variables are null. This c-like syntax sums the fifth column of each record. Commands in a {...} are separated by semicolons (;). General structure is BEGIN {...} pattern {...} pattern {...} ... END {...} Variables are not strongly typed. They may be a string or number depending on how you operate on it. 12 / 19 Colloquium - awk, v1.0 A. Magee
  • 26. Examples Math Example 6 & 7 > awk ’{sum += $8} END {print sum/NR}’ sample2.txt 2.2625 This is not correct! (compute by hand to verify.) Examine the file carefully to understand why. > awk ’!/ˆ#/ {sum += $8; cnt++} END {print sum/cnt}’ sample2.txt 2.58571 Here the problem has been resolved by keeping a count of lines matched. Notice that lines starting with a # have been excluded. 13 / 19 Colloquium - awk, v1.0 A. Magee
  • 27. Examples Math Example 6 & 7 > awk ’{sum += $8} END {print sum/NR}’ sample2.txt 2.2625 This is not correct! (compute by hand to verify.) Examine the file carefully to understand why. > awk ’!/ˆ#/ {sum += $8; cnt++} END {print sum/cnt}’ sample2.txt 2.58571 Here the problem has been resolved by keeping a count of lines matched. Notice that lines starting with a # have been excluded. 13 / 19 Colloquium - awk, v1.0 A. Magee
  • 28. Examples Math Example 8 Recall the sed addressing model x∼y. > awk ’(1+NR)%3 == 0 {print $0}’ sample2.txt psmith01 CLASS2B YEAR2 1 N ADVANCED STAFF 1 Y Y psmith02 CLASS4D UKSCHOOLS 0 N ADVANCED STAFFE 10 Y Y amarkov CLASS4E UKSCHOOLS 3 Y STANDARD PUPIL 1 N N NB: NR is zero indexed. Here x is 1 and y is 3. 14 / 19 Colloquium - awk, v1.0 A. Magee
  • 29. Appendix 3 Appendix Tons of Control 15 / 19 Colloquium - awk, v1.0 A. Magee
  • 30. Appendix Tons of Control More Built-Ins FILENAME - Input file name. FS - The field separator. RS - The record separator (default is newline). OFS - Output field separator. ORS - Output record separator. OFMT - Output format for numbers. 16 / 19 Colloquium - awk, v1.0 A. Magee
  • 31. Appendix Tons of Control Math Functions Relationals: <, ≤, ! =, ==, ≥, > Operators: +, −, ∗, /, ∧, % Also pre- and post- increment and decrement. ++, −− Assignment: =, + =, − =, ∗ =, / =, % = Many other math operations: sqrt(), log(), exp(), int(), etc. 17 / 19 Colloquium - awk, v1.0 A. Magee
  • 32. Appendix Tons of Control String Functions substr(string, begin, length) split(string, array, separator) index(string, substring) 18 / 19 Colloquium - awk, v1.0 A. Magee
  • 33. Appendix Tons of Control Control Structures if ... else while for 19 / 19 Colloquium - awk, v1.0 A. Magee