University of North Texas
2
The sed Stream Editor
3
The sed Stream Editor
• sed is a non-interactive, line-oriented stream editor that
processes one line at a time
– Useful in text processing and especially performing in-place
substitution
– sed can make global substitutions of matched regex patterns with
specific text
• Example
– How change all occurrences of word "the" or "The" to uppercase
"THE" in file called file1?
sed -r "s/(The|the)/THE/g" file1
4
The sed Stream Editor
• Usage:
sed -r "s/REGEX/TEXT/g" filename
– Substitutes (replaces) occurrence(s) of REGEX with the given TEXT
– If filename is omitted, reads from standard input
– sed has other uses, but most can be emulated with substitutions
– Resulting output to terminal
• If wanted permanent changes, need redirect output to new file or
make changes in-place using –i option
• Example
– Replaces all occurrences of 143 with 390 in file2.txt
sed -r "s/143/390/g" file2.txt
5
The sed Stream Editor
• sed is line-oriented; processes input a line at a time
-r option makes regexes work better
– Recognizes ( ) , [ ] , * , + the right way, etc.
s for substitute
g flag after last / asks for a global match (replace all)
• Special characters must be escaped to match them literally
sed -r "s/http:\/\//https:\/\//g" urls.txt
• sed can use delimiters besides / to make more readable
sed -r "s#http://#https://#g" urls.txt
• Example
sed -r "s/([A-Za-z]+), ([A-Za-z]+)/\2 \1/g" names.txt
6
sed Usage
• Edit files too large for interactive editing
• Edit any size files where editing sequence is too complicated to
type in interactive mode
• Perform “multiple global” editing functions efficiently in one
pass through the input
• Edit multiples files automatically
• Good tool for writing conversion programs
7
The sed Command
or
'command'
8
sed Syntax
sed [-n] [-e] ['command'] [file…]
sed [-n] [-f script] [file…]
• Options
–n only print lines specified with print command (or 'p'
flag of substitute ('s') command)
–f script next argument is filename containing editing
commands
If first line of script is "#n", acts as if -n had been
specified
–e command next argument is an editing command rather than
filename, useful if multiple commands are specified
9
How Does sed Work?
• sed reads line of input
– Line of input is copied into a temporary buffer called pattern space
– Editing commands are applied
• Subsequent commands are applied to line in the pattern space,
not the original input line
• Once finished, line is sent to output (unless –n option was used)
– Line is removed from pattern space
• sed reads next line of input, until end of file
• Note that input file is unchanged!
10
sed Scripts
• A script is nothing more than a file of commands
• Each command consists of an address and an action, where
the address can be a regular expression or line numbe ...
University of North Texas 2 The sed Stream Editor .docx
1. University of North Texas
2
The sed Stream Editor
3
The sed Stream Editor
• sed is a non-interactive, line-oriented stream editor that
processes one line at a time
– Useful in text processing and especially performing in-place
substitution
– sed can make global substitutions of matched regex patterns
with
specific text
• Example
– How change all occurrences of word "the" or "The" to
uppercase
"THE" in file called file1?
sed -r "s/(The|the)/THE/g" file1
2. 4
The sed Stream Editor
• Usage:
sed -r "s/REGEX/TEXT/g" filename
– Substitutes (replaces) occurrence(s) of REGEX with the given
TEXT
– If filename is omitted, reads from standard input
– sed has other uses, but most can be emulated with
substitutions
– Resulting output to terminal
• If wanted permanent changes, need redirect output to new file
or
make changes in-place using –i option
• Example
– Replaces all occurrences of 143 with 390 in file2.txt
sed -r "s/143/390/g" file2.txt
5
The sed Stream Editor
• sed is line-oriented; processes input a line at a time
3. -r option makes regexes work better
– Recognizes ( ) , [ ] , * , + the right way, etc.
s for substitute
g flag after last / asks for a global match (replace all)
• Special characters must be escaped to match them literally
sed -r "s/http:///https:///g" urls.txt
• sed can use delimiters besides / to make more readable
sed -r "s#http://#https://#g" urls.txt
• Example
sed -r "s/([A-Za-z]+), ([A-Za-z]+)/2 1/g" names.txt
6
sed Usage
• Edit files too large for interactive editing
• Edit any size files where editing sequence is too complicated
to
type in interactive mode
• Perform “multiple global” editing functions efficiently in one
pass through the input
• Edit multiples files automatically
• Good tool for writing conversion programs
4. 7
The sed Command
or
'command'
8
sed Syntax
sed [-n] [-e] ['command'] [file…]
sed [-n] [-f script] [file…]
• Options
–n only print lines specified with print command (or 'p'
flag of substitute ('s') command)
–f script next argument is filename containing editing
commands
If first line of script is "#n", acts as if -n had been
specified
–e command next argument is an editing command rather than
filename, useful if multiple commands are specified
5. 9
How Does sed Work?
• sed reads line of input
– Line of input is copied into a temporary buffer called pattern
space
– Editing commands are applied
• Subsequent commands are applied to line in the pattern space,
not the original input line
• Once finished, line is sent to output (unless –n option was
used)
– Line is removed from pattern space
• sed reads next line of input, until end of file
• Note that input file is unchanged!
10
sed Scripts
• A script is nothing more than a file of commands
• Each command consists of an address and an action, where
the address can be a regular expression or line number
address action command
6. address action
address action
address action
address action
script
11
sed Scripts
• As each line of the input file is read, sed reads the first
command of the script and checks the address against the
current input line:
– If there is a match, command executed
– If there is no match, command ignored
– sed then repeats this action for every command in script file
• When it has reached the end of the script, sed outputs the
current line (pattern space) unless the -n option has been set
12
Flow of Control
• sed then reads the next line in the input file and restarts from
7. the beginning of the script file
• All commands in the script file are compared to, and
potentially act on, all lines in the input file
... cmd 1 cmd n cmd 2
script
input
output
output
only without -n
print cmd
13
sed Commands
• sed commands have the general form
[address[, address]][!]command [arguments]
• sed copies each input line into a pattern space
– If address of the command matches line in pattern space,
command is
applied to that line
– If command has no address, it is applied to each line as it
enters
8. pattern space
– If a command changes the line in pattern space, subsequent
commands operate on the modified line
• When all commands have been read, the line in pattern space
is written to standard output and a new line is read into
pattern space
14
Addressing
• Address determines which lines in the input file are to be
processed by the command(s)
– Either a line number or a pattern, enclosed in slashes / … /
– If no address is specified, then command is applied to each
input line
• Most commands will accept two addresses
– If only one address is given, command operates only on that
line
– If two comma separated addresses are given, then command
operates
on a range of lines between the first and second address,
inclusively
• The ! operator can be used to negate an address
9. – Command applied to all lines that do NOT match address
15
Commands
• Command is a single letter
• Example:
– Deletion: d
[address1][,address2]d
• Delete the addressed line(s) from the pattern space
– Line(s) not passed to standard output
• A new line of input is read and editing resumes with the first
command of the script
16
Delete Address-Command Examples
d deletes all lines
6d deletes line 6
/^$/d deletes all blank lines
10. 1,10d deletes lines 1 through 10
1,/^$/d deletes from line 1 through the first blank
line
/^$/,$d deletes from first blank line through last line
of file
/^$/,10d deletes from first blank line through line 10
/^ya*y/,/[0-9]$/d deletes from first line that begins with yay,
yaay, yaaay, etc. through first line that ends
with a digit
17
Delete Command (D) Example
• Remove Part-time data from “tuition.data” file
cat tuition.data
Part-time 1003.99
Two-thirds-time 1506.49
Full-time 2012.29
sed –e '/^Part-time/d' tuition.data
Two-thirds-time 1506.49
11. Full-time 2012.29
Input data
Output after
applying delete
command
18
Multiple Commands
• Braces { } used to apply multiple commands to an address
[address][,address]{
command1
command2
command3
}
• The opening brace { must be the last character on a line
• The closing brace } must be on a line by itself
– No spaces following the braces
• Alternatively, use “;” after each command:
12. [address][,address]{command1; command2; command3; }
• Or:
'[address][,address]command1; command2; command3'
19
sed Commands
• sed contains many editing commands, though only a few are
mentioned here
s substitute
a append
i insert
c change
d delete
p print
r read
w write
y transform
= display line number
N append next line to current one
q quit
20
Print
• Print command (p) used to force pattern space to be output,
useful if -n option has been specified
• Syntax:
13. [address1[,address2]]p
– Note: if -n or #n option has not been specified, p will cause
the line to
be output twice!
• Examples:
1,5p will display lines 1 through 5
/^$/,$p will display lines from first blank line through last line
of file
21
Substitute
• Syntax:
[address(es)]s/pattern/replacement/[flags]
– pattern : search pattern
– replacement : replacement string for pattern
– flags : optionally any of the following
n a number from 1 to 512 indicating which occurrence of
pattern
should be replaced
g global, replace all occurrences of pattern in pattern space
p print contents of pattern space
14. 22
Substitute Examples
s/Puff Daddy/P. Diddy/
– Substitute P. Diddy for the first occurrence of Puff Daddy in
pattern
space
s/Four/Five/2
– Substitutes Five for the second occurrence of Four in the
pattern space
(i.e., each line)
s/paper/plastic/p
– Substitutes plastic for the first occurrence of paper and
outputs
(prints) pattern space
23
Replacement Patterns
• Substitute can use several special characters in the
replacement string
& replaced by entire string matched in regular expression for
pattern
15. n replaced by nth substring (or sub-expression) previously
specified
using "(" and ")”
used to escape the ampersand (&) and the backslash ()
24
Replacement Pattern Examples
"the UNIX operating system …"
s/.NI./wonderful &/
--> "the wonderful UNIX operating system …"
"unix is fun"
sed 's/([[:alpha:]])([^ n]*)/21ay/g'
--> "nixuay siay unfay”
cat file3
first:second
one:two
sed 's/(.*):(.*)/2:1/' file3
second:first
two:one
16. 25
Append, Insert, and Change
• Syntax for these commands is little strange because they must
be specified on multiple lines
• Append
[address]a
text
• Insert
[address]i
text
• Change
[address(es)]c
text
Append (a) and Insert (i) for
single lines only, not range
26
Append Command (A) Example
cat tuition.append.sed
a
17. --------------------------
cat tuition.data
Part-time 1003.99
Two-thirds-time 1506.49
Full-time 2012.29
sed -f tuition.append.sed tuition.data
Part-time 1003.99
--------------------------
Two-thirds-time 1506.49
--------------------------
Full-time 2012.29
--------------------------
Input data
sed script to append
dashed line after
each input line
Output after applying
the append command
18. 27
Insert Command (I) Example
cat tuition.insert.sed
1 i
Tuition List
cat tuition.data
Part-time 1003.99
Two-thirds-time 1506.49
Full-time 2012.29
sed -f tuition.insert.sed tuition.data
Tuition List
Part-time 1003.99
Two-thirds-time 1506.49
Full-time 2012.29
Input data
sed script to insert “Tuition List” as
19. report title before line 1
Output after applying
the insert command
28
Change Command (C) Example
cat tuition.change.sed
1 c
Part-time 1100.00
cat tuition.data
Part-time 1003.99
Two-thirds-time 1506.49
Full-time 2012.29
sed -f tuition.change.sed tuition.data
Part-time 1100.00
Two-thirds-time 1506.49
Full-time 2012.29
Input data
20. sed script to change
tuition cost from
1003.99 to 1100.00
Output after applying
the change command
29
Complement (!) Operator
• If an address is followed by exclamation point (!), associated
command is applied to all lines that don’t match address or
address range
• Examples:
/black/!s/cow/horse/
substitute horse for cow on all lines except those that
contained black
1,5!d delete all lines except 1 through 5
• Print lines that do not contain “obsolete”
sed –e '/obsolete/!p' input-file
“The brown cow" --> "The brown horse"
"The black cow" --> "The black cow"
21. 30
Read and Write File Commands
• Syntax: r filename
– Queue contents of filename to be read and inserted into output
stream at end of current cycle, or when next input line is read
• If filename cannot be read, treated as if were an empty file,
without any error indication
• Syntax: w filename
– Write the pattern space to filename
– The filename will be created (or truncated) before the first
input line is
read
– All w commands which refer to the same filename are output
through
the same FILE stream
31
Read and Write File Commands
cat tmp
one two three
22. one three five
two four six
sed 'r tmp'
My first line of input ---> no read until the first line is taken
from the input
My first line of input
one two three
one three five
two four six
My next line
My next line^D
sed 'w tmp1'
hello 1
hello 1
hello 2
hello 2
hello 3
hello 3^D
cat tmp1
hello 1
hello 2
hello 3
32
Line Number
• Line number command (=) writes the current line number
before each matched/output line
• Examples:
23. sed -e '/Two-thirds-time/=' tuition.data
sed -e '/^[0-9][0-9]/=' inventory
sed '=' tmp1
1
hello1
2
hello2
3
hello3
sed -n '=' tmp1
1
2
3
33
Transform
• Transform command (y) operates like tr, doing a one-to-one or
character-to-character replacement
– Accepts zero, one or two addresses
[address[,address]]y/abc/xyz/
– Every a within the specified address(es) is transformed to an
x, b to y
24. and c to z
• Examples
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTU
VWXYZ/
– Changes all lower case characters on addressed line to upper
case
sed –e '1,10y/abcd/wxyz/' datafile
– Must have same number of characters
34
Quit
• Syntax: [addr]q
– Quit (exit sed) when addr is encountered
• It takes at most a single line address
– Once a line matching the address is reached, script will be
terminated
– Can be used to save time when you only want to process some
portion
of the beginning of a file
• Example
– To print the first 100 lines of a file (like head)
25. sed '100q' filename
– sed will, by default, send the first 100 lines of filename to
standard
output and then quit processing
35
The gawk Programming Language
36
The gawk Programming Language
• A scripting language used for manipulating data and
generating reports
– Geared towards working with delimited fields on a line-by-
line basis
• Summary of gawk operations
– Scans a file line-by-line
– Splits each input line into fields
– Compares each input line and fields to the specified pattern
– Performs the requested action(s) on lines matching the
specified
pattern
26. 37
Structure of a gawk Program
• A gawk program consists of:
– An optional BEGIN segment
• For processing to execute prior
to reading input
– Pattern – Action pairs
• Processing for input data
• For each pattern matched, the
corresponding action is taken
– An optional END segment
• Processing after end of input
data
BEGIN {action}
pattern {action}
pattern {action}
.
.
27. .
pattern { action}
END {action}
38
Running a gawk Program
• There are several ways to run a gawk program
gawk 'program' input_file(s)
– Program and input files are provided as command-line
arguments
gawk 'program'
– Program is a command-line argument
– Input is taken from standard input
gawk -f program_file input_files
– Program is read from a file
39
Patterns and Actions
• Search a set of files for patterns
28. • Perform specified actions upon lines or fields that contain
instances of patterns
• Does not alter input files
• Process one input line at a time
• This is similar to sed
40
Pattern-Action Structure
• Every program statement has to have a pattern or an action or
both
• Default pattern is to match all lines
• Default action is to print current record
• Patterns are simply listed; actions are enclosed in { }
– Some actions can be similar to C code
• gawk scans a sequence of input lines, or records, one by one,
searching for lines that match the pattern
– Meaning of match depends on the pattern
41
Patterns
29. • Selector that determines whether action is to be executed
• Pattern can be:
– Special token BEGIN or END
– Regular expression (enclosed with / /)
– Relational or string match expression
– ! negates the match
– Arbitrary combination of the above using && and/or ||
• /UNT/ matches if the string “UNT” is in the record
• x > 0 matches if the condition is true
• /UNT/ && (name == "UNIX Tools")
42
BEGIN and END Patterns
• BEGIN and END provide a way to gain control before and
after
processing, for initialization, and wrap-up
BEGIN
– Actions are performed before first input line is read
30. END
– Actions are done after the last input line has been processed.
43
Actions
• Action may include a list of one or more C like statements, as
well as arithmetic and string expressions and assignments and
multiple output streams
• Action performed on every line that matches pattern
– If pattern not provided, action performed on every input line
– If action not provided, all matching lines sent to standard
output
• Since patterns and actions are optional, actions must be
enclosed in braces to distinguish them from pattern
44
Introductory Example
ls | gawk '
BEGIN { print "List of html files:" }
/.html$/ { print }
31. END { print "There you go!" }
'
List of html files:
index.html
as1.html
as2.html
There you go!
45
Variables
• gawk scripts can define and use variables
BEGIN { sum = 0 }
{ sum++ }
END { print sum }
• Some variables are predefined
46
Basic gawk Terminology
32. • gawk supports two types of buffers
– Field
• A unit of data in a line separated from other fields by the field
separator
– Record
• A collection of fields in a line (file made up of records)
• Default field separator is whitespace
• Namespace for fields in current record: $1, $2, etc.
– The $0 variable contains the entire record (i.e., line)
• Example
– Given line of input: "This class is fun!"
– $1 = "This", $2 = "class", etc.
47
Records
• Default record separator is newline
– By default, gawk processes its input one line at a time
• Could be any other regular expression
• RS: record separator
33. – Can be changed in BEGIN action
• NR is the variable whose value is the number of the current
record.
48
Fields
• Each input line is split into fields
– FS: field separator
• Default is whitespace (1 or more spaces or tabs)
– gawk -Fc option sets FS to the character c
• Can also be changed in BEGIN
– $0 is the entire line
– $1 is the first field, $2 is the second field, ….
• Only fields begin with $, variables do not
49
Some gawk System Variables
• gawk supports number of system variables
34. – FS Field separator (default = space)
– RS Record separator (default = n)
– NF Number of fields in current record
– NR Number of the current record
– OFS Output field separator (default = space)
– ORS Output record separator (default = n)
– FILENAME Current filename
– ARGC/ARGV Get arguments from command line
50
Simple Output from gawk
• Printing every line
– If action has no pattern, action is performed to all input lines
• { print } prints all input lines to standard out
• { print $0 } will do the same thing
• Printing certain fields
– Multiple items can be printed on the same output line with a
single
print statement
35. – { print $1, $3 }
– Expressions separated by a comma are, by default, separated
by a
single space when printed (OFS)
51
More Output from gawk
• NF, the Number of Fields
– Any valid expression can be used after a $ to indicate the
contents of a
particular field
– One built-in expression is NF, or Number of Fields
– { print NF, $1, $NF } will print number of fields, first field,
and
last field in the current record
– { print $(NF-2) } prints the third to last field
• Computing and printing
– You can also do computations on the field values and include
the
results in your output
– { print $1, $2 * $3 }
36. 52
More Output from gawk
• Printing line numbers
– The built-in variable NR can be used to print line numbers
– { print NR, $0 } prints each line prefixed with its line number
• Putting text in the output
– You can also add other text to the output besides what is in
the
current record
– { print "total pay for", $1, "is", $2 * $3 }
– Note that the inserted text needs to be surrounded by double
quotes
53
Formatted Output from gawk
• Lining up fields
– Like C, gawk has a printf function for producing formatted
output
– printf has the form printf( format, val1, val2, val3, … )
{ printf("total pay for %s is $%.2fn", $1, $2 * $3) }
37. – When using printf, formatting is under your control so no
automatic
spaces or newlines are provided by gawk
– You have to insert them yourself.
{ printf("%-8s %6.2fn", $1, $2 * $3 ) }
54
Selection
• gawk patterns are good for selecting specific lines from the
input for further processing
– Selection by comparison
$2 >= 5 { print }
– Selection by computation
$2 * $3 > 50 { printf("%6.2f for %sn", $2 * $3, $1) }
– Selection by text content
$1 == "UNT"
$2 ~ /UNT/
– Combinations of patterns
$2 >= 4 || $3 >= 20
– Selection by line number
38. NR >= 10 && NR <= 20
55
Arithmetic and Variables
• gawk variables take on numeric (floating point) or string
values according to context
• User-defined variables are unadorned (i.e., they do not need
to be declared)
• By default, user-defined variables are initialized to the null
string which has numerical value 0
56
Computing with gawk
• Counting is easy to do with gawk
$3 > 15 { emp = emp + 1}
END { print emp, "employees worked
more than 15 hrs"}
• Computing sums and averages is also simple
{ pay = pay + $2 * $3 }
END { print NR, "employees"
39. print "total pay is", pay
print "average pay is", pay/NR
}
57
Handling Text
• One major advantage of gawk is its ability to handle strings as
easily as many languages handle numbers
• gawk variables can hold strings of characters as well as
numbers, and gawk conveniently translates back and forth as
needed
• This program finds the employee who is paid the most per
hour:
# Fields: employee, payrate
$2 > maxrate { maxrate = $2; maxemp = $1 }
END { print "highest hourly rate:",
maxrate, "for", maxemp }
58
String Manipulation
40. • String Concatenation
– New strings can be created by combining old ones
{ names = names $1 " " }
END { print names }
• Printing the Last Input Line
– Although NR retains its value after the last input line has been
read, $0
does not
{ last = $0 }
END { print last }
59
Built-In Functions
• gawk contains a number of built-in functions: length is one of
them
• Counting lines, words, and characters using length (similar to
wc)
{ nc = nc + length($0) + 1
nw = nw + NF
}
41. END { print NR, "lines,", nw, "words,", nc,
"characters" }
• substr(s, m, n) produces substring of s that begins at position
m and is at most n characters long
60
Control Flow Statements
• gawk provides several control flow statements for making
decisions and writing loops
• If-Then-Else
$2 > 6 { n = n + 1; pay = pay + $2 * $3 }
END { if (n > 0)
print n, "employees, total pay is",
pay, "average pay is", pay/n
else
print "no employees are paid more
than $6/hour"
}
42. 61
Loop Control
• While
# interest1 - compute compound interest
# input: amount, rate, years
# output: compound value at end of each year
{ i = 1
while (i <= $3) {
printf("t%.2fn", $1 * (1 + $2) ^ i)
i = i + 1
}
}
Do-While Loops
do {
statement1
}
while (expression)
62
for Statements
43. • For
# interest2 - compute compound interest
# input: amount, rate, years
# output: compound value at end of each year
{ for (i = 1; i <= $3; i = i + 1)
printf("t%.2fn", $1 * (1 + $2) ^ i)
}
63
Arrays
• Array elements are not declared
• Array subscripts can have any value:
– Numbers
– Strings (associative arrays)
• Examples
arr[3]="value"
grade["Smith"]=40.3
44. 64
Array Example
# reverse - print input in reverse order by line
{ line[NR] = $0 } # remember each line
END {
for (i=NR; (i > 0); i=i-1) {
print line[i]
}
}
• Use for loop to read associative array
for (v in array) { … }
– Assigns to v each subscript of array (unordered)
– Element is array[v]
65
Operators
45. = assignment operator
– Sets a variable equal to a value or string
== equality operator
– Returns TRUE is both sides are equal
!= inverse equality operator
&& logical AND
|| logical OR
! logical NOT
<, >, <=, >= relational operators
+, -, /, *, %, ^
66
Built-In Functions
• Arithmetic
– sin, cos, atan, exp, int, log, rand, sqrt
• String
– length, substr, split
• Output
46. – print, printf
• Special
– system - executes a Unix/Linux command
• system("clear") to clear the screen
• Note double quotes around the Unix command
– exit - stop reading input and go immediately to the END
pattern-action
pair if it exists, otherwise exit the script
67
gawk Examples
• Records and fields
gawk '{print NR, $0}' emp1
• Space as field separator
gawk '{print NR, $1, $2, $5}' emp1
• Colon as field separator
gawk -F: '/Jones/{print $1, $2}' emp2
• Match input record
gawk -F: '/00$/' emp2
• Explicit match
gawk '$5 ~ /.[7-9]+/' emp3
• Matching with regexes
47. gawk '$2 !~ /E/{print $1, $2}' emp3
gawk '/^[ns]/{print $1}' emp3
10/9/2019
1/2
Week 7 - Assignment: Clarify Your Research Mindset
Instructions
This week, you were able to focus on applied projects in
healthcare administration and how
the results of these projects can effect change in healthcare
organizations and systems. As
you near the end of this course, it is time to re-evaluate your
research mindset and identify
any questions you may have moving forward. The DHA faculty
are here to support you in
your efforts as you imagine yourself at a higher degree!
For this assignment, you will reflect and re-examine your
research mindset and skills. Ask
yourself the following as you reflect:
Where were you when you started this course and where are you
now in terms of
48. your research mindset and skills?
What areas did you grow in? What areas do you feel are still
lacking and need further
guidance?
What questions do you have about research methodology and
design or your
potential project that your faculty mentor can help answer?
What area of interest do you think you will do your project on?
How can your project potentially make a difference? Why is it
important?
What is the contribution to an organization, the profession, or
practice?
Length: 2-3 pages minimum, not including title and reference
pages
References: Include a minimum of 3 scholarly resources
Your reflection paper should demonstrate thoughtful
consideration of the ideas and concepts
that are presented in the course and provide new thoughts and
insights relating directly to this
topic. Your response should reflect graduate-level writing and
APA standards.
https://ncuone.ncu.edu/d2l/le/content/168767/navigateContent/4
49. 23/Previous?pId=1591463
https://ncuone.ncu.edu/d2l/le/content/168767/navigateContent/4
23/Next?pId=1591463
1
CSCE 3600: Systems Programming
Minor Assignment 5 – sed and gawk
Due: Wednesday, November 27, 2019 at 11:59 PM
PROGRAM DESCRIPTION:
In this assignment, you will write sed and gawk commands to
accomplish certain
requested functionality. Given the many powerful features of
sed and gawk, you are
provided with a link to a tutorial for sed as well as gawk to
assist you in completing this
assignment.
Using sed
For help using sed, you may find the tutorial
http://www.grymoire.com/Unix/Sed.html or
the actual sed manual
50. https://www.gnu.org/software/sed/manual/sed.html useful.
a) The // character sequence is often known as C++ style or
single-line comments,
while the /* … */ character sequence is often known as C-style
or multi-line
comments. As an example, assume the following myprog.c file
that has a mix of
both C-style and C++ style comments:
// This is a test of how this works
#include <stdio.h>
int main()
{
// declare some variables here
int num1 = 4;
float num2 = 3.5;
// print the result
printf("The result is %fn", num1 * num2); // this does it
/* does it work? */
return 0;
}
Write a one-line sed command that transforms all of the C++
style (i.e., single-line
comments) to C-style (i.e., multi-line comments) and also
upper-cases the comment
text so that after running the appropriate sed command, the
51. following would be
output to the terminal:
/* THIS IS A TEST OF HOW THIS WORKS */
#include <stdio.h>
int main()
{
/* DECLARE SOME VARIABLES HERE */
int num1 = 4;
http://www.grymoire.com/Unix/Sed.html
https://www.gnu.org/software/sed/manual/sed.html
2
float num2 = 3.5;
/* PRINT THE RESULT */
printf("The result is %fn", num1 * num2); /* THIS DOES IT
*/
/* does it work? */
return 0;
}
Write a single sed script called minor5_1.sed.
sed -r -f minor5_1.sed myprog.c
b) Consider the following file called dates.txt containing some
birthdays formatted
52. as <dd> <mm> <yyyy>:
02 01 2002
27 03 2005
19 04 1999
30 11 2007
Write a single sed script called minor5_2.sed that does the
following:
1. Swaps the day and month entries of each input
In this file, for example, my sed script should print the
following:
$ sed -r -f minor5_2.sed dates.txt
01 02 2002
03 27 2005
04 19 1999
11 30 2007
This sed script file will be submitted to Canvas.
Using gawk
For help using gawk, you may find the tutorial
http://www.grymoire.com/Unix/Awk.html
or the gawk manual
https://www.gnu.org/software/gawk/manual/gawk.html useful.
a) Consider the following file called sides.txt:
53. x y z
1 2 3
2 2 3
4 3 5
-1 0 1
Each field in this file is separated by a tab and each record is
separated by a newline
character. gawk is a very powerful utility; it can make
calculations as well as support
branching statements and function calls. For this part, you will
write a gawk program
file where the program and input file are provided as command-
line arguments to
http://www.grymoire.com/Unix/Awk.html
https://www.gnu.org/software/gawk/manual/gawk.html
3
compute if the supplied numbers on a line of input constitute a
valid set of side
lengths of a triangle. Recall that a set of values (x,y,z)
constitute the sides of a
triangle if no side is equal to or larger than the sum of the other
two sides. Also, no
54. side dimension should be zero or negative.
For e.g. with the first line (ignoring the header line) above,
containing values 1, 2 and
3, the output should be ‘NO’. For the second line above,
containing values 2, 2, and
3, the output should be ‘YES’.
The outputs when run with the above data should be:
NO
YES
YES
NO
Your gawk program should be invoked as:
gawk -f minor5_1.gawk sides.txt
b) Consider the following file called grades.txt:
Last First T1 T2 T3
------- ---- -- -- --
Adams Will 86 77 74
Bell Adam 98 83
Evans Kris 92 87 96
Jackson Ben 68 75 82
Morrow Claire 51 62 66
Pratt Joseph 38
Smith Tommy 99 83 94
Welsh Jenny 62 43 54
Each line (i.e., record) contains the last name, then first name,
55. followed by one to
three test scores between 0 and 99 similar to what is shown in
the file above. A
newline character separates each record. Write a single gawk
program file called
minor5.gawk that will read the grades.txt file and calculate the
grades for each
student listed, where 89.5 or greater is an “A”, 79.5 or greater is
a “B”, etc. You will
print out the entire record and add their grade. If a student is
missing two or more
scores, however, he/she will automatically earn an “F”.
However, if the student is
missing at most one score, then the other scores should be used
to grade along with
a 0 for the missed score. You will also print out his/her entire
record, substituting “--
" in place of the missing field, along with their grade and a note
indicating that the
student has missing scores. Obviously, you need to skip the first
two lines that
contain header information.
In this file, for example, the gawk program should print the
following:
$ gawk -f minor5_2.gawk grades.txt
Adams Will 86 77 74 => C
Bell Adam 98 83 -- => D (missing score)
56. 4
Evans Kris 92 87 96 => A
Jackson Ben 68 75 82 => C
Morrow Claire 51 62 66 => D
Pratt Joseph 38 -- -- => F (missing scores)
Smith Tommy 99 83 94 => A
Welsh Jenny 62 43 54 => F
Formatting properly in columns as shown is required. This gawk
program file will be
submitted to Canvas.
REQUIREMENTS:
Your sed script and gawk program files should include your
name and EUID at
the top of the file. No other comments are needed in these files.
files on our CSE
machines (e.g., cse01, cse02, …, cse06), to make sure that they
indeed work.
Your solution to the one-line sed script and gawk program can
be typed (or
copied and pasted) to this document and will be submitted to
Canvas.
57. works correctly on the
CSE machines (e.g., cse01, cse02, …, cse06), so you should
make sure that
your program runs on a CSE machine. Please include any
special instructions
required to run your sed script and gawk program.
programming assignment that must be
the sole work of the
individual student. Any instance of academic dishonesty will
result in a grade of
“F” for the course, along with a report filed into the Academic
Integrity Database.
SUBMISSION:
tronically submit this file with your typed one-
line solutions for sed
and gawk along with your sed scripts minor5_1.sed,
minor5_2.sed and
gawk programs minor5_1.gawk and minor5_2.gawk to the
Minor 5 dropbox in
Canvas by the due date and time.