1. Introduction to Unix and OS
Module 3
grep and Regular Expression
Dr. Girisha G S
Dept. of CSE
SoE,DSU, Bengaluru
1
2. Basic Regular Expression
What is regular expression?
- Regular expression is a pattern which contains meta characters for a complex
search/match
Classification of regular expression
1. Basic Regular Expression (BRE)
2. Extended Regular Expression (ERE)
- grep command supports
-> BRE by default
-> ERE with the –E option
- Basic Regular Expression make it possible to match a group of similar patterns
3. The BRE character subset
Meta character /
character
Matches
* Zero or more occurrences of preceding expression
E.g. a* matches no ‘a’, a, aa, aaa, …….
. A single character
E.g. .* matches any number of characters
[ ] Matches single character with in the brackets
E.g. [pqr] matches a single character p, q, or r
[1-4] matches digit between 1 & 3
^ Match a pattern at the beginning of line
E.g. ^The match any line which begin with the word
The
$ Match a pattern at the end of line
E.g. ab$ matches any stream that end with b
4. The character class ( [ ] )
- matches one of the character in the brackets
E.g. To match Agarwal or agrawal, the RE is
[Aa]g[ar][ra]wal
The *
Matches zero or more occurrences of the preceding expression
E.g.1 g* matches single character g or any no of g’s
i.e. no ‘g’, g, gg, ggg, …….
E.g.2 To match the pattern Agarwal, agrawal, aggarwal, the RE is
[Aa]gg*[ar][ra]wal
The Dot(.)
- A dot(.) matches single character
- .* sigifies any no of character or none
E.g. Match all 4 character words beginning with 2
2…
5. Specifying pattern location (^ & $)
E.g. 1. Match lines starting with 2
^2
E.g. 2 match lines where salary between no 7000 to 7999
7…$
grep & RE
- You can also use regular expression with grep command
Example1: search files Agarwal and agrawal from emp.lst
$ grep ‘[aA]g[ar[ar]wal emp.lst
Example2: search for the lines Agarwal, agrawal, aggarwal from emp.lst
$ gepr ‘[aA]gg*[ar[ar]wal’ emp.lst
Example3: print all lines that begin with letter ‘a ‘ followed by one
character followed by letter sequence ‘ple’ from fruitlist.txt
$ grep ‘^a.ple’ fruitlist.txt
Example4: print all the lines ending with letter ‘e’ from sample.txt
$ grep ‘e$’ sample.txt
6. Extended Regular Expression
- Extended RE make it possible to match disimilar patterns with
a single expression
- This set uses additional characters
- grep command use them with –E option
The ERE set
+ matches one or more occurrence of the previous character
E.g a+ matches a,aa,aaa,……..
[ab]+ matches ab,abaab, ababab,…..
[a-az]+ matches all string of lower characters
? Matches zero or q occurrence of preceding expression
E.g. ab?c matches either ac or abc, here b is optiona
| Indicates alternation
E.g . (ab|cd) matches either ab or cd
( ) groups a series of REs together into a new RE
7. grep and ERE
- Use grep’s -E option to use ERE
Example1: search for the lines Agarwal, aggarwal from the file
emp.lst
$ gepr -E ‘[aA]gg?arwal’ emp.lst
Example2: search for the lines segupta and dasgupta from
the file emp.lst
$ grep –E ‘(sen|das)gupta’ emp.lst
Editor's Notes
We use commands that filter data to select only the portion of data that we wish to view or operate on. t can be used to process information in powerful ways such as restructuring output to generate useful reports, modifying text in files and many other system administration tasks.
If an expression uses metacharacters it is termed as regular exression
The above RE matches two names
This pattern matches 3 patterns