Using Regular Expressions in GrepPresentation Transcript
Dan MorrillHighline Community College April 02 - 2013
Superheroes.txt A linux computer Grep (already on your linux box)
Grep searches the named input FILEs (or standard input if no files are named, or the file name - is given) for lines containing a match to the given PATTERN. By default, grep prints the matching lines. In addition, two variant programs egrep and fgrep are available. Egrep is the same as grep -E. Fgrep is the same as grep -F.
-E – extended-regexp (interpret a pattern (using a regular expression)) -G – basic-regexp (interpret a pattern as a basic regular expression) -i – ignore case (important, linux is case sensitive) -n – print the line number the match was found on -r (-R) – recursive – search all the files under a directory for the pattern -v – invert match – show all lines that are not matching of the pattern
Grep is designed to search for data in a file or in a list (for example, when doing ps –ef |grep http) If you want to search for a specific item in a text document you can also use grep to find what you are looking for grep –i man superheroes.txt Will find everything that has the word “man” in the file and push it to the screen for you to see.
grep –i black superheroes.txt You should see a list of people who have the word man in their names grep –i cat superheroes.txt You should see a list of people who have the word cat in their names grep –i spider superheroes.txt You should see a list of people who have the word spider in their names grep -v -i spider heroes.txt You should see a list of people without the word spider in their names
. (period) – match any single character ^ - match the empty string at the top of the line $ - match the empty string at the bottom of the line A – match an uppercase A a – match a lowercase a d – match a digit (number) D – match any non-number character (a-zA-Z) [A-E] – match any upper case A through E (A, B, C, D, E) [^A-E] – match any upper case character but A through E
X? – match no or one occurance of the captial letter X X* - match zero or more captial x’s X+ - match one or more captial x’s (abc|def)+ Match a sequence of at least one abc and def, abc and def would both match
grep –E ‘^Bat’ superheroes.txt Matches names that start with Bat (note the cap) What would I use to make it not case sensitive? grep -E ^(bat|Bat|cat|Cat)‘ superheroes.txt Matches all bat, Bat, cat, Cat in the file grep –i –E ‘^(bat|cat)’ superheroes.txt Matches all bat and cat regardless of case (similar to the second example without so much typing) grep -i -E [^b]at superheroes.txt Excludes all lowercase b followed by at