09 string processing_with_regex copy
Upcoming SlideShare
Loading in...5
×
 

09 string processing_with_regex copy

on

  • 169 views

Unix / Linux Fundamentals

Unix / Linux Fundamentals

Statistics

Views

Total Views
169
Views on SlideShare
169
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Discussion: importance of regular expressions for even the most basic automated task Usage in programming languages and IT The difference between Matching and Substitution
  • Draw: Explanation scheme
  • Exercise: use egrep with a regular expression to print out all user records for users with name beginning with u or with r
  • Exercise: use sed to view only the ‘root’ user record from /etc/passwd Exrcise: used sed to print the ‘ etc/shadow/file ’ while removing the ‘ user ’ records
  • If time permits: show a short example of AWK (use perl in exercise for “fast” students) Exercise for home / later - explain the example

09 string processing_with_regex copy 09 string processing_with_regex copy Presentation Transcript

  • String Processing with REGEX
  • Regular Expressions • Regular Expression is a mean for defining text patterns. It is vastly used in various implementation of automatic computing tasks • Regular expressions can be used when working with different programming languages, such as Perl, AWK and TCL or with some of the Linux power tools such as sed, grep, awk, expr and VI • There are two main types of regular expressions usage – Matching – Substitution Note: Regular expression can be also referred as regexp or REGEX
  • Regular Expressions • REGEX are different from the shell’s meta-characters, even though they make use of similar characters; They should always be quoted in order to protect them from the shell. Note: There are some variants and additions in the REGEX syntaxes between the different commands; if something does not work or if in doubt, consult the man pages of that program.
  • Regular Expressions • Below is a list of some of the common REGEX and their values:  . - match any single character.  [list] – matches any single character in the list.  [range] – matches any single character in the range.  [^range] - matches any single character, not in list or range.  * - matches previous character 0 or more times.  {n} – matches previous character n times.  {n,} – matches previous character at least n times.  {n,m} – matches previous character between n and m times.  ^ - matches regex at the start of the line Only.  $ - matches regex at the end of the line Only.  - quote. Cancels the meaning of a meta-character.
  • Regular Expressions | - Logical OR & - Logical AND ! - Logical NOT • Regular Expression parsing is done simply be interpreting each char, from left to right. When matching, each text line will be tested for a match against the Regular Expression every time a new character is being parsed • Each character matches itself, unless it is a meta-character.
  • Regular Expressions • Some examples for REGEX matching: # egrep '^u' /etc/passwd uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin user1:x:500:500::/home/user1:/bin/bash # egrep '^[^a-v]' /etc/passwd webalizer:x:67:67:Webalizer:/var/www/usage:/sbin/nologin
  • sed • ‘sed’ is a stream editor, it parses and edit text according to a predefined set of commands • Syntax:  sed [options] ‘command(s)’ [file] • Options:  -i modify the file data  -e adds support for multiple commands  -n do not output lines by default By default, the “sed” command does not change the contents of files; the safer way to make the changes is to redirect the new output, after “sed” has done its trick into a new file.
  • sed • ‘sed’ uses Regular Expression commands to do both matching and text manipulation • Regular Expression commands can be pretty confusing, as the command declaration can be on both sides of the regexp declaration • Syntax:  [command]/regexp/[command][arguments] – ‘/regexp/p’ Print matched text (to be used with ‘-n’) – ‘/regexp/d’ Delete matched text – ‘s/regexp/string/[g]’ Substitute matched text with string
  • sed • ‘sed’ is one of the more complex Linux power tools. For most advanced usages, it has two main competitors: Perl and ‘awk’. Both are fully featured programming languages. • Example # cat file one two three four five six # sed 's/([a-z]*) ([a-z]*) ([a-z]*)/1 SECRET 3/g' file
  • sed • ‘sed’ is one of the more complex Linux power tools. For most advanced usages, it has two main competitors: Perl and ‘awk’. Both are fully featured programming languages. • Example # cat file one two three four five six # sed 's/([a-z]*) ([a-z]*) ([a-z]*)/1 SECRET 3/g' file