Regular Expressions In Action
Upcoming SlideShare
Loading in...5
×
 

Regular Expressions In Action

on

  • 3,238 views

It explains building blocks of regular expressions and their usage with easy to understand examples.

It explains building blocks of regular expressions and their usage with easy to understand examples.

Statistics

Views

Total Views
3,238
Slideshare-icon Views on SlideShare
3,216
Embed Views
22

Actions

Likes
0
Downloads
42
Comments
0

2 Embeds 22

http://www.linkedin.com 18
https://www.linkedin.com 4

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Regular Expressions In Action Regular Expressions In Action Presentation Transcript

    • Regular Expression in Action
      Brief overview of Regular Expression building blocks and tools with a practical example
      Muhammad Sheraz Siddiqi
      http://www.sherazsiddiqi.com/
    • What are Regular Expressions
      Tools to learn
      Literal characters and Special characters
      Build blocks of Regular Expressions  
      Grouping and Backreferences
      Unicode characters in regular expressions
      Regex Matching Modes
      Lookarounds
      Parse a log file…
      This Presentation…
      http://www.sherazsiddiqi.com/
    • Regular expressions provide a concise and flexible means for matching strings of text, such as particular characters, words, or patterns of characters.
      What are Regular Expressions?
      http://www.sherazsiddiqi.com/
    • The Regex Coach is a graphical application for Windows which can be used to experiment with regular expressions interactively.
      http://weitz.de/regex-coach/
      Notepad++ is a text editor that has support of find and replace using Regular Expressions.
      http://notepad-plus-plus.org/
      Web based Regular Expressions tester.
      http://www.regular-expressions.info/javascriptexample.html
      Tools to learn?
      http://www.sherazsiddiqi.com/
    • The most basic regular expression consists of a literal which behaves just like string matching. For e.g.
      catwill match cat in About cats and dogs.
      Special characters known as meta characters needs to be escaped with a in regular expressions if they are used as part of a literal:
      dogs.will match dogs. in About cats and dogs.
      Meta characters are:
      [ ^ $ . | ? * + ( ) {
      Literal and Special characters
      http://www.sherazsiddiqi.com/
    • With a "character class", also called "character set", you can tell the regex engine to match only one out of several characters. For e.g.
      gr[ae]ywill match grey and gray both.
      Ranges can be specified using dash. For e.g.
      [0-9]will match any digit from 0 to 9.
      [0-9a-fA-F]will match any single hexadecimal digit.
      Caret after the opening square bracket will negate the character class. The result is that the character class will match any character that is not in the character class. For e.g.
      [^0-9]will match any thing except number.
      q[^u]will not match Iraq but it will match Iraq is a country
      Character Classes and Shorthands
      http://www.sherazsiddiqi.com/
    • Meta characters works fine without escaping in Character classes. For e.g.
      [+*]is a valid expression and match either * or +.
      There are some pre-defined character classes known as short hand character classes:
      wstands for[A-Za-z0-9_]
      sstands for[ trn]
      dstands for[0-9]
      If a character class is repeated by using the ?, * or + operators, the entire character class will be repeated, and not just the character that it matched. For e.g.
      [0-9]+ can match 837 as well as 222
      ([0-9])1+ will match 222 but not 837.
      Character Classes and Shorthands
      http://www.sherazsiddiqi.com/
    • The famous dot “.” operator matches anything. For e.g.
      a.bwill match abb, aab, a+betc.
      ^ and $ are used to match start and end of regular expressions. For e.g.
      ^My.*.$will match anything starting with My and ending with a dot.
      Pipe operator is used to match a string against either its left or the right part. For e.g.
      (cat|dog) can match both cat or dog.
      Question:
      If the expression is Get|GetValue|Set|SetValueand string isSetValue. What will this match and why?
      What if the expression becomes Get(Value)?|Set(Value)?
      * or {0,} and+ or {1,} are used to control repititions.
      Building blocks of Regular Exp.
      http://www.sherazsiddiqi.com/
    • Round brackets besides grouping part of a regular expression together, also create a "backreference". A backreference stores the matching part of the string matched by the part of the regular expression inside the parentheses. For e.g.
      ([0-9])1+ will match 222 but not 837.
      If backreference are not required, you can optimize this regular expression Set(?:Value)?
      Backreferences can be used in expressions itself or in replacement text. For e.g.
      <([A-Za-z][A-Za-z0-9]*)>.*</1> will match matching opening and closing tags.
      Grouping and Backreferences
      http://www.sherazsiddiqi.com/
    • Unicode characters can be used as uxxxx in regular expressions. For e.g.
      عطاری cat be matched in an expression as: u0639u0637u0627u0631u06cc
      Unicode characters in Regular Exp.
      http://www.sherazsiddiqi.com/
    • /i makes the regex match case insensitive.
      [A-Z] will match A and a with this modifier.
      /s enables "single-line mode". In this mode, the dot matches newlines as well.
      .* will match sherazrnattari with this modifier.
      /m enables "multi-line mode". In this mode, the caret and dollar match before and after newlines in the subject string.
      .* will match only sherazin sherazrnattari with this modifier.
      /x enables "free-spacing mode". In this mode, whitespace between regex tokens is ignored, and an unescaped # starts a comment.
      #sherazrnrn.* will match only sherazin with this modifier.
      Regular Exp. Matching Modes
      http://www.sherazsiddiqi.com/
    • A conditional is a special construct that will first evaluate a lookaround, and then execute one sub-regex if the lookaround succeeds, and another sub-regex if the lookaround fails.
      Example of Positive lookahead is:
      q(?=uv*)will match q in quvvvv and qu.
      Example of Negative lookahead is:
      q(?!uv*)will match q not followed by u and uv.
      Example of Positive lookbehind is:
      (?<=b)awill match a prefixed by b like ba.
      Example of Negative lookbehind is:
      (?<!b)awill match a not prefixed by b like ca and da etc.
      Lookarounds with Conditions…
      http://www.sherazsiddiqi.com/
    • Example1:: I have an access log (access.log) file of Helix DNA server. I want to calculate how many times each content is access and update download and listen count of each content in the database.
      Exp: ^(.*)asxgen/Data/Naat/Download(.*)/(d+).(mp3|rm)(.*)$
      Replace: UPDATE DB.TBL set col=col + COUNT where id=3;
      Example2:: I have application generated log (applog.txt) file of a web application. I want to fetch required information from relevant rows. In order to remove irrelevant rows:
      Exp: ^(?!((.*)ID:s(.*)sStatus:s(.*))).*$
      Replace: Empty string
      Parse a log file…
      http://www.sherazsiddiqi.com/
    • Questions Please…..
      http://www.sherazsiddiqi.com/
      Thank you for being here…
      Most of the content is taken from:
      http://www.regular-expressions.info/
      me@sherazsiddiqi.com