SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 30 day free trial to unlock unlimited reading.
5.
Simple Regular Expressions
● Traditional regular expressions.
● Not a standard.
● Support by some applications for backwards
compatibility.
● Deprecated.
6.
POSIX Basic Regular
Expressions
● Created to provide a common standard for Unix
tools.
● Designed to be backwards compatible with
traditional regular expressions.
● Adopted as the default syntax of many Unix
tools.
● Some metacharacters require escaping.
7.
POSIX Extended Regular
Expressions
● Adds some new metacharacters.
● Metacharacters do not require escaping.
● Dropped support for back references (n).
● Many Unix tools provide support with a
command line argument (usually -E).
8.
Perl Regular Expressions
● Adds lazy quantification, named capture groups
and recursive patterns.
● Adopted by many programming languages due
to its power.
● Requires non-alphanumeric delimiters around
expression.
● Other languages only implement a subset, so
implementations vary.
10.
Basic Metacharacters
. Match any single character.
^ Matches beginning of a string.
$ Matches end of a string.
| Matches the expression before or after (think ||).
11.
Character Classes
[] Match any characters within the group.
[^ ] Match any characters NOT within the group.
[n-m] Match a range of characters.
Examples:
[A-Za-z0-9]
[^G-Zg-z _]
12.
Shorthand Character Classes
s Any whitespace character such as space, tab and newlines.
Same as [nrt ]
w Any word character.
Same as [A-Za-z0-9_]
d Any digit character.
Same as [0-9]
S, W, D Negated version of the above. Can be used inside character
classes but could be confusing.
13.
Quantifiers
* Match the preceding expression 0 or more times.
+ Match the preceding expression 1 or more times.
? Match the preceding expression 0 or 1 time.
{m,n} Match the preceding expression at least m times but no more than n times.
{m,} Match the preceding expression at least m times with no maximum.
{,n} Match the preceding expression no more than n times with no minimum.
{n} Match the preceding expression exactly n times.
14.
Lazy Quantifiers
Standard Quantifiers are greedy.
Example:
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
"Hello .*"
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
15.
Lazy Quantifiers
Use ? to make a quantifier lazy.
Example:
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
"Hello .*?"
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
16.
Grouping
() Group the expression and capture the text.
(?: ) Group the expression but DO NOT capture the text.
17.
Backreferences
1 through 9 reference previously captured text.
Example:
Many programming courses start with a "Hello World"
example. 'Hello World' examples are extremely simple,
especially when they just output "Hello World'.
('|")Hello World(1)
Many programming courses start with a "Hello World"
example. 'Hello World' examples are extremely simple,
especially when they just output "Hello World'.
18.
Word Boundaries
b matches the position between a word character
(w) and a non-word character (W).
Example:
Hello World
ob
Hello| World
19.
Word Boundaries
B matches the position between two word
characters (ww).
Example:
Hello World
oB
Hello Wo|rld
20.
Lookaheads
(?= ) matches the position directly before the
expression is matched.
Example:
Hello World sounds better than "Hello Earth".
Hello(?= World)
Hello World sounds better than "Hello Earth".
21.
Lookbehinds
(?<= ) matches the position directly after the
expression is matched.
Example:
Hello World sounds better than "Hello Earth".
(?<=")Hello
Hello World sounds better than "Hello Earth".
22.
Lookaheads
(?! ) matches the position directly before the
expression is NOT matched.
Example:
Hello World sounds better than "Hello Earth".
Hello(?! World)
Hello World sounds better than "Hello Earth".
23.
Lookbehinds
(?<! ) matches the position directly after the
expression is NOT matched.
Example:
Hello World sounds better than "Hello Earth".
(?<!")Hello
Hello World sounds better than "Hello Earth".
24.
Conditionals
(?(condition)then|else)
● condition must be a lookahead or a lookbehind.
● If condition is matched, then must match for the
expression to pass.
● If condition is not matched, else must match for
the expression to pass.
25.
Conditionals
Example:
Hello World sounds better than "Hello Earth".
Hello (?(?<=World)World|Earth)
Hello World sounds better than "Hello Earth".
Hello (?(?<=People)People|Earth)
Hello World sounds better than "Hello Earth".
26.
Modifiers
i Case insensitive matching.
s . matches newline characters.
m ^ and $ match after and before newlines (respectively).
x Whitespace within the expression is ignored unless escaped.
g Match globally.
27.
Modifiers
● (?a) to turn modifiers on.
●(?-a) to turn modifiers off.
Examples:
(?i)WORLD(?-i)
(?i-s)WORLD.(?s-i)
(?i:WORLD)
5.
Simple Regular Expressions
● Traditional regular expressions.
● Not a standard.
● Support by some applications for backwards
compatibility.
● Deprecated.
6.
POSIX Basic Regular
Expressions
● Created to provide a common standard for Unix
tools.
● Designed to be backwards compatible with
traditional regular expressions.
● Adopted as the default syntax of many Unix
tools.
● Some metacharacters require escaping.
7.
POSIX Extended Regular
Expressions
● Adds some new metacharacters.
● Metacharacters do not require escaping.
● Dropped support for back references (n).
● Many Unix tools provide support with a
command line argument (usually -E).
8.
Perl Regular Expressions
● Adds lazy quantification, named capture groups
and recursive patterns.
● Adopted by many programming languages due
to its power.
● Requires non-alphanumeric delimiters around
expression.
● Other languages only implement a subset, so
implementations vary.
10.
Basic Metacharacters
. Match any single character.
^ Matches beginning of a string.
$ Matches end of a string.
| Matches the expression before or after (think ||).
11.
Character Classes
[] Match any characters within the group.
[^ ] Match any characters NOT within the group.
[n-m] Match a range of characters.
Examples:
[A-Za-z0-9]
[^G-Zg-z _]
12.
Shorthand Character Classes
s Any whitespace character such as space, tab and newlines.
Same as [nrt ]
w Any word character.
Same as [A-Za-z0-9_]
d Any digit character.
Same as [0-9]
S, W, D Negated version of the above. Can be used inside character
classes but could be confusing.
13.
Quantifiers
* Match the preceding expression 0 or more times.
+ Match the preceding expression 1 or more times.
? Match the preceding expression 0 or 1 time.
{m,n} Match the preceding expression at least m times but no more than n times.
{m,} Match the preceding expression at least m times with no maximum.
{,n} Match the preceding expression no more than n times with no minimum.
{n} Match the preceding expression exactly n times.
14.
Lazy Quantifiers
Standard Quantifiers are greedy.
Example:
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
"Hello .*"
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
15.
Lazy Quantifiers
Use ? to make a quantifier lazy.
Example:
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
"Hello .*?"
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
16.
Grouping
() Group the expression and capture the text.
(?: ) Group the expression but DO NOT capture the text.
17.
Backreferences
1 through 9 reference previously captured text.
Example:
Many programming courses start with a "Hello World"
example. 'Hello World' examples are extremely simple,
especially when they just output "Hello World'.
('|")Hello World(1)
Many programming courses start with a "Hello World"
example. 'Hello World' examples are extremely simple,
especially when they just output "Hello World'.
18.
Word Boundaries
b matches the position between a word character
(w) and a non-word character (W).
Example:
Hello World
ob
Hello| World
19.
Word Boundaries
B matches the position between two word
characters (ww).
Example:
Hello World
oB
Hello Wo|rld
20.
Lookaheads
(?= ) matches the position directly before the
expression is matched.
Example:
Hello World sounds better than "Hello Earth".
Hello(?= World)
Hello World sounds better than "Hello Earth".
21.
Lookbehinds
(?<= ) matches the position directly after the
expression is matched.
Example:
Hello World sounds better than "Hello Earth".
(?<=")Hello
Hello World sounds better than "Hello Earth".
22.
Lookaheads
(?! ) matches the position directly before the
expression is NOT matched.
Example:
Hello World sounds better than "Hello Earth".
Hello(?! World)
Hello World sounds better than "Hello Earth".
23.
Lookbehinds
(?<! ) matches the position directly after the
expression is NOT matched.
Example:
Hello World sounds better than "Hello Earth".
(?<!")Hello
Hello World sounds better than "Hello Earth".
24.
Conditionals
(?(condition)then|else)
● condition must be a lookahead or a lookbehind.
● If condition is matched, then must match for the
expression to pass.
● If condition is not matched, else must match for
the expression to pass.
25.
Conditionals
Example:
Hello World sounds better than "Hello Earth".
Hello (?(?<=World)World|Earth)
Hello World sounds better than "Hello Earth".
Hello (?(?<=People)People|Earth)
Hello World sounds better than "Hello Earth".
26.
Modifiers
i Case insensitive matching.
s . matches newline characters.
m ^ and $ match after and before newlines (respectively).
x Whitespace within the expression is ignored unless escaped.
g Match globally.
27.
Modifiers
● (?a) to turn modifiers on.
●(?-a) to turn modifiers off.
Examples:
(?i)WORLD(?-i)
(?i-s)WORLD.(?s-i)
(?i:WORLD)