Your SlideShare is downloading. ×
Regular
Expressions
Jesse Anderson
What Are They?
• Language to parse text
• Apply logic and constraints
• Concise (but not readable)
• Consistent (mostly)
•...
Hello Regex
Source Text Regular Expression Yield
“hello world” “hello” { “hello” }
“hello world hello world” “hello” { “he...
Java Regex Code
Pattern pattern = Pattern.compile("hello");
Matcher matcher = pattern.matcher("hello world");
// Find all ...
C# Regex Code
foreach (Match match in
Regex.Matches("hello world", "hello",
RegexOptions.IgnoreCase)) {
// Get the matchin...
Python Regex Code
regex = re.compile("hello");
results = regex.search("hello world");
// results = "hello"
Perl Regex Code
$value = "hello world";
$value =~ m/hello/;
$result = $1;
// result = "hello"
The (Ugly) Alternative
String needle = "hello";
String haystack = "hello world hello world";
int index = 0;
while ((index ...
Regex Metacharacters
• * - Match zero or more times
• ? - Match zero or 1 time
• + - Match one or more times
• ^ - Match t...
Character Classes
POSIXPOSIX ShorthandShorthand LonghandLonghand DescriptionDescription
[:word:] w [A-Za-z0-9_] Alphanumer...
Groups
Source TextSource Text Regular ExpressionRegular Expression YieldYield
“hello world” “([a-z]+)s+([a-z]+)
{ “hello w...
Example
• Example that parses, cleans up, and
normalizes input
Recommended
Reading
• Mastering Regular Expressions by Jeffry
Friedl
• Regular Expressions Cheat Sheet
http://www.addedbyt...
Upcoming SlideShare
Loading in...5
×

Introduction to Regular Expressions

3,530

Published on

Introduces regular expressions and their power. Example code at http://resume.jesse-anderson.com/regex.zip.

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,530
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Transcript of "Introduction to Regular Expressions"

  1. 1. Regular Expressions Jesse Anderson
  2. 2. What Are They? • Language to parse text • Apply logic and constraints • Concise (but not readable) • Consistent (mostly) • Widely supported in programming languages
  3. 3. Hello Regex Source Text Regular Expression Yield “hello world” “hello” { “hello” } “hello world hello world” “hello” { “hello”, “hello” } “hello world hello world” “world” { “world”, “world” } “hello world hello world” “hello world” { “hello world”, “hello world” }
  4. 4. Java Regex Code Pattern pattern = Pattern.compile("hello"); Matcher matcher = pattern.matcher("hello world"); // Find all matches while (matcher.find()) { // Get the matching string String match = matcher.group(); // match = “hello” }
  5. 5. C# Regex Code foreach (Match match in Regex.Matches("hello world", "hello", RegexOptions.IgnoreCase)) { // Get the matching string String match = match.Value; // match = “hello” }
  6. 6. Python Regex Code regex = re.compile("hello"); results = regex.search("hello world"); // results = "hello"
  7. 7. Perl Regex Code $value = "hello world"; $value =~ m/hello/; $result = $1; // result = "hello"
  8. 8. The (Ugly) Alternative String needle = "hello"; String haystack = "hello world hello world"; int index = 0; while ((index = haystack.indexOf( needle, index )) != -1) { String match = haystack.substring( index, index + needle.length() ); index++; }
  9. 9. Regex Metacharacters • * - Match zero or more times • ? - Match zero or 1 time • + - Match one or more times • ^ - Match the start of a string • $ - Match the end of a string
  10. 10. Character Classes POSIXPOSIX ShorthandShorthand LonghandLonghand DescriptionDescription [:word:] w [A-Za-z0-9_] Alphanumeric Chars. W [^A-Za-z0-9_] Non-alphanumeric Chars. [:alpha:] [A-Za-z] Alphabetic Chars. [:blank:] [ t] Space and tab [:digit:] d [0-9] Numeric Characters D [^0-9] Non-numeric Chars. [:space:] s [ trnvf] Whitespace Characters
  11. 11. Groups Source TextSource Text Regular ExpressionRegular Expression YieldYield “hello world” “([a-z]+)s+([a-z]+) { “hello world”, “hello”, “world” } “hello world12345” “([a-z]+)s+([a-z]+) { “hello world”, “hello”, “world” } “hello world12345” “([a-z]+)s+([a-z]+)(d+) { “hello world12345”, “hello”, “world”, “12345” }
  12. 12. Example • Example that parses, cleans up, and normalizes input
  13. 13. Recommended Reading • Mastering Regular Expressions by Jeffry Friedl • Regular Expressions Cheat Sheet http://www.addedbytes.com/cheat-sheets/regular-e • Regex Evaluator http://www.cuneytyilmaz.com/prog/jrx/

×