SlideShare a Scribd company logo
1 of 7
Download to read offline
DZone, Inc. | www.dzone.com 
By Mike Keith 
#127 
Getting Started with JPA 2.0 Get More Refcardz! Visit refcardz.com 
CONTENTS INCLUDE: 
n Whats New in JPA 2.0 
n JDBC Properties 
n Access Mode 
n Mappings 
n Shared Cache 
n Additional API and more... 
Getting Started with 
JPA 2.0
#196 
DZone, Inc. | www.dzone.com Get More Refcardz! Visit refcardz.com 
Regular Expressions 
By: Callum Macrae 
About Regular Expressions 
A regular expression, also known as a regex or regexp, is a way of defining 
a search pattern. Think of regexes as wildcards on steroids. Using 
wildcards, *.txt matches anything followed by the string “.txt”. But regular 
expressions let you e.g.: match specific characters, refer back to previous 
matches in the expression, include conditionals within the expression, and 
much more. 
This Refcard assumes basic familiarity with program structures, but no 
prior knowledge of regular expressions. It would be wise to play with as 
many of the examples as possible, since mastering regex requires a good 
deal of practical experience. 
There is no single standard that defines what regular expressions must 
and must not do, so this Refcard does not cover every regex version or 
‘flavor’. Parts of this Refcard will apply to some flavors (for example, PHP 
and Perl), while the same syntax will throw parse errors with other flavors 
(for example, JavaScript, which has a surprisingly poor regex engine). 
Regular expressions are supported in most programming and scripting 
languages, including Java, .NET, Perl, Python, PHP, JavaScript, MySQL, 
Ruby, and XML. Languages that do not natively support regular 
expressions nearly always have libraries available to add support. If 
you prefer one regex flavor to the flavor the language natively supports, 
libraries are often available which will add support for your preferred flavor 
(albeit often using a slightly different syntax to the native expressions). 
The Structure of a Regular Expression 
All regular expressions follow the same basic structure: expression plus 
flag. In the following example, "regexp?" is the expression, and "mg" is the 
flag used. 
/regexp?/mg 
Certain programming languages may also allow you to specify your 
expression as a string. This allows you to manipulate the expression 
programmatically – something you are very likely to do, once you get the 
hang of writing regexes. For example, the following code will specify the 
same regular expression as above in JavaScript: 
new RegExp(“regexp?”, “mg”); // returns /regexp?/mg 
Basic Features 
Normal Characters 
A regular expression containing no special characters ($()*+.?[^|) will match 
exactly what is contained within the expression. The following expression 
will match the "world" in "Hello world!": 
“Hello world!”.match(/world/); // “world” 
These characters are treated as special characters: 
$()*+.?[^| 
and need to be escaped using the backslash (“”) character: 
“Price: $20”.match(/$20/); // “$20” 
(For more on handling special characters, see page 2 below.) 
Character Types 
Regex includes character types that can be used to match a range of 
commonly used characters: 
Type Description 
d Decimal digit: [0-9] 
D Not a decimal digit: [^0-9] 
s Whitespace character: [ nrtf] 
S Not a whitespace character: [^ nrtf] 
w Word character: [a-zA-Z0-9_] 
W Not a word character: [^a-zA-Z0-9_] 
Comments 
Comments in regex are inserted like this: 
“Hello world!”.match(/wor(?#comment)ld/); // “world” 
Many flavors of regex (including Java, .Net, Perl, Python, Ruby, and XPath) 
include a free-spacing mode (enabled using the x flag). This lets you use 
just a hash character to start a comment: 
/ 
(0?d|1[0-2]) # Month 
/ # Separator 
(20dd) # Year between 2000 and 2099 
/x 
CONTENTS INCLUDE: 
❱ About Regular Expressions 
❱ Advanced Features 
❱ The Dot Operator 
❱ Quantifiers 
❱ Flags 
❱ Character Classes... and more! 
Regular Expressions
2 Regular Expressions 
To provide immediate feedback on regex match results, this Refcard will 
include matches in comments, like this: 
DZone, Inc. | www.dzone.com 
“Regex is awesome”.match(/regexp?/i); // “Regex” 
The Dot Operator 
The dot character (".") can be used to match any character except for new 
line characters (n and r). Be careful with the dot operator: it may match 
more than you would intuitively guess. 
For example, the following regular expression will match the letter c, 
followed by any character that isn't a new line, followed by the letter t: 
“cat”.match(/c.t/); // “cat” 
The dot operator will only match one character (“a”). 
Contrast this simple usage of the dot operator with the following, 
dangerous usage: 
/<p class=”(.+)”>/ 
That expression initially looks like it would match an opening HTML 
paragraph tag, make sure that the only attribute is the class attribute, 
and store the class attribute in a group. The “+” is used to quantify the 
dot operator: the “+” says ‘match one or more’. Most of the expression is 
treated as literal text, and the (.+) looks for a string of any character of any 
length and stores it in a group. However, it doesn’t work. Here’s what it 
matches instead: 
“<p class=’test’ onclick=’evil()’>”.match(/<p class=’(.+)’>/); 
// [“<p class=’test’ onclick=’evil()’>”, 
// “test’ onclick=’evil()”] 
Instead of matching just the class, it has matched the on-click attribute. 
This is because the dot operator, matched with the “+” quantifier, picked 
up everything before the closing greater-than. The first lesson: the dot 
operator can be a clumsy weapon. Always use the most targeted feature 
of regex you can. The second lesson: quantifiers are ex-tremely powerful, 
especially with the dot operator. 
The Pipe Character 
The pipe character causes a regular expression to match either the left side 
or the right side. For example, the following expression will match either 
"abc" or "def": 
/abc|def/ 
Anchors 
Regex anchors match specific, well-defined points, like the start of the 
expression or the end of a word. Here are the most important: 
Char Description Example 
^ The caret character matches 
the beginning of a string or the 
beginning of a line in multi-line 
mode. 
/^./ 
Matches "a" in "abcndef", 
or both "a" and "d" in mul-ti- 
line mode. 
$ The dollar character matches the 
end of a string, or the end of a line 
in multi-line mode. 
/.$/ 
Matches "f" in "abcndef", 
or both "c" and "f" in mul-ti- 
line mode. 
A Always matches the start of a 
string, irrespective of multi-line 
mode. 
/A./ 
Matches "a" in "abcndef". 
Z Always matches the end of a 
string, irrespective of multi-line 
mode. If the last character is a line 
break, it will match the character 
before. 
/.Z/ 
Matches "f" in "abc 
ndefn". Matches "f" in 
"abcndef" 
Char Description Example 
z Always matches the end of a 
string, irrespective of multi-line 
mode. Will always match the last 
character. 
/.z/ 
Matches "f" in "abcndef". 
Matches "n" in "abc 
ndefn" 
b Matches a word boundary: the 
position between a word character 
(w) and a non-word character 
or the start or end of the string ( 
W|^|$). 
/.b/ 
Matches "o", " " and "d" in 
"Hello world". 
B Matches a non-word boundary: 
the position between two word 
characters or two non-word 
characters. 
/B.B/ 
Matches every letter but 
"H" and "o" in "Hello". 
< Matches the start of a word - 
similar to b, but only at the start 
of the word. 
/<./ 
Matches "H" and "w" in 
"Hello world". 
> Matches the end of a word - 
similar to b, but only at the end of 
the word. 
/.>/ 
Matches "o" and "d" in 
"Hello world". 
Groups 
Regex groups lump just about anything together. Groups are commonly 
used to match part of a regular expression and refer back to it later, or to 
repeat a group multiple times. 
In this Refcard, groups will be represented like this: 
“Regex is awesome”.match(/Regex is (awesome)/); 
// [“Regex is awesome”, “awesome”] 
Capturing Groups 
The most basic group, the capturing group, is marked simply by putting 
parentheses around whatever you want to group: 
“Hello world!”.match(/Hello (w+)/); // [“Hello world”, “world”] 
(Note the w character type, introduced on page 1.) Here the capturing 
group is saved and can be referred to later. 
Non-capturing Groups 
There are also non-capturing groups, which will not save the matching text 
and so cannot be referred back to later. We can get these by putting “?:” 
after the opening parenthesis of a group: 
“Hello world!”.match(/Hello (?:w+)/); // [“Hello world”] 
Using Capturing vs. Non-capturing Groups 
Use capturing groups only when you will need to refer to the grouped text 
later. Non-capturing groups save memory and simplify debugging by not 
filling up the returned array with data that you don't need. 
Named Capture Groups 
When a regex includes a named capture group, the func-tion will return an 
object such that you can refer to the matched text using the name of the 
group as the property: 
“Hello world!”.match(/Hello (?<item>w+)/); 
// {“0”: “Hello world”, “1”: “world”, “item”: “world”} 
The name of the group is surrounded by angle brackets. 
Backreferences 
You can refer back to a previous capturing group by using a backreference. 
After matching a group, refer back to the text matched in that group by 
inserting a backward slash followed by the number of the capturing group. 
For example, the following will match the same character four times in a row:
3 Regular Expressions 
DZone, Inc. | www.dzone.com 
“aabbbb”.match(/(w)1{3}/); // bbbb 
In this example, the group matches the first “b”; then 1 refers to that “b” and 
matches the letters after it three times (the “3” inside the curly brackets). 
Capturing groups are numbered in numerical order starting from 1. 
A more complicated (but quite common) use-case involves matching 
repeated words. To do this, we first want to match a word. We’ll match 
a word by grabbing a word boundary, followed by a group capturing the 
first word, followed by a space, followed by a backreference to match 
the second word, and then a word boundary to make sure that we’re not 
matching word-parts plus whole words: 
“hello hello world!”.match(/b(w+) 1b/); “hello hello” 
It is possible to have a backreference refer to a named capture group, too, 
by using k<groupname>. The following regular expression will do the same 
as above: 
/b(?<word>w+) k<word>b/ 
Groups and Operator Precedence 
Groups can be used to control effective order of operations. For example, 
the pipe character has the lowest operator precedence, so pipes are often 
used within groups to con-trol the insertion point of the logical ‘or’. The 
following expression will match either "abcdef" or "abcghi": 
/abc(def|ghi)/ 
Character Classes 
A character class allows you to match a range or set of characters. The 
following uses a character class to match any of the contained letters (in 
this case, any vowel): 
/[aeiou]/ 
The following regex will match “c”, followed by a vowel, followed by “t”; it 
will match “cat” and “cot”, but not “ctt” or “cyt”: 
/c[aeiou]t/ 
You can also match a range of characters using a character class. For 
example, /[a-i]/ will match any of the letters between a and i (inclusive). 
Character classes work with numbers too. This is particularly useful for 
matching e.g. date-ranges. The following regular expression will match a 
date between 1900 and 2013: 
/19[0-9][0-9]|200[0-9]|201[0-3]/ 
Ranges and specific matches can be combined: /[a-mxyz]/ will match the 
letters a through m, plus the letters x, y, and z. 
Negated Character Classes 
We can also use character classes to specify characters we don’t want 
to match. These are called negated character classes, and are created by 
putting a caret (“^”) at the be-ginning of the class. This indicates that we 
want to match a character that doesn’t match any of the characters in a 
character class. 
The following negated character class will match a “c”, followed by a non-vowel, 
followed by a “t”. This means that it will match “cyt”, “c0t” and “c.t”, 
but will not match “cat”, “cot”, “ct” or “cyyt”: 
/c[^aeiou]t/ 
Hot 
Tip 
Be careful to distinguish between negated character classes 
(“match not-x”) and simply not matching a character (“do 
not match x”). 
Quantifiers 
Quantifiers are used to repeat something multiple times. The basic quanti-fiers 
are: + * ? 
The most basic quantifier is "+", which means that something should 
be repeated one or more times. For example, the following expression 
matches one or more vowels: 
“queueing”.match(/[aeiou]+/); // “ueuei” 
Note that it doesn’t match the previous match (so not the first “u” multiple 
times), but rather the previous expression, i.e. another vowel. In order to 
match the previous match, use a group and a backreference, along with the 
“*” operator. 
The “*” operator is similar to the “+” operator, except that “*” matches zero 
or more instead of one or more of an expression. 
“ct”.match(/ca*t/); // “ct” 
“cat”.match(/ca*t/); // “cat” 
“caat”.match(/ca*t/); // “caat” 
To match the same character from a match one or more times, use a 
group, a backreference, and the “*” operator: 
“saae”.match(/([aeiou])1*/); // “aa” 
In the string “queuing,” this expression would match only “u”. In “aaeaa”, it 
would match only the first “aa”. 
The “?” character is the final basic quantifier. It makes the previous 
expression optional: 
“ct”.match(/ca?t/); // “ct” 
“cat”.match(/ca?t/); // “cat” 
“caat”.match(/ca?t/); // no match 
We can make more powerful quantifiers using curly braces. The following 
expressions will both match exactly five oc-currences of the letter “a”. 
/^a{5}$/ 
/^aaaaa$/ 
The first is a lot more readable, especially with higher numbers or longer 
groups, where it would be impractical to type them all out. 
We can use curly braces to repeat something between a range of times: 
/^a{3,5}$/ 
That will match the letter “a” repeated 3, 4, or 5 times. 
If you want to match something repeated up to a certain number of times, 
you can use 0 as the first number. If you want to match something more 
than a certain number with no maximum, you can just leave the second 
number blank: 
/^a{3,}$/ 
This will match the letter “a” repeated 3 or more times with no maximum 
number of repetitions. 
Greedy vs. Lazy Matches 
There are two types of matches: greedy or lazy. Most flavors of regular 
expressions are greedy by default.
4 Regular Expressions 
DZone, Inc. | www.dzone.com 
To make a quantifier lazy, append a question mark to it. 
The difference is best illustrated using an example: 
‘He said “Hello” and then “world!”’.match(/”.+”/); 
// “Hello” and then “world!” 
‘He said “Hello” and then “world!”’.match(/”.+?”/); 
// “Hello” 
The first match is greedy. It has matched as many characters as it can 
(between the first and last quotes). The second match is lazy. It has matched 
as few characters as possible (between the first and second quotes). 
Note that the question mark does not make the quantifier optional (which is 
what the question mark is usually used for in regular expressions). 
More examples of greedy vs. lazy quantification: 
‘He said “Hello” and then “world!”’.match(/”.+?”/); 
// “Hello” 
‘He said “Hello” and then “world!”’.match(/”.*?”/); 
// “Hello” 
“aaa”.match(/aa?a/); // “aaa” 
“aaa”.match(/aa??a/); // “aa” 
‘He said “Hello” and then “world!”’. match(/”.{2,}?”/); 
// “Hello” 
Note the difference the question mark makes. 
Hot 
Tip 
Be aware of differences between greedy and lazy matching 
when writing regular expressions. A greedy expression 
can be rewritten into a shorter lazy expression. For regex 
beginners, the difference often accounts for unexpected, 
counterintuitive results. 
Flags 
Flags (also known as "modifiers") can be used to modify how the regular 
expression engine will treat the regular expression. In most flavors of 
regular expression, flags are specified after the final forward slash, like this: 
/regexp?/mg 
In this case, the flags are m and g. 
Flag support does vary among regex flavors, so check the documentation 
for the regex engine in your language or library. 
The most common flags are: i, g, m, s, and x. 
The i flag is the case insensitive flag. This tells the regex engine to ignore 
the case. For example, the character class [a-z] will match [a-zA-Z] when 
the i flag is used and the literal character “c” will also match C. 
“CAT”.match(/c[aeiou]t/); // no match 
“CAT”.match(/c[aeiou]t/i); // “CAT” 
The g flag is the global flag. This tells the regex engine to match every 
instance of the match, not just the first (which is the default). For example, 
when replacing a match with a string in JavaScript, using the global flag 
means that every match will be replaced, not just the first. The following 
code (in JavaScript) demonstrates this: 
“cat cat”.replace(/cat/, “dog”); // “dog cat” 
“cat cat”.replace(/cat/g, “dog”); // “dog dog” 
The m flag is the multiline flag. This tells the regex engine to match “^” and 
“$” to the beginning and end of a line respectively, instead of the beginning 
and end of the string (which is the default). Note that A and Z will still only 
match the beginning and end of the string, regardless of what multi-line 
mode is set to. 
In the following code example, we see a lazy match from ^ to $ (we’re using 
[sS] because the dot operator doesn’t match new lines by default). When 
^ and $ match the beginning of the line (when multi-line mode is enabled), it 
returns only one line of text. Otherwise, it returns everything. 
“1st linen2nd line”.match(/^[sS]+?$/); // “1st linen2nd line” 
“1st linen2nd line”.match(/^[sS]+?$/m); // “1st line” 
The s flag is the singleline flag (also called the dotall flag in some flavors). By 
default, the dot operator will match every character except for new lines. With 
the s flag enabled, the dot operator will match everything, including new lines. 
“This isna test”.match(/^.+$/); // no match 
“This isna test”.match(/^.+$/s); // “This isna test” 
Hot 
Tip 
This behavior can be simulated in languages where the s 
flag isn’t supported by using [sS] instead of the s flag. 
The x flag is the extended mode flag, which activates extended mode. 
Extended mode allows you to make your regular expressions more 
readable by adding whitespace and comments. Consider the following 
two expressions, which both do exactly the same thing, but are drastically 
different in terms of readability. 
/([01]?d{1,3}|200d|201[0-3])([-/.])(0?[1-9]|1[012])2(0?[1-9]|[12]d|3[01])/ 
/ 
([01]?d{1,3}|200d|201[0-3]) # year 
([-/.]) # separator 
(0?[1-9]|1[012]) # month 
2 # same separator 
(0?[1-9]|[12]d|3[01]) # date 
/x 
Modifying Flags Mid-Expression 
In some flavors of regular expressions, it is possible to modify a flag mid-expression 
using a syntax similar (but not identical) to groups. 
To enable a flag mid-expression, use “(?i)” where i is the flag you want to 
activate: 
“foo bar”.match(/foo bar/); // “foo bar” 
“foo BAR”.match(/foo bar/); // no match 
“foo bar”.match(/foo (?i)bar/); // “foo bar” 
“foo BAR”.match(/foo (?i)bar/); // “foo BAR” 
To disable a flag, use “(?-i)”, where i is the flag you want to disable: 
“foo bar”.match(/foo bar/i); // “foo bar” 
“foo BAR”.match(/foo bar/i); // “foo BAR” 
“foo bar”.match(/foo (?-i)bar/i); // “foo bar” 
“foo BAR”.match(/foo (?-i)bar/i); // no match 
You can activate a flag and then deactivate it: 
/foo (?i)bar(?-i) world/ 
And you can activate or deactivate multiple flags at the same time by using 
“(?ix-gm)” where i and x are the flags we want enabling and g and m the 
ones we want disabling. 
Handling Special Characters 
To handle special characters in a regular expression, either (a) use their 
ASCII or Unicode numbers (in a variety of forms), or (b) just escape them. 
Any character that isn’t $()*+.?[]^{|} will match itself, even if it is a special 
character. For example, the following will match an umlaut: 
/ß/
5 Regular Expressions 
To match any of the characters mentioned above ()$()*+.?[]^{|}() you can 
prepend them with a backward slash to mark them as literal characters: 
DZone, Inc. | www.dzone.com 
/$/ 
It is impractical to put some special characters, such as emoji (which will 
not display in most operating systems) and invisible characters, right into a 
regular expression. To solve this problem, when writing regular expressions 
just use their ASCII or Unicode character codes. 
The first ASCII syntax allows you to specify the ASCII character code as 
octal, which must begin with a 0 or it will be parsed as a backreference. To 
match a space, we can use this: 
/040/ 
You can match ASCII characters using hexadecimal, too. The following will 
also match a space: 
/x20/ 
You can also match characters using their Unicode character codes, but 
only as hexadecimal. The following will match a space: 
/u0020/ 
A few characters can also be matched using shorter escapes. We saw a 
few of them previously (such as t and n for tab and new line characters, 
respectively – pretty standard), and the following table defines all of them 
properly: 
Escape Description Character code 
a The alarm character (bell). u0007 
b Backspace character, but only when 
in a character class—word boundary 
otherwise. 
u0008 
e Escape character. u001B 
f Form feed character. u000C 
n New line character. u000A 
r Carriage return character. u000D 
t Tab character. u0009 
v Vertical tab character. u000B 
Finally, it is possible to specify control characters by pre-pending the letter 
of the character with c. The following will match the character created by 
ctrl+c: 
/cC/ 
Advanced Features 
Advanced features vary more widely by flavor, but several are especially 
useful and worth covering here. 
POSIX Character Classes 
POSIX character classes are a special type of character class which 
allow you to match a common range of characters—such as printing 
characters—without having to type out an ordinary character class every 
time. 
The table below lists all the POSIX character classes available, plus their 
normal character class equivalents. 
Class Description Equivalent 
[:alnum:] Letters and digits [a-zA-Z0-9] 
Class Description Equivalent 
[:alpha:] Letters [a-zA-Z] 
[:ascii:] Character codes 0 - x7F [x00-x7F] 
[:blank:] Space or tab character [ t] 
[:cntrl:] Control characters [x00-x1Fx7F] 
[:digit:] Decimal digits [0-9] 
[:graph:] Printing characters (minus spaces) [x21-x7E] 
[:lower:] Lower case letters [a-z] 
[:print:] Printing characters (including 
spaces) 
[x20-x7E] 
[:punct:] Punctuation: [:graph:] minus 
[:alnum:] 
[:space:] White space [ trnvf] 
[:upper:] Upper case letters [A-Z] 
[:word:] Digits, letters, and underscores [a-zA-Z0-9_] 
[:xdigit:] Hexadecimal digits [a-fA-F0-9] 
aSERTIONS 
Lookahead Assertions 
Assertions match characters but don’t return them. Consider the 
expression “>”, which will match only if there is an end of word (i.e., 
the engine will look for (W|$)), but won’t actually return the space or 
punctuation mark following the word. 
Assertions can accomplish the same thing, but explicitly and therefore with 
more control. 
For a positive lookahead assertion, use “(?=x)”. 
The following assertion will behave exactly like >: 
/.(?=W|$)/ 
This will match the “o” and “d” in the string “Hello world” but it will not 
match the space or the end. 
Consider the following, more string-centered example. This expression 
matches “bc” in the string “abcde”: 
abcde”.match(/[a-z]{2}(?=d)/); // “bc” 
Or consider the following example, which uses a normal capturing group: 
“abcde”.match(/[a-z]{2}(d)/); // [“bcd”, “d”] 
While the expression with the capturing group matches the “d” and then 
returns it, the assertion matches the character and does not return it. This 
can be useful for backrefer-ences or to avoid cluttering up the return value. 
The assertion is the “(?=d)” bit. Pretty much anything you would put in a 
group in a regular expression can be put into an assertion. 
Negative Lookahead Assertions 
Use a negative lookahead assertion (“(?!x)”) when you want to match 
something that is not followed by something else. For example, the 
following will match any letter that is not followed by a vowel: 
This expression returns e, l, and o in hello. Here we notice a difference 
between positive assertions and negative assertions. The following positive 
lookahead does not behave the same as the previous negative lookahead: 
/w(?=[^aeiou])/
6 Regular Expressions 
That would match only the e and the l in hello. A negative assertion does 
not behave like a negated character class. A negative assertion simply 
means that the expression contained within the negative assertion should 
not be matched—not that something that doesn’t match the nega-tive 
assertion should be present. 
Lookbehind Assertions 
As with lookahead assertions, you can use negative look-behind assertions 
to specify that something should not be there. There are two syntaxes for 
doing this; the following two expressions both look for a letter that does not 
have a vowel before it. 
As with negative lookahead assertions, this expression looks for letters that 
do not have vowels before them – not letters with non-vowels before them. 
You can have multiple assertions in a row, for example to assert three digits 
in a row that are not “123”. 
When you shouldn't use Regular Expressions 
Regexes are amazing. Beginners are sometimes tempted to regexify 
everything. This is a mistake. In many cases, it is better to use a parser 
instead. Here are some such cases: 
Parsing XML / HTML. There are many technical reasons why HTML and 
XML should never be parsed using regex, but all far too long winded for 
a cheatsheet - see the Chomsky hierarchy for an explanation. Instead of 
using regular expressions, you can simply use an XML or HTML parser 
instead, of which there are many available for almost every language. 
Email addresses. The RFCs specifying what makes a valid email address 
ABOUT the Author Recommended Book 
Browse our collection of over 150 Free Cheat Sheets 
Upcoming Refcardz 
Free PDF 
DZone, Inc. 
150 Preston Executive Dr. 
Suite 201 
Cary, NC 27513 
888.678.0399 
919.678.0300 
Refcardz Feedback Welcome 
refcardz@dzone.com 
Sponsorship Opportunities 
sales@dzone.com 
/(?<![aeiou]+)w/ 
/(?!=[aeiou]+)w/ 
Copyright © 2013 DZone, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval 
system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior 
written permission of the publisher. 
$7.95 
Version 1.0 
DZone communities deliver over 6 million pages each month to 
more than 3.3 million software developers, architects and decision 
makers. DZone offers something for everyone, including news, 
tutorials, cheat sheets, blogs, feature articles, source code and more. 
“"DZone is a developer's dream",” says PC Magazine. 
and what does not are extremely complicated. A fairly famous regular 
expression which satisfies RFC-822 is over 6,500 characters long: (see 
http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html). Instead of 
using regular expressions: 
1. split the email address into two parts (address and domain) 
2. check for control characters 
3. check for domain validity 
URLs. While there are standards (e.g. http://url.spec.whatwg.org/), they are 
often broken (e.g. Wikipedia uses brackets in their URLs). 
For string operations. If you’re just checking whether a string contains 
a certain word, regular expressions are overkill. Instead, use a function 
designed specifically for working with strings. 
Huge library of community-contributed regexes, rated by users: 
http://regexlib.com/ 
Complete tutorial for beginners, with exercises: 
http://regex.learncodethehardway.org/book/ 
Regex Tuesday: over a dozen regex challenges, with varying levels of 
difficulty: http://callumacrae.github.io/regex-tuesday/ 
This cookbook provides more than 100 recipes 
to help you crunch data and manipulate text 
with regular expressions. Every programmer 
can find uses for regular expressions, but their 
power doesn't come worry-free. Even seasoned 
users often suffer from poor performance, false 
positives, false negatives, or perplexing bugs. 
Regular Expressions Cookbook offers step-by-step 
instructions for some of the most common 
tasks involving this tool, with recipes for C#, Java, 
JavaScript, Perl, PHP, Python, Ruby, and VB.NET. 
BUY NOW! 
Callum Macrae is an experienced front-end developer with a love of all things 
JavaScript and the author of Learning from jQuery. He regularly contributes 
JavaScript-related patches to open source projects, primarily phpBB where he 
is a team member. He also runs the Regex Tuesday Challenges, where regex 
challenges are still occasionally posted. When not coding or writing, Callum 
spends most of his time drumming or cycling. 
Wordpress 
Javascript Debugging 
OpenLayers 
Play 
RESOURCES

More Related Content

What's hot

Lecture 23
Lecture 23Lecture 23
Lecture 23rhshriva
 
3.2 javascript regex
3.2 javascript regex3.2 javascript regex
3.2 javascript regexJalpesh Vasa
 
Textpad and Regular Expressions
Textpad and Regular ExpressionsTextpad and Regular Expressions
Textpad and Regular ExpressionsOCSI
 
Regular expressions in Perl
Regular expressions in PerlRegular expressions in Perl
Regular expressions in PerlGirish Manwani
 
Perl programming language
Perl programming languagePerl programming language
Perl programming languageElie Obeid
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressionsBen Brumfield
 
Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionProf. Wim Van Criekinge
 
Introduction to Perl - Day 2
Introduction to Perl - Day 2Introduction to Perl - Day 2
Introduction to Perl - Day 2Dave Cross
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressionsmussawir20
 
Finaal application on regular expression
Finaal application on regular expressionFinaal application on regular expression
Finaal application on regular expressionGagan019
 
Ruby for Java Developers
Ruby for Java DevelopersRuby for Java Developers
Ruby for Java DevelopersRobert Reiz
 
Regular Expression
Regular ExpressionRegular Expression
Regular ExpressionLambert Lum
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsDanny Bryant
 
Ruby_Coding_Convention
Ruby_Coding_ConventionRuby_Coding_Convention
Ruby_Coding_ConventionJesse Cai
 

What's hot (20)

Lecture 23
Lecture 23Lecture 23
Lecture 23
 
3.2 javascript regex
3.2 javascript regex3.2 javascript regex
3.2 javascript regex
 
Textpad and Regular Expressions
Textpad and Regular ExpressionsTextpad and Regular Expressions
Textpad and Regular Expressions
 
Regular expressions in Perl
Regular expressions in PerlRegular expressions in Perl
Regular expressions in Perl
 
Perl programming language
Perl programming languagePerl programming language
Perl programming language
 
Bioinformatica p2-p3-introduction
Bioinformatica p2-p3-introductionBioinformatica p2-p3-introduction
Bioinformatica p2-p3-introduction
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressions
 
Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introduction
 
Perl Scripting
Perl ScriptingPerl Scripting
Perl Scripting
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
Introduction to Perl - Day 2
Introduction to Perl - Day 2Introduction to Perl - Day 2
Introduction to Perl - Day 2
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
 
Finaal application on regular expression
Finaal application on regular expressionFinaal application on regular expression
Finaal application on regular expression
 
Bioinformatics p2-p3-perl-regexes v2014
Bioinformatics p2-p3-perl-regexes v2014Bioinformatics p2-p3-perl-regexes v2014
Bioinformatics p2-p3-perl-regexes v2014
 
Ruby for Java Developers
Ruby for Java DevelopersRuby for Java Developers
Ruby for Java Developers
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular Expressions
 
Regex lecture
Regex lectureRegex lecture
Regex lecture
 
Perl Introduction
Perl IntroductionPerl Introduction
Perl Introduction
 
Ruby_Coding_Convention
Ruby_Coding_ConventionRuby_Coding_Convention
Ruby_Coding_Convention
 

Viewers also liked

Biostats Origional
Biostats OrigionalBiostats Origional
Biostats Origionalsanchitbaba
 
Guj pakshik kushi_mahotsav
Guj pakshik kushi_mahotsavGuj pakshik kushi_mahotsav
Guj pakshik kushi_mahotsavHimesh Prajapati
 
Miles Corporate ppt v3 May 2012
Miles Corporate ppt v3 May 2012Miles Corporate ppt v3 May 2012
Miles Corporate ppt v3 May 2012Akash Anand
 
Data warehousing
Data warehousingData warehousing
Data warehousingkeeyre
 
Advanced Organic Chemistry
Advanced Organic ChemistryAdvanced Organic Chemistry
Advanced Organic Chemistrysanchitbaba
 
Miles Corporate 20072011
Miles Corporate 20072011Miles Corporate 20072011
Miles Corporate 20072011Akash Anand
 
IFN Islamic Finance News- Special Report
IFN Islamic Finance News- Special ReportIFN Islamic Finance News- Special Report
IFN Islamic Finance News- Special ReportAkash Anand
 
Miles Corporate Presentation
Miles Corporate PresentationMiles Corporate Presentation
Miles Corporate PresentationAkash Anand
 
Partnering with Miles
Partnering with MilesPartnering with Miles
Partnering with MilesAkash Anand
 
Desafios do Desenvolvimento de Front-end em um e-commerce
Desafios do Desenvolvimento de Front-end em um e-commerceDesafios do Desenvolvimento de Front-end em um e-commerce
Desafios do Desenvolvimento de Front-end em um e-commerceEduardo Shiota Yasuda
 
Lead drug discovery
Lead drug discoveryLead drug discovery
Lead drug discoverysanchitbaba
 
Advanced organic chemistry
Advanced organic chemistryAdvanced organic chemistry
Advanced organic chemistrysanchitbaba
 
Anti Parkinsonism
Anti ParkinsonismAnti Parkinsonism
Anti Parkinsonismsanchitbaba
 

Viewers also liked (15)

Biostats Origional
Biostats OrigionalBiostats Origional
Biostats Origional
 
Guj pakshik kushi_mahotsav
Guj pakshik kushi_mahotsavGuj pakshik kushi_mahotsav
Guj pakshik kushi_mahotsav
 
Miles Corporate ppt v3 May 2012
Miles Corporate ppt v3 May 2012Miles Corporate ppt v3 May 2012
Miles Corporate ppt v3 May 2012
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Advanced Organic Chemistry
Advanced Organic ChemistryAdvanced Organic Chemistry
Advanced Organic Chemistry
 
Miles Corporate 20072011
Miles Corporate 20072011Miles Corporate 20072011
Miles Corporate 20072011
 
IFN Islamic Finance News- Special Report
IFN Islamic Finance News- Special ReportIFN Islamic Finance News- Special Report
IFN Islamic Finance News- Special Report
 
Ucits
UcitsUcits
Ucits
 
Miles Corporate Presentation
Miles Corporate PresentationMiles Corporate Presentation
Miles Corporate Presentation
 
Partnering with Miles
Partnering with MilesPartnering with Miles
Partnering with Miles
 
Desafios do Desenvolvimento de Front-end em um e-commerce
Desafios do Desenvolvimento de Front-end em um e-commerceDesafios do Desenvolvimento de Front-end em um e-commerce
Desafios do Desenvolvimento de Front-end em um e-commerce
 
Lead drug discovery
Lead drug discoveryLead drug discovery
Lead drug discovery
 
Advanced organic chemistry
Advanced organic chemistryAdvanced organic chemistry
Advanced organic chemistry
 
Front-end Culture @ Booking.com
Front-end Culture @ Booking.comFront-end Culture @ Booking.com
Front-end Culture @ Booking.com
 
Anti Parkinsonism
Anti ParkinsonismAnti Parkinsonism
Anti Parkinsonism
 

Similar to Regular expressions

Regular expressions
Regular expressionsRegular expressions
Regular expressionsRaghu nath
 
Regular Expressions and You
Regular Expressions and YouRegular Expressions and You
Regular Expressions and YouJames Armes
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20Max Kleiner
 
Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017Sandy Smith
 
Regular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondRegular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondMax Shirshin
 
Can someone please fix my code for a hashtable frequencey counter I.pdf
Can someone please fix my code for a hashtable frequencey counter I.pdfCan someone please fix my code for a hashtable frequencey counter I.pdf
Can someone please fix my code for a hashtable frequencey counter I.pdfanjandavid
 
Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014Sandy Smith
 
Don't Fear the Regex - Northeast PHP 2015
Don't Fear the Regex - Northeast PHP 2015Don't Fear the Regex - Northeast PHP 2015
Don't Fear the Regex - Northeast PHP 2015Sandy Smith
 
Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Sandy Smith
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in PythonSujith Kumar
 
Class 5 - PHP Strings
Class 5 - PHP StringsClass 5 - PHP Strings
Class 5 - PHP StringsAhmed Swilam
 
Stop overusing regular expressions!
Stop overusing regular expressions!Stop overusing regular expressions!
Stop overusing regular expressions!Franklin Chen
 
Strings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlStrings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlsana mateen
 

Similar to Regular expressions (20)

Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular Expressions and You
Regular Expressions and YouRegular Expressions and You
Regular Expressions and You
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Quick start reg ex
Quick start reg exQuick start reg ex
Quick start reg ex
 
Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017
 
Regular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondRegular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And Beyond
 
2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex
 
Can someone please fix my code for a hashtable frequencey counter I.pdf
Can someone please fix my code for a hashtable frequencey counter I.pdfCan someone please fix my code for a hashtable frequencey counter I.pdf
Can someone please fix my code for a hashtable frequencey counter I.pdf
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014
 
Don't Fear the Regex - Northeast PHP 2015
Don't Fear the Regex - Northeast PHP 2015Don't Fear the Regex - Northeast PHP 2015
Don't Fear the Regex - Northeast PHP 2015
 
JavaScript.pptx
JavaScript.pptxJavaScript.pptx
JavaScript.pptx
 
First steps in C-Shell
First steps in C-ShellFirst steps in C-Shell
First steps in C-Shell
 
Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Don't Fear the Regex LSP15
Don't Fear the Regex LSP15
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
 
Class 5 - PHP Strings
Class 5 - PHP StringsClass 5 - PHP Strings
Class 5 - PHP Strings
 
Stop overusing regular expressions!
Stop overusing regular expressions!Stop overusing regular expressions!
Stop overusing regular expressions!
 
Strings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlStrings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perl
 

Recently uploaded

A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts servicesonalikaur4
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Deliverybabeytanya
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of indiaimessage0108
 
Complet Documnetation for Smart Assistant Application for Disabled Person
Complet Documnetation   for Smart Assistant Application for Disabled PersonComplet Documnetation   for Smart Assistant Application for Disabled Person
Complet Documnetation for Smart Assistant Application for Disabled Personfurqan222004
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
Denver Web Design brochure for public viewing
Denver Web Design brochure for public viewingDenver Web Design brochure for public viewing
Denver Web Design brochure for public viewingbigorange77
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)Damian Radcliffe
 

Recently uploaded (20)

Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of india
 
Complet Documnetation for Smart Assistant Application for Disabled Person
Complet Documnetation   for Smart Assistant Application for Disabled PersonComplet Documnetation   for Smart Assistant Application for Disabled Person
Complet Documnetation for Smart Assistant Application for Disabled Person
 
sasti delhi Call Girls in munirka 🔝 9953056974 🔝 escort Service-
sasti delhi Call Girls in munirka 🔝 9953056974 🔝 escort Service-sasti delhi Call Girls in munirka 🔝 9953056974 🔝 escort Service-
sasti delhi Call Girls in munirka 🔝 9953056974 🔝 escort Service-
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
Denver Web Design brochure for public viewing
Denver Web Design brochure for public viewingDenver Web Design brochure for public viewing
Denver Web Design brochure for public viewing
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 

Regular expressions

  • 1. DZone, Inc. | www.dzone.com By Mike Keith #127 Getting Started with JPA 2.0 Get More Refcardz! Visit refcardz.com CONTENTS INCLUDE: n Whats New in JPA 2.0 n JDBC Properties n Access Mode n Mappings n Shared Cache n Additional API and more... Getting Started with JPA 2.0
  • 2. #196 DZone, Inc. | www.dzone.com Get More Refcardz! Visit refcardz.com Regular Expressions By: Callum Macrae About Regular Expressions A regular expression, also known as a regex or regexp, is a way of defining a search pattern. Think of regexes as wildcards on steroids. Using wildcards, *.txt matches anything followed by the string “.txt”. But regular expressions let you e.g.: match specific characters, refer back to previous matches in the expression, include conditionals within the expression, and much more. This Refcard assumes basic familiarity with program structures, but no prior knowledge of regular expressions. It would be wise to play with as many of the examples as possible, since mastering regex requires a good deal of practical experience. There is no single standard that defines what regular expressions must and must not do, so this Refcard does not cover every regex version or ‘flavor’. Parts of this Refcard will apply to some flavors (for example, PHP and Perl), while the same syntax will throw parse errors with other flavors (for example, JavaScript, which has a surprisingly poor regex engine). Regular expressions are supported in most programming and scripting languages, including Java, .NET, Perl, Python, PHP, JavaScript, MySQL, Ruby, and XML. Languages that do not natively support regular expressions nearly always have libraries available to add support. If you prefer one regex flavor to the flavor the language natively supports, libraries are often available which will add support for your preferred flavor (albeit often using a slightly different syntax to the native expressions). The Structure of a Regular Expression All regular expressions follow the same basic structure: expression plus flag. In the following example, "regexp?" is the expression, and "mg" is the flag used. /regexp?/mg Certain programming languages may also allow you to specify your expression as a string. This allows you to manipulate the expression programmatically – something you are very likely to do, once you get the hang of writing regexes. For example, the following code will specify the same regular expression as above in JavaScript: new RegExp(“regexp?”, “mg”); // returns /regexp?/mg Basic Features Normal Characters A regular expression containing no special characters ($()*+.?[^|) will match exactly what is contained within the expression. The following expression will match the "world" in "Hello world!": “Hello world!”.match(/world/); // “world” These characters are treated as special characters: $()*+.?[^| and need to be escaped using the backslash (“”) character: “Price: $20”.match(/$20/); // “$20” (For more on handling special characters, see page 2 below.) Character Types Regex includes character types that can be used to match a range of commonly used characters: Type Description d Decimal digit: [0-9] D Not a decimal digit: [^0-9] s Whitespace character: [ nrtf] S Not a whitespace character: [^ nrtf] w Word character: [a-zA-Z0-9_] W Not a word character: [^a-zA-Z0-9_] Comments Comments in regex are inserted like this: “Hello world!”.match(/wor(?#comment)ld/); // “world” Many flavors of regex (including Java, .Net, Perl, Python, Ruby, and XPath) include a free-spacing mode (enabled using the x flag). This lets you use just a hash character to start a comment: / (0?d|1[0-2]) # Month / # Separator (20dd) # Year between 2000 and 2099 /x CONTENTS INCLUDE: ❱ About Regular Expressions ❱ Advanced Features ❱ The Dot Operator ❱ Quantifiers ❱ Flags ❱ Character Classes... and more! Regular Expressions
  • 3. 2 Regular Expressions To provide immediate feedback on regex match results, this Refcard will include matches in comments, like this: DZone, Inc. | www.dzone.com “Regex is awesome”.match(/regexp?/i); // “Regex” The Dot Operator The dot character (".") can be used to match any character except for new line characters (n and r). Be careful with the dot operator: it may match more than you would intuitively guess. For example, the following regular expression will match the letter c, followed by any character that isn't a new line, followed by the letter t: “cat”.match(/c.t/); // “cat” The dot operator will only match one character (“a”). Contrast this simple usage of the dot operator with the following, dangerous usage: /<p class=”(.+)”>/ That expression initially looks like it would match an opening HTML paragraph tag, make sure that the only attribute is the class attribute, and store the class attribute in a group. The “+” is used to quantify the dot operator: the “+” says ‘match one or more’. Most of the expression is treated as literal text, and the (.+) looks for a string of any character of any length and stores it in a group. However, it doesn’t work. Here’s what it matches instead: “<p class=’test’ onclick=’evil()’>”.match(/<p class=’(.+)’>/); // [“<p class=’test’ onclick=’evil()’>”, // “test’ onclick=’evil()”] Instead of matching just the class, it has matched the on-click attribute. This is because the dot operator, matched with the “+” quantifier, picked up everything before the closing greater-than. The first lesson: the dot operator can be a clumsy weapon. Always use the most targeted feature of regex you can. The second lesson: quantifiers are ex-tremely powerful, especially with the dot operator. The Pipe Character The pipe character causes a regular expression to match either the left side or the right side. For example, the following expression will match either "abc" or "def": /abc|def/ Anchors Regex anchors match specific, well-defined points, like the start of the expression or the end of a word. Here are the most important: Char Description Example ^ The caret character matches the beginning of a string or the beginning of a line in multi-line mode. /^./ Matches "a" in "abcndef", or both "a" and "d" in mul-ti- line mode. $ The dollar character matches the end of a string, or the end of a line in multi-line mode. /.$/ Matches "f" in "abcndef", or both "c" and "f" in mul-ti- line mode. A Always matches the start of a string, irrespective of multi-line mode. /A./ Matches "a" in "abcndef". Z Always matches the end of a string, irrespective of multi-line mode. If the last character is a line break, it will match the character before. /.Z/ Matches "f" in "abc ndefn". Matches "f" in "abcndef" Char Description Example z Always matches the end of a string, irrespective of multi-line mode. Will always match the last character. /.z/ Matches "f" in "abcndef". Matches "n" in "abc ndefn" b Matches a word boundary: the position between a word character (w) and a non-word character or the start or end of the string ( W|^|$). /.b/ Matches "o", " " and "d" in "Hello world". B Matches a non-word boundary: the position between two word characters or two non-word characters. /B.B/ Matches every letter but "H" and "o" in "Hello". < Matches the start of a word - similar to b, but only at the start of the word. /<./ Matches "H" and "w" in "Hello world". > Matches the end of a word - similar to b, but only at the end of the word. /.>/ Matches "o" and "d" in "Hello world". Groups Regex groups lump just about anything together. Groups are commonly used to match part of a regular expression and refer back to it later, or to repeat a group multiple times. In this Refcard, groups will be represented like this: “Regex is awesome”.match(/Regex is (awesome)/); // [“Regex is awesome”, “awesome”] Capturing Groups The most basic group, the capturing group, is marked simply by putting parentheses around whatever you want to group: “Hello world!”.match(/Hello (w+)/); // [“Hello world”, “world”] (Note the w character type, introduced on page 1.) Here the capturing group is saved and can be referred to later. Non-capturing Groups There are also non-capturing groups, which will not save the matching text and so cannot be referred back to later. We can get these by putting “?:” after the opening parenthesis of a group: “Hello world!”.match(/Hello (?:w+)/); // [“Hello world”] Using Capturing vs. Non-capturing Groups Use capturing groups only when you will need to refer to the grouped text later. Non-capturing groups save memory and simplify debugging by not filling up the returned array with data that you don't need. Named Capture Groups When a regex includes a named capture group, the func-tion will return an object such that you can refer to the matched text using the name of the group as the property: “Hello world!”.match(/Hello (?<item>w+)/); // {“0”: “Hello world”, “1”: “world”, “item”: “world”} The name of the group is surrounded by angle brackets. Backreferences You can refer back to a previous capturing group by using a backreference. After matching a group, refer back to the text matched in that group by inserting a backward slash followed by the number of the capturing group. For example, the following will match the same character four times in a row:
  • 4. 3 Regular Expressions DZone, Inc. | www.dzone.com “aabbbb”.match(/(w)1{3}/); // bbbb In this example, the group matches the first “b”; then 1 refers to that “b” and matches the letters after it three times (the “3” inside the curly brackets). Capturing groups are numbered in numerical order starting from 1. A more complicated (but quite common) use-case involves matching repeated words. To do this, we first want to match a word. We’ll match a word by grabbing a word boundary, followed by a group capturing the first word, followed by a space, followed by a backreference to match the second word, and then a word boundary to make sure that we’re not matching word-parts plus whole words: “hello hello world!”.match(/b(w+) 1b/); “hello hello” It is possible to have a backreference refer to a named capture group, too, by using k<groupname>. The following regular expression will do the same as above: /b(?<word>w+) k<word>b/ Groups and Operator Precedence Groups can be used to control effective order of operations. For example, the pipe character has the lowest operator precedence, so pipes are often used within groups to con-trol the insertion point of the logical ‘or’. The following expression will match either "abcdef" or "abcghi": /abc(def|ghi)/ Character Classes A character class allows you to match a range or set of characters. The following uses a character class to match any of the contained letters (in this case, any vowel): /[aeiou]/ The following regex will match “c”, followed by a vowel, followed by “t”; it will match “cat” and “cot”, but not “ctt” or “cyt”: /c[aeiou]t/ You can also match a range of characters using a character class. For example, /[a-i]/ will match any of the letters between a and i (inclusive). Character classes work with numbers too. This is particularly useful for matching e.g. date-ranges. The following regular expression will match a date between 1900 and 2013: /19[0-9][0-9]|200[0-9]|201[0-3]/ Ranges and specific matches can be combined: /[a-mxyz]/ will match the letters a through m, plus the letters x, y, and z. Negated Character Classes We can also use character classes to specify characters we don’t want to match. These are called negated character classes, and are created by putting a caret (“^”) at the be-ginning of the class. This indicates that we want to match a character that doesn’t match any of the characters in a character class. The following negated character class will match a “c”, followed by a non-vowel, followed by a “t”. This means that it will match “cyt”, “c0t” and “c.t”, but will not match “cat”, “cot”, “ct” or “cyyt”: /c[^aeiou]t/ Hot Tip Be careful to distinguish between negated character classes (“match not-x”) and simply not matching a character (“do not match x”). Quantifiers Quantifiers are used to repeat something multiple times. The basic quanti-fiers are: + * ? The most basic quantifier is "+", which means that something should be repeated one or more times. For example, the following expression matches one or more vowels: “queueing”.match(/[aeiou]+/); // “ueuei” Note that it doesn’t match the previous match (so not the first “u” multiple times), but rather the previous expression, i.e. another vowel. In order to match the previous match, use a group and a backreference, along with the “*” operator. The “*” operator is similar to the “+” operator, except that “*” matches zero or more instead of one or more of an expression. “ct”.match(/ca*t/); // “ct” “cat”.match(/ca*t/); // “cat” “caat”.match(/ca*t/); // “caat” To match the same character from a match one or more times, use a group, a backreference, and the “*” operator: “saae”.match(/([aeiou])1*/); // “aa” In the string “queuing,” this expression would match only “u”. In “aaeaa”, it would match only the first “aa”. The “?” character is the final basic quantifier. It makes the previous expression optional: “ct”.match(/ca?t/); // “ct” “cat”.match(/ca?t/); // “cat” “caat”.match(/ca?t/); // no match We can make more powerful quantifiers using curly braces. The following expressions will both match exactly five oc-currences of the letter “a”. /^a{5}$/ /^aaaaa$/ The first is a lot more readable, especially with higher numbers or longer groups, where it would be impractical to type them all out. We can use curly braces to repeat something between a range of times: /^a{3,5}$/ That will match the letter “a” repeated 3, 4, or 5 times. If you want to match something repeated up to a certain number of times, you can use 0 as the first number. If you want to match something more than a certain number with no maximum, you can just leave the second number blank: /^a{3,}$/ This will match the letter “a” repeated 3 or more times with no maximum number of repetitions. Greedy vs. Lazy Matches There are two types of matches: greedy or lazy. Most flavors of regular expressions are greedy by default.
  • 5. 4 Regular Expressions DZone, Inc. | www.dzone.com To make a quantifier lazy, append a question mark to it. The difference is best illustrated using an example: ‘He said “Hello” and then “world!”’.match(/”.+”/); // “Hello” and then “world!” ‘He said “Hello” and then “world!”’.match(/”.+?”/); // “Hello” The first match is greedy. It has matched as many characters as it can (between the first and last quotes). The second match is lazy. It has matched as few characters as possible (between the first and second quotes). Note that the question mark does not make the quantifier optional (which is what the question mark is usually used for in regular expressions). More examples of greedy vs. lazy quantification: ‘He said “Hello” and then “world!”’.match(/”.+?”/); // “Hello” ‘He said “Hello” and then “world!”’.match(/”.*?”/); // “Hello” “aaa”.match(/aa?a/); // “aaa” “aaa”.match(/aa??a/); // “aa” ‘He said “Hello” and then “world!”’. match(/”.{2,}?”/); // “Hello” Note the difference the question mark makes. Hot Tip Be aware of differences between greedy and lazy matching when writing regular expressions. A greedy expression can be rewritten into a shorter lazy expression. For regex beginners, the difference often accounts for unexpected, counterintuitive results. Flags Flags (also known as "modifiers") can be used to modify how the regular expression engine will treat the regular expression. In most flavors of regular expression, flags are specified after the final forward slash, like this: /regexp?/mg In this case, the flags are m and g. Flag support does vary among regex flavors, so check the documentation for the regex engine in your language or library. The most common flags are: i, g, m, s, and x. The i flag is the case insensitive flag. This tells the regex engine to ignore the case. For example, the character class [a-z] will match [a-zA-Z] when the i flag is used and the literal character “c” will also match C. “CAT”.match(/c[aeiou]t/); // no match “CAT”.match(/c[aeiou]t/i); // “CAT” The g flag is the global flag. This tells the regex engine to match every instance of the match, not just the first (which is the default). For example, when replacing a match with a string in JavaScript, using the global flag means that every match will be replaced, not just the first. The following code (in JavaScript) demonstrates this: “cat cat”.replace(/cat/, “dog”); // “dog cat” “cat cat”.replace(/cat/g, “dog”); // “dog dog” The m flag is the multiline flag. This tells the regex engine to match “^” and “$” to the beginning and end of a line respectively, instead of the beginning and end of the string (which is the default). Note that A and Z will still only match the beginning and end of the string, regardless of what multi-line mode is set to. In the following code example, we see a lazy match from ^ to $ (we’re using [sS] because the dot operator doesn’t match new lines by default). When ^ and $ match the beginning of the line (when multi-line mode is enabled), it returns only one line of text. Otherwise, it returns everything. “1st linen2nd line”.match(/^[sS]+?$/); // “1st linen2nd line” “1st linen2nd line”.match(/^[sS]+?$/m); // “1st line” The s flag is the singleline flag (also called the dotall flag in some flavors). By default, the dot operator will match every character except for new lines. With the s flag enabled, the dot operator will match everything, including new lines. “This isna test”.match(/^.+$/); // no match “This isna test”.match(/^.+$/s); // “This isna test” Hot Tip This behavior can be simulated in languages where the s flag isn’t supported by using [sS] instead of the s flag. The x flag is the extended mode flag, which activates extended mode. Extended mode allows you to make your regular expressions more readable by adding whitespace and comments. Consider the following two expressions, which both do exactly the same thing, but are drastically different in terms of readability. /([01]?d{1,3}|200d|201[0-3])([-/.])(0?[1-9]|1[012])2(0?[1-9]|[12]d|3[01])/ / ([01]?d{1,3}|200d|201[0-3]) # year ([-/.]) # separator (0?[1-9]|1[012]) # month 2 # same separator (0?[1-9]|[12]d|3[01]) # date /x Modifying Flags Mid-Expression In some flavors of regular expressions, it is possible to modify a flag mid-expression using a syntax similar (but not identical) to groups. To enable a flag mid-expression, use “(?i)” where i is the flag you want to activate: “foo bar”.match(/foo bar/); // “foo bar” “foo BAR”.match(/foo bar/); // no match “foo bar”.match(/foo (?i)bar/); // “foo bar” “foo BAR”.match(/foo (?i)bar/); // “foo BAR” To disable a flag, use “(?-i)”, where i is the flag you want to disable: “foo bar”.match(/foo bar/i); // “foo bar” “foo BAR”.match(/foo bar/i); // “foo BAR” “foo bar”.match(/foo (?-i)bar/i); // “foo bar” “foo BAR”.match(/foo (?-i)bar/i); // no match You can activate a flag and then deactivate it: /foo (?i)bar(?-i) world/ And you can activate or deactivate multiple flags at the same time by using “(?ix-gm)” where i and x are the flags we want enabling and g and m the ones we want disabling. Handling Special Characters To handle special characters in a regular expression, either (a) use their ASCII or Unicode numbers (in a variety of forms), or (b) just escape them. Any character that isn’t $()*+.?[]^{|} will match itself, even if it is a special character. For example, the following will match an umlaut: /ß/
  • 6. 5 Regular Expressions To match any of the characters mentioned above ()$()*+.?[]^{|}() you can prepend them with a backward slash to mark them as literal characters: DZone, Inc. | www.dzone.com /$/ It is impractical to put some special characters, such as emoji (which will not display in most operating systems) and invisible characters, right into a regular expression. To solve this problem, when writing regular expressions just use their ASCII or Unicode character codes. The first ASCII syntax allows you to specify the ASCII character code as octal, which must begin with a 0 or it will be parsed as a backreference. To match a space, we can use this: /040/ You can match ASCII characters using hexadecimal, too. The following will also match a space: /x20/ You can also match characters using their Unicode character codes, but only as hexadecimal. The following will match a space: /u0020/ A few characters can also be matched using shorter escapes. We saw a few of them previously (such as t and n for tab and new line characters, respectively – pretty standard), and the following table defines all of them properly: Escape Description Character code a The alarm character (bell). u0007 b Backspace character, but only when in a character class—word boundary otherwise. u0008 e Escape character. u001B f Form feed character. u000C n New line character. u000A r Carriage return character. u000D t Tab character. u0009 v Vertical tab character. u000B Finally, it is possible to specify control characters by pre-pending the letter of the character with c. The following will match the character created by ctrl+c: /cC/ Advanced Features Advanced features vary more widely by flavor, but several are especially useful and worth covering here. POSIX Character Classes POSIX character classes are a special type of character class which allow you to match a common range of characters—such as printing characters—without having to type out an ordinary character class every time. The table below lists all the POSIX character classes available, plus their normal character class equivalents. Class Description Equivalent [:alnum:] Letters and digits [a-zA-Z0-9] Class Description Equivalent [:alpha:] Letters [a-zA-Z] [:ascii:] Character codes 0 - x7F [x00-x7F] [:blank:] Space or tab character [ t] [:cntrl:] Control characters [x00-x1Fx7F] [:digit:] Decimal digits [0-9] [:graph:] Printing characters (minus spaces) [x21-x7E] [:lower:] Lower case letters [a-z] [:print:] Printing characters (including spaces) [x20-x7E] [:punct:] Punctuation: [:graph:] minus [:alnum:] [:space:] White space [ trnvf] [:upper:] Upper case letters [A-Z] [:word:] Digits, letters, and underscores [a-zA-Z0-9_] [:xdigit:] Hexadecimal digits [a-fA-F0-9] aSERTIONS Lookahead Assertions Assertions match characters but don’t return them. Consider the expression “>”, which will match only if there is an end of word (i.e., the engine will look for (W|$)), but won’t actually return the space or punctuation mark following the word. Assertions can accomplish the same thing, but explicitly and therefore with more control. For a positive lookahead assertion, use “(?=x)”. The following assertion will behave exactly like >: /.(?=W|$)/ This will match the “o” and “d” in the string “Hello world” but it will not match the space or the end. Consider the following, more string-centered example. This expression matches “bc” in the string “abcde”: abcde”.match(/[a-z]{2}(?=d)/); // “bc” Or consider the following example, which uses a normal capturing group: “abcde”.match(/[a-z]{2}(d)/); // [“bcd”, “d”] While the expression with the capturing group matches the “d” and then returns it, the assertion matches the character and does not return it. This can be useful for backrefer-ences or to avoid cluttering up the return value. The assertion is the “(?=d)” bit. Pretty much anything you would put in a group in a regular expression can be put into an assertion. Negative Lookahead Assertions Use a negative lookahead assertion (“(?!x)”) when you want to match something that is not followed by something else. For example, the following will match any letter that is not followed by a vowel: This expression returns e, l, and o in hello. Here we notice a difference between positive assertions and negative assertions. The following positive lookahead does not behave the same as the previous negative lookahead: /w(?=[^aeiou])/
  • 7. 6 Regular Expressions That would match only the e and the l in hello. A negative assertion does not behave like a negated character class. A negative assertion simply means that the expression contained within the negative assertion should not be matched—not that something that doesn’t match the nega-tive assertion should be present. Lookbehind Assertions As with lookahead assertions, you can use negative look-behind assertions to specify that something should not be there. There are two syntaxes for doing this; the following two expressions both look for a letter that does not have a vowel before it. As with negative lookahead assertions, this expression looks for letters that do not have vowels before them – not letters with non-vowels before them. You can have multiple assertions in a row, for example to assert three digits in a row that are not “123”. When you shouldn't use Regular Expressions Regexes are amazing. Beginners are sometimes tempted to regexify everything. This is a mistake. In many cases, it is better to use a parser instead. Here are some such cases: Parsing XML / HTML. There are many technical reasons why HTML and XML should never be parsed using regex, but all far too long winded for a cheatsheet - see the Chomsky hierarchy for an explanation. Instead of using regular expressions, you can simply use an XML or HTML parser instead, of which there are many available for almost every language. Email addresses. The RFCs specifying what makes a valid email address ABOUT the Author Recommended Book Browse our collection of over 150 Free Cheat Sheets Upcoming Refcardz Free PDF DZone, Inc. 150 Preston Executive Dr. Suite 201 Cary, NC 27513 888.678.0399 919.678.0300 Refcardz Feedback Welcome refcardz@dzone.com Sponsorship Opportunities sales@dzone.com /(?<![aeiou]+)w/ /(?!=[aeiou]+)w/ Copyright © 2013 DZone, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. $7.95 Version 1.0 DZone communities deliver over 6 million pages each month to more than 3.3 million software developers, architects and decision makers. DZone offers something for everyone, including news, tutorials, cheat sheets, blogs, feature articles, source code and more. “"DZone is a developer's dream",” says PC Magazine. and what does not are extremely complicated. A fairly famous regular expression which satisfies RFC-822 is over 6,500 characters long: (see http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html). Instead of using regular expressions: 1. split the email address into two parts (address and domain) 2. check for control characters 3. check for domain validity URLs. While there are standards (e.g. http://url.spec.whatwg.org/), they are often broken (e.g. Wikipedia uses brackets in their URLs). For string operations. If you’re just checking whether a string contains a certain word, regular expressions are overkill. Instead, use a function designed specifically for working with strings. Huge library of community-contributed regexes, rated by users: http://regexlib.com/ Complete tutorial for beginners, with exercises: http://regex.learncodethehardway.org/book/ Regex Tuesday: over a dozen regex challenges, with varying levels of difficulty: http://callumacrae.github.io/regex-tuesday/ This cookbook provides more than 100 recipes to help you crunch data and manipulate text with regular expressions. Every programmer can find uses for regular expressions, but their power doesn't come worry-free. Even seasoned users often suffer from poor performance, false positives, false negatives, or perplexing bugs. Regular Expressions Cookbook offers step-by-step instructions for some of the most common tasks involving this tool, with recipes for C#, Java, JavaScript, Perl, PHP, Python, Ruby, and VB.NET. BUY NOW! Callum Macrae is an experienced front-end developer with a love of all things JavaScript and the author of Learning from jQuery. He regularly contributes JavaScript-related patches to open source projects, primarily phpBB where he is a team member. He also runs the Regex Tuesday Challenges, where regex challenges are still occasionally posted. When not coding or writing, Callum spends most of his time drumming or cycling. Wordpress Javascript Debugging OpenLayers Play RESOURCES