Ciarán Walsh's PHPNW08 slides:
In the right hands regular expressions can be a powerful tool, but it’s also far too easy for them to be used badly, or in the wrong situations.
This talk will kick off with a look at alternatives to regular expressions, for when the power of pattern matching is not required, and will also go over some cases when there are better alternatives available.
Then there will be a brief refresher on pattern syntax and some general tips and tricks to help when constructing regular expressions, before we go on to look at some situations where the use of pattern matching is a good fit, how to solve some common problems, and some common pitfalls when writing patterns.
4. Regular Expression Basics Anchors ^ Matches at the beginning of a line $ Matches at the end of a line
5. Regular Expression Basics Character Classes [abc] Matches one of ‘ a ’, ‘ b ’ or ‘ c ’ [a-c] Same as above (character range) [^abc] Matches one character that is not listed . Matches any single character
6. Regular Expression Basics Alternation a|b Matches one of ‘ a ’ or ‘ b ’ dog|cat Matches one of “dog” or “cat”
7. Regular Expression Basics Quantifiers (repetition) {x,y} Matches minimum of x and a maximum of y occurrences; either can be omitted * Matches zero or more occurrences (any amount). Same as {0,} + Matches one or more occurrences. Same as {1,} ? Matches zero or one occurrences. Same as {0,1}
8. Regular Expression Basics Grouping (…) Groups the contents of the parentheses. Affects alternation and quantifiers. Allows parts of the match to be captured for|backward “ for” or “backward” (for|back)ward “ forward” or “backward”
13. Testing for a Substring if ( preg_match ( '/foo/' , $ var )) if ( strpos ( $ var , 'foo' ) !== false ) if ( preg_match ( '/foo/i' , $ var )) if ( stripos ( $ var , 'foo' ) !== false )
19. Using Regular Expressions Postcodes /[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][A-Z]{2}/ IP Addresses @^({1,2})/({1,2})/({4})$@
20.
21. You don’t need to use /…/ to denote a pattern! /…/ to denote a pattern! preg_match ( '/<b><s> .+ < s> .+ < b>/' , $ html ) preg_match ( '@<b><s> .+ </s> .+ </b>@' , $ html )
22. Greediness $ html = <<< HTML <span> some text </span><span> some more text! </span> HTML ; preg_match ( "@<span>(.+)</span>@" , $ html , $ matches ); echo $ matches [ 0 ]; preg_match ( "@<span>(.+?)</span>@" , $ html , $ matches ); echo $ matches [ 0 ];
23. You can make your pattern readable! preg_match ( '`^(+)://(?:(.+?):(.+?)@)?(.+?)(+)$`' , $ s , $ matches ) preg_match ( '` ^ (+):// # Protocol (?: (.+?) # Username : # : (.+?) # Password @ # @ )? # Username/password are optional (.+?) # Hostname (+) # Top-level domain $ `x' , $ s , $ matches );