Regular Expressions 2007
Upcoming SlideShare
Loading in...5
×
 

Regular Expressions 2007

on

  • 3,191 views

Beginners guide to using Regular Expressions in PHP

Beginners guide to using Regular Expressions in PHP

Statistics

Views

Total Views
3,191
Views on SlideShare
3,082
Embed Views
109

Actions

Likes
6
Downloads
117
Comments
0

3 Embeds 109

http://www.sydphp.org 106
http://www.slideshare.net 2
http://www.slideee.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Regular Expressions 2007 Regular Expressions 2007 Presentation Transcript

  • Regular Expressions for PHP Adding magic to your programming. Geoffrey Dunn (geoff@warmage.com)
  • What are Regular Expressions
    • Regular expressions are a syntax to match text.
    • They date back to mathematical notation made in the 1950s.
    • Became embedded in unix systems through tools like ed and grep.
  • What are RE
    • Perl in particular promoted the use of very complex regular expressions.
    • They are now available in all popular programming languages.
    • They allow much more complex matching than strpos()
  • Why use RE
    • You can use RE to enforce rules on formats like phone numbers, email addresses or URLs.
    • You can use them to find key data within logs, configuration files or webpages.
  • Why use RE
    • They can quickly make replacements that may be complex like finding all email addresses in a page and making them address [AT] site [dot] com.
    • You can make your code really hard to understand
  • Syntax basics
    • The entire regular expression is a sequence of characters between two forward slashes (/)
    • abc - most characters are normal character matches. This is looking for the exact character sequence a, b and then c
    • . - a period will match any character (except a newline but that can change)
    • [abc] - square brackets will match any of the characters inside. Here: a, b or c.
  • Syntax basics
    • ? - marks the previous as optional. so a? means there might be an a
    • (abc)* - parenthesis group patterns and the asterix marks zero or more of the previous character. So this would match an empty string or abcabcabcabc
    • .+ - the backslash is an all purpose escape character. the + marks one or more of the previous character. So this would match ......
  • More syntax tricks
    • [0-4] - match any number from 0 to 4
    • [^0-4] - match anything not the number 0-4
    • swords - match word where there is white space before and after
    • word -  marks a word boundary. This could be white space, new line or end of the string
  • More syntax tricks
    • d{3,12} - d matches any digit ([0-9]) while the braces mark the min and max count of the previous character. In this case 3 to 12 digits
    • [a-z]{8,} - must be at least 8 letters
  • Matching Text
    • Simple check: preg_match(“/^[a-z0-9]+@([a-z0-9]+.)*[a-z0-9]+$/i”, $email_address) > 0
    • Finding: preg_match(“/colou?r:s+([a-zA-Z]+)/”, $text, $matches); echo $matches[1];
    • Find all: preg_match_all(“/<([^>]+)>/”, $html, $tags); echo $tags[2][1];
  • Matching Lines
    • This is more for looking through files but could be for any array of text.
    • $new_lines = preg_grep(“/Jan[a-z]*[s/-](20)?07/”, $old_lines);
    • Or lines that do not match by adding a third parameter of PREG_GREP_INVERT rather than complicating your regular expression into something like /^[^/]|(/[^p])|(/p[^r]) etc...
  • Replacing text
    • preg_replace(
    • “ /[^@]+(@)[a-zA-Z-_d]+(.)[a-zA-Z-_d.]+/”,
    • array(“ [AT] “, “ [dot] “), $post);
  • Splitting text
    • $date_parts = preg_split(“/[-.,/]+/”, $date_string);
  • Tips
    • Comment what your regular expression is doing.
    • Test your regular expression for speed. Some can cause a noticeable slowdown.
    • There are plenty of simple uses like /Width: (d+)/
    • Watch out for greedy expressions. Eg /(<(.+)>)/ will not pull out “b” and “/b” from “<b>test</b>” but instead will pull “b>test</b”. A easy way to change this behaviour is like this: /(<(.+?)>)/
  • References
    • http://en.wikipedia.org/wiki/Regular_expressions
    • http://php.net/manual/en/ref.pcre.php
    • Thank you