• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Regular Expressions and You
 

Regular Expressions and You

on

  • 696 views

 

Statistics

Views

Total Views
696
Views on SlideShare
442
Embed Views
254

Actions

Likes
0
Downloads
0
Comments
0

6 Embeds 254

http://allplayers.github.com 169
http://localhost 43
http://a.achappell.allplayers.com 29
http://imetchrischris.com 11
http://coderwall.com 1
http://allplayers.github.io 1

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Regular Expressions and You Regular Expressions and You Presentation Transcript

    • Regular Expressions and YouAn introduction to regular expressions.James I. ArmesWeb Developer, AllPlayers.com@jamesiarmes
    • Email Validation Examples ^[w.%+-]+@[w.-]+.[A-Za-z]{2,4}$
    • Email Validation Examples(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*:(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)(?:,s*(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*))*)?;s*)
    • Types of Regular Expressions● Simple Regular Expressions● POSIX Basic Regular Expressions● POSIX Extended Regular Expressions● Perl Regular Expressions
    • Simple Regular Expressions● Traditional regular expressions.● Not a standard.● Support by some applications for backwards compatibility.● Deprecated.
    • POSIX Basic Regular Expressions● Created to provide a common standard for Unix tools.● Designed to be backwards compatible with traditional regular expressions.● Adopted as the default syntax of many Unix tools.● Some metacharacters require escaping.
    • POSIX Extended Regular Expressions● Adds some new metacharacters.● Metacharacters do not require escaping.● Dropped support for back references (n).● Many Unix tools provide support with a command line argument (usually -E).
    • Perl Regular Expressions● Adds lazy quantification, named capture groups and recursive patterns.● Adopted by many programming languages due to its power.● Requires non-alphanumeric delimiters around expression.● Other languages only implement a subset, so implementations vary.
    • Syntax
    • Basic Metacharacters. Match any single character.^ Matches beginning of a string.$ Matches end of a string.| Matches the expression before or after (think ||).
    • Character Classes[] Match any characters within the group.[^ ] Match any characters NOT within the group.[n-m] Match a range of characters.Examples:[A-Za-z0-9][^G-Zg-z _]
    • Shorthand Character Classess Any whitespace character such as space, tab and newlines. Same as [nrt ]w Any word character. Same as [A-Za-z0-9_]d Any digit character. Same as [0-9]S, W, D Negated version of the above. Can be used inside character classes but could be confusing.
    • Quantifiers* Match the preceding expression 0 or more times.+ Match the preceding expression 1 or more times.? Match the preceding expression 0 or 1 time.{m,n} Match the preceding expression at least m times but no more than n times.{m,} Match the preceding expression at least m times with no maximum.{,n} Match the preceding expression no more than n times with no minimum.{n} Match the preceding expression exactly n times.
    • Lazy QuantifiersStandard Quantifiers are greedy.Example:Many programming courses start with a "Hello World" example.That would be "Hallo Welt" in German."Hello .*"Many programming courses start with a "Hello World" example.That would be "Hallo Welt" in German.
    • Lazy QuantifiersUse ? to make a quantifier lazy.Example:Many programming courses start with a "Hello World" example.That would be "Hallo Welt" in German."Hello .*?"Many programming courses start with a "Hello World" example.That would be "Hallo Welt" in German.
    • Grouping() Group the expression and capture the text.(?: ) Group the expression but DO NOT capture the text.
    • Backreferences1 through 9 reference previously captured text.Example:Many programming courses start with a "Hello World"example. Hello World examples are extremely simple,especially when they just output "Hello World.(|")Hello World(1)Many programming courses start with a "Hello World"example. Hello World examples are extremely simple,especially when they just output "Hello World.
    • Word Boundariesb matches the position between a word character(w) and a non-word character (W).Example:Hello WorldobHello| World
    • Word BoundariesB matches the position between two wordcharacters (ww).Example:Hello WorldoBHello Wo|rld
    • Lookaheads(?= ) matches the position directly before theexpression is matched.Example:Hello World sounds better than "Hello Earth".Hello(?= World)Hello World sounds better than "Hello Earth".
    • Lookbehinds(?<= ) matches the position directly after theexpression is matched.Example:Hello World sounds better than "Hello Earth".(?<=")HelloHello World sounds better than "Hello Earth".
    • Lookaheads(?! ) matches the position directly before theexpression is NOT matched.Example:Hello World sounds better than "Hello Earth".Hello(?! World)Hello World sounds better than "Hello Earth".
    • Lookbehinds(?<! ) matches the position directly after theexpression is NOT matched.Example:Hello World sounds better than "Hello Earth".(?<!")HelloHello World sounds better than "Hello Earth".
    • Conditionals(?(condition)then|else)● condition must be a lookahead or a lookbehind.● If condition is matched, then must match for the expression to pass.● If condition is not matched, else must match for the expression to pass.
    • ConditionalsExample:Hello World sounds better than "Hello Earth".Hello (?(?<=World)World|Earth)Hello World sounds better than "Hello Earth".Hello (?(?<=People)People|Earth)Hello World sounds better than "Hello Earth".
    • Modifiersi Case insensitive matching.s . matches newline characters.m ^ and $ match after and before newlines (respectively).x Whitespace within the expression is ignored unless escaped.g Match globally.
    • Modifiers● (?a) to turn modifiers on.●(?-a) to turn modifiers off.Examples:(?i)WORLD(?-i)(?i-s)WORLD.(?s-i)(?i:WORLD)
    • LanguageImplementations
    • JavaScript● RegExp object. – var expression = new RegExp(World, g); – var expression = /World/g;● String.match()● String.replace()● String.split()
    • Perl● if ($string =~ /regex/)● $string =~ s/regex/replacement/● Regexp::Common – http://search.cpan.org/dist/Regexp-Common/ – Provides common expressions. – Examples: ● IP Address ● Credit Card Number ● Profanity
    • PHP● ereg vs. preg – preg uses Perl syntax. – ereg uses POSIX Extended syntax. – preg is much faster. – ereg has been deprecated as of PHP 5.3.
    • PHP● preg_match()● preg_match_all()● preg_replace()● preg_split()● preg_quote()● http://www.php.net/manual/en/book.pcre.php● http://php.net/manual/reference.pcre.pattern.modifiers.php
    • Tools and Resources● txt2regex - http://aurelio.net/txt2regex/● Reggy (mac) - http://reggyapp.com/● Patterns (mac) - http://krillapps.com/patterns/● Web based - http://regex.larsolavtorvik.com/● Regular-Expressions.info (reference) - http://www.regular-expressions.info/
    • Thanks!http://xkcd.com/208/