Your SlideShare is downloading. ×
Regex Presentation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Regex Presentation

1,209
views

Published on

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,209
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Regular Expression What are Regular Expressions? Regular Expressions (regex) are a way to define a pattern of characters in a text string.
  • 2. Not all REGEX are equal PCRE : Perl Compatible Regex Posix : (Simple Regular Expression) Basic Regex (BRE) Extended Regex (ereg_...) (1986!) deprecated as of PHP 5.3.0
  • 3. Components ● Literal Characters ● Character Classes ● Shorthand Character Classes / 'dot' ● Non-Printable Characters / Anchors ● Quantifiers ● Modifiers
  • 4. Literal Characters Are simple characters the way you expect them. e.g. /a/ -> will match 'a' in a text
  • 5. Character Classes ● Groups of character [abcdefgh0123] ● Range of character [a-h0-3] ● Inverse Range of Character [^i-z4-9] e.g. /[gG]uide[lL]ine/ -> guideline or guideLine
  • 6. Shorthand Char. Classes d -> [0-9] w -> [a-zA-Z] s -> whitespace + [trn] negative: D -> [^0-9] W -> [^a-zA-Z] S '.' -> any character!!
  • 7. Non-printable Characters t -> tab character (ASCII 0x09) r -> carriage return (0x0D) n -> line feed (0x0A) (a (bell, 0x07), e (escape, 0x1B), f (form feed, 0x0C) ,v (vertical tab, 0x0B) ) xFF -> hexadecimal index in the char. set e.g. xA9 -> copyright symbol in the Latin-1 uFFFF -> Unicode character e.g. u20AC -> the euro currency sign ^ begin of the string $ end of the string b -> word boundary - B -> not word boundary
  • 8. Quantifiers REGEXES ARE GREEDY {min, max} / {min,} / {,max} / {exact} ? -> {0,1} + -> {1,} * -> {0,} lazy quantifiers: carefull when using /.*/ +? *? | : not quantifier, simple 'OR'
  • 9. Modifiers //i : case incensitive //m : multiline //x : ignore whitespace Internal Option Set: (? .. ) (?i) e.g. /ab(?i)c/ -> "abc" and "abC"
  • 10. Subpatterns Pattern in a pattern in ..... Can be nested!! e.g. /((red|white) (king|queen))/ reg king white king red queen white queen
  • 11. PHP & Regex preg_.... : PCRE strpos() or strstr() faster ereg_.... : Posix deprecated in 5.3.0 preg_ is often faster mb_ereg_...: "multibyte"
  • 12. Resources & Tools ● http://www.regular-expressions.info ● http://en.wikipedia.org/wiki/Regular_expression ● http://be.php.net/manual/en/regexp.reference.php ● http://regexpal.com/ ● http://www.fileformat.info/tool/regex.htm ● http://www.regexbuddy.com/ ● http://www.ultrapico.com/Expresso.htm