Your SlideShare is downloading. ×
Introduction to regular expressions
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Introduction to regular expressions

972
views

Published on

A quick start introduction to the world of regular expressions, through special characters, quantifiers, character classes.. …

A quick start introduction to the world of regular expressions, through special characters, quantifiers, character classes..

Assumes no knowledge of regular expressions.

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
972
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
28
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Pluarals
  • There's a lot of shorthand when talking about Perl. e.g. Array of Arrays. I'll try to avoid this shorthand.
  • See handout
  •  – reject any match where the cursor is not now at the end of the input
  • There are a load on your handout
  • Transcript

    • 1. Quick Intro to Regexen Brian McCauley (nobull) Birmingham.pm
    • 2. About this talk
      • For Perl Newbies
      • 3. For Regex Newbies
      • 4. Assumes programming experience
      • 5. Only scratches surface
        • Full tutorial could last days
      • Takes some liberties
      • 6. Somewhat revised compared to proceedings
      • 7. Not suitable for world authorities!
    • 8. What is a RE?
      • Compact description of a set of strings
      • 9. Notation does not a regex make
      • 10. We're talking Perl notation
    • 11. Truly “Regular”?
      • “Regular expression” from formal language theory
      • 12. True regular expressions only a tiny subset of what we commonly mean
      • 13. Perl5 (Java, Ruby etc..) regex perhaps better called “patterns”
        • I'll tend to use the terms interchangeably
    • 14. Notational aside
      • Perl patterns conventionally written between //
      • 15. One writes “the pattern /foo/”
        • Looks just like pattern match operator
        • 16. But it's not
      • I'm talking about the pattern
      • 17. I'm not talking about the match operator
    • 18. Simple regex syntax
      • Literal characters / tokens match a literal
        • Alphanumerics
        • 19. Escaped non-alphanumerics
        • 20. (Most) double-quotish escapes
      • Anything else may have special meaning
        • Without specials, a pattern describes one string
      • Concatenation is concatenation
    • 21. “Matches” v “Describes”
      • Initially said “RE describes a set of strings”
      • 22. Why do I keep saying “matches”?
      • 23. Can also think of a pattern as a bit of code
        • Passed an input string (and a cursor)
        • 24. Locates string described by the RE (following the cursor)
        • 25. May also record additional information
    • 26. “Matches” v “Matches”
      • People use “matches” loosely
      • 27. Shorthand terminology
        • Usually clear from context
        • 28. Confusion if shorthand taken literally
    • 29. Alternation
      • Match “this or that”
      • 30. Lower precedence than concatenation
      • 31. Parentheses DWIM
      • 32. Grouping with parentheses has a side-effect
    • 33. Character classes
      • Alternation of a single token (character)
      • 34. Negation
        • /[^ac]/ any single character other than 'a' or 'c'
    • 35. Shorthand character classes
      • The (almost) universal class
        • Sometimes any character at all (depends on switches)
      • “Well known” classes
    • 36. Character encoding
      • Beyond chr(127) “DWIM” gets complicated!
        • Locales, Unicode (the utf8 flag)
        • 37. Exact version of Perl
        • 38. Cited as one of the most annoying features in Perl
    • 39. Quantifiers
      • Match a number of repeats of pattern
      • 40. Pattern, not string, repeated
      • 41. Range (can be open-ended)
      • 42. Precedence
    • 43. Quantifiers
      • Shorthand forms for well known ranges
    • 44. Best match
      • Theoretical RE just defines a set of strings
      • 45. Matching in Perl also says what it matched
        • But a lot of possible matches
        • 46. 19 in all!
      • Choose the first match found
        • For some definition of “first”
    • 47. First match
      • Must match complete pattern
      • 48. First starting position in input
      • 49. First choice in alternation
      • 50. Most repeats in repeat
    • 51. Non-greedy
      • Usual rule “as many repeats as possible”
      • 52. Can also go for the fewest
      • 53. Only useful in the context of a larger expression
    • 54. Greedy but impatient
      • Remember (non-)greediness is local
      • 55. This is sometimes called “eager” or “impatient”
        • I've got a complete match so take it
      • But “must match whole pattern still applies”
    • 56. Anchors
      • Zero-width assertions - match the empty string
      • 57. Only where something that I assert holds true
        • Gross simplification!
      • These assertions also called “anchors”
        • Using term “anchor” for the more complex zero-width assertions can result in false expectations
    • 58. Capturing
      • Match can return more than overall position
      • 59. Records last cursor position at each ( )
      • 60. “captures” the bit between
      • There's an overhead so can group without capture
      1 2 3
    • 63. Back references
      • Match whatever a previous capture matched
      2 nd caputure – any single character As few characters as possible The character we captured before
    • 64. Switches
      • Vagueness earlier
      • 65. Controlled by switches
        • Usually referred to as /i /m /x and /s
    • 66. The rest!
      • This is only a tiny subset
      • 67. Lots more assertions
      • 68. The Perl substitution operator s///
      • 69. Naming your captures
      • 70. Embedding Perl code in your regex
      • 71. Creating complex grammars by defining named subpatterns and using them later
      • 72. It would take an hour just to enumerate them!
    • 73. Live floor show
      • Requests?
      • 74. Questions?