Beneath the Surface - Rubyconf 2013

  • 1,123 views
Uploaded on

This is the final version of this talk, given at RubyConf 2013 …

This is the final version of this talk, given at RubyConf 2013

Many of us approach regular expressions with a certain fear and trepidation, using them only when absolutely necessary. We can get by when we need to use them, but we hesitate to dive any deeper into their cryptic world. Ruby has so much more to offer us. This talk showcases the incredible power of Ruby and the Onigmo regex library

Ruby runs on. It takes you on a journey beneath the surface, exploring the beauty, elegance, and power of regular expressions. You will discover the flexible, dynamic, and eloquent ways to harness this beauty and power in your own code.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,123
On Slideshare
0
From Embeds
0
Number of Embeds
10

Actions

Shares
Downloads
18
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Beneath the Surface Regular Expressions in Ruby @nellshamrell Photo By Mr. Christopher Thomas Creative Commons Attribution-ShareALike 2.0 Generic License
  • 2. ^4[0-9]{12}(?:[0-9]{3})?$ Source: regular-expressions.info
  • 3. We fear what we do not understand
  • 4. Regular Expressions + Ruby Photo By Shayan Creative Commons Attribution-ShareALike 2.0 Generic License
  • 5. Regex Matching in Ruby Ruby Methods Onigmo
  • 6. Onigmo
  • 7. Oniguruma Fork Onigmo
  • 8. Onigmo Reads Regex
  • 9. Onigmo Reads Regex Parses Into Abstract Syntax Tree
  • 10. Onigmo Series of Instructions Reads Regex Parses Into Compiles Into Abstract Syntax Tree
  • 11. Finite State Machines Photo By Felipe Skroski Creative Commons Attribution Generic 2.0
  • 12. A Finite State Machine Shows How Something Works
  • 13. Annie the Dog
  • 14. In the House Out of House Annie the Dog
  • 15. Door In the House Out of House Annie the Dog
  • 16. Door In the House Door Out of House Annie the Dog
  • 17. Finite State Machine
  • 18. Finite State Machine
  • 19. Finite State Machine
  • 20. Multiple States
  • 21. /force/
  • 22. re = /force/ string = “Use the force” re.match(string)
  • 23. “Use the force” f o r Path Doesn’t Match /force/ c e
  • 24. “Use the force” f o r Still Doesn’t Match /force/ c e
  • 25. “Use the force” f o (Fast Forward) r Path Matches! /force/ c e
  • 26. “Use the force” f o r /force/ c e
  • 27. “Use the force” f o r /force/ c e
  • 28. “Use the force” f o r /force/ c e
  • 29. “Use the force” f o r /force/ c e
  • 30. “Use the force” f o r c We Have A Match! /force/ e
  • 31. re = /force/ string = “Use the force” re.match(string) => #<MatchData “force”>
  • 32. Alternation Photo By Shayan Creative Commons Attribution Generic 2.0
  • 33. Pipe /Y(olk|oda)/
  • 34. re = /Y(olk|oda)/ string = “Yoda” re.match(string)
  • 35. “Yoda” o Y o l k d a /Y(olk|oda)/
  • 36. Which To Choose? “Yoda” o Y o l k d a /Y(olk|oda)/
  • 37. Saves To Backtrack Stack “Yoda” o Y o l k d a /Y(olk|oda)/
  • 38. Uh Oh, No Match “Yoda” o Y o l k d a /Y(olk|oda)/
  • 39. Backtracks To Here “Yoda” o Y o l k d a /Y(olk|oda)/
  • 40. “Yoda” o Y o l k d a /Y(olk|oda)/
  • 41. “Yoda” o Y o l k d a /Y(olk|oda)/
  • 42. “Yoda” o Y o l k d a We Have A Match! /Y(olk|oda)/
  • 43. re = /Y(olk|oda)/ string = “Yoda” re.match(string) => #<MatchData “Yoda”>
  • 44. Quantifiers Photo By Fancy Horse Creative Commons Attribution Generic 2.0
  • 45. Plus Quantifier /No+/
  • 46. re = /No+/ string = “Noooo” re.match(string)
  • 47. “Noooo” o N o /No+/
  • 48. “Noooo” o N o /No+/
  • 49. “Noooo” o N o Return Match? Or Keep Looping? /No+/
  • 50. “Noooo” o N o Greedy Quantifier /No+/ Keeps Looping
  • 51. Greedy quantifiers match as much as possible
  • 52. Greedy quantifiers use maximum effort for maximum return
  • 53. “Noooo” o N o /No+/
  • 54. “Noooo” o N o /No+/
  • 55. “Noooo” o N o We Have A Match! /No+/
  • 56. re = /No+/ string = “Noooo” re.match(string) => #<MatchData “Noooo”>
  • 57. Lazy Quantifiers
  • 58. Lazy quantifiers match as little as possible
  • 59. Lazy quantifiers use minimum effort for minimum return
  • 60. Makes Quantifier Lazy /No+?/
  • 61. re = /No+?/ string = “Noooo” re.match(string)
  • 62. “Noooo” o N o /No+?/
  • 63. “Noooo” o N o /No+?/
  • 64. “Noooo” o N o Return Match? Or Keep Looping? /No+?/
  • 65. “Noooo” o N o We Have A Match! /No+?/
  • 66. re = /No+?/ string = “Noooo” re.match(string) => #<MatchData “No”>
  • 67. Greedy quantifiers are greedy but reasonable
  • 68. Star Quantifier /.*moon/
  • 69. re = /.*moon/ string = “That’s no moon” re.match(string)
  • 70. “That’s no moon” . m o o . /.*moon/ n
  • 71. “That’s no moon” . m o o . /.*moon/ n
  • 72. “That’s no moon” . m . o o Loops /.*moon/ n
  • 73. “That’s no moon” . m . (Fast Forward) o o Which To Match? /.*moon/ n
  • 74. “That’s no moon” . m . o o Keeps Looping /.*moon/ n
  • 75. “That’s no moon” . m . o o Keeps Looping /.*moon/ n
  • 76. “That’s no moon” . m . o o Keeps Looping /.*moon/ n
  • 77. “That’s no moon” . m o No More Characters? o . /.*moon/ n
  • 78. “That’s no moon” . m . o o n Backtrack or Fail? /.*moon/
  • 79. “That’s no moon” . m Backtracks o o . /.*moon/ n
  • 80. “That’s no moon” . m Backtracks o o . /.*moon/ n
  • 81. “That’s no moon” . m Backtracks o o . /.*moon/ n
  • 82. “That’s no moon” . Backtracks m . o o Huzzah! /.*moon/ n
  • 83. “That’s no moon” . m o o . /.*moon/ n
  • 84. “That’s no moon” . m o o . /.*moon/ n
  • 85. “That’s no moon” . m o o . /.*moon/ n
  • 86. “That’s no moon” . m o o . n We Have A Match! /.*moon/
  • 87. re = /.*moon/ string = “That’s no moon” re.match(string) => #<MatchData “That’s no moon”>
  • 88. Backtracking = Slow
  • 89. /No+w+/
  • 90. re = /No+w+/ string = “Noooo” re.match(string)
  • 91. “Noooo” o N o w /No+w+/ w
  • 92. “Noooo” o N o w /No+w+/ w
  • 93. “Noooo” o Loops N o w /No+w+/ w
  • 94. “Noooo” o Loops N o w /No+w+/ w
  • 95. “Noooo” o Loops N o w /No+w+/ w
  • 96. “Noooo” o N o w /No+w+/ Uh Oh w
  • 97. “Noooo” o N o Uh Oh w w Backtrack or Fail? /No+w+/
  • 98. “Noooo” Backtracks N o o w /No+w+/ w
  • 99. “Noooo” o Backtracks N o w /No+w+/ w
  • 100. “Noooo” o Backtracks N o w /No+w+/ w
  • 101. “Noooo” o N o w Match FAILS /No+w+/ w
  • 102. Possessive Quantifers
  • 103. Possessive quantifiers do not backtrack
  • 104. Makes Quantifier Possessive /No++w+/
  • 105. “Noooo” o N o w /No++w+/ w
  • 106. “Noooo” o N o w /No++w+/ w
  • 107. “Noooo” o Loops N o w /No++w+/ w
  • 108. “Noooo” o Loops N o w /No++w+/ w
  • 109. “Noooo” o Loops N o w /No++w+/ w
  • 110. “Noooo” o N o w /No++w+/ w
  • 111. “Noooo” o Loops N o Uh Oh w w Backtrack or Fail? /No++w+/
  • 112. “Noooo” o N o w Match FAILS /No++w+/ w
  • 113. Possessive quantifiers fail faster by controlling backtracking
  • 114. Use possessive quantifers with caution
  • 115. Tying It All Together Photo By Keith Ramos Creative Commons Attribution 2.0 Generic
  • 116. snake_case to CamelCase
  • 117. snake_case to CamelCase Find first letter of string and capitalize it
  • 118. snake_case to CamelCase Find first letter of string and capitalize it Find any character that follows an underscore and capitalize it
  • 119. snake_case to CamelCase Find first letter of string and capitalize it Find any character that follows an underscore and capitalize it Remove underscores
  • 120. snake_case to CamelCase Find first letter of string and capitalize it
  • 121. case_converter_spec.rb before(:each) do @case_converter = CaseConverter.new end it ʺ″capitalizes the first letterʺ″ do result = @case_converter .upcase_chars(ʺ″methodʺ″) result.should == ʺ″Methodʺ″ end
  • 122. case_converter_spec.rb before(:each) do @case_converter = CaseConverter.new end it ʺ″capitalizes the first letterʺ″ do result = @case_converter .upcase_chars(ʺ″methodʺ″) result.should == ʺ″Methodʺ″ end
  • 123. case_converter_spec.rb before(:each) do @case_converter = CaseConverter.new end it ʺ″capitalizes the first letterʺ″ do result = @case_converter .upcase_chars(ʺ″methodʺ″) result.should == ʺ″Methodʺ″ end
  • 124. Anchors Match To Beginning Of String / A /
  • 125. Matches Any Word Character / Aw/
  • 126. case_converter.rb def upcase_chars(string) re = / A w/ string.gsub(re){|char| char.upcase} end
  • 127. case_converter.rb def upcase_chars(string) re = / A w/ string.gsub(re){|char| char.upcase} end
  • 128. case_converter.rb def upcase_chars(string) re = / A w/ string.gsub(re){|char| char.upcase} end Spec Passes!
  • 129. case_converter_spec.rb it ʺ″capitalizes the first letterʺ″ do result = @case_converter .upcase_chars(ʺ″_methodʺ″) result.should == ʺ″_Methodʺ″ end
  • 130. case_converter_spec.rb it ʺ″capitalizes the first letterʺ″ do result = @case_converter .upcase_chars(ʺ″_methodʺ″) result.should == ʺ″_Methodʺ″ end
  • 131. case_converter_spec.rb it ʺ″capitalizes the first letterʺ″ do result = @case_converter .upcase_chars(ʺ″_methodʺ″) result.should == ʺ″_Methodʺ″ end Spec Fails!
  • 132. Spec Failure: Expected: ʺ″_Methodʺ″ Got: ʺ″_methodʺ″
  • 133. Problem: Matches Letters AND Underscores / Aw/
  • 134. Matches Only Lowercase Letters / A[a-z]/
  • 135. Matches an underscore / A _ [a-z]/
  • 136. Makes underscore optional / A _ ?[a-z] /
  • 137. case_converter.rb def upcase_chars(string) re = / A _? [a-z] / string.gsub(re){|char| char.upcase} end
  • 138. case_converter.rb def upcase_chars(string) re = / A _? [a-z] / string.gsub(re){|char| char.upcase} end Spec Passes!
  • 139. snake_case to CamelCase Find any character that follows an underscore and capitalize it
  • 140. case_converter_spec.rb it ʺ″capitalizes letters after an underscoreʺ″ do result = @case_converter .upcase_chars(ʺ″some_methodʺ″) result.should == ʺ″Some_Methodʺ″ end
  • 141. case_converter_spec.rb it ʺ″capitalizes letters after an underscoreʺ″ do result = @case_converter .upcase_chars(ʺ″some_methodʺ″) result.should == ʺ″Some_Methodʺ″ end
  • 142. / A _ ?[a-z] /
  • 143. Pipe For Alternation / A _ ?[a-z]|[a-z] /
  • 144. Look Behind / A _ ?[a-z]|(?<=_)[a-z] /
  • 145. case_converter.rb def upcase_chars(string) re = / A _ ?[a-z] | (?<=_)[a-z] / string.gsub(re){|char| char.upcase} end
  • 146. case_converter.rb def upcase_chars(string) re = / A _ ?[a-z] | (?<=_)[a-z] / string.gsub(re){|char| char.upcase} end Spec Passes!
  • 147. snake_case to CamelCase Remove underscores
  • 148. case_converter_spec.rb it ʺ″removes underscoresʺ″ do result = @case_converter .rmv_underscores(ʺ″some_methodʺ″) result.should == ʺ″somemethodʺ″ end
  • 149. case_converter_spec.rb it ʺ″removes underscoresʺ″ do result = @case_converter .rmv_underscores(ʺ″some_methodʺ″) result.should == ʺ″somemethodʺ″ end
  • 150. case_converter_spec.rb it ʺ″removes underscoresʺ″ do result = @case_converter .rmv_underscores(ʺ″some_methodʺ″) result.should == ʺ″somemethodʺ″ end
  • 151. Matches An Underscore /_ /
  • 152. case_converter.rb def rmv_underscores(string) re = / _ / string.gsub(re, “”) end
  • 153. case_converter.rb def rmv_underscores(string) re = / _ / string.gsub(re, “”) end
  • 154. case_converter.rb def rmv_underscores(string) re = / _ / string.gsub(re, “”) end Spec Passes!
  • 155. snake_case to CamelCase Combine results of two methods
  • 156. case_converter_spec.rb it ʺ″converts snake_case to CamelCaseʺ″ do result = @case_converter .snake_to_camel(ʺ″some_methodʺ″) result.should == ʺ″SomeMethodʺ″ end
  • 157. case_converter_spec.rb it ʺ″converts snake_case to CamelCaseʺ″ do result = @case_converter .snake_to_camel(ʺ″some_methodʺ″) result.should == ʺ″SomeMethodʺ″ end
  • 158. case_converter_spec.rb it ʺ″converts snake_case to CamelCaseʺ″ do result = @case_converter .snake_to_camel(ʺ″some_methodʺ″) result.should == ʺ″SomeMethodʺ″ end
  • 159. case_converter.rb def snake_to_camel(string) upcase_chars(string) end
  • 160. case_converter.rb def snake_to_camel(string) rmv_underscores( upcase_chars(string) ) end
  • 161. case_converter.rb def snake_to_camel(string) rmv_underscores( upcase_chars(string) ) end Spec Passes!
  • 162. Code is available here: https://github.com/nellshamrell/ snake_to_camel_case
  • 163. Conclusion Photo By Steve Jurvetson Creative Commons Attribution Generic 2.0
  • 164. Develop regular expressions in small pieces
  • 165. If you write code, you can write regular expressions
  • 166. Move beyond the fear
  • 167. Nell Shamrell Software Development Engineer Blue Box @nellshamrell Resources: https://gist.github.com/ nellshamrell/6031738 Photo By Leonardo Pallotta Creative Commons Attribution Generic 2.0