Your SlideShare is downloading. ×
0
And now you have  two problemsRuby regular expressions for fun and profit           Luca Mearelli @lmea         Codemotion ...
Regular expressionspatterns to describe the contents of a text•cat catch indicate ...•2013-03-22, YYYY-MM-DD, ...•$ 12,500...
Regexps: good for...Pattern matchingSearch and replace                     @lmea
Regexp in rubyRegexp object: Regexp.new("cat")literal notation #1: %r{cat}literal notation #2: /cat/                      ...
Regexp syntaxliterals: /cat/ matches any ‘cat’ substringthe dot: /./ matches any charactercharacter classes: /[aeiou]/ /[a...
Regexp syntax                  Modifierscase insensitive: /./ionly interpolate #{} blocks once: /./omultiline mode - . will...
Regexp syntax          Shorthand classes/d/       digit     /D/      non digit/s/    whitespace   /S/   non whitespace/w/ ...
Regexp syntax                   Anchors/^/    beginning of line /$/        end of line/b/ word boundary /B/ non word bound...
Regexp syntaxalternation: /cat|dog/ matches ‘cats and dogs’0-or-more: /ab*/ matches ‘a’ ‘ab’ ‘abb’...1-or-more: /ab+/ matc...
Regexp syntaxgreedy matches: /.+cat/ matches ‘the cat iscatching a mouse’lazy matches: /.+?scat/ matches ‘the cat iscatchi...
Regexp syntaxgrouping: /(d{3}.){3}d{3}/ matches IP-like stringscapturing: /a (cat|dog)/ the match iscaptured in $1 to be u...
String substitution  "My cat eats catfood".sub(/cat/, "dog")# => My dog eats catfood"My cat eats catfood".gsub(/cat/, "dog...
String parsing   "Codemotion Rome: Mar 20 to Mar 23".scan(/w{3} d{1,2}/)# => ["Mar 20", "Mar 23"]"Codemotion Rome: Mar 20 ...
Regexp methodsif "what a wonderful world" =~ /(world)/  puts "hello #{$1.upcase}"end# hello WORLDif /(world)/.match("The w...
Rails app examples# in routingmatch path/:id, :constraints => { :id => /[A-Z]d{5}/ }# in validationsvalidates :phone, :for...
Rails examples# in ActiveModel::Validations::NumericalityValidatordef parse_raw_value_as_an_integer(raw_value)  raw_value....
Regexps are               dangerous"If I was going to place a bet on somethingabout Rails security, itd be that there are ...
Tip #1Beware of nested quantifiers/(x+x+)+y/ =~ xxxxxxxxxy/(xx+)+y/ =~ xxxxxxxxxx/(?>x+x+)+y/ =~ xxxxxxxxx                 ...
Tip #2Don’t make everything optional/[-+]?[0-9]*.?[0-9]*/ =~ ./[-+]?([0-9]*.?[0-9]+|[0-9]+)//[-+]?[0-9]*.?[0-9]+/         ...
Tip #3Evaluate tradeoffs/(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t]   .[] 000-031]+(?:(?:(?:rn)?[ t...
Tip #4Capture repeated groups and don’trepeat a captured group/!(abc|123)+!/ =~ !abc123!# $1 == 123/!((abc|123)+)!/ =~ !ab...
Tip #5use interpolation with carestr = "cat"/#{str}/ =~ "My cat eats catfood"/#{Regexp.quote(str)}/ =~ "My cat eats catfoo...
Tip #6Don’t use ^ and $ to match thestrings beginning and endvalidates :url, :format => /^https?/"http://example.com" =~ /...
From 060bb7250b963609a0d8a5d0559e36b99d2402c6 Mon Sep 17 00:00:00 2001From: joernchen of Phenoelit <joernchen@phenoelit.de...
From 99123ad12f71ce3e7fe70656810e53133665527c Mon Sep 17 00:00:00 2001From: Aaron Patterson <aaron.patterson@gmail.com>Dat...
ToolsPrint a cheatsheet!Info:http://www.regular-expressions.infoDebug:http://rubular.comhttp://rubyxp.comVisualize:http://...
Thank you!
Upcoming SlideShare
Loading in...5
×

And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

278

Published on

A wise hacker said: Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.
Regular expressions are a powerful tool in our hands and a first class citizen in ruby so it is tempting to overuse them. But knowing them and using them properly is a fundamental asset of every developer.
We’ll see hands-on examples of proper Reg Exps usage in ruby code, we’ll also look at bad and ugly cases and learn how to approach writing, testing and debugging regular expressions.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
278
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli"

  1. 1. And now you have two problemsRuby regular expressions for fun and profit Luca Mearelli @lmea Codemotion Rome - 2013
  2. 2. Regular expressionspatterns to describe the contents of a text•cat catch indicate ...•2013-03-22, YYYY-MM-DD, ...•$ 12,500.80 @lmea
  3. 3. Regexps: good for...Pattern matchingSearch and replace @lmea
  4. 4. Regexp in rubyRegexp object: Regexp.new("cat")literal notation #1: %r{cat}literal notation #2: /cat/ @lmea
  5. 5. Regexp syntaxliterals: /cat/ matches any ‘cat’ substringthe dot: /./ matches any charactercharacter classes: /[aeiou]/ /[a-z]/ /[01]/negated character classes: /[^abc]/ @lmea
  6. 6. Regexp syntax Modifierscase insensitive: /./ionly interpolate #{} blocks once: /./omultiline mode - . will match newline: /./mextended mode - whitespace is ignored: /./x @lmea
  7. 7. Regexp syntax Shorthand classes/d/ digit /D/ non digit/s/ whitespace /S/ non whitespace/w/ word character /W/ non word character/h/ hexdigit /H/ non hexdigit @lmea
  8. 8. Regexp syntax Anchors/^/ beginning of line /$/ end of line/b/ word boundary /B/ non word boundary/A/ beginning of string /z/ end of string end of string. If string ends with a newline, /Z/ it matches just before newline @lmea
  9. 9. Regexp syntaxalternation: /cat|dog/ matches ‘cats and dogs’0-or-more: /ab*/ matches ‘a’ ‘ab’ ‘abb’...1-or-more: /ab+/ matches ‘ab’ ‘abb’ ...given-number: /ab{2}/ matches ‘abb’ but not‘ab’ or the whole ‘abbb’ string @lmea
  10. 10. Regexp syntaxgreedy matches: /.+cat/ matches ‘the cat iscatching a mouse’lazy matches: /.+?scat/ matches ‘the cat iscatching a mouse’ @lmea
  11. 11. Regexp syntaxgrouping: /(d{3}.){3}d{3}/ matches IP-like stringscapturing: /a (cat|dog)/ the match iscaptured in $1 to be used laternon capturing: /a (?:cat|dog)/ no contentcapturedatomic grouping: /(?>a+)/ doesn’t backtrack @lmea
  12. 12. String substitution "My cat eats catfood".sub(/cat/, "dog")# => My dog eats catfood"My cat eats catfood".gsub(/cat/, "dog")# => My dog eats dogfood"My cat eats catfood".gsub(/bcat(w+)/, "dog1")# => My cat eats dogfood"My cat eats catfood".gsub(/bcat(w+)/){|m| $1.reverse}# => My cat eats doof @lmea
  13. 13. String parsing "Codemotion Rome: Mar 20 to Mar 23".scan(/w{3} d{1,2}/)# => ["Mar 20", "Mar 23"]"Codemotion Rome: Mar 20 to Mar 23".scan(/(w{3}) (d{1,2})/)# => [["Mar", "20"], ["Mar", "23"]]"Codemotion Rome: Mar 20 to Mar 23".scan(/(w{3}) (d{1,2})/){|a,b| puts b+"/"+a}# 20/Mar# 23/Mar# => "Codemotion Rome: Mar 20 to Mar 23" @lmea
  14. 14. Regexp methodsif "what a wonderful world" =~ /(world)/ puts "hello #{$1.upcase}"end# hello WORLDif /(world)/.match("The world") puts "hello #{$1.upcase}"end# hello WORLDmatch_data = /(world)/.match("The world")puts "hello #{match_data[1].upcase}"# hello WORLD @lmea
  15. 15. Rails app examples# in routingmatch path/:id, :constraints => { :id => /[A-Z]d{5}/ }# in validationsvalidates :phone, :format => /Ad{2,4}s*d+z/validates :phone, :format => { :with=> /Ad{2,4}s*d+z/ }validates :phone, :format => { :without=> /A02s*d+z/ } @lmea
  16. 16. Rails examples# in ActiveModel::Validations::NumericalityValidatordef parse_raw_value_as_an_integer(raw_value) raw_value.to_i if raw_value.to_s =~ /A[+-]?d+Z/end# in ActionDispatch::RemoteIp::IpSpoofAttackError# IP addresses that are "trusted proxies" that can be stripped from# the comma-delimited list in the X-Forwarded-For header. See also:# http://en.wikipedia.org/wiki/Private_network#Private_IPv4_address_spacesTRUSTED_PROXIES = %r{ ^127.0.0.1$ | # localhost ^(10 | # private IP 10.x.x.x 172.(1[6-9]|2[0-9]|3[0-1]) | # private IP in the range 172.16.0.0 .. 172.31.255.255 192.168 # private IP 192.168.x.x ).}xWILDCARD_PATH = %r{*([^/)]+))?$} @lmea
  17. 17. Regexps are dangerous"If I was going to place a bet on somethingabout Rails security, itd be that there are moreregex vulnerabilities in the tree. I amuncomfortable with how much Rails leans onregex for policy decisions."Thomas H. Ptacek (Founder @ Matasano, Feb 2013) @lmea
  18. 18. Tip #1Beware of nested quantifiers/(x+x+)+y/ =~ xxxxxxxxxy/(xx+)+y/ =~ xxxxxxxxxx/(?>x+x+)+y/ =~ xxxxxxxxx @lmea
  19. 19. Tip #2Don’t make everything optional/[-+]?[0-9]*.?[0-9]*/ =~ ./[-+]?([0-9]*.?[0-9]+|[0-9]+)//[-+]?[0-9]*.?[0-9]+/ @lmea
  20. 20. Tip #3Evaluate tradeoffs/(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t] .[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[)+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?: ]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:( [] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[]t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-0 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)* |.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 0](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+ 00-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?: .|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z ;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn) :[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?: (?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ []]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn) ^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t] ]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)(?:,s*()*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ ?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;: t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])* ".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:()(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t] ?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[)+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*) ["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+ ])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:r ])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?n)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?: :.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031 [^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*]( ]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(? ?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[[":(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(? ()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn):rn)?[ t])*))*>(?:(?:rn)?[ t])*)|(?:[^()<>@,;:".[] 000-031]+(?:(? ?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)? @,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[[ t]))*"(?:(?:rn)?[ t])*)*:(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]| ;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t].|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<> )*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|" ".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t] (?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".)*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;: []]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(? rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[[:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[ "()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000- *))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|( +|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,; .(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([ |(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:" ?:rn)?[ t])*))*)?;s*)//b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+.)+[A-Z]{2,4}b/ @lmea
  21. 21. Tip #4Capture repeated groups and don’trepeat a captured group/!(abc|123)+!/ =~ !abc123!# $1 == 123/!((abc|123)+)!/ =~ !abc123!# $1 == abc123 @lmea
  22. 22. Tip #5use interpolation with carestr = "cat"/#{str}/ =~ "My cat eats catfood"/#{Regexp.quote(str)}/ =~ "My cat eats catfood" @lmea
  23. 23. Tip #6Don’t use ^ and $ to match thestrings beginning and endvalidates :url, :format => /^https?/"http://example.com" =~ /^https?/"javascript:alert(hello!);%0Ahttp://example.com""javascript:alert(hello!);nhttp://example.com" =~ /^https?/"javascript:alert(hello!);nhttp://example.com" =~ /Ahttps?/ @lmea
  24. 24. From 060bb7250b963609a0d8a5d0559e36b99d2402c6 Mon Sep 17 00:00:00 2001From: joernchen of Phenoelit <joernchen@phenoelit.de>Date: Sat, 9 Feb 2013 15:46:44 -0800Subject: [PATCH] Fix issue with attr_protected where malformed input could circumvent protectionFixes: CVE-2013-0276--- activemodel/lib/active_model/attribute_methods.rb | 2 +- activemodel/lib/active_model/mass_assignment_security/permission_set.rb | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)diff --git a/activemodel/lib/active_model/attribute_methods.rb b/activemodel/lib/active_model/attribute_methods.rbindex f033a94..96f2c82 100644--- a/activemodel/lib/active_model/attribute_methods.rb+++ b/activemodel/lib/active_model/attribute_methods.rb@@ -365,7 +365,7 @@ module ActiveModel end @prefix, @suffix = options[:prefix] || , options[:suffix] || - @regex = /^(#{Regexp.escape(@prefix)})(.+?)(#{Regexp.escape(@suffix)})$/+ @regex = /A(#{Regexp.escape(@prefix)})(.+?)(#{Regexp.escape(@suffix)})z/ @method_missing_target = "#{@prefix}attribute#{@suffix}" @method_name = "#{prefix}%s#{suffix}" enddiff --git a/activemodel/lib/active_model/mass_assignment_security/permission_set.rb b/activemodel/lib/active_model/mass_assignment_security/permission_set.rbindex a1fcdf1..10faa29 100644--- a/activemodel/lib/active_model/mass_assignment_security/permission_set.rb+++ b/activemodel/lib/active_model/mass_assignment_security/permission_set.rb@@ -19,7 +19,7 @@ module ActiveModel protected def remove_multiparameter_id(key)- key.to_s.gsub(/(.+/, )+ key.to_s.gsub(/(.+/m, ) end end--1.8.1.1 @lmea
  25. 25. From 99123ad12f71ce3e7fe70656810e53133665527c Mon Sep 17 00:00:00 2001From: Aaron Patterson <aaron.patterson@gmail.com>Date: Fri, 15 Mar 2013 15:04:00 -0700Subject: [PATCH] fix protocol checking in sanitization [CVE-2013-1857]Conflicts: actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb--- .../action_controller/vendor/html-scanner/html/sanitizer.rb | 4 ++-- actionpack/test/template/html-scanner/sanitizer_test.rb | 10 ++++++++++ 2 files changed, 12 insertions(+), 2 deletions(-)diff --git a/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb b/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rbindex 02eea58..994e115 100644--- a/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb+++ b/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb@@ -66,7 +66,7 @@ module HTML # A regular expression of the valid characters used to separate protocols like # the : in http://foo.com- self.protocol_separator = /:|(&#0*58)|(&#x70)|(%|%)3A/+ self.protocol_separator = /:|(&#0*58)|(&#x70)|(&#x0*3a)|(%|%)3A/i # Specifies a Set of HTML attributes that can have URIs. self.uri_attributes = Set.new(%w(href src cite action longdesc xlink:href lowsrc))@@ -171,7 +171,7 @@ module HTML def contains_bad_protocols?(attr_name, value) uri_attributes.include?(attr_name) &&- (value =~ /(^[^/:]*):|(&#0*58)|(&#x70)|(%|%)3A/ && !allowed_protocols.include?(value.split(protocol_separator).first.downcase))+ (value =~ /(^[^/:]*):|(&#0*58)|(&#x70)|(&#x0*3a)|(%|%)3A/i && !allowed_protocols.include?(value.split(protocol_separator).first.downcase.strip)) end end enddiff --git a/actionpack/test/template/html-scanner/sanitizer_test.rb b/actionpack/test/template/html-scanner/sanitizer_test.rbindex 4e2ad4e..dee60c9 100644--- a/actionpack/test/template/html-scanner/sanitizer_test.rb+++ b/actionpack/test/template/html-scanner/sanitizer_test.rb@@ -176,6 +176,7 @@ class SanitizerTest < ActionController::TestCase %(<IMG SRC="jav ascript:alert(XSS);">), @lmea
  26. 26. ToolsPrint a cheatsheet!Info:http://www.regular-expressions.infoDebug:http://rubular.comhttp://rubyxp.comVisualize:http://www.regexper.com/ @lmea
  27. 27. Thank you!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×