And Now You Have Two Problems

Luca Mearelli
Luca MearelliCraftsman, developer
And now you have
two problems
Ruby regular expressions for fun and profit
Luca Mearelli @lmea
Codemotion Rome - 2013
@lmea
Regular expressions
•cat catch indicate ...
•2013-03-22, YYYY-MM-DD, ...
•$ 12,500.80
patterns to describe the contents of a text
@lmea
Regexps: good for...
Pattern matching
Search and replace
@lmea
Regexp in ruby
Regexp object: Regexp.new("cat")
literal notation #1: %r{cat}
literal notation #2: /cat/
@lmea
Regexp syntax
literals: /cat/ matches any ‘cat’ substring
the dot: /./ matches any character
character classes: /[aeiou]/ /[a-z]/ /[01]/
negated character classes: /[^abc]/
@lmea
Regexp syntax
case insensitive: /./i
only interpolate #{} blocks once: /./o
multiline mode - '.' will match newline: /./m
extended mode - whitespace is ignored: /./x
Modifiers
@lmea
Regexp syntax
/d/ digit /D/ non digit
/s/ whitespace /S/ non whitespace
/w/ word character /W/ non word character
/h/ hexdigit /H/ non hexdigit
Shorthand classes
@lmea
Regexp syntax
/^/ beginning of line /$/ end of line
/b/ word boundary /B/ non word boundary
/A/ beginning of string /z/ end of string
/Z/
end of string. If string
ends with a newline,
it matches just
before newline
Anchors
@lmea
Regexp syntax
alternation: /cat|dog/ matches ‘cats and dogs’
0-or-more: /ab*/ matches ‘a’ ‘ab’ ‘abb’...
1-or-more: /ab+/ matches ‘ab’ ‘abb’ ...
given-number: /ab{2}/ matches ‘abb’ but not
‘ab’ or the whole ‘abbb’ string
@lmea
Regexp syntax
greedy matches: /.+cat/ matches ‘the cat is
catching a mouse’
lazy matches: /.+?scat/ matches ‘the cat is
catching a mouse’
@lmea
Regexp syntax
grouping: /(d{3}.){3}d{3}/ matches IP-
like strings
capturing: /a (cat|dog)/ the match is
captured in $1 to be used later
non capturing: /a (?:cat|dog)/ no content
captured
atomic grouping: /(?>a+)/ doesn’t backtrack
@lmea
String substitution
"My cat eats catfood".sub(/cat/, "dog")
# => My dog eats catfood
"My cat eats catfood".gsub(/cat/, "dog")
# => My dog eats dogfood
"My cat eats catfood".gsub(/bcat(w+)/, "dog1")
# => My cat eats dogfood
"My cat eats catfood".gsub(/bcat(w+)/){|m| $1.reverse}
# => My cat eats doof
@lmea
String parsing
"Codemotion Rome: Mar 20 to Mar 23".scan(/w{3} d{1,2}/)
# => ["Mar 20", "Mar 23"]
"Codemotion Rome: Mar 20 to Mar 23".scan(/(w{3}) (d{1,2})/)
# => [["Mar", "20"], ["Mar", "23"]]
"Codemotion Rome: Mar 20 to Mar 23".scan(/(w{3}) (d{1,2})/)
{|a,b| puts b+"/"+a}
# 20/Mar
# 23/Mar
# => "Codemotion Rome: Mar 20 to Mar 23"
@lmea
Regexp methods
if "what a wonderful world" =~ /(world)/
puts "hello #{$1.upcase}"
end
# hello WORLD
if /(world)/.match("The world")
puts "hello #{$1.upcase}"
end
# hello WORLD
match_data = /(world)/.match("The world")
puts "hello #{match_data[1].upcase}"
# hello WORLD
@lmea
Rails app examples
# in routing
match 'path/:id', :constraints => { :id => /[A-Z]d{5}/ }
# in validations
validates :phone, :format => /Ad{2,4}s*d+z/
validates :phone, :format => { :with=> /Ad{2,4}s*d+z/ }
validates :phone, :format => { :without=> /A02s*d+z/ }
@lmea
Rails examples
# in ActiveModel::Validations::NumericalityValidator
def parse_raw_value_as_an_integer(raw_value)
raw_value.to_i if raw_value.to_s =~ /A[+-]?d+Z/
end
# in ActionDispatch::RemoteIp::IpSpoofAttackError
# IP addresses that are "trusted proxies" that can be stripped from
# the comma-delimited list in the X-Forwarded-For header. See also:
# http://en.wikipedia.org/wiki/Private_network#Private_IPv4_address_spaces
TRUSTED_PROXIES = %r{
^127.0.0.1$ | # localhost
^(10 | # private IP 10.x.x.x
172.(1[6-9]|2[0-9]|3[0-1]) | # private IP in the range 172.16.0.0 .. 172.31.255.255
192.168 # private IP 192.168.x.x
).
}x
WILDCARD_PATH = %r{*([^/)]+))?$}
@lmea
Regexps are
dangerous
"If I was going to place a bet on something
about Rails security, it'd be that there are more
regex vulnerabilities in the tree. I am
uncomfortable with how much Rails leans on
regex for policy decisions."
Thomas H. Ptacek (Founder @ Matasano, Feb 2013)
@lmea
Tip #1
Beware of nested quantifiers
/(x+x+)+y/ =~ 'xxxxxxxxxy'
/(xx+)+y/ =~ 'xxxxxxxxxx'
/(?>x+x+)+y/ =~ 'xxxxxxxxx'
@lmea
Tip #2
Don’t make everything optional
/[-+]?[0-9]*.?[0-9]*/ =~ '.'
/[-+]?([0-9]*.?[0-9]+|[0-9]+)/
/[-+]?[0-9]*.?[0-9]+/
@lmea
Tip #3
Evaluate tradeoffs
/b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+.)+[A-Z]{2,4}b/
/(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t]
)+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:
rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(
?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[
t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-0
31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*
](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+
(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:
(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z
|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)
?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:
rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[
t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)
?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t]
)*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[
t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*
)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t]
)+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)
*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+
|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:r
n)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:
rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t
]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031
]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](
?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?
:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?
:rn)?[ t])*))*>(?:(?:rn)?[ t])*)|(?:[^()<>@,;:".[] 000-031]+(?:(?
:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?
[ t]))*"(?:(?:rn)?[ t])*)*:(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[]
000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|
.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>
@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"
(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t]
)*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:
".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?
:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[
]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-
031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(
?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;
:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([
^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:"
.[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[
]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".
[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]
r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[]
000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]
|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 0
00-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|
.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,
;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?
:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*
(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".
[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[
^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]
]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)(?:,s*(
?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:
".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(
?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[
["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t
])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t
])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?
:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|
Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:
[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[
]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)
?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["
()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)
?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>
@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[
t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,
;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t]
)*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:
".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?
(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".
[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:
rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[[
"()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])
*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])
+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:
.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z
|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(
?:rn)?[ t])*))*)?;s*)/
@lmea
Tip #4
Capture repeated groups and don’t
repeat a captured group
/!(abc|123)+!/ =~ '!abc123!'
# $1 == '123'
/!((abc|123)+)!/ =~ '!abc123!'
# $1 == 'abc123'
@lmea
Tip #5
use interpolation with care
str = "cat"
/#{str}/ =~ "My cat eats catfood"
/#{Regexp.quote(str)}/ =~ "My cat eats catfood"
@lmea
Tip #6
Don’t use ^ and $ to match the
strings beginning and end
validates :url, :format => /^https?/
"http://example.com" =~ /^https?/
"javascript:alert('hello!');%0Ahttp://example.com"
"javascript:alert('hello!');nhttp://example.com" =~ /^https?/
"javascript:alert('hello!');nhttp://example.com" =~ /Ahttps?/
@lmea
From 060bb7250b963609a0d8a5d0559e36b99d2402c6 Mon Sep 17 00:00:00 2001
From: joernchen of Phenoelit <joernchen@phenoelit.de>
Date: Sat, 9 Feb 2013 15:46:44 -0800
Subject: [PATCH] Fix issue with attr_protected where malformed input could
circumvent protection
Fixes: CVE-2013-0276
---
activemodel/lib/active_model/attribute_methods.rb | 2 +-
activemodel/lib/active_model/mass_assignment_security/permission_set.rb | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/activemodel/lib/active_model/attribute_methods.rb b/activemodel/lib/active_model/
attribute_methods.rb
index f033a94..96f2c82 100644
--- a/activemodel/lib/active_model/attribute_methods.rb
+++ b/activemodel/lib/active_model/attribute_methods.rb
@@ -365,7 +365,7 @@ module ActiveModel
end
@prefix, @suffix = options[:prefix] || '', options[:suffix] || ''
- @regex = /^(#{Regexp.escape(@prefix)})(.+?)(#{Regexp.escape(@suffix)})$/
+ @regex = /A(#{Regexp.escape(@prefix)})(.+?)(#{Regexp.escape(@suffix)})z/
@method_missing_target = "#{@prefix}attribute#{@suffix}"
@method_name = "#{prefix}%s#{suffix}"
end
diff --git a/activemodel/lib/active_model/mass_assignment_security/permission_set.rb b/activemodel/lib/
active_model/mass_assignment_security/permission_set.rb
index a1fcdf1..10faa29 100644
--- a/activemodel/lib/active_model/mass_assignment_security/permission_set.rb
+++ b/activemodel/lib/active_model/mass_assignment_security/permission_set.rb
@@ -19,7 +19,7 @@ module ActiveModel
protected
def remove_multiparameter_id(key)
- key.to_s.gsub(/(.+/, '')
+ key.to_s.gsub(/(.+/m, '')
end
end
--
1.8.1.1
@lmea
From 99123ad12f71ce3e7fe70656810e53133665527c Mon Sep 17 00:00:00 2001
From: Aaron Patterson <aaron.patterson@gmail.com>
Date: Fri, 15 Mar 2013 15:04:00 -0700
Subject: [PATCH] fix protocol checking in sanitization [CVE-2013-1857]
Conflicts:
actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb
---
.../action_controller/vendor/html-scanner/html/sanitizer.rb | 4 ++--
actionpack/test/template/html-scanner/sanitizer_test.rb | 10 ++++++++++
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb b/actionpack/lib/
action_controller/vendor/html-scanner/html/sanitizer.rb
index 02eea58..994e115 100644
--- a/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb
+++ b/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb
@@ -66,7 +66,7 @@ module HTML
# A regular expression of the valid characters used to separate protocols like
# the ':' in 'http://foo.com'
- self.protocol_separator = /:|(&#0*58)|(&#x70)|(%|&#37;)3A/
+ self.protocol_separator = /:|(&#0*58)|(&#x70)|(&#x0*3a)|(%|&#37;)3A/i
# Specifies a Set of HTML attributes that can have URIs.
self.uri_attributes = Set.new(%w(href src cite action longdesc xlink:href lowsrc))
@@ -171,7 +171,7 @@ module HTML
def contains_bad_protocols?(attr_name, value)
uri_attributes.include?(attr_name) &&
- (value =~ /(^[^/:]*):|(&#0*58)|(&#x70)|(%|&#37;)3A/ && !allowed_protocols.include?
(value.split(protocol_separator).first.downcase))
+ (value =~ /(^[^/:]*):|(&#0*58)|(&#x70)|(&#x0*3a)|(%|&#37;)3A/i && !allowed_protocols.include?
(value.split(protocol_separator).first.downcase.strip))
end
end
end
diff --git a/actionpack/test/template/html-scanner/sanitizer_test.rb b/actionpack/test/template/html-scanner/
sanitizer_test.rb
index 4e2ad4e..dee60c9 100644
--- a/actionpack/test/template/html-scanner/sanitizer_test.rb
+++ b/actionpack/test/template/html-scanner/sanitizer_test.rb
@@ -176,6 +176,7 @@ class SanitizerTest < ActionController::TestCase
%(<IMG SRC="jav&#x0A;ascript:alert('XSS');">),
@lmea
Tools
Print a cheatsheet!
Info:
http://www.regular-expressions.info
Debug:
http://rubular.com
http://rubyxp.com
Visualize:
http://www.regexper.com/
Thank you!
1 of 27

Recommended

A Little Backbone For Your App by
A Little Backbone For Your AppA Little Backbone For Your App
A Little Backbone For Your AppLuca Mearelli
2.5K views35 slides
Controlling The Cloud With Python by
Controlling The Cloud With PythonControlling The Cloud With Python
Controlling The Cloud With PythonLuca Mearelli
1.7K views46 slides
OSCON Google App Engine Codelab - July 2010 by
OSCON Google App Engine Codelab - July 2010OSCON Google App Engine Codelab - July 2010
OSCON Google App Engine Codelab - July 2010ikailan
7K views21 slides
To Batch Or Not To Batch by
To Batch Or Not To BatchTo Batch Or Not To Batch
To Batch Or Not To BatchLuca Mearelli
2.5K views46 slides
Advanced symfony Techniques by
Advanced symfony TechniquesAdvanced symfony Techniques
Advanced symfony TechniquesKris Wallsmith
5.4K views106 slides
Symfony & Javascript. Combining the best of two worlds by
Symfony & Javascript. Combining the best of two worldsSymfony & Javascript. Combining the best of two worlds
Symfony & Javascript. Combining the best of two worldsIgnacio Martín
19.3K views136 slides

More Related Content

What's hot

New in php 7 by
New in php 7New in php 7
New in php 7Vic Metcalfe
1.1K views53 slides
A Functional Guide to Cat Herding with PHP Generators by
A Functional Guide to Cat Herding with PHP GeneratorsA Functional Guide to Cat Herding with PHP Generators
A Functional Guide to Cat Herding with PHP GeneratorsMark Baker
846 views31 slides
Why Task Queues - ComoRichWeb by
Why Task Queues - ComoRichWebWhy Task Queues - ComoRichWeb
Why Task Queues - ComoRichWebBryan Helmig
16.8K views30 slides
Filling the flask by
Filling the flaskFilling the flask
Filling the flaskJason Myers
1.2K views71 slides
Europython 2011 - Playing tasks with Django & Celery by
Europython 2011 - Playing tasks with Django & CeleryEuropython 2011 - Playing tasks with Django & Celery
Europython 2011 - Playing tasks with Django & CeleryMauro Rocco
33.1K views26 slides
Django - 次の一歩 gumiStudy#3 by
Django - 次の一歩 gumiStudy#3Django - 次の一歩 gumiStudy#3
Django - 次の一歩 gumiStudy#3makoto tsuyuki
2.6K views150 slides

What's hot(20)

A Functional Guide to Cat Herding with PHP Generators by Mark Baker
A Functional Guide to Cat Herding with PHP GeneratorsA Functional Guide to Cat Herding with PHP Generators
A Functional Guide to Cat Herding with PHP Generators
Mark Baker846 views
Why Task Queues - ComoRichWeb by Bryan Helmig
Why Task Queues - ComoRichWebWhy Task Queues - ComoRichWeb
Why Task Queues - ComoRichWeb
Bryan Helmig16.8K views
Filling the flask by Jason Myers
Filling the flaskFilling the flask
Filling the flask
Jason Myers1.2K views
Europython 2011 - Playing tasks with Django & Celery by Mauro Rocco
Europython 2011 - Playing tasks with Django & CeleryEuropython 2011 - Playing tasks with Django & Celery
Europython 2011 - Playing tasks with Django & Celery
Mauro Rocco33.1K views
Django - 次の一歩 gumiStudy#3 by makoto tsuyuki
Django - 次の一歩 gumiStudy#3Django - 次の一歩 gumiStudy#3
Django - 次の一歩 gumiStudy#3
makoto tsuyuki2.6K views
RESTful API 제대로 만들기 by Juwon Kim
RESTful API 제대로 만들기RESTful API 제대로 만들기
RESTful API 제대로 만들기
Juwon Kim57.3K views
Symfony Guard Authentication: Fun with API Token, Social Login, JWT and more by Ryan Weaver
Symfony Guard Authentication: Fun with API Token, Social Login, JWT and moreSymfony Guard Authentication: Fun with API Token, Social Login, JWT and more
Symfony Guard Authentication: Fun with API Token, Social Login, JWT and more
Ryan Weaver32.1K views
What’s new in ECMAScript 6.0 by Eyal Vardi
What’s new in ECMAScript 6.0What’s new in ECMAScript 6.0
What’s new in ECMAScript 6.0
Eyal Vardi11.4K views
An Introduction to Celery by Idan Gazit
An Introduction to CeleryAn Introduction to Celery
An Introduction to Celery
Idan Gazit70.3K views
Speed up your developments with Symfony2 by Hugo Hamon
Speed up your developments with Symfony2Speed up your developments with Symfony2
Speed up your developments with Symfony2
Hugo Hamon4.5K views
symfony on action - WebTech 207 by patter
symfony on action - WebTech 207symfony on action - WebTech 207
symfony on action - WebTech 207
patter1.9K views
Building Web Services with Zend Framework (PHP Benelux meeting 20100713 Vliss... by King Foo
Building Web Services with Zend Framework (PHP Benelux meeting 20100713 Vliss...Building Web Services with Zend Framework (PHP Benelux meeting 20100713 Vliss...
Building Web Services with Zend Framework (PHP Benelux meeting 20100713 Vliss...
King Foo1.2K views
Doctrine MongoDB ODM (PDXPHP) by Kris Wallsmith
Doctrine MongoDB ODM (PDXPHP)Doctrine MongoDB ODM (PDXPHP)
Doctrine MongoDB ODM (PDXPHP)
Kris Wallsmith2.1K views

Viewers also liked

Capistrano2 by
Capistrano2Capistrano2
Capistrano2Luca Mearelli
627 views31 slides
Estudi de processos industrials by
Estudi de processos industrialsEstudi de processos industrials
Estudi de processos industrialsEscola Vedruna-Àngels
3.1K views46 slides
Acabats by
AcabatsAcabats
Acabatsmontse garcia dilla
929 views5 slides
El TèXtil by
El TèXtilEl TèXtil
El TèXtilmontse garcia dilla
2.7K views17 slides
Filatura by
FilaturaFilatura
Filaturamontse garcia dilla
3.3K views7 slides

Similar to And Now You Have Two Problems

And now you have two problems. Ruby regular expressions for fun and profit by... by
And now you have two problems. Ruby regular expressions for fun and profit by...And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...Codemotion
989 views27 slides
My First Rails Plugin - Usertext by
My First Rails Plugin - UsertextMy First Rails Plugin - Usertext
My First Rails Plugin - Usertextfrankieroberto
696 views48 slides
C++11 - A Change in Style - v2.0 by
C++11 - A Change in Style - v2.0C++11 - A Change in Style - v2.0
C++11 - A Change in Style - v2.0Yaser Zhian
1K views51 slides
Is Haskell an acceptable Perl? by
Is Haskell an acceptable Perl?Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?osfameron
1.6K views87 slides
Out with Regex, In with Tokens by
Out with Regex, In with TokensOut with Regex, In with Tokens
Out with Regex, In with Tokensscoates
4.5K views56 slides
Using Regular Expressions and Staying Sane by
Using Regular Expressions and Staying SaneUsing Regular Expressions and Staying Sane
Using Regular Expressions and Staying SaneCarl Brown
1.3K views65 slides

Similar to And Now You Have Two Problems(20)

And now you have two problems. Ruby regular expressions for fun and profit by... by Codemotion
And now you have two problems. Ruby regular expressions for fun and profit by...And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...
Codemotion989 views
My First Rails Plugin - Usertext by frankieroberto
My First Rails Plugin - UsertextMy First Rails Plugin - Usertext
My First Rails Plugin - Usertext
frankieroberto696 views
C++11 - A Change in Style - v2.0 by Yaser Zhian
C++11 - A Change in Style - v2.0C++11 - A Change in Style - v2.0
C++11 - A Change in Style - v2.0
Yaser Zhian1K views
Is Haskell an acceptable Perl? by osfameron
Is Haskell an acceptable Perl?Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?
osfameron1.6K views
Out with Regex, In with Tokens by scoates
Out with Regex, In with TokensOut with Regex, In with Tokens
Out with Regex, In with Tokens
scoates4.5K views
Using Regular Expressions and Staying Sane by Carl Brown
Using Regular Expressions and Staying SaneUsing Regular Expressions and Staying Sane
Using Regular Expressions and Staying Sane
Carl Brown1.3K views
A Toda Maquina Con Ruby on Rails by Rafael García
A Toda Maquina Con Ruby on RailsA Toda Maquina Con Ruby on Rails
A Toda Maquina Con Ruby on Rails
Rafael García678 views
Refactoring to Macros with Clojure by Dmitry Buzdin
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
Dmitry Buzdin3.5K views
Rust Workshop - NITC FOSSMEET 2017 by pramode_ce
Rust Workshop - NITC FOSSMEET 2017 Rust Workshop - NITC FOSSMEET 2017
Rust Workshop - NITC FOSSMEET 2017
pramode_ce998 views
"How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur... by Fwdays
"How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur..."How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur...
"How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur...
Fwdays234 views
SDPHP - Percona Toolkit (It's Basically Magic) by Robert Swisher
SDPHP - Percona Toolkit (It's Basically Magic)SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)
Robert Swisher1.9K views
Rooted 2010 ppp by noc_313
Rooted 2010 pppRooted 2010 ppp
Rooted 2010 ppp
noc_313209 views
What you forgot from your Computer Science Degree by Stephen Darlington
What you forgot from your Computer Science DegreeWhat you forgot from your Computer Science Degree
What you forgot from your Computer Science Degree
Stephen Darlington2.5K views
Malli: inside data-driven schemas by Metosin Oy
Malli: inside data-driven schemasMalli: inside data-driven schemas
Malli: inside data-driven schemas
Metosin Oy1K views
Perl6 Regexen: Reduce the line noise in your code. by Workhorse Computing
Perl6 Regexen: Reduce the line noise in your code.Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.
Javascript part1 by Raghu nath
Javascript part1Javascript part1
Javascript part1
Raghu nath443 views

More from Luca Mearelli

The anatomy of an infographic by
The anatomy of an infographicThe anatomy of an infographic
The anatomy of an infographicLuca Mearelli
1.2K views30 slides
L'altra meta del web by
L'altra meta del webL'altra meta del web
L'altra meta del webLuca Mearelli
511 views63 slides
WorseSoftware by
WorseSoftwareWorseSoftware
WorseSoftwareLuca Mearelli
506 views33 slides
Open Web by
Open WebOpen Web
Open WebLuca Mearelli
559 views79 slides
Wikierp by
WikierpWikierp
WikierpLuca Mearelli
420 views8 slides
Introduzione a Ruby On Rails by
Introduzione a Ruby On RailsIntroduzione a Ruby On Rails
Introduzione a Ruby On RailsLuca Mearelli
576 views51 slides

More from Luca Mearelli(7)

Recently uploaded

My Fitness Journey.pdf by
My Fitness Journey.pdfMy Fitness Journey.pdf
My Fitness Journey.pdfrahuldharwal141
38 views8 slides
Birthstones Jewelry.pdf by
Birthstones Jewelry.pdfBirthstones Jewelry.pdf
Birthstones Jewelry.pdfRajGupta314849
5 views3 slides
family problem presentation.pptx by
family problem presentation.pptxfamily problem presentation.pptx
family problem presentation.pptxmaryamalhammadi105
9 views5 slides
ELEGANCE UNVEILED by
ELEGANCE UNVEILED ELEGANCE UNVEILED
ELEGANCE UNVEILED BmwN13
13 views6 slides
Asian Traditional Weddings by
Asian Traditional WeddingsAsian Traditional Weddings
Asian Traditional WeddingsKokoStevan
8 views22 slides
Timeless Radiance Unveiling the Elegance of Emerald Cut Eternity Bands from E... by
Timeless Radiance Unveiling the Elegance of Emerald Cut Eternity Bands from E...Timeless Radiance Unveiling the Elegance of Emerald Cut Eternity Bands from E...
Timeless Radiance Unveiling the Elegance of Emerald Cut Eternity Bands from E...Eternity Us
9 views2 slides

Recently uploaded(8)

ELEGANCE UNVEILED by BmwN13
ELEGANCE UNVEILED ELEGANCE UNVEILED
ELEGANCE UNVEILED
BmwN1313 views
Asian Traditional Weddings by KokoStevan
Asian Traditional WeddingsAsian Traditional Weddings
Asian Traditional Weddings
KokoStevan8 views
Timeless Radiance Unveiling the Elegance of Emerald Cut Eternity Bands from E... by Eternity Us
Timeless Radiance Unveiling the Elegance of Emerald Cut Eternity Bands from E...Timeless Radiance Unveiling the Elegance of Emerald Cut Eternity Bands from E...
Timeless Radiance Unveiling the Elegance of Emerald Cut Eternity Bands from E...
Eternity Us9 views

And Now You Have Two Problems

  • 1. And now you have two problems Ruby regular expressions for fun and profit Luca Mearelli @lmea Codemotion Rome - 2013
  • 2. @lmea Regular expressions •cat catch indicate ... •2013-03-22, YYYY-MM-DD, ... •$ 12,500.80 patterns to describe the contents of a text
  • 3. @lmea Regexps: good for... Pattern matching Search and replace
  • 4. @lmea Regexp in ruby Regexp object: Regexp.new("cat") literal notation #1: %r{cat} literal notation #2: /cat/
  • 5. @lmea Regexp syntax literals: /cat/ matches any ‘cat’ substring the dot: /./ matches any character character classes: /[aeiou]/ /[a-z]/ /[01]/ negated character classes: /[^abc]/
  • 6. @lmea Regexp syntax case insensitive: /./i only interpolate #{} blocks once: /./o multiline mode - '.' will match newline: /./m extended mode - whitespace is ignored: /./x Modifiers
  • 7. @lmea Regexp syntax /d/ digit /D/ non digit /s/ whitespace /S/ non whitespace /w/ word character /W/ non word character /h/ hexdigit /H/ non hexdigit Shorthand classes
  • 8. @lmea Regexp syntax /^/ beginning of line /$/ end of line /b/ word boundary /B/ non word boundary /A/ beginning of string /z/ end of string /Z/ end of string. If string ends with a newline, it matches just before newline Anchors
  • 9. @lmea Regexp syntax alternation: /cat|dog/ matches ‘cats and dogs’ 0-or-more: /ab*/ matches ‘a’ ‘ab’ ‘abb’... 1-or-more: /ab+/ matches ‘ab’ ‘abb’ ... given-number: /ab{2}/ matches ‘abb’ but not ‘ab’ or the whole ‘abbb’ string
  • 10. @lmea Regexp syntax greedy matches: /.+cat/ matches ‘the cat is catching a mouse’ lazy matches: /.+?scat/ matches ‘the cat is catching a mouse’
  • 11. @lmea Regexp syntax grouping: /(d{3}.){3}d{3}/ matches IP- like strings capturing: /a (cat|dog)/ the match is captured in $1 to be used later non capturing: /a (?:cat|dog)/ no content captured atomic grouping: /(?>a+)/ doesn’t backtrack
  • 12. @lmea String substitution "My cat eats catfood".sub(/cat/, "dog") # => My dog eats catfood "My cat eats catfood".gsub(/cat/, "dog") # => My dog eats dogfood "My cat eats catfood".gsub(/bcat(w+)/, "dog1") # => My cat eats dogfood "My cat eats catfood".gsub(/bcat(w+)/){|m| $1.reverse} # => My cat eats doof
  • 13. @lmea String parsing "Codemotion Rome: Mar 20 to Mar 23".scan(/w{3} d{1,2}/) # => ["Mar 20", "Mar 23"] "Codemotion Rome: Mar 20 to Mar 23".scan(/(w{3}) (d{1,2})/) # => [["Mar", "20"], ["Mar", "23"]] "Codemotion Rome: Mar 20 to Mar 23".scan(/(w{3}) (d{1,2})/) {|a,b| puts b+"/"+a} # 20/Mar # 23/Mar # => "Codemotion Rome: Mar 20 to Mar 23"
  • 14. @lmea Regexp methods if "what a wonderful world" =~ /(world)/ puts "hello #{$1.upcase}" end # hello WORLD if /(world)/.match("The world") puts "hello #{$1.upcase}" end # hello WORLD match_data = /(world)/.match("The world") puts "hello #{match_data[1].upcase}" # hello WORLD
  • 15. @lmea Rails app examples # in routing match 'path/:id', :constraints => { :id => /[A-Z]d{5}/ } # in validations validates :phone, :format => /Ad{2,4}s*d+z/ validates :phone, :format => { :with=> /Ad{2,4}s*d+z/ } validates :phone, :format => { :without=> /A02s*d+z/ }
  • 16. @lmea Rails examples # in ActiveModel::Validations::NumericalityValidator def parse_raw_value_as_an_integer(raw_value) raw_value.to_i if raw_value.to_s =~ /A[+-]?d+Z/ end # in ActionDispatch::RemoteIp::IpSpoofAttackError # IP addresses that are "trusted proxies" that can be stripped from # the comma-delimited list in the X-Forwarded-For header. See also: # http://en.wikipedia.org/wiki/Private_network#Private_IPv4_address_spaces TRUSTED_PROXIES = %r{ ^127.0.0.1$ | # localhost ^(10 | # private IP 10.x.x.x 172.(1[6-9]|2[0-9]|3[0-1]) | # private IP in the range 172.16.0.0 .. 172.31.255.255 192.168 # private IP 192.168.x.x ). }x WILDCARD_PATH = %r{*([^/)]+))?$}
  • 17. @lmea Regexps are dangerous "If I was going to place a bet on something about Rails security, it'd be that there are more regex vulnerabilities in the tree. I am uncomfortable with how much Rails leans on regex for policy decisions." Thomas H. Ptacek (Founder @ Matasano, Feb 2013)
  • 18. @lmea Tip #1 Beware of nested quantifiers /(x+x+)+y/ =~ 'xxxxxxxxxy' /(xx+)+y/ =~ 'xxxxxxxxxx' /(?>x+x+)+y/ =~ 'xxxxxxxxx'
  • 19. @lmea Tip #2 Don’t make everything optional /[-+]?[0-9]*.?[0-9]*/ =~ '.' /[-+]?([0-9]*.?[0-9]+|[0-9]+)/ /[-+]?[0-9]*.?[0-9]+/
  • 20. @lmea Tip #3 Evaluate tradeoffs /b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+.)+[A-Z]{2,4}b/ /(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t] )+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?: rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:( ?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-0 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)* ](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+ (?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?: (?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z |(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn) ?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?: rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn) ?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t] )*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])* )(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t] )+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*) *:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+ |Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:r n)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?: rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t ]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031 ]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*]( ?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(? :(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(? :rn)?[ t])*))*>(?:(?:rn)?[ t])*)|(?:[^()<>@,;:".[] 000-031]+(?:(? :(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)? [ t]))*"(?:(?:rn)?[ t])*)*:(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]| .|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<> @,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|" (?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t] )*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;: ".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(? :[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[ ]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000- 031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|( ?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,; :".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([ ^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:" .[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[ ]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:". [] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[] r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r] |.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 0 00-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]| .|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@, ;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(? :[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])* (?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:". []]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[ ^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[] ]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)(?:,s*( ?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;: ".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:( ?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[ ["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t ])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t ])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(? :.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+| Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?: [^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[ ]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn) ?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[[" ()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn) ?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<> @,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@, ;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t] )*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;: ".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)? (?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:". []]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?: rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[[ "()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t]) *))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t]) +|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?: .(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z |(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:( ?:rn)?[ t])*))*)?;s*)/
  • 21. @lmea Tip #4 Capture repeated groups and don’t repeat a captured group /!(abc|123)+!/ =~ '!abc123!' # $1 == '123' /!((abc|123)+)!/ =~ '!abc123!' # $1 == 'abc123'
  • 22. @lmea Tip #5 use interpolation with care str = "cat" /#{str}/ =~ "My cat eats catfood" /#{Regexp.quote(str)}/ =~ "My cat eats catfood"
  • 23. @lmea Tip #6 Don’t use ^ and $ to match the strings beginning and end validates :url, :format => /^https?/ "http://example.com" =~ /^https?/ "javascript:alert('hello!');%0Ahttp://example.com" "javascript:alert('hello!');nhttp://example.com" =~ /^https?/ "javascript:alert('hello!');nhttp://example.com" =~ /Ahttps?/
  • 24. @lmea From 060bb7250b963609a0d8a5d0559e36b99d2402c6 Mon Sep 17 00:00:00 2001 From: joernchen of Phenoelit <joernchen@phenoelit.de> Date: Sat, 9 Feb 2013 15:46:44 -0800 Subject: [PATCH] Fix issue with attr_protected where malformed input could circumvent protection Fixes: CVE-2013-0276 --- activemodel/lib/active_model/attribute_methods.rb | 2 +- activemodel/lib/active_model/mass_assignment_security/permission_set.rb | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/activemodel/lib/active_model/attribute_methods.rb b/activemodel/lib/active_model/ attribute_methods.rb index f033a94..96f2c82 100644 --- a/activemodel/lib/active_model/attribute_methods.rb +++ b/activemodel/lib/active_model/attribute_methods.rb @@ -365,7 +365,7 @@ module ActiveModel end @prefix, @suffix = options[:prefix] || '', options[:suffix] || '' - @regex = /^(#{Regexp.escape(@prefix)})(.+?)(#{Regexp.escape(@suffix)})$/ + @regex = /A(#{Regexp.escape(@prefix)})(.+?)(#{Regexp.escape(@suffix)})z/ @method_missing_target = "#{@prefix}attribute#{@suffix}" @method_name = "#{prefix}%s#{suffix}" end diff --git a/activemodel/lib/active_model/mass_assignment_security/permission_set.rb b/activemodel/lib/ active_model/mass_assignment_security/permission_set.rb index a1fcdf1..10faa29 100644 --- a/activemodel/lib/active_model/mass_assignment_security/permission_set.rb +++ b/activemodel/lib/active_model/mass_assignment_security/permission_set.rb @@ -19,7 +19,7 @@ module ActiveModel protected def remove_multiparameter_id(key) - key.to_s.gsub(/(.+/, '') + key.to_s.gsub(/(.+/m, '') end end -- 1.8.1.1
  • 25. @lmea From 99123ad12f71ce3e7fe70656810e53133665527c Mon Sep 17 00:00:00 2001 From: Aaron Patterson <aaron.patterson@gmail.com> Date: Fri, 15 Mar 2013 15:04:00 -0700 Subject: [PATCH] fix protocol checking in sanitization [CVE-2013-1857] Conflicts: actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb --- .../action_controller/vendor/html-scanner/html/sanitizer.rb | 4 ++-- actionpack/test/template/html-scanner/sanitizer_test.rb | 10 ++++++++++ 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb b/actionpack/lib/ action_controller/vendor/html-scanner/html/sanitizer.rb index 02eea58..994e115 100644 --- a/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb +++ b/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb @@ -66,7 +66,7 @@ module HTML # A regular expression of the valid characters used to separate protocols like # the ':' in 'http://foo.com' - self.protocol_separator = /:|(&#0*58)|(&#x70)|(%|&#37;)3A/ + self.protocol_separator = /:|(&#0*58)|(&#x70)|(&#x0*3a)|(%|&#37;)3A/i # Specifies a Set of HTML attributes that can have URIs. self.uri_attributes = Set.new(%w(href src cite action longdesc xlink:href lowsrc)) @@ -171,7 +171,7 @@ module HTML def contains_bad_protocols?(attr_name, value) uri_attributes.include?(attr_name) && - (value =~ /(^[^/:]*):|(&#0*58)|(&#x70)|(%|&#37;)3A/ && !allowed_protocols.include? (value.split(protocol_separator).first.downcase)) + (value =~ /(^[^/:]*):|(&#0*58)|(&#x70)|(&#x0*3a)|(%|&#37;)3A/i && !allowed_protocols.include? (value.split(protocol_separator).first.downcase.strip)) end end end diff --git a/actionpack/test/template/html-scanner/sanitizer_test.rb b/actionpack/test/template/html-scanner/ sanitizer_test.rb index 4e2ad4e..dee60c9 100644 --- a/actionpack/test/template/html-scanner/sanitizer_test.rb +++ b/actionpack/test/template/html-scanner/sanitizer_test.rb @@ -176,6 +176,7 @@ class SanitizerTest < ActionController::TestCase %(<IMG SRC="jav&#x0A;ascript:alert('XSS');">),