0
RegEx 101

Todd Benson
Overview

•
•
•
•

What is RegEx
RegEx Basics
Uses for RegEx
Useful RegExpressions
What is RegEx?

“In computing, a regular
expression (abbreviated regex or regexp) is a
sequence of characters that forms a...
• “Some people, when confronted with a
problem, think ‘I know, I'll use regular
expressions.’ Now they have two problems.”...
Why RegEx?

• Tools use it: Nessus, Burp, W3AF
• All programming languages use it
• Excellent tool to have in the toolbox
RegEx Basics: Literal Matches

Literal Matches
‘bat’ matches ‘bat’

12 special characters -  ^ $ . | ? * + ( ) [ ]
These m...
RegEx Basics: Characture Classes

Character Classes
• -- [ ]
‘[bc]at’ will match ‘bat’ or ‘cat’

• --[^ ]
[^A-Z] will matc...
RegEx Basics: Shorthand Character Classes

Shorthand Character Classes
• d
Same as [0-9]

• D
Same as [^0-9]

• w
Same as ...
RegEx Basics: Anchors

Anchors
• ^
Beginning of line
‘rpm -qa|grep ^ao’ would list all packages that start with
‘ao’

• $
...
RegEx Basics: Non-Printable

Non-printable
• -- n
New Line

• -- r
Carriage Return
RegEx Basics: Groups

Groups
• --( )
Defines the scope and precedence of operators
‘Write(ln)?’ matches ‘Write’ and
‘Write...
RegEx Basics: Quantification

Quantification
Shows how often a token or group is allowed to
occur
• ?
Zero or one
‘a?’ wil...
RegEx Basics: Quantification (Cont.)

Quantification
Shows how often a token or group is allowed to
occur
• +
One or more
...
Uses: Searches

• Errors
(error|exception|illegal|invalid|fail|stack|access|direc
tory|file|not
found|unknown|uid=|varchar...
Uses: Searches (Cont.)

• DOM XSS
((src|href|data|location|code|value|action)s*["']]*
s*+?s*=)|((replace|assign|navigate|g...
Uses: Searching Logs

• grep -v 156.132.142.[11-19]
/var/log/apache2/other_vhosts_access.log|grep
-v 156.132.103.*
• cat
/...
Uses: VI Search and Replace

• SS#
:%s/d{3}-d{2}-d{4}/123-45-6789/g
• email
:%s/[0-9A-Za-z._%+-]+@[0-9A-Za-z._%+-]+.[AZa-z...
Uses: Command Line

openssl ciphers|sed ‘s/:/n/g'|sort
Uses: Output Mangaling

while read line; do host $line; done < ips.txt | sed
's/ has address / / /g‘ > foo.txt
Uses: Programming

• Sanitizing input
$name = preg_replace("/<s*?/?scripts*?>/i",
"&lt;script&gt;", $name);
Useful RegExes
• SS#
d{3}-d{2}-d{4}
• Phone#
((?d{3})?[ -.])?d{3}[ -.]d{4}
• IP Addresses
b((25[0-5]|2[0-4][0-9]|[01]?[0-9...
Questions?
Go forth and RegEx…
References

•
•
•
•
•
•

Web Application Hacker's Handbook
http://regex.info/blog/2006-09-15/247#comment-3085
http://en.wi...
Upcoming SlideShare
Loading in...5
×

Regex 101

1,104

Published on

Basic introduction into regular expressions and some ways they may be used

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,104
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
17
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Regex 101"

  1. 1. RegEx 101 Todd Benson
  2. 2. Overview • • • • What is RegEx RegEx Basics Uses for RegEx Useful RegExpressions
  3. 3. What is RegEx? “In computing, a regular expression (abbreviated regex or regexp) is a sequence of characters that forms a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. “ - Wikipedia
  4. 4. • “Some people, when confronted with a problem, think ‘I know, I'll use regular expressions.’ Now they have two problems.” Jamie Zawinski
  5. 5. Why RegEx? • Tools use it: Nessus, Burp, W3AF • All programming languages use it • Excellent tool to have in the toolbox
  6. 6. RegEx Basics: Literal Matches Literal Matches ‘bat’ matches ‘bat’ 12 special characters - ^ $ . | ? * + ( ) [ ] These must be escaped ‘’ ‘$’ . ‘.at’ Matches ‘bat’, ‘cat’, and ‘hat’
  7. 7. RegEx Basics: Characture Classes Character Classes • -- [ ] ‘[bc]at’ will match ‘bat’ or ‘cat’ • --[^ ] [^A-Z] will match any character that is not a capitol letter
  8. 8. RegEx Basics: Shorthand Character Classes Shorthand Character Classes • d Same as [0-9] • D Same as [^0-9] • w Same as [0-9A-Za-z_] • W Same as [^0-9A-Za-z_] • s tab, line feed, form feed, carriage return, and space • S Anything other than tab, line feed, etc.
  9. 9. RegEx Basics: Anchors Anchors • ^ Beginning of line ‘rpm -qa|grep ^ao’ would list all packages that start with ‘ao’ • $ End of line ‘[0-9][0-9][0-9]$’ would find all instances when a line ended with 3 consecutive digits • b b Word boundary ‘bW.n*b’ looks for words that begin with ‘W’ followed by any character followed by ‘n’ followed by zero or more characters ‘Win’ ‘Windows’ ‘Won’ ‘Wonton’ ‘Winter’ ‘Wonderland’ ‘Wonder’ all match
  10. 10. RegEx Basics: Non-Printable Non-printable • -- n New Line • -- r Carriage Return
  11. 11. RegEx Basics: Groups Groups • --( ) Defines the scope and precedence of operators ‘Write(ln)?’ matches ‘Write’ and ‘Writeln’ • -- | OR ‘Gr(a|e)y’ matches ‘Gray’ and ‘Grey’ ‘(ITSO|OITS)’ matches ‘ITSO’ or ‘OITS’
  12. 12. RegEx Basics: Quantification Quantification Shows how often a token or group is allowed to occur • ? Zero or one ‘a?’ will match ‘’ and ‘a’ • * Zero or more ‘a*’ will match ‘’ and ‘a’ and ‘aaaaaaaaa’
  13. 13. RegEx Basics: Quantification (Cont.) Quantification Shows how often a token or group is allowed to occur • + One or more ‘a+’ will match ‘a’ and ‘aaaaaaaaaaaa’ • {,} Minimum and Maximum ‘a{3,7}’ will match between 3 and 7 ‘a’
  14. 14. Uses: Searches • Errors (error|exception|illegal|invalid|fail|stack|access|direc tory|file|not found|unknown|uid=|varchar|SQL|quotation mark|syntax|password) • Redirects (document|window).
  15. 15. Uses: Searches (Cont.) • DOM XSS ((src|href|data|location|code|value|action)s*["']]* s*+?s*=)|((replace|assign|navigate|getResponseHea der|open(Dialog)?|showModalDialog|eval|evaluate|e xecCommand|execScript|setTimeout|setInterval)s*["' ]]*s*() • DOM XSS (locations*[[.])|([.[]s*["']?s*(arguments|dialogArg uments|innerHTML|write(ln)?|open(Dialog)?|showMo dalDialog|cookie|URL|documentURI|baseURI|referrer |name|opener|parent|top|content|self|frames)W)|( localStorage|sessionStorage|Database)
  16. 16. Uses: Searching Logs • grep -v 156.132.142.[11-19] /var/log/apache2/other_vhosts_access.log|grep -v 156.132.103.* • cat /var/log/apache2/other_vhosts_access.log|grep -o 's[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[09]{1,3}s' | sort -t . -k 3,3n -k 4,4n|uniq
  17. 17. Uses: VI Search and Replace • SS# :%s/d{3}-d{2}-d{4}/123-45-6789/g • email :%s/[0-9A-Za-z._%+-]+@[0-9A-Za-z._%+-]+.[AZa-z]{2,4}/john.doe@ao.uscourts.gov/g
  18. 18. Uses: Command Line openssl ciphers|sed ‘s/:/n/g'|sort
  19. 19. Uses: Output Mangaling while read line; do host $line; done < ips.txt | sed 's/ has address / / /g‘ > foo.txt
  20. 20. Uses: Programming • Sanitizing input $name = preg_replace("/<s*?/?scripts*?>/i", "&lt;script&gt;", $name);
  21. 21. Useful RegExes • SS# d{3}-d{2}-d{4} • Phone# ((?d{3})?[ -.])?d{3}[ -.]d{4} • IP Addresses b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3} (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)b • email [0-9A-Z._%+-]+@[0-9A-Z._%+-]+.[A-Z]{2,4} • Find Base64 (?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)? • Credit Card# - HTML Tags - Dates
  22. 22. Questions?
  23. 23. Go forth and RegEx…
  24. 24. References • • • • • • Web Application Hacker's Handbook http://regex.info/blog/2006-09-15/247#comment-3085 http://en.wikipedia.org/wiki/Regular_expression https://isc.sans.edu/regex.html http://www.regular-expressions.info/examples.html http://blog.spiderlabs.com/2013/02/easy-dom-basedxss-detection-via-regexes.html • https://en.wikipedia.org/wiki/Regular_expression • www.xkcd.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×