Regexes:
It's magic!
“Some people, when confronted with a problem,
think 'I know, I'll use regular expressions!'
Now they have two problems.”
*
Perl style regex:
It's magic done right!
Metacharacters
^ beginning
$ end
. anything
 escape

/^....G..AA$/
Escaped characters
s whitespace

/^wwwwGwwAA$/

S not-whitespace

/^dddddddd$/

w word
d digit
. dot
 counterslash
Repetition
? 0 or 1 time

/^w{4}Gw{2}AA$/

* 0 or more times

/^d{1,2}d{1,2}d{2,4}$/

+ 1 or more times
*? ungreedy *
+? ungreedy +
{m} m times
{m, n} m up to n times
{m, n}? ungreedy {m,n}
Grouping
[ABC] any of these
characters
(AB|BC|CA) any of
these expressions
(THIS!) save this
[A-Za-z0-9] ranges

/^[ACTG]{4}G[ACTG]
{2}AA$/
/^(0?[1-9]|[0-2]d|3[01])
(0?d|1[0-2])
(d{2}|d{4})$/
OVERKILL

http://nbviewer.ipython.org/url/norvig.com/ipython/xkcd1313.ipynb
In Python (sigh...)
E.g.: finding files
E.g.: finding files

iel'
v 'Dan
' | grep
bo
p -v 'bu
e
'->' | gr
ep
-la | gr
ls
E.g.: demultiplexing fasta
1. Barcode
2. Primer
3. Random nucleotides

grep -P '1:N:0:ACTGGTT' -A3 –no-group-separator
multiplex_R1.fastq | grep -P '^[ACTGN]
{4}CCC[ACGT]T[GC]AGATA' -A2 -B1 --no-group-separator >
deplexed_R1.fq
E.g.: paper figures!
From the subset of unique sequences that span the
entire region under study, how many unique
sequences are matched by each primer combination?
Sed: find & replace
“Are you gonna talk about
vim regexes?”
“Sed regexes are weird”
My work around:
use ranges
[0-9]
[A-Z]
[a-z]
[A-Za-z]
Sed: find & replace
“Are you gonna talk about
vim regexes?”
Sed regexes are weird”
My work around:
use ranges
[0-9]
[A-Z]
[a-z]
[A-Za-z]

E.g.:
“Oh noes, Americans don't know how to
separate decimals!”
sed 's/./,/g' hisfile.tab > myfile.tab
“Oh noes, this bloody file was edited in
Windows!”
sed 's/r/n/' theirfile.tab > decentfile.tab
“Oh noes, Cassava 1.6 has a slash in it!”
sed 's,/1, 1:N:0:NNNNNN,' oldfile.fq > newfile.fq
Other neat stuff
grep (-c)
sort (-n, -r, -k, -t)
uniq -c
LMGTFY:
sed
http://www.tutorialspoint.com/unix/unix-regular-expressions.htm
grep
http://linux.about.com/od/commands/l/blcmdl1_grep.htm
Perl
http://www.cs.tut.fi/~jkorpela/perl/regexp.html
Python
http://docs.python.org/2/howto/regex.html
Vim
http://vimregex.com/
sed 's/fear of regex/love of regex/g'

Nerd talk: regexes