SlideShare a Scribd company logo
1 of 18
Regular Expressions

21 Januari 2014
Sido Grond
Exception Twente
1
Index

•
•
•
•
•
•
•

Introduction
Applications
Constructs
Regular expressions in C#
Demos
Pros and cons
Conclusions

2
Introduction
• Regular Expression (RE or regex) is a text
string that describes a search pattern
• Some similarity to wildcards (e.g. *.txt)
• Platform/language independent but some
minor differences among them
• Available in a.o. C#, Perl, Python, Java,
Javascript, Visual Studio, Notepad++, Linux
command line tools
3
Applications
• Syntax highlighting
• Find and replace
– Visual Studio (as usual: different syntax)
– Notepad++

• Text searching
– Unix tools: grep, sed, find

• Programming
– Pattern matching, filtering, replacing
4
Constructs: single character
Regex

Strings that match

Strings that don’t match

a

“abc”, “bad”

“xyz”

^a

“abc”

“bad”, “xyz”, “^a”

a$

“cba”

“abc”, “bad”, “xyz”

^[a-z]$

“a”, “x”

“aa”, “A”

^[a-z0-9+-]$

“s”, “3”, “+”, “-”

“12”, “Q”, “a1”

^[^a-zA-Z]$

“5”, “+”

“p”, “R”, “abc”, “15”

^[a-zA-Z]

“b”, “C123”, “dd”

“1cba”, “ f”(note space)

[^a-zA-Z]

“1cba”, “ f”

“b”, “dd”

^.$

“m”, “3”, “%”

“13”, “%a”, “xyz”

.

“%”, “e4”, “xyz”, “.”

“”

.

“.”

“%”, “e4”, “xyz”
5
Constructs: word groups
•
•
•
•
•
•

d is shorthand for Digits [0-9]
w is shorthand for Words [a-zA-Z0-9]
s is shorthand for whiteSpace [ trn]
D is shorthand for non-Digits [^0-9]
W is shorthand for non-Words [^a-zA-Z0-9]
S is shorthand for non-whiteSpace [^ trn]

6
Constructs: multiple character
Regex

Strings that match

Strings that don’t match

abc

“abcde”, “rabco”

“bac”, “ABC”, “bcde”

^abc

“abc”, “abcde”

“rabco”, “ABC”, “bcde”

Ed

“ME3”, “E85”, “EBE5” “ACME”, “E”, “E 8”

Ed+$

“ME35”, “EBE5”

“E85x”, “E3EB”, “E 8”

foo|bar

“foot”, “bart”

“ooba”

a(b|c)d

“Tabd”, “acdc”

“abcd”, “aed”, “ad”

^foo|bar

“foo1”, “Harbar”

“toofoo”

^(foo|bar)

“foo1”, “bart”

“toofoo”, “Harbar”

7
Constructs: quantifiers
Regex

Strings that match

Strings that don’t match

^a?$

“”, “a”

“aa”, “aaaaa”

^a*$

“a”, “aaaaa”, “”

“aaab”, “c”, “ba”

^a+$

“a”, “aaaaa”

“aaab”, “c”, “ba”, “”

^a{4}$

“aaaa”

“a”, “aaaaa”, “”

^a{2,6}$

“aa”, “aaa”, “aaaaaa”

“a”, “”, “aaaaaaa”

(e|o){2}

“koe”, “booom”, “veel” “kol”, “omo”, “vele”

e|(o{2})

“nest”, “booom”, “koe” “kol”

(a[0-9]*){2}

“aa”, “ba45a”, “a3a0”

“abacus”, “a3b8a0”

8
Constructs: advanced
Regex

String

First match

Greedy

<.+>

“This is a <B>first</B> test”

“<B>first</B>”

Lazy

<.+?>

“This is a <B>first</B> test”

“<B>”

Greedy/lazy repetition
Regex

Strings that match

Groups

^(a|b)c(d)$

“acd”, “bcd”

0:”acd” 1:”a” 2:”d”

^(?:a|b)c(d)$

same as above

0:”acd” 1:”d”

^(?<name>a|b)c(d)$

same as above

0:”acd” 1:”d” name:”a”

Grouping
9
Constructs: advanced
Regex

Strings that match

Strings that don’t match

^([a-c])x1x1

“axaxa”, “bxbxbyyyy”

“axaxb”, “bxaxc”

<(b)><(i)>.*?</2></1>

“<b><i>bla</i></b>”

“<b><i>bla</b></i>”

Backreferences: inside regex
Regex

Strings that match Replace pattern Result strings

^(var)(1|2)$

“var1”, “var2”

1iable

“variable”

^(a|b)c(d|e)

“acd”, “bcd”

2XXX1

“dXXXa”, “dXXXb”

Backreferences: find and replace
10
Constructs: advanced
Regex

Strings that match

Strings that don’t match

([a-b](?=x))

“blaax”, “bxa”

“ab”, “bacx”

((?<=x)[a-b])

“yyxa”, “bxb”

“ab”, “aax”

([a-b](?!x))

“bla”, “a”, “bxa”

“bxc”, “ax”

((?<!x)[a-b])

“ral”, “dbx”, “bxa”

“xa”, “lxb”

Look ahead/behind

11
Regexes in C#
• using System.Text.RegularExpressions
• Regex reg = new Regex(“a(b|c)d”);
–
–
–
–
–

reg.IsMatch(“abd”);
reg.Matches(“abdEacd”);
reg.Groups(“abdEabdC”);
reg.Split(“abdEabdC”);
reg.Replace(“abdEabdC”, “X”);

true
2 Matches
2 Groups per match
2 Strings (“E” and “C”)
“XEXC”

• Single/Multiline options: is linebreak(n)
special character

12
Regexes in C#
• Demo

13
Regexes in editors
• Notepad++

14
Regexes in Linux
• grep, sed, find, vi

15
Pros and cons
• Advantages
–
–
–
–
–

Very flexible
Fast processing
Language independent
A lot of work in a single line of code
Often simpler than ‘substring+indexes’ approach

• Disadvantages
– Hard to read, for example ‘?’ has three meanings depending on
context
– Hard to debug: no info given when no match
– Compilation only at runtime
– Typos are very easily made (e.g. forget escape character)

16
Conclusions
“Some people, when confronted with a
problem, think ‘I know, I'll use regular
expressions.’ Now they have two problems.”
Jamie Zawinski
Don’t overuse it!
17
Conclusions
•
•
•
•

Very handy tool for string matching and replacing
Built-in support in most programming languages
Support in/for multiple applications
More info
– http://www.regular-expressions.info/
– http://msdn.microsoft.com/en-us/library/az24scfc
%28v=vs.110%29.aspx

• Fun
– http://regex.alf.nu/
– http://www.i-programmer.info/news/144-graphics-and-games/5450can-you-do-the-regular-expression-crossword.html
18

More Related Content

What's hot

Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressionsBen Brumfield
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsDanny Bryant
 
Bytecode manipulation with Javassist for fun and profit
Bytecode manipulation with Javassist for fun and profitBytecode manipulation with Javassist for fun and profit
Bytecode manipulation with Javassist for fun and profitJérôme Kehrli
 
Python Foundation – A programmer's introduction to Python concepts & style
Python Foundation – A programmer's introduction to Python concepts & stylePython Foundation – A programmer's introduction to Python concepts & style
Python Foundation – A programmer's introduction to Python concepts & styleKevlin Henney
 
Cascading Style Sheets - Part 01
Cascading Style Sheets - Part 01Cascading Style Sheets - Part 01
Cascading Style Sheets - Part 01Hatem Mahmoud
 
LPW: Beginners Perl
LPW: Beginners PerlLPW: Beginners Perl
LPW: Beginners PerlDave Cross
 
.block() 은 신중하게
.block() 은 신중하게.block() 은 신중하게
.block() 은 신중하게NAVER SHOPPING
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbLucidworks
 
Introduction to Object Oriented Programming
Introduction to Object Oriented ProgrammingIntroduction to Object Oriented Programming
Introduction to Object Oriented ProgrammingMoutaz Haddara
 
Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Fariz Darari
 
Introduction to method overloading &amp; method overriding in java hdm
Introduction to method overloading &amp; method overriding  in java  hdmIntroduction to method overloading &amp; method overriding  in java  hdm
Introduction to method overloading &amp; method overriding in java hdmHarshal Misalkar
 
Javascript built in String Functions
Javascript built in String FunctionsJavascript built in String Functions
Javascript built in String FunctionsAvanitrambadiya
 

What's hot (20)

Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressions
 
CSS
CSSCSS
CSS
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular Expressions
 
Bytecode manipulation with Javassist for fun and profit
Bytecode manipulation with Javassist for fun and profitBytecode manipulation with Javassist for fun and profit
Bytecode manipulation with Javassist for fun and profit
 
Python Foundation – A programmer's introduction to Python concepts & style
Python Foundation – A programmer's introduction to Python concepts & stylePython Foundation – A programmer's introduction to Python concepts & style
Python Foundation – A programmer's introduction to Python concepts & style
 
Cascading Style Sheets - Part 01
Cascading Style Sheets - Part 01Cascading Style Sheets - Part 01
Cascading Style Sheets - Part 01
 
Python Data-Types
Python Data-TypesPython Data-Types
Python Data-Types
 
LPW: Beginners Perl
LPW: Beginners PerlLPW: Beginners Perl
LPW: Beginners Perl
 
.block() 은 신중하게
.block() 은 신중하게.block() 은 신중하게
.block() 은 신중하게
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
 
Html forms
Html formsHtml forms
Html forms
 
Php array
Php arrayPhp array
Php array
 
Javascript
JavascriptJavascript
Javascript
 
Introduction to Object Oriented Programming
Introduction to Object Oriented ProgrammingIntroduction to Object Oriented Programming
Introduction to Object Oriented Programming
 
skip list
skip listskip list
skip list
 
Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02
 
Introduction to method overloading &amp; method overriding in java hdm
Introduction to method overloading &amp; method overriding  in java  hdmIntroduction to method overloading &amp; method overriding  in java  hdm
Introduction to method overloading &amp; method overriding in java hdm
 
Searching and sorting
Searching  and sortingSearching  and sorting
Searching and sorting
 
Html forms
Html formsHtml forms
Html forms
 
Javascript built in String Functions
Javascript built in String FunctionsJavascript built in String Functions
Javascript built in String Functions
 

Viewers also liked

Regex Presentation
Regex PresentationRegex Presentation
Regex Presentationarnolambert
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular ExpressionsMatt Casto
 
Field Extractions: Making Regex Your Buddy
Field Extractions: Making Regex Your BuddyField Extractions: Making Regex Your Buddy
Field Extractions: Making Regex Your BuddyMichael Wilde
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressionsa3sec
 
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...44CON
 
Regular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondRegular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondMax Shirshin
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Pythonprimeteacher32
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsEran Zimbler
 
Talking about advantages and disadvantages
Talking about advantages and disadvantagesTalking about advantages and disadvantages
Talking about advantages and disadvantageseoihelen
 
Regular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsRegular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsMesut Günes
 
compiler and their types
compiler and their typescompiler and their types
compiler and their typespatchamounika7
 
Lecture: Regular Expressions and Regular Languages
Lecture: Regular Expressions and Regular LanguagesLecture: Regular Expressions and Regular Languages
Lecture: Regular Expressions and Regular LanguagesMarina Santini
 
State of the Word 2011
State of the Word 2011State of the Word 2011
State of the Word 2011photomatt
 

Viewers also liked (18)

Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular Expressions
 
Field Extractions: Making Regex Your Buddy
Field Extractions: Making Regex Your BuddyField Extractions: Making Regex Your Buddy
Field Extractions: Making Regex Your Buddy
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Andrei's Regex Clinic
Andrei's Regex ClinicAndrei's Regex Clinic
Andrei's Regex Clinic
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
ООП_лекция_11
ООП_лекция_11ООП_лекция_11
ООП_лекция_11
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressions
 
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
 
Regular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondRegular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And Beyond
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Python
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Talking about advantages and disadvantages
Talking about advantages and disadvantagesTalking about advantages and disadvantages
Talking about advantages and disadvantages
 
Regular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsRegular Expression (Regex) Fundamentals
Regular Expression (Regex) Fundamentals
 
compiler and their types
compiler and their typescompiler and their types
compiler and their types
 
Lecture: Regular Expressions and Regular Languages
Lecture: Regular Expressions and Regular LanguagesLecture: Regular Expressions and Regular Languages
Lecture: Regular Expressions and Regular Languages
 
State of the Word 2011
State of the Word 2011State of the Word 2011
State of the Word 2011
 

Similar to Regular Expressions

Python advanced 2. regular expression in python
Python advanced 2. regular expression in pythonPython advanced 2. regular expression in python
Python advanced 2. regular expression in pythonJohn(Qiang) Zhang
 
Understanding Regular expressions: Programming Historian Study Group, Univers...
Understanding Regular expressions: Programming Historian Study Group, Univers...Understanding Regular expressions: Programming Historian Study Group, Univers...
Understanding Regular expressions: Programming Historian Study Group, Univers...Allison Jai O'Dell
 
APMG juni 2014 - Regular Expression
APMG juni 2014 - Regular ExpressionAPMG juni 2014 - Regular Expression
APMG juni 2014 - Regular ExpressionByte
 
Certified bit coded regular expression parsing
Certified bit coded regular expression parsingCertified bit coded regular expression parsing
Certified bit coded regular expression parsingrodrigogribeiro
 
Using Regular Expressions and Staying Sane
Using Regular Expressions and Staying SaneUsing Regular Expressions and Staying Sane
Using Regular Expressions and Staying SaneCarl Brown
 
Type Checking Scala Spark Datasets: Dataset Transforms
Type Checking Scala Spark Datasets: Dataset TransformsType Checking Scala Spark Datasets: Dataset Transforms
Type Checking Scala Spark Datasets: Dataset TransformsJohn Nestor
 
Week-2: Theory & Practice of Data Cleaning: Regular Expressions in Practice
Week-2: Theory & Practice of Data Cleaning: Regular Expressions in PracticeWeek-2: Theory & Practice of Data Cleaning: Regular Expressions in Practice
Week-2: Theory & Practice of Data Cleaning: Regular Expressions in PracticeBertram Ludäscher
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular ExpressionsJesse Anderson
 
OpenCog Developer Workshop
OpenCog Developer WorkshopOpenCog Developer Workshop
OpenCog Developer WorkshopIbby Benali
 
2015 bioinformatics python_strings_wim_vancriekinge
2015 bioinformatics python_strings_wim_vancriekinge2015 bioinformatics python_strings_wim_vancriekinge
2015 bioinformatics python_strings_wim_vancriekingeProf. Wim Van Criekinge
 
Avoiding JavaScript Pitfalls Through Tree Hugging
Avoiding JavaScript Pitfalls Through Tree HuggingAvoiding JavaScript Pitfalls Through Tree Hugging
Avoiding JavaScript Pitfalls Through Tree Huggingzefhemel
 
Slides chapter3part1 ruby-forjavaprogrammers
Slides chapter3part1 ruby-forjavaprogrammersSlides chapter3part1 ruby-forjavaprogrammers
Slides chapter3part1 ruby-forjavaprogrammersGiovanni924
 
(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?Tomasz Wrobel
 

Similar to Regular Expressions (20)

Python advanced 2. regular expression in python
Python advanced 2. regular expression in pythonPython advanced 2. regular expression in python
Python advanced 2. regular expression in python
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
Regular expression for everyone
Regular expression for everyoneRegular expression for everyone
Regular expression for everyone
 
Understanding Regular expressions: Programming Historian Study Group, Univers...
Understanding Regular expressions: Programming Historian Study Group, Univers...Understanding Regular expressions: Programming Historian Study Group, Univers...
Understanding Regular expressions: Programming Historian Study Group, Univers...
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regex 101
Regex 101Regex 101
Regex 101
 
APMG juni 2014 - Regular Expression
APMG juni 2014 - Regular ExpressionAPMG juni 2014 - Regular Expression
APMG juni 2014 - Regular Expression
 
P3 2018 python_regexes
P3 2018 python_regexesP3 2018 python_regexes
P3 2018 python_regexes
 
Json the-x-in-ajax1588
Json the-x-in-ajax1588Json the-x-in-ajax1588
Json the-x-in-ajax1588
 
Certified bit coded regular expression parsing
Certified bit coded regular expression parsingCertified bit coded regular expression parsing
Certified bit coded regular expression parsing
 
Using Regular Expressions and Staying Sane
Using Regular Expressions and Staying SaneUsing Regular Expressions and Staying Sane
Using Regular Expressions and Staying Sane
 
Type Checking Scala Spark Datasets: Dataset Transforms
Type Checking Scala Spark Datasets: Dataset TransformsType Checking Scala Spark Datasets: Dataset Transforms
Type Checking Scala Spark Datasets: Dataset Transforms
 
Week-2: Theory & Practice of Data Cleaning: Regular Expressions in Practice
Week-2: Theory & Practice of Data Cleaning: Regular Expressions in PracticeWeek-2: Theory & Practice of Data Cleaning: Regular Expressions in Practice
Week-2: Theory & Practice of Data Cleaning: Regular Expressions in Practice
 
P3 2017 python_regexes
P3 2017 python_regexesP3 2017 python_regexes
P3 2017 python_regexes
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular Expressions
 
OpenCog Developer Workshop
OpenCog Developer WorkshopOpenCog Developer Workshop
OpenCog Developer Workshop
 
2015 bioinformatics python_strings_wim_vancriekinge
2015 bioinformatics python_strings_wim_vancriekinge2015 bioinformatics python_strings_wim_vancriekinge
2015 bioinformatics python_strings_wim_vancriekinge
 
Avoiding JavaScript Pitfalls Through Tree Hugging
Avoiding JavaScript Pitfalls Through Tree HuggingAvoiding JavaScript Pitfalls Through Tree Hugging
Avoiding JavaScript Pitfalls Through Tree Hugging
 
Slides chapter3part1 ruby-forjavaprogrammers
Slides chapter3part1 ruby-forjavaprogrammersSlides chapter3part1 ruby-forjavaprogrammers
Slides chapter3part1 ruby-forjavaprogrammers
 
(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?
 

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Regular Expressions

  • 1. Regular Expressions 21 Januari 2014 Sido Grond Exception Twente 1
  • 3. Introduction • Regular Expression (RE or regex) is a text string that describes a search pattern • Some similarity to wildcards (e.g. *.txt) • Platform/language independent but some minor differences among them • Available in a.o. C#, Perl, Python, Java, Javascript, Visual Studio, Notepad++, Linux command line tools 3
  • 4. Applications • Syntax highlighting • Find and replace – Visual Studio (as usual: different syntax) – Notepad++ • Text searching – Unix tools: grep, sed, find • Programming – Pattern matching, filtering, replacing 4
  • 5. Constructs: single character Regex Strings that match Strings that don’t match a “abc”, “bad” “xyz” ^a “abc” “bad”, “xyz”, “^a” a$ “cba” “abc”, “bad”, “xyz” ^[a-z]$ “a”, “x” “aa”, “A” ^[a-z0-9+-]$ “s”, “3”, “+”, “-” “12”, “Q”, “a1” ^[^a-zA-Z]$ “5”, “+” “p”, “R”, “abc”, “15” ^[a-zA-Z] “b”, “C123”, “dd” “1cba”, “ f”(note space) [^a-zA-Z] “1cba”, “ f” “b”, “dd” ^.$ “m”, “3”, “%” “13”, “%a”, “xyz” . “%”, “e4”, “xyz”, “.” “” . “.” “%”, “e4”, “xyz” 5
  • 6. Constructs: word groups • • • • • • d is shorthand for Digits [0-9] w is shorthand for Words [a-zA-Z0-9] s is shorthand for whiteSpace [ trn] D is shorthand for non-Digits [^0-9] W is shorthand for non-Words [^a-zA-Z0-9] S is shorthand for non-whiteSpace [^ trn] 6
  • 7. Constructs: multiple character Regex Strings that match Strings that don’t match abc “abcde”, “rabco” “bac”, “ABC”, “bcde” ^abc “abc”, “abcde” “rabco”, “ABC”, “bcde” Ed “ME3”, “E85”, “EBE5” “ACME”, “E”, “E 8” Ed+$ “ME35”, “EBE5” “E85x”, “E3EB”, “E 8” foo|bar “foot”, “bart” “ooba” a(b|c)d “Tabd”, “acdc” “abcd”, “aed”, “ad” ^foo|bar “foo1”, “Harbar” “toofoo” ^(foo|bar) “foo1”, “bart” “toofoo”, “Harbar” 7
  • 8. Constructs: quantifiers Regex Strings that match Strings that don’t match ^a?$ “”, “a” “aa”, “aaaaa” ^a*$ “a”, “aaaaa”, “” “aaab”, “c”, “ba” ^a+$ “a”, “aaaaa” “aaab”, “c”, “ba”, “” ^a{4}$ “aaaa” “a”, “aaaaa”, “” ^a{2,6}$ “aa”, “aaa”, “aaaaaa” “a”, “”, “aaaaaaa” (e|o){2} “koe”, “booom”, “veel” “kol”, “omo”, “vele” e|(o{2}) “nest”, “booom”, “koe” “kol” (a[0-9]*){2} “aa”, “ba45a”, “a3a0” “abacus”, “a3b8a0” 8
  • 9. Constructs: advanced Regex String First match Greedy <.+> “This is a <B>first</B> test” “<B>first</B>” Lazy <.+?> “This is a <B>first</B> test” “<B>” Greedy/lazy repetition Regex Strings that match Groups ^(a|b)c(d)$ “acd”, “bcd” 0:”acd” 1:”a” 2:”d” ^(?:a|b)c(d)$ same as above 0:”acd” 1:”d” ^(?<name>a|b)c(d)$ same as above 0:”acd” 1:”d” name:”a” Grouping 9
  • 10. Constructs: advanced Regex Strings that match Strings that don’t match ^([a-c])x1x1 “axaxa”, “bxbxbyyyy” “axaxb”, “bxaxc” <(b)><(i)>.*?</2></1> “<b><i>bla</i></b>” “<b><i>bla</b></i>” Backreferences: inside regex Regex Strings that match Replace pattern Result strings ^(var)(1|2)$ “var1”, “var2” 1iable “variable” ^(a|b)c(d|e) “acd”, “bcd” 2XXX1 “dXXXa”, “dXXXb” Backreferences: find and replace 10
  • 11. Constructs: advanced Regex Strings that match Strings that don’t match ([a-b](?=x)) “blaax”, “bxa” “ab”, “bacx” ((?<=x)[a-b]) “yyxa”, “bxb” “ab”, “aax” ([a-b](?!x)) “bla”, “a”, “bxa” “bxc”, “ax” ((?<!x)[a-b]) “ral”, “dbx”, “bxa” “xa”, “lxb” Look ahead/behind 11
  • 12. Regexes in C# • using System.Text.RegularExpressions • Regex reg = new Regex(“a(b|c)d”); – – – – – reg.IsMatch(“abd”); reg.Matches(“abdEacd”); reg.Groups(“abdEabdC”); reg.Split(“abdEabdC”); reg.Replace(“abdEabdC”, “X”); true 2 Matches 2 Groups per match 2 Strings (“E” and “C”) “XEXC” • Single/Multiline options: is linebreak(n) special character 12
  • 13. Regexes in C# • Demo 13
  • 14. Regexes in editors • Notepad++ 14
  • 15. Regexes in Linux • grep, sed, find, vi 15
  • 16. Pros and cons • Advantages – – – – – Very flexible Fast processing Language independent A lot of work in a single line of code Often simpler than ‘substring+indexes’ approach • Disadvantages – Hard to read, for example ‘?’ has three meanings depending on context – Hard to debug: no info given when no match – Compilation only at runtime – Typos are very easily made (e.g. forget escape character) 16
  • 17. Conclusions “Some people, when confronted with a problem, think ‘I know, I'll use regular expressions.’ Now they have two problems.” Jamie Zawinski Don’t overuse it! 17
  • 18. Conclusions • • • • Very handy tool for string matching and replacing Built-in support in most programming languages Support in/for multiple applications More info – http://www.regular-expressions.info/ – http://msdn.microsoft.com/en-us/library/az24scfc %28v=vs.110%29.aspx • Fun – http://regex.alf.nu/ – http://www.i-programmer.info/news/144-graphics-and-games/5450can-you-do-the-regular-expression-crossword.html 18