SlideShare a Scribd company logo
And are they contagious?
There is no official standard for
regular expressions, so no real
definition.

Simply put, you can call it a
text pattern to search and/or
                                    Easy peasy!
replace text.
Perl programming language



Perl-compatible

.NET

Java

JavaScript

…
                  What, no cherry flavour?
Back to grammar school!
a matches any occurrence of that character

Jack is a boy.

cat matches

About cats and dogs.
square bracket                [
backslash                     
caret                         ^
dollar sign                   $
period or dot                 .
vertical bar or pipe symbol   |
question mark                 ?
asterisk or star              *
plus sign                     +
opening round bracket         (
closing round bracket         )
opening curley bracket        {
Special characters are reserved for special use.

They need to be preceded by a backslash if you want to
match them as literal characters.

This is called escaping.

If you want to match 1+1=2 the correct regex is 1+1=2
tab                 t
carriage return     r
line feed           n


beginning of line   ^
end of line         $
word boundary       b
If regular expressions are Unicode enabled you can
search any character using the Unicode value.

Depending on syntax: u0000 or x{0000}

Hard space             u00A0 or x{00A0}
® sign                 u00AE or x{00AE}
...
Quantifiers allow you to specify the number of
occurrences to match against

X?             X, once or not at all
X*             X, zero or more times
X+             X, one or more times
X{n}           X, exactly n times
X{n,}          X, at least n times
X{n,m}         X, at least n but not more than m times
The regex colou?r matches both colour and color.

You can also group items together by using brackets:

Nov(ember)? will match Nov and November

The regex a+ is the same as a{1,} and matches a or aaaaa

The regex w{3} matches www.qa-distiller.com
Simply place the characters you want to match between
square brackets.

If you want to match an a or an e, use [ae]. You could
use this in gr[ae]y to match either gray or grey.

A character class matches only a single character, the
order is not important

You can also use ranges. [0-9] matches a single digit
between 0 and 9
Typing a caret after the opening square bracket will negate
the character class.

q[^u] means: "a q followed by a character that is not a u".

It will match the q and the space after the q in

Iraq is a political quagmire.

but not the q of quagmire because it is followed by the
letter u
d             digit                  [0-9]
w             word character         [A-Za-z0-9_ ]
s             whitespace             [ trn]

Negated versions

D             not a digit            [^d]
W             not a word character   [^w]
S             not a whitespace       [^s]
The dot matches a single character, without caring what
that character is.

The regex e. matches

Houston, we have a problem
If you want to search for cat or dog, separate both options
with a vertical bar or pipe symbol:

cat|dog matches

Are you sure you want a cat?

You can add more options like this:

green|black|yellow|white
Which of the following completely matches regex a(ab)*a

1)   abababa
2)   aaba
3)   aabbaa
4)   aba
5)   aabababa
Which of the following completely matches regex ab+c?

1)   abc
2)   ac
3)   abbb
4)   bbc
5)   abbcc
Which of the following completely matches regex a.[bc]+

1)   abc
2)   abbbbbbbb
3)   azc
4)   abcbcbcbc
5)   ac
6)   asccbbbbcbcccc
Which of the following completely matches regex
(very )+(fat )?(tall|ugly) man

1)   very fat man
2)   fat tall man
3)   very very fat ugly man
4)   very very very tall man
Still awake?
Positive lookahead:           X(?=X)

Match something that is followed by something
Yamagata(?= Europe) matches

Yamagata Europe, Yamagata Intech Solutions

Negative lookahead:           X(?!X)

Match something that is not followed by something
Yamagata(?! Europe) matches

Yamagata Europe, Yamagata Intech Solutions
Positive lookbehind:         (?<=X)X

Match something following something
(?<=a)b matches

thingamabob

Negative lookbehind:         (?<!X)X

Match something not following something
(?<!a)b matches

thingamabob
Round brackets create a backreference.

You can use the backreference with a backslash + the number of the
backreference.

The regex Java(script) is a 1ing language matches
Javascript is a scripting language

The regex (Java)(script) is a 2ing language that is not the same as 1
matches
Javascript is a scripting language that is not the same as Java
Use the regex b(w+) 1b to find doubled words.

Ze streelde haar haar in in de auto.

With exceptions:

b(?!haarb)(w+) 1b

Ze streelde haar haar in in de auto.
You want to add brackets around step numbers:

This is step 5 from chapter 1. Continue with step 45 from page 15.

Use the regex ([sS]tep) (d+) to find all instances.

Replace it by 1 (2)

Or alternatively (?<=[sS]tep )d+ by (0)
Powerful, for individual text-based files


More powerful, batch operations, command line


No back references


RegEx Text File Filter


RegEx search


Very limited


Powerful, called GREP
Some people, when confronted with a problem, think
"I know, I'll use regular expressions.“
Now they have two problems.


-> Do not try to do everything in one uber-regex
-> Regular expressions are not parsers
http://www.regular-expressions.info

More Related Content

What's hot

11. using regular expressions with oracle database
11. using regular expressions with oracle database11. using regular expressions with oracle database
11. using regular expressions with oracle database
Amrit Kaur
 

What's hot (20)

Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressions
 
11. using regular expressions with oracle database
11. using regular expressions with oracle database11. using regular expressions with oracle database
11. using regular expressions with oracle database
 
Regular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsRegular Expression (Regex) Fundamentals
Regular Expression (Regex) Fundamentals
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular expression
Regular expressionRegular expression
Regular expression
 
Basic regular expression, extended regular expression
Basic regular expression, extended regular expressionBasic regular expression, extended regular expression
Basic regular expression, extended regular expression
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler Design
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Python : Regular expressions
Python : Regular expressionsPython : Regular expressions
Python : Regular expressions
 
Chapter1 Formal Language and Automata Theory
Chapter1 Formal Language and Automata TheoryChapter1 Formal Language and Automata Theory
Chapter1 Formal Language and Automata Theory
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Finaal application on regular expression
Finaal application on regular expressionFinaal application on regular expression
Finaal application on regular expression
 
R Programming: Transform/Reshape Data In R
R Programming: Transform/Reshape Data In RR Programming: Transform/Reshape Data In R
R Programming: Transform/Reshape Data In R
 
Object-oriented Programming in Python
Object-oriented Programming in PythonObject-oriented Programming in Python
Object-oriented Programming in Python
 
Python-03| Data types
Python-03| Data typesPython-03| Data types
Python-03| Data types
 
6. describing syntax and semantics
6. describing syntax and semantics6. describing syntax and semantics
6. describing syntax and semantics
 

Similar to An Introduction to Regular expressions

Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular ExpressionEloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
Kuyseng Chhoeun
 

Similar to An Introduction to Regular expressions (20)

Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular expressions quick reference
Regular expressions quick referenceRegular expressions quick reference
Regular expressions quick reference
 
Regex Intro
Regex IntroRegex Intro
Regex Intro
 
Regex Basics
Regex BasicsRegex Basics
Regex Basics
 
Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular ExpressionEloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
 
Regexps
RegexpsRegexps
Regexps
 
Looking for Patterns
Looking for PatternsLooking for Patterns
Looking for Patterns
 
Expresiones Regulares
Expresiones RegularesExpresiones Regulares
Expresiones Regulares
 
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfFUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)
 
test vedio
test vediotest vedio
test vedio
 
qwdeqwe
qwdeqweqwdeqwe
qwdeqwe
 
Added to test pdf
Added to test pdf Added to test pdf
Added to test pdf
 
added for test
added for test added for test
added for test
 
ganesh testing
ganesh testing ganesh testing
ganesh testing
 
now its pdf
now its pdfnow its pdf
now its pdf
 
fghfghf
fghfghffghfghf
fghfghf
 
The hindu
The hinduThe hindu
The hindu
 
test
testtest
test
 
test
testtest
test
 

More from Yamagata Europe

More from Yamagata Europe (8)

Smart QA
Smart QASmart QA
Smart QA
 
Machine Translation Quality Metrics
Machine Translation Quality MetricsMachine Translation Quality Metrics
Machine Translation Quality Metrics
 
XML and Localization
XML and LocalizationXML and Localization
XML and Localization
 
A standards driven workflow for Sitecore localization
A standards driven workflow for Sitecore localizationA standards driven workflow for Sitecore localization
A standards driven workflow for Sitecore localization
 
QA Distiller
QA DistillerQA Distiller
QA Distiller
 
SnellSpell
SnellSpellSnellSpell
SnellSpell
 
Machine translation
Machine translationMachine translation
Machine translation
 
DITA translatability best practices
DITA translatability best practicesDITA translatability best practices
DITA translatability best practices
 

Recently uploaded

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 

Recently uploaded (20)

Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

An Introduction to Regular expressions

  • 1.
  • 2. And are they contagious?
  • 3. There is no official standard for regular expressions, so no real definition. Simply put, you can call it a text pattern to search and/or Easy peasy! replace text.
  • 5. Back to grammar school!
  • 6. a matches any occurrence of that character Jack is a boy. cat matches About cats and dogs.
  • 7. square bracket [ backslash caret ^ dollar sign $ period or dot . vertical bar or pipe symbol | question mark ? asterisk or star * plus sign + opening round bracket ( closing round bracket ) opening curley bracket {
  • 8. Special characters are reserved for special use. They need to be preceded by a backslash if you want to match them as literal characters. This is called escaping. If you want to match 1+1=2 the correct regex is 1+1=2
  • 9. tab t carriage return r line feed n beginning of line ^ end of line $ word boundary b
  • 10. If regular expressions are Unicode enabled you can search any character using the Unicode value. Depending on syntax: u0000 or x{0000} Hard space u00A0 or x{00A0} ® sign u00AE or x{00AE} ...
  • 11. Quantifiers allow you to specify the number of occurrences to match against X? X, once or not at all X* X, zero or more times X+ X, one or more times X{n} X, exactly n times X{n,} X, at least n times X{n,m} X, at least n but not more than m times
  • 12. The regex colou?r matches both colour and color. You can also group items together by using brackets: Nov(ember)? will match Nov and November The regex a+ is the same as a{1,} and matches a or aaaaa The regex w{3} matches www.qa-distiller.com
  • 13. Simply place the characters you want to match between square brackets. If you want to match an a or an e, use [ae]. You could use this in gr[ae]y to match either gray or grey. A character class matches only a single character, the order is not important You can also use ranges. [0-9] matches a single digit between 0 and 9
  • 14. Typing a caret after the opening square bracket will negate the character class. q[^u] means: "a q followed by a character that is not a u". It will match the q and the space after the q in Iraq is a political quagmire. but not the q of quagmire because it is followed by the letter u
  • 15. d digit [0-9] w word character [A-Za-z0-9_ ] s whitespace [ trn] Negated versions D not a digit [^d] W not a word character [^w] S not a whitespace [^s]
  • 16. The dot matches a single character, without caring what that character is. The regex e. matches Houston, we have a problem
  • 17. If you want to search for cat or dog, separate both options with a vertical bar or pipe symbol: cat|dog matches Are you sure you want a cat? You can add more options like this: green|black|yellow|white
  • 18. Which of the following completely matches regex a(ab)*a 1) abababa 2) aaba 3) aabbaa 4) aba 5) aabababa
  • 19. Which of the following completely matches regex ab+c? 1) abc 2) ac 3) abbb 4) bbc 5) abbcc
  • 20. Which of the following completely matches regex a.[bc]+ 1) abc 2) abbbbbbbb 3) azc 4) abcbcbcbc 5) ac 6) asccbbbbcbcccc
  • 21. Which of the following completely matches regex (very )+(fat )?(tall|ugly) man 1) very fat man 2) fat tall man 3) very very fat ugly man 4) very very very tall man
  • 23. Positive lookahead: X(?=X) Match something that is followed by something Yamagata(?= Europe) matches Yamagata Europe, Yamagata Intech Solutions Negative lookahead: X(?!X) Match something that is not followed by something Yamagata(?! Europe) matches Yamagata Europe, Yamagata Intech Solutions
  • 24. Positive lookbehind: (?<=X)X Match something following something (?<=a)b matches thingamabob Negative lookbehind: (?<!X)X Match something not following something (?<!a)b matches thingamabob
  • 25. Round brackets create a backreference. You can use the backreference with a backslash + the number of the backreference. The regex Java(script) is a 1ing language matches Javascript is a scripting language The regex (Java)(script) is a 2ing language that is not the same as 1 matches Javascript is a scripting language that is not the same as Java
  • 26. Use the regex b(w+) 1b to find doubled words. Ze streelde haar haar in in de auto. With exceptions: b(?!haarb)(w+) 1b Ze streelde haar haar in in de auto.
  • 27. You want to add brackets around step numbers: This is step 5 from chapter 1. Continue with step 45 from page 15. Use the regex ([sS]tep) (d+) to find all instances. Replace it by 1 (2) Or alternatively (?<=[sS]tep )d+ by (0)
  • 28. Powerful, for individual text-based files More powerful, batch operations, command line No back references RegEx Text File Filter RegEx search Very limited Powerful, called GREP
  • 29. Some people, when confronted with a problem, think "I know, I'll use regular expressions.“ Now they have two problems. -> Do not try to do everything in one uber-regex -> Regular expressions are not parsers