Your SlideShare is downloading. ×
Regular Expression Supported By : java.util.regex
Introduction & uses <ul><li>It’s a way to describe a group of Strings based on common characteristics shared by each Strin...
Pattern & Matcher Objects <ul><li>Pattern </li></ul><ul><li>A pattern object is a compiled representation of a regular exp...
<ul><li>static Pattern compile(String regex ) </li></ul><ul><li>Matcher matcher(CharSequence str) </li></ul><ul><li>Patter...
Performing Pattern Matching  contd. <ul><li>boolean find() </li></ul><ul><li>To determine if a subsequence of the input se...
Performing Pattern Matching  contd. <ul><li>Pattern p = Pattern.compile(&quot;sec-58&quot;); </li></ul><ul><li>Matcher m =...
Metacharacters <ul><li>The API supports a no. of special characters that affect the way a pattern is matched. </li></ul><u...
Character Classes <ul><li>A character class is a set of characters enclosed within square brackets. </li></ul><ul><li>It s...
Simple Classes <ul><li>Pattern p = Pattern.compile(&quot;[Hh][abit]&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;...
Negation <ul><li>To match all the characters except those listed within brackets, insert the ^ metacharacter at the beginn...
Ranges <ul><li>Metacharacter used is - (hyphen) </li></ul><ul><li>Ex : a-z, A-P, 5-8 </li></ul><ul><li>Pattern p = Pattern...
Union <ul><li>Used to create a single character class comprised of 2 or more separate character classes. </li></ul><ul><li...
Intersections <ul><li>To create a single character class matching only the characters common to all of its nested classes....
Subtraction <ul><li>Used to negate one or more nested character classes. </li></ul><ul><li>Ex : [0-6&&[^345]] </li></ul><u...
Predefined Character Classes <ul><li>The Pattern API contains a no. of useful predefined character classes, which offer co...
Subtraction <ul><li>Used to negate one or more nested character classes. </li></ul><ul><li>Ex : [0-6&&[^345]] </li></ul><u...
Quantifiers <ul><li>Allow us to specify the no. of occurrences to match against. </li></ul><ul><li>Types : 3 </li></ul><ul...
Greedy Quantifiers <ul><li>String regex1 = &quot;a?&quot;; </li></ul><ul><li>String regex2 = &quot;a*&quot;; </li></ul><ul...
Zero-length Matches <ul><li>Both a* and a? allow zero occurrences of the letter a. </li></ul><ul><li>Cases where a zero-le...
Zero-length Matches   contd. <ul><li>Pattern p = Pattern.compile(&quot;a?&quot;); </li></ul><ul><li>Matcher m = p.matcher(...
Zero-length Matches   contd. <ul><li>Pattern p = Pattern.compile(&quot;a*&quot;); </li></ul><ul><li>Matcher m = p.matcher(...
Zero-length Matches   contd. <ul><li>Pattern p = Pattern.compile(&quot;a+&quot;); </li></ul><ul><li>Matcher m = p.matcher(...
Capturing Groups & Character classes with Quantifiers <ul><li>Quantifiers can only be attached to one character at a time....
Capturing Character classes with Quantifiers <ul><li>Pattern p = Pattern.compile(“[hp]+&quot;); </li></ul><ul><li>Matcher ...
Difference Among Greedy, Reluctant & Possessive Quantifiers <ul><li>Greedy Quantifier :  </li></ul><ul><li>It forces the m...
Difference Among Greedy, Reluctant & Possessive Quantifiers <ul><li>Reluctant Quantifier :  </li></ul><ul><li>Starts at th...
Difference Among Greedy, Reluctant & Possessive Quantifiers <ul><li>Possessive Quantifier :  </li></ul><ul><li>Reads in th...
Capturing Groups <ul><li>A way to treat multiple characters as a single unit. </li></ul><ul><li>A Group is created by plac...
Counting the Capturing Groups <ul><li>groupCount()  : To count the no. of capturing groups present in the matcher’s patter...
Backreferences <ul><li>The  portion of the input string that matches the capturing group is saved in the memory for a late...
Boundary Matchers <ul><li>Situation :  We want to find a word in a file, but only if it appears at the beginning or end of...
Boundary Matchers <ul><li>Regex : ^dog$ </li></ul><ul><ul><li>Input String : dog </li></ul></ul><ul><ul><li>Match found at...
Boundary Matchers <ul><li>Regex : dog </li></ul><ul><ul><li>Input String : The dog plays in the yard </li></ul></ul><ul>...
Boundary Matchers <ul><li>Regex : dog </li></ul><ul><ul><li>Input String : dog dog </li></ul></ul><ul><ul><li>“ dog” found...
Upcoming SlideShare
Loading in...5
×

Regular Expression

1,491

Published on

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,491
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "Regular Expression"

  1. 1. Regular Expression Supported By : java.util.regex
  2. 2. Introduction & uses <ul><li>It’s a way to describe a group of Strings based on common characteristics shared by each String in the group. </li></ul><ul><li>In the normal sense, We may have a sequence of characters, that we’ll call a pattern. We can use this pattern to collect those character sequences that match the pattern. </li></ul><ul><li>Generally used in text parsing, searching & replacing mechanism, editing & other kinds of text manipulation. </li></ul>
  3. 3. Pattern & Matcher Objects <ul><li>Pattern </li></ul><ul><li>A pattern object is a compiled representation of a regular expression. </li></ul><ul><li>This class doesn’t have public constructors. </li></ul><ul><li>To create a pattern, we invoke one of its public static compile methods. </li></ul><ul><li>Matcher </li></ul><ul><li>A Matcher object is the engine that interprets the pattern & performs match operations against an input string. </li></ul><ul><li>Like the Pattern class, Matcher defines no public constructors. </li></ul><ul><li>We get a matcher object by invoking the matcher() on a Pattern object. </li></ul>
  4. 4. <ul><li>static Pattern compile(String regex ) </li></ul><ul><li>Matcher matcher(CharSequence str) </li></ul><ul><li>Pattern p = Pattern.compile(&quot;sec-58&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;sec-58&quot;); </li></ul><ul><li>System.out.println(m.matches()); </li></ul><ul><li>m = p.matcher(&quot;Sec-58&quot;); </li></ul><ul><li>System.out.println(m.matches()); </li></ul><ul><li>Output </li></ul><ul><li>true </li></ul><ul><li>false </li></ul>Performing Pattern Matching
  5. 5. Performing Pattern Matching contd. <ul><li>boolean find() </li></ul><ul><li>To determine if a subsequence of the input sequence matches the Pattern. </li></ul><ul><li>String group() </li></ul><ul><ul><ul><li>To get a string containing the last matching sequence. </li></ul></ul></ul><ul><li>int start() </li></ul><ul><li>Returns the index of the current match in the input sequence. </li></ul><ul><li>int end </li></ul><ul><li>Returns the index one past the end of the current match. </li></ul><ul><li>Both throws IllegalStateException if there is no match. </li></ul><ul><li>String replaceAll(String) </li></ul><ul><ul><li>To replace all occurrences of a matching sequence with another sequence. </li></ul></ul>
  6. 6. Performing Pattern Matching contd. <ul><li>Pattern p = Pattern.compile(&quot;sec-58&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;C-58,sec-58;D-20,sec-58;F-14,sec-57;C-45,sec-58&quot;); </li></ul><ul><li>while( m.find() ) </li></ul><ul><li>{ </li></ul><ul><li>System.out.println( m.group() +&quot; Starting at &quot;+m.start() ); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>sec-58 Starting at 5 </li></ul><ul><li>sec-58 Starting at 17 </li></ul><ul><li>sec-58 Starting at 41 </li></ul>
  7. 7. Metacharacters <ul><li>The API supports a no. of special characters that affect the way a pattern is matched. </li></ul><ul><li>Supported Metacharacters are : </li></ul><ul><li>( [ { ^ - $ | ] } ) ? * + . </li></ul><ul><li>Note : In certain conditions, the characters listed above won’t be treated as Metacharacters. </li></ul><ul><li>There are 2 ways to force a metacharacter to be treated as an ordinary character : </li></ul><ul><li>Precede the metacharacter with a backslash. or </li></ul><ul><li>Enclose it within Q and E. </li></ul><ul><li>Pattern p = Pattern.compile(&quot;&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;+.+&quot;); </li></ul><ul><li>while(m.find()) </li></ul><ul><li>{ </li></ul><ul><li>System.out.println( m.group()); </li></ul><ul><li>} </li></ul><ul><li>Output : +. </li></ul>
  8. 8. Character Classes <ul><li>A character class is a set of characters enclosed within square brackets. </li></ul><ul><li>It specifies the characters that will successfully match a single character from a give input string. </li></ul>a through z, except for b and c: [ad-z] (subtraction) [a-z&&[^bc]] a through z, and not m through p: [a-lq-z] (subtraction) [a-z&&[^m-p]] d,e, or f (intersection) [a-z&&[def]] A through d, or m through p: [a-dm-p] (union) [a-d[m-p]] A through z, or A through Z, inclusive (range) [a-zA-Z] Any character except a,b, or c (negation) [^abc] a,b or c (simple class) [abc] Conditions under which there’ll be a match Regular Expression
  9. 9. Simple Classes <ul><li>Pattern p = Pattern.compile(&quot;[Hh][abit]&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;Hi, how r u?.Hey! Shall we go for the dinner tonight.&quot;); </li></ul><ul><li>while(m.find()) </li></ul><ul><li>{ </li></ul><ul><li>System.out.println( m.group()); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>Hi </li></ul><ul><li>ha </li></ul><ul><li>ht </li></ul>
  10. 10. Negation <ul><li>To match all the characters except those listed within brackets, insert the ^ metacharacter at the beginning of the character class. </li></ul><ul><li>Pattern p = Pattern.compile(&quot;[^Hh][^abit]&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;Hindustan times.&quot;); </li></ul><ul><li>while(m.find()) </li></ul><ul><li>{ </li></ul><ul><li>System.out.print( m.group()+&quot; &quot;); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>in du an im es </li></ul>
  11. 11. Ranges <ul><li>Metacharacter used is - (hyphen) </li></ul><ul><li>Ex : a-z, A-P, 5-8 </li></ul><ul><li>Pattern p = Pattern.compile(&quot;[^2-6][0-7]&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;4500-569-3286-5639&quot;); </li></ul><ul><li>while(m.find()) </li></ul><ul><li>{ </li></ul><ul><li>System.out.print( m.group()+&quot; &quot;); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>0-5 </li></ul><ul><li>9-3 </li></ul>
  12. 12. Union <ul><li>Used to create a single character class comprised of 2 or more separate character classes. </li></ul><ul><li>To create a union, simply nest one class inside the other. Such as [0-5[6-8]]. </li></ul><ul><li>Pattern p = Pattern.compile(&quot;[4-8][[5-9][02]]&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;4590-569-3286-5639&quot;); </li></ul><ul><li>while(m.find()) </li></ul><ul><li>{ </li></ul><ul><li>System.out.print( m.group()+&quot; &quot;); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>45 56 86 56 </li></ul>
  13. 13. Intersections <ul><li>To create a single character class matching only the characters common to all of its nested classes. </li></ul><ul><li>Ex : [0-6&&[345]] </li></ul><ul><li>Pattern p = Pattern.compile(&quot;[2-6&&[23478]]&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;978979321326&quot;); </li></ul><ul><li>while(m.find()) </li></ul><ul><li>{ </li></ul><ul><li>System.out.print(m.group()+&quot; &quot;); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>3 2 3 2 </li></ul>
  14. 14. Subtraction <ul><li>Used to negate one or more nested character classes. </li></ul><ul><li>Ex : [0-6&&[^345]] </li></ul><ul><li>Pattern p = Pattern.compile(&quot;[2-6&&[^234]]&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;978979321326&quot;); </li></ul><ul><li>while(m.find()) </li></ul><ul><li>{ </li></ul><ul><li>System.out.print(m.group()+&quot; &quot;); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>6 </li></ul>
  15. 15. Predefined Character Classes <ul><li>The Pattern API contains a no. of useful predefined character classes, which offer convenient shorthands for commonly used regular expressions. </li></ul>A non-word character: [^w] W A word character: [a-zA-Z_0-9] w A non-whitespace character: [^s] S A whitespace character: [ x0Bf ] s A non-digit: [^0-9] D A digit: [0-9] d Any character . (Dot) Character class Shorthand
  16. 16. Subtraction <ul><li>Used to negate one or more nested character classes. </li></ul><ul><li>Ex : [0-6&&[^345]] </li></ul><ul><li>Pattern p = Pattern.compile(&quot;[2-6&&[^234]]&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;978979321326&quot;); </li></ul><ul><li>while(m.find()) </li></ul><ul><li>{ </li></ul><ul><li>System.out.print(m.group()+&quot; &quot;); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>6 </li></ul>
  17. 17. Quantifiers <ul><li>Allow us to specify the no. of occurrences to match against. </li></ul><ul><li>Types : 3 </li></ul><ul><ul><ul><li>Greedy </li></ul></ul></ul><ul><ul><ul><li>Reluctant </li></ul></ul></ul><ul><ul><ul><li>Possessive </li></ul></ul></ul>X, at least n but not more than m times X{n,m}+ X{n,m}? X{n,m} X,at least n times X{n,}+ X{n,}? X{n,} X, exactly n times X{n}+ X{n}? X{n} X, One or more times X++ X+? X+ X, Zero or more times X*+ X*? X* X,Once or not at all X?+ X?? X? Meaning Possessive Reluctant Greedy
  18. 18. Greedy Quantifiers <ul><li>String regex1 = &quot;a?&quot;; </li></ul><ul><li>String regex2 = &quot;a*&quot;; </li></ul><ul><li>String regex3 = &quot;a+&quot;; </li></ul><ul><li>Pattern p = Pattern.compile(regex1); </li></ul><ul><li>Matcher m = p.matcher(&quot;&quot;); </li></ul><ul><li>if( m.find() ) </li></ul><ul><li>System.out.println(&quot;Match found at &quot;+m.start()+&quot; ending at &quot;+m.end() ); </li></ul><ul><li>p = Pattern.compile(regex2); </li></ul><ul><li>m = p.matcher(&quot;&quot;); </li></ul><ul><li>if( m.find() ) </li></ul><ul><li>System.out.println(&quot;Match found at &quot;+m.start()+&quot; ending at &quot;+m.end() ); </li></ul><ul><li>p = Pattern.compile(regex3); </li></ul><ul><li>m = p.matcher(&quot;&quot;); </li></ul><ul><li>if( m.find() ) </li></ul><ul><li>System.out.println(&quot;Match found at &quot;+m.start()+&quot; ending at &quot;+m.end() ); </li></ul><ul><li>Output </li></ul><ul><li>Match found at 0 ending at 0 </li></ul><ul><li>Match found at 0 ending at 0 </li></ul>
  19. 19. Zero-length Matches <ul><li>Both a* and a? allow zero occurrences of the letter a. </li></ul><ul><li>Cases where a zero-length match can occur : </li></ul><ul><ul><ul><li>It can occur in an empty input string. </li></ul></ul></ul><ul><ul><ul><li>At the beginning of an input string. </li></ul></ul></ul><ul><ul><ul><li>After the last character of an input string </li></ul></ul></ul><ul><ul><ul><li>Between any 2 characters of an input string. </li></ul></ul></ul><ul><li>A zero-length match always start and end at the same index position. </li></ul><ul><li>Pattern p = Pattern.compile(&quot;a?&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;a&quot;); </li></ul><ul><li>while( m.find() ) </li></ul><ul><li>{ </li></ul><ul><li>System.out.println(m.group()+&quot; found at &quot;+m.start()+&quot; ending at &quot;+m.end() ); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>a found at 0 ending at 1 </li></ul><ul><li>found at 1 ending at 1 </li></ul>
  20. 20. Zero-length Matches contd. <ul><li>Pattern p = Pattern.compile(&quot;a?&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;aaaa&quot;); </li></ul><ul><li>while( m.find() ) </li></ul><ul><li>{ </li></ul><ul><li>System.out.println(m.group()+&quot; found at &quot;+m.start()+&quot; ending at &quot;+m.end() ); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>a found at 0 ending at 1 </li></ul><ul><li>a found at 1 ending at 2 </li></ul><ul><li>a found at 2 ending at 3 </li></ul><ul><li>a found at 3 ending at 4 </li></ul><ul><li>found at 4 ending at 4 </li></ul>
  21. 21. Zero-length Matches contd. <ul><li>Pattern p = Pattern.compile(&quot;a*&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;aaaa&quot;); </li></ul><ul><li>while( m.find() ) </li></ul><ul><li>{ </li></ul><ul><li>System.out.println(m.group()+&quot; found at &quot;+m.start()+&quot; ending at &quot;+m.end() ); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>aaaa found at 0 ending at 4 </li></ul><ul><li>found at 4 ending at 4 </li></ul>
  22. 22. Zero-length Matches contd. <ul><li>Pattern p = Pattern.compile(&quot;a+&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;aaaa&quot;); </li></ul><ul><li>while( m.find() ) </li></ul><ul><li>{ </li></ul><ul><li>System.out.println(m.group()+&quot; found at &quot;+m.start()+&quot; ending at &quot;+m.end() ); </li></ul><ul><li>} </li></ul><ul><li>Output </li></ul><ul><li>aaaa found at 0 ending at 4 </li></ul>
  23. 23. Capturing Groups & Character classes with Quantifiers <ul><li>Quantifiers can only be attached to one character at a time. So, it means the regular expression abc+ would mean </li></ul><ul><li>a, followed by b followed by c one or more times. </li></ul><ul><li>[abc]+ means a or b or c, one or more times. </li></ul><ul><li>(abc)+ means the group “abc” one or more times. </li></ul><ul><li>Pattern p = Pattern.compile(&quot;(hp)+&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;hphclhpcompaq IBM hp&quot;); </li></ul><ul><li>while( m.find() ) </li></ul><ul><li>{ System.out.println(m.group()+&quot; found at &quot;+m.start()+&quot; ending at &quot;+m.end() ); } </li></ul><ul><li>Output </li></ul><ul><li>hp found at 0 ending at 2 </li></ul><ul><li>hp found at 5 ending at 7 </li></ul><ul><li>hp found at 18 ending at 20 </li></ul>
  24. 24. Capturing Character classes with Quantifiers <ul><li>Pattern p = Pattern.compile(“[hp]+&quot;); </li></ul><ul><li>Matcher m = p.matcher(&quot;hphclhpcompaq IBM hp&quot;); </li></ul><ul><li>while( m.find() ) </li></ul><ul><li>{ System.out.println(m.group()+&quot; found at &quot;+m.start()+&quot; ending at &quot;+m.end() ); } </li></ul><ul><li>Output </li></ul><ul><li>hph found at 0 ending at 3 </li></ul><ul><li>hp found at 5 ending at 7 </li></ul><ul><li>p found at 10 ending at 11 </li></ul><ul><li>hp found at 18 ending at 20 </li></ul>
  25. 25. Difference Among Greedy, Reluctant & Possessive Quantifiers <ul><li>Greedy Quantifier : </li></ul><ul><li>It forces the matcher to read in the entire input string prior to attempting the first match. </li></ul><ul><li>If the first match attempt( i.e., the entire string) fails, the matcher backs off the input string by one character and tries again. </li></ul><ul><li>This process is repeated until a match is found or there are no more characters left to back off from. </li></ul><ul><li>Depending on the quantifier used in the expression, the last thing it’ll try matching against is 1 or 0 characters. </li></ul><ul><li>Ex: </li></ul><ul><li>Pattern = . *foo </li></ul><ul><li>Input String = xfooxxxxxxfoo </li></ul><ul><li>Match found at index 0 & ending at index 13. </li></ul>
  26. 26. Difference Among Greedy, Reluctant & Possessive Quantifiers <ul><li>Reluctant Quantifier : </li></ul><ul><li>Starts at the beginning of the input string. Then read one character at a time & looks for a match. </li></ul><ul><li>The last thing it tries is the entire input string. </li></ul><ul><li>Ex: </li></ul><ul><li>Pattern = . *?foo </li></ul><ul><li>Input String = xfooxxxxxxfoo </li></ul><ul><li>Match found at index 0 & ending at index 4. </li></ul><ul><li>Match found at index 4 & ending at index 13. </li></ul>
  27. 27. Difference Among Greedy, Reluctant & Possessive Quantifiers <ul><li>Possessive Quantifier : </li></ul><ul><li>Reads in the entire input string. </li></ul><ul><li>It tries once and only once for a match. </li></ul><ul><li>Ex: </li></ul><ul><li>Pattern = . *+foo </li></ul><ul><li>Input String = xfooxxxxxxfoo </li></ul><ul><li>No Match found </li></ul>
  28. 28. Capturing Groups <ul><li>A way to treat multiple characters as a single unit. </li></ul><ul><li>A Group is created by placing the characters to be grouped inside a set of parentheses. Such as (890). </li></ul><ul><li>Numbering </li></ul><ul><li>Capturing groups are numbered by counting their opening parentheses from left to right. </li></ul><ul><li>The expression ((A)(B(C))) has 4 groups </li></ul><ul><ul><li>((A)(B(C))) </li></ul></ul><ul><ul><li>(A) </li></ul></ul><ul><ul><li>(B(C)) </li></ul></ul><ul><ul><li>(c) </li></ul></ul>
  29. 29. Counting the Capturing Groups <ul><li>groupCount() : To count the no. of capturing groups present in the matcher’s pattern. </li></ul><ul><li>There is also a special group, group 0. It always represents the entire expression. It’s not included in the total reported by groupCount(). </li></ul><ul><li>public int start(int group) : Returns the start index of the subsequence captured by the given group during the previous match. </li></ul><ul><li>public int end(int group) : Returns the index of the last character, plus one, of the subsequence captured by the given group during the previous match. </li></ul><ul><li>public String end(int group) : Returns the input subsequence captured by the given group during the previous match operation. </li></ul>
  30. 30. Backreferences <ul><li>The portion of the input string that matches the capturing group is saved in the memory for a later recall via backreferences. </li></ul><ul><li>A backreference is specified in the regular expression as a backslash ( ) followed by a digit indicating the no. of the group to be recalled. </li></ul><ul><li>Ex : (dd) defines one capturing group matching 2 digits in a row, which can be recalled later in the expression via the backreference 1. </li></ul><ul><li>Example </li></ul><ul><li>Regex : (dd)1 </li></ul><ul><li>String : 1212 </li></ul><ul><li>String found at index 0 & ending at 4. </li></ul><ul><li>Regex : (dd)1 </li></ul><ul><li>String : 1234 </li></ul><ul><li>No match found. </li></ul><ul><li>Note : For nested capturing groups, backreferencing works in exactly the same way: speciy a backslash followed by the no. of groups to be recalled. </li></ul>
  31. 31. Boundary Matchers <ul><li>Situation : We want to find a word in a file, but only if it appears at the beginning or end of a line. </li></ul>The end of the input z End of the input but for the final terminator, if any  End of the previous match G Beginning of the input A A non-word boundary B A word boundary  End of a line $ Beginning of a line ^ Meaning Boundary Matchers
  32. 32. Boundary Matchers <ul><li>Regex : ^dog$ </li></ul><ul><ul><li>Input String : dog </li></ul></ul><ul><ul><li>Match found at index 0 ending at index 3. </li></ul></ul><ul><li>Regex : ^dog$ </li></ul><ul><ul><li>Input String : “ dog” </li></ul></ul><ul><ul><li>No match found </li></ul></ul><ul><li>Regex : s*dog$ </li></ul><ul><ul><li>Input String : “ dog” </li></ul></ul><ul><ul><li>Match found at index 0 ending at index 8. </li></ul></ul><ul><li>Regex : ^dogw* </li></ul><ul><ul><li>Input String : dogblahblah </li></ul></ul><ul><ul><li>Match found at index 0 ending at index 11. </li></ul></ul>
  33. 33. Boundary Matchers <ul><li>Regex : dog </li></ul><ul><ul><li>Input String : The dog plays in the yard </li></ul></ul><ul><ul><li>“ dog” found at index 4 ending at index 7. </li></ul></ul><ul><li>Regex : dog </li></ul><ul><li> Input String : The doggie plays in the yard. </li></ul><ul><ul><li>No match found </li></ul></ul><ul><li>Regex : dogB </li></ul><ul><ul><li>Input String : The dog plays in the yard </li></ul></ul><ul><ul><li>No match found. </li></ul></ul><ul><li>Regex :dogB </li></ul><ul><li>Input String : The doggie plays in the yard. </li></ul><ul><ul><li>“ dog” found at index 4 ending at index 7. </li></ul></ul>
  34. 34. Boundary Matchers <ul><li>Regex : dog </li></ul><ul><ul><li>Input String : dog dog </li></ul></ul><ul><ul><li>“ dog” found at index 0 ending at index 3. </li></ul></ul><ul><ul><li>“ dog” found at index 4 ending at index 7. </li></ul></ul><ul><li>Regex : Gdog </li></ul><ul><li> Input String : dog dog </li></ul><ul><li>“ dog” found at index 0 ending at index 3. </li></ul>

×