An Introduction to Regular expressions

1,537 views

Published on

An Introduction to Regular expressions

Published in: Technology
2 Comments
4 Likes
Statistics
Notes
No Downloads
Views
Total views
1,537
On SlideShare
0
From Embeds
0
Number of Embeds
34
Actions
Shares
0
Downloads
51
Comments
2
Likes
4
Embeds 0
No embeds

No notes for slide

An Introduction to Regular expressions

  1. 1. And are they contagious?
  2. 2. There is no official standard forregular expressions, so no realdefinition.Simply put, you can call it atext pattern to search and/or Easy peasy!replace text.
  3. 3. Perl programming languagePerl-compatible.NETJavaJavaScript… What, no cherry flavour?
  4. 4. Back to grammar school!
  5. 5. a matches any occurrence of that characterJack is a boy.cat matchesAbout cats and dogs.
  6. 6. square bracket [backslash caret ^dollar sign $period or dot .vertical bar or pipe symbol |question mark ?asterisk or star *plus sign +opening round bracket (closing round bracket )opening curley bracket {
  7. 7. Special characters are reserved for special use.They need to be preceded by a backslash if you want tomatch them as literal characters.This is called escaping.If you want to match 1+1=2 the correct regex is 1+1=2
  8. 8. tab tcarriage return rline feed nbeginning of line ^end of line $word boundary b
  9. 9. If regular expressions are Unicode enabled you cansearch any character using the Unicode value.Depending on syntax: u0000 or x{0000}Hard space u00A0 or x{00A0}® sign u00AE or x{00AE}...
  10. 10. Quantifiers allow you to specify the number ofoccurrences to match againstX? X, once or not at allX* X, zero or more timesX+ X, one or more timesX{n} X, exactly n timesX{n,} X, at least n timesX{n,m} X, at least n but not more than m times
  11. 11. The regex colou?r matches both colour and color.You can also group items together by using brackets:Nov(ember)? will match Nov and NovemberThe regex a+ is the same as a{1,} and matches a or aaaaaThe regex w{3} matches www.qa-distiller.com
  12. 12. Simply place the characters you want to match betweensquare brackets.If you want to match an a or an e, use [ae]. You coulduse this in gr[ae]y to match either gray or grey.A character class matches only a single character, theorder is not importantYou can also use ranges. [0-9] matches a single digitbetween 0 and 9
  13. 13. Typing a caret after the opening square bracket will negatethe character class.q[^u] means: "a q followed by a character that is not a u".It will match the q and the space after the q inIraq is a political quagmire.but not the q of quagmire because it is followed by theletter u
  14. 14. d digit [0-9]w word character [A-Za-z0-9_ ]s whitespace [ trn]Negated versionsD not a digit [^d]W not a word character [^w]S not a whitespace [^s]
  15. 15. The dot matches a single character, without caring whatthat character is.The regex e. matchesHouston, we have a problem
  16. 16. If you want to search for cat or dog, separate both optionswith a vertical bar or pipe symbol:cat|dog matchesAre you sure you want a cat?You can add more options like this:green|black|yellow|white
  17. 17. Which of the following completely matches regex a(ab)*a1) abababa2) aaba3) aabbaa4) aba5) aabababa
  18. 18. Which of the following completely matches regex ab+c?1) abc2) ac3) abbb4) bbc5) abbcc
  19. 19. Which of the following completely matches regex a.[bc]+1) abc2) abbbbbbbb3) azc4) abcbcbcbc5) ac6) asccbbbbcbcccc
  20. 20. Which of the following completely matches regex(very )+(fat )?(tall|ugly) man1) very fat man2) fat tall man3) very very fat ugly man4) very very very tall man
  21. 21. Still awake?
  22. 22. Positive lookahead: X(?=X)Match something that is followed by somethingYamagata(?= Europe) matchesYamagata Europe, Yamagata Intech SolutionsNegative lookahead: X(?!X)Match something that is not followed by somethingYamagata(?! Europe) matchesYamagata Europe, Yamagata Intech Solutions
  23. 23. Positive lookbehind: (?<=X)XMatch something following something(?<=a)b matchesthingamabobNegative lookbehind: (?<!X)XMatch something not following something(?<!a)b matchesthingamabob
  24. 24. Round brackets create a backreference.You can use the backreference with a backslash + the number of thebackreference.The regex Java(script) is a 1ing language matchesJavascript is a scripting languageThe regex (Java)(script) is a 2ing language that is not the same as 1matchesJavascript is a scripting language that is not the same as Java
  25. 25. Use the regex b(w+) 1b to find doubled words.Ze streelde haar haar in in de auto.With exceptions:b(?!haarb)(w+) 1bZe streelde haar haar in in de auto.
  26. 26. You want to add brackets around step numbers:This is step 5 from chapter 1. Continue with step 45 from page 15.Use the regex ([sS]tep) (d+) to find all instances.Replace it by 1 (2)Or alternatively (?<=[sS]tep )d+ by (0)
  27. 27. Powerful, for individual text-based filesMore powerful, batch operations, command lineNo back referencesRegEx Text File FilterRegEx searchVery limitedPowerful, called GREP
  28. 28. Some people, when confronted with a problem, think"I know, Ill use regular expressions.“Now they have two problems.-> Do not try to do everything in one uber-regex-> Regular expressions are not parsers
  29. 29. http://www.regular-expressions.info

×