Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Algorithms Lecture 8: Pattern Algorithms

232 views

Published on

We will discuss the following: Pattern searching, Naive Pattern Searching, Regular Expression.
Regular Expression Regular Expression Regular Expression Regular Expression Regular Expression Regular Expression Regular Expression Regular Expression Regular Expression Regular Expression Regular Expression

Published in: Education
  • Be the first to comment

Algorithms Lecture 8: Pattern Algorithms

  1. 1. Analysis and Design of Algorithms Patterns Algorithms
  2. 2. Analysis and Design of Algorithms Pattern searching Naive Pattern Searching Regular Expression
  3. 3. Analysis and Design of Algorithms Pattern searching is an important problem in computer science. When we do search for a string in notepad/word file or browser or database, pattern searching algorithms are used to show the search results.
  4. 4. Analysis and Design of Algorithms Naive Pattern Searching
  5. 5. Analysis and Design of Algorithms Slide the pattern over text one by one and check for a match. If a match is found, then slides by 1 again to check for subsequent matches.
  6. 6. Analysis and Design of Algorithms Example 1: Input: txt[] = "THIS IS A TEST TEXT" pat[] = "TEST" Output: Pattern found at index 10
  7. 7. Analysis and Design of Algorithms Example 2: Input: txt[] = "AABAACAADAABAABA" pat[] = "AABA" Output: Pattern found at index 0 Pattern found at index 9 Pattern found at index 12
  8. 8. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A  j i
  9. 9. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A  j i i+j
  10. 10. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A  j i i+j
  11. 11. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A  j i i+j
  12. 12. Analysis and Design of Algorithms  Pattern found at index 0 A A B A A C A A D A A B A A B A A A B A i
  13. 13. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  14. 14. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  15. 15. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  16. 16. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  17. 17. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  18. 18. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  19. 19. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  20. 20. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  21. 21. Analysis and Design of Algorithms  Pattern found at index 9 A A B A A C A A D A A B A A B A A A B A i
  22. 22. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  23. 23. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  24. 24. Analysis and Design of Algorithms  Pattern found at index 12 A A B A A C A A D A A B A A B A A A B A i
  25. 25. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B i
  26. 26. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A i
  27. 27. Analysis and Design of Algorithms Input: txt[] = "AABAACAADAABAABA" pat[] = "AABA" Output: Pattern found at index 0 Pattern found at index 9 Pattern found at index 12
  28. 28. Analysis and Design of Algorithms  Python Code:
  29. 29. Analysis and Design of Algorithms
  30. 30. Analysis and Design of Algorithms What is the best case? The best case occurs when the first character of the pattern is not present in text at all. txt[] = "AABCCAADDEE" pat[] = "FAA"
  31. 31. Analysis and Design of Algorithms What is the worst case ? 1) When all characters of the text and pattern are same. txt[] = "AAAAAAAAAAAAAAAAAA" pat[] = "AAAAA"
  32. 32. Analysis and Design of Algorithms 2) Worst case also occurs when only the last character is different. txt[] = "AAAAAAAAAAAAAAAAAB" pat[] = "AAAAB"
  33. 33. Analysis and Design of Algorithms The worst case is O(m*(n-m+1))
  34. 34. Analysis and Design of Algorithms Regular Expression
  35. 35. Analysis and Design of Algorithms Regular expressions are a powerful language for matching text patterns.
  36. 36. Analysis and Design of Algorithms re.match() checks for a match only at the beginning of the string. re.search() checks for a match anywhere in the string.
  37. 37. Analysis and Design of Algorithms Import Regular Expression (Python)
  38. 38. Analysis and Design of Algorithms In Python a regular expression search is typically written as:  match = re.search(pat, str)
  39. 39. Analysis and Design of Algorithms Simple match
  40. 40. Analysis and Design of Algorithms w (lowercase w): matches a "word" character: a letter or digit or underbar [a-z A-Z 0-9 _].
  41. 41. Analysis and Design of Algorithms W (upper case W): matches any non-word character.
  42. 42. Analysis and Design of Algorithms d (lowercase d): decimal digit [0-9]
  43. 43. Analysis and Design of Algorithms s (lowercase s): matches a single whitespace character -- space, newline, return, tab, form [ nrtf].
  44. 44. Analysis and Design of Algorithms  . (dot) : matches any single character except newline 'n'
  45. 45. Analysis and Design of Algorithms  ^ = start : match the start Match is empty
  46. 46. Analysis and Design of Algorithms  $ = end : match the end of the string Match is empty
  47. 47. Analysis and Design of Algorithms  [ ] : a range of characters can be indicated by giving two characters and separating them by a “-”.
  48. 48. Analysis and Design of Algorithms  [ ]
  49. 49. Analysis and Design of Algorithms  [^ ] : will match any character except this.
  50. 50. Analysis and Design of Algorithms  {n} : match exactly with any number of n.
  51. 51. Analysis and Design of Algorithms  {n,m} : match exactly with any number of n to m.
  52. 52. Analysis and Design of Algorithms Syntax Description Equivalent d Matches any decimal digit [0-9] D Matches any non-digit character [^0-9] s Matches any whitespace character [ tnrfv] S Matches any non-whitespace character [^ tnrfv] w Matches any alphanumeric character [a-zA-Z0-9_] W Matches any non-alphanumeric character [^a-zA-Z0-9_]
  53. 53. Analysis and Design of Algorithms + : 1 or more occurrences of the pattern
  54. 54. Analysis and Design of Algorithms * : 0 or more occurrences of the pattern
  55. 55. Analysis and Design of Algorithms ? : match 0 or 1 occurrences of the pattern
  56. 56. Analysis and Design of Algorithms  Emails:
  57. 57. Analysis and Design of Algorithms  Emails:  Square brackets can be used to indicate a set of chars, so [abc] matches 'a' or 'b' or 'c'.
  58. 58. Analysis and Design of Algorithms  Emails:  findall() finds all the matches and returns them as a list of strings.
  59. 59. Analysis and Design of Algorithms ( ): match group of pattern
  60. 60. Analysis and Design of Algorithms Operators Description . Matches with any single character except newline ‘n’ ? Match 0 or 1 occurrence of the pattern to its left + Match 1 or more occurrences of the pattern to its left * Match 0 or more occurrences of the pattern to its left w Matches with a alphanumeric character W Matches non alphanumeric character d Matches with digits [0-9] D Matches with non-digits
  61. 61. Analysis and Design of Algorithms Operators Description s Matches with a single white space character (space, newline, tab) S Matches any non-white space character [..] Matches any single character in a square bracket [^..] Matches any single character not in square bracket It is used for special meaning characters ^ and $ ^ and $ match the start or end of the string respectively {n,m} Matches at least n and at most m occurrences of expression a| b Matches either a or b
  62. 62. Analysis and Design of Algorithms Operators Description ( ) Groups regular expressions and returns matched text t, n, r Matches tab, newline, return
  63. 63. Analysis and Design of Algorithms Validate a phone number (phone number must be of 10 digits and starts with 8 or 9)
  64. 64. Analysis and Design of Algorithms  Solution
  65. 65. Analysis and Design of Algorithms Return date from given string str= Amit 34-3456 12-05-2007, XYZ 56-4532 11- 11-2011
  66. 66. Analysis and Design of Algorithms  Solution
  67. 67. Analysis and Design of Algorithms facebook.com/mloey mohamedloey@gmail.com twitter.com/mloey linkedin.com/in/mloey mloey@fci.bu.edu.eg mloey.github.io
  68. 68. Analysis and Design of Algorithms www.YourCompany.com © 2020 Companyname PowerPoint Business Theme. All Rights Reserved. THANKS FOR YOUR TIME

×