Python Programming - XI. String Manipulation and Regular Expressions

2,104 views
1,884 views

Published on

Published in: Technology
3 Comments
4 Likes
Statistics
Notes
  • Getting Started with Processing: A Hands-On Introduction to Making Interactive Graphics --- http://amzn.to/1pC7lY0
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Learning Processing, Second Edition: A Beginner's Guide to Programming Images, Animation, and Interaction (The Morgan Kaufmann Series in Computer Graphics) --- http://amzn.to/1RsFJki
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Processing: A Programming Handbook for Visual Designers and Artists (MIT Press) --- http://amzn.to/1UdwVPX
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
2,104
On SlideShare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
235
Comments
3
Likes
4
Embeds 0
No embeds

No notes for slide

Python Programming - XI. String Manipulation and Regular Expressions

  1. 1. PYTHON PROGRAMMING Text Processing XI. String Manipulation and Regular Expressions Engr. Ranel O. Padon
  2. 2. PYTHON PROGRAMMING TOPICS I • Introduction to Python Programming II • Python Basics III • Controlling the Program Flow IV • Program Components: Functions, Classes, Packages, and Modules V • Sequences (List and Tuples), and Dictionaries VI • Object-Based Programming: Classes and Objects VII • Customizing Classes and Operator Overloading VIII • Object-Oriented Programming: Inheritance and Polymorphism IX • Randomization Algorithms X • Exception Handling and Assertions XI • String Manipulation and Regular Expressions XII • File Handling and Processing XIII • GUI Programming Using Tkinter
  3. 3. Text Processing String Manipulation Regular Expressions
  4. 4. TEXT PROCESSING * used to develop text editors, word processors, page-layout soft-ware, computerized typesetting systems, and other text-processing software * used to search for patterns in text * used to validate user-inputs * used to process the contents of text files
  5. 5. STRING MANIPULATION Strings are made up of Characters. Characters are made up of: Digits (0, 1, 2, …, 9) Letters (a, b, c, …, z) Symbols (@, *, #, $, %, &, …)
  6. 6. String Methods
  7. 7. String Methods
  8. 8. String Methods
  9. 9. String Methods
  10. 10. String Methods
  11. 11. STRING MANIPULATION | Samples
  12. 12. STRING MANIPULATION | Samples
  13. 13. STRING MANIPULATION | Samples
  14. 14. STRING MANIPULATION | Samples
  15. 15. STRING MANIPULATION | Samples
  16. 16. STRING MANIPULATION | Samples
  17. 17. STRING MANIPULATION | Samples
  18. 18. STRING MANIPULATION | Samples
  19. 19. STRING MANIPULATION | Samples
  20. 20. STRING MANIPULATION | Samples
  21. 21. REGULAR EXPRESSIONS to test if a certain string contains a day of a week, it has to test if it contains “Monday,” “Tuesday”, and so on. you will need to use the find() method seven times but, it could be solved elegantly by Regular Expressions
  22. 22. REGULAR EXPRESSIONS * use string methods for simple text processing * string methods are more readable and simpler than regular expressions
  23. 23. REGULAR EXPRESSION text pattern that a program uses to find substrings that will match the required pattern expression that specify a set of strings a pattern matching mechanism also known as Regex introduced in the 1950s as part of formal language theory
  24. 24. REGULAR EXPRESSIONS very powerful! hundreds of code could be reduced to a one-liner elegant regular expression. used to construct compilers, interpreters, text editors, … used to search & match text patterns used to validate text data formats especially input data
  25. 25. REGULAR EXPRESSIONS Popular programming languages have RegEx capabilities: Perl, JavaScript, PHP, Python, Ruby, Tcl, Java, C, C++, C#, .Net, Ruby, …
  26. 26. REGEX Popular programming languages have RegEx capabilities: Perl, JavaScript, PHP, Python, Ruby, Tcl, Java, C, C++, C#, .Net, Ruby, …
  27. 27. REGEX | General Concepts  Alternative  Grouping  Quantification  Anchors  Meta-characters  Character Classes
  28. 28. REGEX | General Concepts  Alternative: |  Grouping: ()  Quantification: ? + * {m,n}  Anchors: ^$  Meta-characters: . [ ] [-] [^ ]  Character Classes: w d s W …
  29. 29. REGEX | Alternative “ranel|ranilio” == “ranel” or “ranilio” “gray|grey” == “gray” or “grey”
  30. 30. REGEX | Grouping “ran(el|ilio)” == “ranel” or “ranilio” “gr(a|e)y” == “gray” or “grey” “ra(mil|n(ny|el))” == “ramil” or “ranny” or “ranel”
  31. 31. REGEX | Quantification | ? ? == zero or one of the preceding element “rani?el” == “raniel” or “ranel” “colou?r” == “colour” or “color”
  32. 32. REGEX | Quantification | * * == zero or more of the preceding element “goo*gle” == “gogle” or “google” or “gooooogle” “(ha)*” == “” or “ha” or “haha” or “hahahahaha” “12*3” == “13” or “1223” or “12223”
  33. 33. REGEX | Quantification | + + == one or more of the preceding element “goo+gle” == “google” or “gooogle” or “gooooogle” “(ha)+” == “ha” or “haha” or “hahahahaha” “12+3” == “123” or “1223” or “12223”
  34. 34. REGEX | Quantification | {m,n} {m, n} == m to n times of the preceding element “go{2, 3}gle” == “google” or “gooogle” “6{3, 6}” == “666” or “6666” or “66666” or “666666” “5{3}” == “555” “a{2,}” == “aa” or “aaa” or “aaaa” or “aaaaa” …
  35. 35. REGEX | Anchors | ^ ^ == matches the starting position within the string “^laman” == “lamang” or “lamang-loob” or “lamang-lupa” “^2013” == “2013”, “2013-12345”, “2013/1320”
  36. 36. REGEX | Anchors | $ $ == matches the ending position within the string “laman$” == “halaman” or “kaalaman” “2013$” == “2013”, “777-2013”, “0933-445-2013”
  37. 37. REGEX | Meta-characters | . . == matches any single character “ala.” == “ala” or “alat” or “alas” or “ala2” “1.3” == “123” or “143” or “1s3”
  38. 38. REGEX | Meta-characters | [ ] [ ] == matches a single character that is contained within the brackets. “[abc]” == “a” or “b” or “c” “[aoieu]” == any vowel “[0123456789]” == any digit
  39. 39. REGEX | Meta-characters | [ - ] [ - ] == matches a single character that is contained within the brackets and the specified range. “[a-c]” == “a” or “b” or “c” “[a-z]” == all alphabet letters (lowercase only) “[a-zA-Z]” == all letters (lowercase & uppercase) “[0-9]” == all digits
  40. 40. REGEX | Meta-characters | [^ ] [^ ] == matches a single character that is not contained within the brackets. “[^aeiou]” == any non-vowel “[^0-9]” == any non-digit “[^abc]” == any character, but not “a”, “b”, or “c”
  41. 41. REGEX | Character Classes Character classes specifies a group of characters to match in a string
  42. 42. REGEX | Summary  Alternative: |  Grouping: ()  Quantification: ? + * {m,n}  Anchors: ^$  Meta-characters: . [ ] [-] [^ ]  Character Classes: w d s W …
  43. 43. REGEX | Combo
  44. 44. REGEX | Date Validation “1/3/2013” or “24/2/2020” (d{1,2}/d{1,2}/d{4})
  45. 45. REGEX | Alphanumeric, -, & _ “rr2000” or “ranel_padon” or “Oblan-Padon” ([a-zA-Z0-9-_]+)
  46. 46. REGEX | Numbers in 1 to 50 “1” or “50” or “14” (^[1-9]{1}$|^[1-4]{1}[0-9]{1}$|^50$)
  47. 47. REGEX | HTML Tags “<title>” or “<strong>” or “/body” (<(/?[^>]+)>)
  48. 48. PYTHON REGEX | Raw String
  49. 49. PYTHON REGEX | Raw String r Two Solutions:
  50. 50. PYTHON REGEX | Raw String r Raw Strings are used for enhancing readability.
  51. 51. PYTHON REGEX | Raw String
  52. 52. PYTHON REGEX | The re Module
  53. 53. PYTHON REGEX | Samples
  54. 54. PYTHON REGEX | Samples
  55. 55. PYTHON REGEX | Samples
  56. 56. PYTHON REGEX | Samples
  57. 57. PYTHON REGEX | Samples
  58. 58. PYTHON REGEX | Samples
  59. 59. PYTHON REGEX | Samples
  60. 60. PYTHON REGEX | Samples
  61. 61. PYTHON REGEX | Samples
  62. 62. REFERENCES  Deitel, Deitel, Liperi, and Wiedermann - Python: How to Program (2001).  Disclaimer: Most of the images/information used here have no proper source citation, and I do not claim ownership of these either. I don’t want to reinvent the wheel, and I just want to reuse and reintegrate materials that I think are useful or cool, then present them in another light, form, or perspective. Moreover, the images/information here are mainly used for illustration/educational purposes only, in the spirit of openness of data, spreading light, and empowering people with knowledge. 

×