SlideShare a Scribd company logo
1 of 36
Download to read offline
maXbox Regex
THE DELPHI SYSTEM
Regular Expressions
REGULAR EXPRESSION
 A pattern of special characters used to match
 strings in a search
 Typically made up from special characters called
 metacharacters

 Regular expressions are used thoughout Linux:
   Utilities: ed, vi, grep, egrep, sed, and awk
   You can validate e-mail addresses, extract phone numbers or ZIP-
   codes from web-pages or documents, search for complex patterns
   in log files and all You can imagine! Rules (templates) can be
   changed without Your program recompilation! (add IBAN as ex.)

                                                                      2
REGULAR EXPRESSION IN DELPHI
 The regular expression engine in Delphi XE is PCRE (Perl
 Compatible Regular Expression). It's a fast and compliant (with
 generally accepted regex syntax) engine which has been around




                                                                           maXbox Delphi System
 for many years. Users of earlier versions of delphi can use it with
 TPerlRegEx, a delphi class wrapper around it.

 www.softwareschule.ch/maxbox.htm

 Regular expressions are used in Editors:
    Search for empty lines: ‘^$’
    Utilities: ed, vi, grep, egrep, sed, TRex and awk
    The main type in RegularExpressions.pas is TRegEx.
    TRegEx is a record with a bunch of methods and static class
    methods for matching with regular expressions.
                                                                       3
REGEX CORE UNIT IN DELPHI
 I hope it helps! I've always used the RegularExpressionsCore unit
 rather than the higher level stuff because the core unit is
 compatible with the unit that Jan Goyvaerts has provided for free




                                                                            maXbox Delphi System
 for years. That was my introduction to regular expressions. So I
 forgot about the other unit. I guess there's either a bug or it just
 doesn't work the way one might expect.


    Record: G:= TRegex.Match(...)
    Object: regEx:= TPerlRegEx.Create;
      Ex. 309_regex_powertester2.txt
    RegExStudio: TRegExpr.Create
                                                                        4
METACHARACTERS


RE Metacharacter   Matches…
        .          Any one character, except new line
      [a-z]        Any one of the enclosed characters (e.g. a-z)
       *           Zero or more of preceding character
     ? or ?       Zero or one of the preceding characters
     + or +       One or more of the preceding characters



 any non-metacharacter matches itself
                                                                   5
THE PURPOSE
 There are 3 main operators that use regular
 expressions:
     1. matching (which returns TRUE if a match
 is found and FALSE if no match is found.
     2. substitution, which substitutes one pattern
 of characters for another within a string
     3. split, which separates a string into a series
 of substrings
 Regular expressions are composed of characters,
 character classes, metacharacters, quantifiers,
 and assertions.
                                                        6
MORE METACHARACTERS
RE Metacharacter    Matches…
          ^         beginning of line
          $         end of line
       char        Escape the meaning of char following it
         [^]        One character not in the set
         <         Beginning of word anchor
         >         End of word anchor
     ( ) or ( )   Tags matched characters to be used later (max = 9)
       | or |      Or grouping
      x{m}        Repetition of character x, m times (x,m = integer)
      x{m,}       Repetition of character x, at least m times
     x{m,n}       Repetition of character x between m and m times
                                                                         7
Regular Expression




An atom specifies what text is to be matched and
           where it is to be found.

An operator combines regular expression atoms.



                                                   8
Atoms
An atom specifies what text is to be matched and where
                   it is to be found.




                                                         9
Single-Character Atom
A single character matches itself




                                    10
Dot Atom
matches any single character except for a new
             line character (n)




                                                11
Class Atom
    matches only single character that can be any of
    the characters defined in a set:
           Example: [ABC] matches either A, B, or C.




                          Notes:
1) A range of characters is indicated by a dash, e.g. [A-Q]
2) Can specify characters to be excluded from the set, e.g.
     [^0-9] matches any character other than a number.
                                                              12
Example: Classes




                   13
REPLACEMENT
 procedure DelphiPerlRegex;
 begin
  with TPerlRegex.create do try




                                                        maXbox Delphi System
   Options:= Options + [preUnGreedy];
   Subject:= 'I like to sing out at Foo bar';
   RegEx:= '([A-Za-z]+) bar';
   Replacement:= '1 is the name of the bar I like';
   if Match then ShowMessageBig(ComputeReplacement);
  finally
   Free;
  end;
 end;
                                                       14
Anchors
Anchors tell where the next character in the pattern must
              be located in the text data.




                                                            15
BACK REFERENCES: N
 used to retrieve saved text in one of nine buffers
 can refer to the text in a saved buffer by using a
 back reference:
 ex.: 1 2 3 ...9

 more details on this later
 rex:= '.*(es).*1.*'; //subpattern   Ex.




                                                      16
Operators




            17
Sequence Operator

In a sequence operator, if a series of atoms are shown in
a regular expression, there is no operator between them.




                                                            18
Alternation Operator: | or |
    operator (| or | ) is used to define one
             or more alternatives




    Note: depends on version of

                                                19
Repetition Operator: {…}
  The repetition operator specifies that the atom or
expression immediately before the repetition may be
                      repeated.




                                                       20
Basic Repetition Forms




                         21
Short Form Repetition Operators:
             *+?




                                   22
Group Operator
  In the group operator, when a group of characters is
enclosed in parentheses, the next operator applies to the
     whole group, not only the previous characters.




     Note: depends on version of “unit”
            use ( and ) instead
                                                            23
DELPHI DETAIL AND EXAMPLES

 For new code written in Delphi XE, you should




                                                      maXbox Delphi System
 definitely use the RegularExpressions unit that
 is part of Delphi rather than one of the many 3rd
 party units that may be available. But if you're
 dealing with UTF-8 data, use the
 RegularExpressionsCore unit to avoid needless
 UTF-8 to UTF-16 to UTF-8 conversions.



                                                     24
COMMONLY USED “D” OPTIONS:
 -c   CL.AddTypeS('TPerlRegExOption', ‘
      ( preCaseLess,




                                           maXbox Delphi System
       preMultiLine,
       preSingleLine,
       preExtended,
       preAnchored,
       preUnGreedy,
       preNoAutoCapture )');


                                          25
EXAMPLE: WITH <north >
% cat grep-datafile
northwest       NW      Charles Main                          300000.00
western         WE      Sharon Gray                           53000.89
southwest       SW      Lewis Dalsass                         290000.73




                                                                             C309_regex_powertester2.txt
southern        SO      Suan Chin                             54500.10
southeast       SE      Patricia Hemenway                     400000.00
eastern         EA      TB Savage                             440500.45
northeast       NE      AM Main Jr.                           57800.10
north           NO      Ann Stephens                          455000.50
central         CT      KRush                                 575500.70
Extra [A-Z]****[0-9]..$5.00

            Print the line if it contains the word “north”.

% grep '<north>' grep-datafile
north           NO      Ann Stephens                                 455000.50

                                                                            26
EXAMPLE: EGREP WITH +
% cat grep-datafile
northwest       NW      Charles Main                       300000.00
western         WE      Sharon Gray                        53000.89
southwest       SW      Lewis Dalsass                      290000.73




                                                                        C309_regex_powertester2.txt
southern        SO      Suan Chin                          54500.10
southeast       SE      Patricia Hemenway                  400000.00
eastern         EA      TB Savage                          440500.45
northeast       NE      AM Main Jr.                        57800.10
north           NO      Ann Stephens                       455000.50
central         CT      KRush                              575500.70
Extra [A-Z]****[0-9]..$5.00

             Print all lines containing one or more 3's.

% egrep '3+' grep-datafile
northwest       NW      Charles Main                       300000.00
western         WE      Sharon Gray                        53000.89
southwest       SW      Lewis Dalsass                      290000.73
                                                                       27
                    Note: line works with .*
EXAMPLE: EGREP WITH RE: ?
% cat grep-datafile
northwest       NW      Charles Main                                   300000.00
western         WE      Sharon Gray                                    53000.89
southwest       SW      Lewis Dalsass                                  290000.73




                                                                                           C309_regex_powertester2.txt
southern        SO      Suan Chin                                      54500.10
southeast       SE      Patricia Hemenway                              400000.00
eastern         EA      TB Savage                                      440500.45
northeast       NE      AM Main Jr.                                    57800.10
north           NO      Ann Stephens                                   455000.50
central         CT      KRush                                          575500.70
Extra [A-Z]****[0-9]..$5.00

  Print all lines containing a 2, followed by zero or one period, followed by a number.

% egrep '2.?[0-9]' grep-datafile
southwest       SW      Lewis Dalsass                                   290000.73


                                                                                          28
                        Note: Delphi PCRE grep Test
EXAMPLE: FGREP
% cat grep-datafile
northwest       NW      Charles Main                                    300000.00
western         WE      Sharon Gray                                     53000.89
southwest       SW      Lewis Dalsass                                   290000.73




                                                                                          C309_regex_powertester2.txt
southern        SO      Suan Chin                                       54500.10
southeast       SE      Patricia Hemenway                               400000.00
eastern         EA      TB Savage                                       440500.45
northeast       NE      AM Main Jr.                                     57800.10
north           NO      Ann Stephens                                    455000.50
central         CT      KRush                                           575500.70
Extra [A-Z]****[0-9]..$5.00

 Find all lines in the file containing the literal string “[A-Z]****[0-9]..$5.00”. All
       characters are treated as themselves. There are no special characters.


% fgrep '[A-Z]****[0-9]..$5.00' grep-datafile
Extra [A-Z]****[0-9]..$5.00
                                                                                         29
EXAMPLE: Mail Finder
procedure delphiRegexMailfinder;
begin
  // Initialize a test string to include some email addresses. This
would normally be your eMail.
  TestString:= '<one@server.domain.xy>, another@otherserver.xyz';
  PR:= TPerlRegEx.Create;




                                                                         maXbox Delphi System
  try
    PR.RegEx:= 'b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b';
    PR.Options:= PR.Options + [preCaseLess];
    PR.Compile;
    PR.Subject:= TestString; // <-- tell PR where to look for matches
    if PR.Match then begin
       WriteLn(PR.MatchedText); // Extract first address
       while PR.MatchAgain do
       WriteLn(PR.MatchedText); // Extract subsequent addresses
    end;
  finally
    PR.Free;
  end;
  //Readln;
end;

                                                                        30
EXAMPLE: Songfinder

with TRegExpr.Create do try
  gstr:= 'Deep Purple';
  modifierS:= false; //non greedy
  Expression:= '#EXTINF:d{3},'+gstr+' - ([^n].*)';




                                                                    maXbox Delphi System
     if Exec(fstr) then
      Repeat
         writeln(Format ('Songs of ' +gstr+': %s', [Match[1]]));
       (*if AnsiCompareText(Match[1], 'Woman') > 0 then begin
             closeMP3;
     PlayMP3(‘..EKON_13_14_15EKON1606_Woman_From_Tokyo.mp3');
        end;*)
      Until Not ExecNext;
  finally Free;
end;




                                                                   31
EXAMPLE: GREP WITH CHAR
% cat grep-datafile
northwest       NW      Charles Main                                   300000.00
western         WE      Sharon Gray                                    53000.89
southwest       SW      Lewis Dalsass                                  290000.73
southern        SO      Suan Chin                                      54500.10
southeast       SE      Patricia Hemenway                              400000.00
eastern         EA      TB Savage                                      440500.45
northeast       NE      AM Main Jr.                                    57800.10
north           NO      Ann Stephens                                   455000.50
central         CT      KRush                                          575500.70
Extra [A-Z]****[0-9]..$5.00

  Print all lines containing the number 5, followed by a literal period and any
                                 single character.


% grep '5..' grep-datafile
Extra [A-Z]****[0-9]..$5.00
                                                                                   32
EXAMPLE: HTTP RegEx[ ]
% cat get russian rouble rate - datafile

procedure getHttpREGEX(Sender: TObject);
var http1: TIDHTTP;
    htret: string;
begin




                                                                          maXbox Delphi System
  http1:= TIDHTTP.Create(self);
  htret:= HTTP1.Get('http://win.www.citycat.ru/finance/finmarket/_CBR/');
  //writeln(htret);
  with TRegExpr.Create do try
     Expression:= russTemplate;
     if Exec(htret) then begin
       //if output success
       writeln(Format ('Russian rouble rate at %s.%s.%s: %s',
                  [Match [2], Match [1], Match [3], Match [4]]));
      end;
      //writeln(dump)
    finally Free;
   end;
   //text2html
   //writeln('deco: '+#13+#10+DecorateURLs(htret,[durlAddr, durlPath]))
end;
                                                                         33
EXAMPLE: GREP WITH X{M}
% cat grep-datafile
northwest       NW      Charles Main                                   300000.00
western         WE      Sharon Gray                                    53000.89
southwest       SW      Lewis Dalsass                                  290000.73
southern        SO      Suan Chin                                      54500.10
southeast       SE      Patricia Hemenway                              400000.00
eastern         EA      TB Savage                                      440500.45
northeast       NE      AM Main Jr.                                    57800.10
north           NO      Ann Stephens                                   455000.50
central         CT      KRush                                          575500.70
Extra [A-Z]****[0-9]..$5.00
   Print all lines where there are at least six consecutive numbers followed by a period.

% grep '[0-9]{6}.' grep-datafile
northwest       NW      Charles Main                                    300000.00
southwest       SW      Lewis Dalsass                                   290000.73
southeast       SE      Patricia Hemenway                               400000.00
eastern         EA      TB Savage                                       440500.45
north           NO      Ann Stephens                                    455000.50           34
central         CT      KRush                                           575500.70
EXAMPLE: Extract Phones<city code 812

% cat grep-delphi-maXbox_datafile

procedure ExtractPhones(const AText: string; APhones: TStrings);
  begin




                                                                    maXbox Delphi System
  with TRegExpr.Create do try
     Expression := '(+d*)?(((d+))*)?(d+(-d*)*)';
     if Exec (AText) then
        REPEAT
          if Match[3] = '812'
           then APhones.Add(Match [4]);
        UNTIL not ExecNext;
     finally Free;
   end;
end;

writeln('Formula Gauss : '+
     floatToSTr(maXcalc('1/SQRT(2*PI*3^2)*EXP((-
                                     0.0014^2)/(2*3^2))')));
                                                                   35
SUMMARY
regular expressions
Delphi Implementation




                                maXbox Delphi System
for PCRE family of commands
Examples to train your brain




                               36

More Related Content

What's hot

What's hot (20)

Regular Expressions in PHP
Regular Expressions in PHPRegular Expressions in PHP
Regular Expressions in PHP
 
Perl6 signatures, types and multicall
Perl6 signatures, types and multicallPerl6 signatures, types and multicall
Perl6 signatures, types and multicall
 
Awk programming
Awk programming Awk programming
Awk programming
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
 
RegexCat
RegexCatRegexCat
RegexCat
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular Expressions
 
Strings in Python
Strings in PythonStrings in Python
Strings in Python
 
PHP Regular Expressions
PHP Regular ExpressionsPHP Regular Expressions
PHP Regular Expressions
 
Regular expressions and php
Regular expressions and phpRegular expressions and php
Regular expressions and php
 
String variable in php
String variable in phpString variable in php
String variable in php
 
Erlang
ErlangErlang
Erlang
 
Functions
FunctionsFunctions
Functions
 
Learning sed and awk
Learning sed and awkLearning sed and awk
Learning sed and awk
 
Hadoop Pig
Hadoop PigHadoop Pig
Hadoop Pig
 
Textpad and Regular Expressions
Textpad and Regular ExpressionsTextpad and Regular Expressions
Textpad and Regular Expressions
 
Strings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlStrings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perl
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Perl 101 - The Basics of Perl Programming
Perl  101 - The Basics of Perl ProgrammingPerl  101 - The Basics of Perl Programming
Perl 101 - The Basics of Perl Programming
 
Subroutines in perl
Subroutines in perlSubroutines in perl
Subroutines in perl
 

Viewers also liked

maXbox Starter 43 Work with Code Metrics ISO Standard
maXbox Starter 43 Work with Code Metrics ISO StandardmaXbox Starter 43 Work with Code Metrics ISO Standard
maXbox Starter 43 Work with Code Metrics ISO StandardMax Kleiner
 
Maxbox starter18
Maxbox starter18Maxbox starter18
Maxbox starter18Max Kleiner
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27Max Kleiner
 
maXbox Starter 42 Multiprocessing Programming
maXbox Starter 42 Multiprocessing Programming maXbox Starter 42 Multiprocessing Programming
maXbox Starter 42 Multiprocessing Programming Max Kleiner
 
Arduino delphi 2014_7_bonn
Arduino delphi 2014_7_bonnArduino delphi 2014_7_bonn
Arduino delphi 2014_7_bonnMax Kleiner
 
Dice and the Law of Probability (maXbox)
Dice and the Law of Probability (maXbox)Dice and the Law of Probability (maXbox)
Dice and the Law of Probability (maXbox)Max Kleiner
 

Viewers also liked (6)

maXbox Starter 43 Work with Code Metrics ISO Standard
maXbox Starter 43 Work with Code Metrics ISO StandardmaXbox Starter 43 Work with Code Metrics ISO Standard
maXbox Starter 43 Work with Code Metrics ISO Standard
 
Maxbox starter18
Maxbox starter18Maxbox starter18
Maxbox starter18
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27
 
maXbox Starter 42 Multiprocessing Programming
maXbox Starter 42 Multiprocessing Programming maXbox Starter 42 Multiprocessing Programming
maXbox Starter 42 Multiprocessing Programming
 
Arduino delphi 2014_7_bonn
Arduino delphi 2014_7_bonnArduino delphi 2014_7_bonn
Arduino delphi 2014_7_bonn
 
Dice and the Law of Probability (maXbox)
Dice and the Law of Probability (maXbox)Dice and the Law of Probability (maXbox)
Dice and the Law of Probability (maXbox)
 

Similar to A regex ekon16

Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20Max Kleiner
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsRaj Gupta
 
Unit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionsUnit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionssana mateen
 
Chapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular ExpressionChapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular Expressionazzamhadeel89
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptxDurgaNayak4
 
16 Java Regex
16 Java Regex16 Java Regex
16 Java Regexwayn
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionskeeyre
 
Php Chapter 4 Training
Php Chapter 4 TrainingPhp Chapter 4 Training
Php Chapter 4 TrainingChris Chubb
 
Regular expression
Regular expressionRegular expression
Regular expressionRajon
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdfGaneshRaghu4
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsDanny Bryant
 

Similar to A regex ekon16 (20)

Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
newperl5
newperl5newperl5
newperl5
 
Andrei's Regex Clinic
Andrei's Regex ClinicAndrei's Regex Clinic
Andrei's Regex Clinic
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Unit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionsUnit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressions
 
2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex
 
Les08
Les08Les08
Les08
 
Chapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular ExpressionChapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular Expression
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
 
Perl_Part4
Perl_Part4Perl_Part4
Perl_Part4
 
16 Java Regex
16 Java Regex16 Java Regex
16 Java Regex
 
Perl slid
Perl slidPerl slid
Perl slid
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Shell scripting
Shell scriptingShell scripting
Shell scripting
 
Php Chapter 4 Training
Php Chapter 4 TrainingPhp Chapter 4 Training
Php Chapter 4 Training
 
Regular expression
Regular expressionRegular expression
Regular expression
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular Expressions
 
Php, mysqlpart2
Php, mysqlpart2Php, mysqlpart2
Php, mysqlpart2
 

More from Max Kleiner

EKON26_VCL4Python.pdf
EKON26_VCL4Python.pdfEKON26_VCL4Python.pdf
EKON26_VCL4Python.pdfMax Kleiner
 
EKON26_Open_API_Develop2Cloud.pdf
EKON26_Open_API_Develop2Cloud.pdfEKON26_Open_API_Develop2Cloud.pdf
EKON26_Open_API_Develop2Cloud.pdfMax Kleiner
 
maXbox_Starter91_SyntheticData_Implement
maXbox_Starter91_SyntheticData_ImplementmaXbox_Starter91_SyntheticData_Implement
maXbox_Starter91_SyntheticData_ImplementMax Kleiner
 
Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475Max Kleiner
 
EKON 25 Python4Delphi_mX4
EKON 25 Python4Delphi_mX4EKON 25 Python4Delphi_mX4
EKON 25 Python4Delphi_mX4Max Kleiner
 
maXbox Starter87
maXbox Starter87maXbox Starter87
maXbox Starter87Max Kleiner
 
maXbox Starter78 PortablePixmap
maXbox Starter78 PortablePixmapmaXbox Starter78 PortablePixmap
maXbox Starter78 PortablePixmapMax Kleiner
 
maXbox starter75 object detection
maXbox starter75 object detectionmaXbox starter75 object detection
maXbox starter75 object detectionMax Kleiner
 
BASTA 2020 VS Code Data Visualisation
BASTA 2020 VS Code Data VisualisationBASTA 2020 VS Code Data Visualisation
BASTA 2020 VS Code Data VisualisationMax Kleiner
 
EKON 24 ML_community_edition
EKON 24 ML_community_editionEKON 24 ML_community_edition
EKON 24 ML_community_editionMax Kleiner
 
maxbox starter72 multilanguage coding
maxbox starter72 multilanguage codingmaxbox starter72 multilanguage coding
maxbox starter72 multilanguage codingMax Kleiner
 
EKON 23 Code_review_checklist
EKON 23 Code_review_checklistEKON 23 Code_review_checklist
EKON 23 Code_review_checklistMax Kleiner
 
EKON 12 Running OpenLDAP
EKON 12 Running OpenLDAP EKON 12 Running OpenLDAP
EKON 12 Running OpenLDAP Max Kleiner
 
EKON 12 Closures Coding
EKON 12 Closures CodingEKON 12 Closures Coding
EKON 12 Closures CodingMax Kleiner
 
NoGUI maXbox Starter70
NoGUI maXbox Starter70NoGUI maXbox Starter70
NoGUI maXbox Starter70Max Kleiner
 
maXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VIImaXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VIIMax Kleiner
 
maXbox starter68 machine learning VI
maXbox starter68 machine learning VImaXbox starter68 machine learning VI
maXbox starter68 machine learning VIMax Kleiner
 
maXbox starter67 machine learning V
maXbox starter67 machine learning VmaXbox starter67 machine learning V
maXbox starter67 machine learning VMax Kleiner
 
maXbox starter65 machinelearning3
maXbox starter65 machinelearning3maXbox starter65 machinelearning3
maXbox starter65 machinelearning3Max Kleiner
 
EKON22_Overview_Machinelearning_Diagrams
EKON22_Overview_Machinelearning_DiagramsEKON22_Overview_Machinelearning_Diagrams
EKON22_Overview_Machinelearning_DiagramsMax Kleiner
 

More from Max Kleiner (20)

EKON26_VCL4Python.pdf
EKON26_VCL4Python.pdfEKON26_VCL4Python.pdf
EKON26_VCL4Python.pdf
 
EKON26_Open_API_Develop2Cloud.pdf
EKON26_Open_API_Develop2Cloud.pdfEKON26_Open_API_Develop2Cloud.pdf
EKON26_Open_API_Develop2Cloud.pdf
 
maXbox_Starter91_SyntheticData_Implement
maXbox_Starter91_SyntheticData_ImplementmaXbox_Starter91_SyntheticData_Implement
maXbox_Starter91_SyntheticData_Implement
 
Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475
 
EKON 25 Python4Delphi_mX4
EKON 25 Python4Delphi_mX4EKON 25 Python4Delphi_mX4
EKON 25 Python4Delphi_mX4
 
maXbox Starter87
maXbox Starter87maXbox Starter87
maXbox Starter87
 
maXbox Starter78 PortablePixmap
maXbox Starter78 PortablePixmapmaXbox Starter78 PortablePixmap
maXbox Starter78 PortablePixmap
 
maXbox starter75 object detection
maXbox starter75 object detectionmaXbox starter75 object detection
maXbox starter75 object detection
 
BASTA 2020 VS Code Data Visualisation
BASTA 2020 VS Code Data VisualisationBASTA 2020 VS Code Data Visualisation
BASTA 2020 VS Code Data Visualisation
 
EKON 24 ML_community_edition
EKON 24 ML_community_editionEKON 24 ML_community_edition
EKON 24 ML_community_edition
 
maxbox starter72 multilanguage coding
maxbox starter72 multilanguage codingmaxbox starter72 multilanguage coding
maxbox starter72 multilanguage coding
 
EKON 23 Code_review_checklist
EKON 23 Code_review_checklistEKON 23 Code_review_checklist
EKON 23 Code_review_checklist
 
EKON 12 Running OpenLDAP
EKON 12 Running OpenLDAP EKON 12 Running OpenLDAP
EKON 12 Running OpenLDAP
 
EKON 12 Closures Coding
EKON 12 Closures CodingEKON 12 Closures Coding
EKON 12 Closures Coding
 
NoGUI maXbox Starter70
NoGUI maXbox Starter70NoGUI maXbox Starter70
NoGUI maXbox Starter70
 
maXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VIImaXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VII
 
maXbox starter68 machine learning VI
maXbox starter68 machine learning VImaXbox starter68 machine learning VI
maXbox starter68 machine learning VI
 
maXbox starter67 machine learning V
maXbox starter67 machine learning VmaXbox starter67 machine learning V
maXbox starter67 machine learning V
 
maXbox starter65 machinelearning3
maXbox starter65 machinelearning3maXbox starter65 machinelearning3
maXbox starter65 machinelearning3
 
EKON22_Overview_Machinelearning_Diagrams
EKON22_Overview_Machinelearning_DiagramsEKON22_Overview_Machinelearning_Diagrams
EKON22_Overview_Machinelearning_Diagrams
 

A regex ekon16

  • 1. maXbox Regex THE DELPHI SYSTEM Regular Expressions
  • 2. REGULAR EXPRESSION A pattern of special characters used to match strings in a search Typically made up from special characters called metacharacters Regular expressions are used thoughout Linux: Utilities: ed, vi, grep, egrep, sed, and awk You can validate e-mail addresses, extract phone numbers or ZIP- codes from web-pages or documents, search for complex patterns in log files and all You can imagine! Rules (templates) can be changed without Your program recompilation! (add IBAN as ex.) 2
  • 3. REGULAR EXPRESSION IN DELPHI The regular expression engine in Delphi XE is PCRE (Perl Compatible Regular Expression). It's a fast and compliant (with generally accepted regex syntax) engine which has been around maXbox Delphi System for many years. Users of earlier versions of delphi can use it with TPerlRegEx, a delphi class wrapper around it. www.softwareschule.ch/maxbox.htm Regular expressions are used in Editors: Search for empty lines: ‘^$’ Utilities: ed, vi, grep, egrep, sed, TRex and awk The main type in RegularExpressions.pas is TRegEx. TRegEx is a record with a bunch of methods and static class methods for matching with regular expressions. 3
  • 4. REGEX CORE UNIT IN DELPHI I hope it helps! I've always used the RegularExpressionsCore unit rather than the higher level stuff because the core unit is compatible with the unit that Jan Goyvaerts has provided for free maXbox Delphi System for years. That was my introduction to regular expressions. So I forgot about the other unit. I guess there's either a bug or it just doesn't work the way one might expect. Record: G:= TRegex.Match(...) Object: regEx:= TPerlRegEx.Create; Ex. 309_regex_powertester2.txt RegExStudio: TRegExpr.Create 4
  • 5. METACHARACTERS RE Metacharacter Matches… . Any one character, except new line [a-z] Any one of the enclosed characters (e.g. a-z) * Zero or more of preceding character ? or ? Zero or one of the preceding characters + or + One or more of the preceding characters any non-metacharacter matches itself 5
  • 6. THE PURPOSE There are 3 main operators that use regular expressions: 1. matching (which returns TRUE if a match is found and FALSE if no match is found. 2. substitution, which substitutes one pattern of characters for another within a string 3. split, which separates a string into a series of substrings Regular expressions are composed of characters, character classes, metacharacters, quantifiers, and assertions. 6
  • 7. MORE METACHARACTERS RE Metacharacter Matches… ^ beginning of line $ end of line char Escape the meaning of char following it [^] One character not in the set < Beginning of word anchor > End of word anchor ( ) or ( ) Tags matched characters to be used later (max = 9) | or | Or grouping x{m} Repetition of character x, m times (x,m = integer) x{m,} Repetition of character x, at least m times x{m,n} Repetition of character x between m and m times 7
  • 8. Regular Expression An atom specifies what text is to be matched and where it is to be found. An operator combines regular expression atoms. 8
  • 9. Atoms An atom specifies what text is to be matched and where it is to be found. 9
  • 10. Single-Character Atom A single character matches itself 10
  • 11. Dot Atom matches any single character except for a new line character (n) 11
  • 12. Class Atom matches only single character that can be any of the characters defined in a set: Example: [ABC] matches either A, B, or C. Notes: 1) A range of characters is indicated by a dash, e.g. [A-Q] 2) Can specify characters to be excluded from the set, e.g. [^0-9] matches any character other than a number. 12
  • 14. REPLACEMENT procedure DelphiPerlRegex; begin with TPerlRegex.create do try maXbox Delphi System Options:= Options + [preUnGreedy]; Subject:= 'I like to sing out at Foo bar'; RegEx:= '([A-Za-z]+) bar'; Replacement:= '1 is the name of the bar I like'; if Match then ShowMessageBig(ComputeReplacement); finally Free; end; end; 14
  • 15. Anchors Anchors tell where the next character in the pattern must be located in the text data. 15
  • 16. BACK REFERENCES: N used to retrieve saved text in one of nine buffers can refer to the text in a saved buffer by using a back reference: ex.: 1 2 3 ...9 more details on this later rex:= '.*(es).*1.*'; //subpattern Ex. 16
  • 17. Operators 17
  • 18. Sequence Operator In a sequence operator, if a series of atoms are shown in a regular expression, there is no operator between them. 18
  • 19. Alternation Operator: | or | operator (| or | ) is used to define one or more alternatives Note: depends on version of 19
  • 20. Repetition Operator: {…} The repetition operator specifies that the atom or expression immediately before the repetition may be repeated. 20
  • 22. Short Form Repetition Operators: *+? 22
  • 23. Group Operator In the group operator, when a group of characters is enclosed in parentheses, the next operator applies to the whole group, not only the previous characters. Note: depends on version of “unit” use ( and ) instead 23
  • 24. DELPHI DETAIL AND EXAMPLES For new code written in Delphi XE, you should maXbox Delphi System definitely use the RegularExpressions unit that is part of Delphi rather than one of the many 3rd party units that may be available. But if you're dealing with UTF-8 data, use the RegularExpressionsCore unit to avoid needless UTF-8 to UTF-16 to UTF-8 conversions. 24
  • 25. COMMONLY USED “D” OPTIONS: -c CL.AddTypeS('TPerlRegExOption', ‘ ( preCaseLess, maXbox Delphi System preMultiLine, preSingleLine, preExtended, preAnchored, preUnGreedy, preNoAutoCapture )'); 25
  • 26. EXAMPLE: WITH <north > % cat grep-datafile northwest NW Charles Main 300000.00 western WE Sharon Gray 53000.89 southwest SW Lewis Dalsass 290000.73 C309_regex_powertester2.txt southern SO Suan Chin 54500.10 southeast SE Patricia Hemenway 400000.00 eastern EA TB Savage 440500.45 northeast NE AM Main Jr. 57800.10 north NO Ann Stephens 455000.50 central CT KRush 575500.70 Extra [A-Z]****[0-9]..$5.00 Print the line if it contains the word “north”. % grep '<north>' grep-datafile north NO Ann Stephens 455000.50 26
  • 27. EXAMPLE: EGREP WITH + % cat grep-datafile northwest NW Charles Main 300000.00 western WE Sharon Gray 53000.89 southwest SW Lewis Dalsass 290000.73 C309_regex_powertester2.txt southern SO Suan Chin 54500.10 southeast SE Patricia Hemenway 400000.00 eastern EA TB Savage 440500.45 northeast NE AM Main Jr. 57800.10 north NO Ann Stephens 455000.50 central CT KRush 575500.70 Extra [A-Z]****[0-9]..$5.00 Print all lines containing one or more 3's. % egrep '3+' grep-datafile northwest NW Charles Main 300000.00 western WE Sharon Gray 53000.89 southwest SW Lewis Dalsass 290000.73 27 Note: line works with .*
  • 28. EXAMPLE: EGREP WITH RE: ? % cat grep-datafile northwest NW Charles Main 300000.00 western WE Sharon Gray 53000.89 southwest SW Lewis Dalsass 290000.73 C309_regex_powertester2.txt southern SO Suan Chin 54500.10 southeast SE Patricia Hemenway 400000.00 eastern EA TB Savage 440500.45 northeast NE AM Main Jr. 57800.10 north NO Ann Stephens 455000.50 central CT KRush 575500.70 Extra [A-Z]****[0-9]..$5.00 Print all lines containing a 2, followed by zero or one period, followed by a number. % egrep '2.?[0-9]' grep-datafile southwest SW Lewis Dalsass 290000.73 28 Note: Delphi PCRE grep Test
  • 29. EXAMPLE: FGREP % cat grep-datafile northwest NW Charles Main 300000.00 western WE Sharon Gray 53000.89 southwest SW Lewis Dalsass 290000.73 C309_regex_powertester2.txt southern SO Suan Chin 54500.10 southeast SE Patricia Hemenway 400000.00 eastern EA TB Savage 440500.45 northeast NE AM Main Jr. 57800.10 north NO Ann Stephens 455000.50 central CT KRush 575500.70 Extra [A-Z]****[0-9]..$5.00 Find all lines in the file containing the literal string “[A-Z]****[0-9]..$5.00”. All characters are treated as themselves. There are no special characters. % fgrep '[A-Z]****[0-9]..$5.00' grep-datafile Extra [A-Z]****[0-9]..$5.00 29
  • 30. EXAMPLE: Mail Finder procedure delphiRegexMailfinder; begin // Initialize a test string to include some email addresses. This would normally be your eMail. TestString:= '<one@server.domain.xy>, another@otherserver.xyz'; PR:= TPerlRegEx.Create; maXbox Delphi System try PR.RegEx:= 'b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b'; PR.Options:= PR.Options + [preCaseLess]; PR.Compile; PR.Subject:= TestString; // <-- tell PR where to look for matches if PR.Match then begin WriteLn(PR.MatchedText); // Extract first address while PR.MatchAgain do WriteLn(PR.MatchedText); // Extract subsequent addresses end; finally PR.Free; end; //Readln; end; 30
  • 31. EXAMPLE: Songfinder with TRegExpr.Create do try gstr:= 'Deep Purple'; modifierS:= false; //non greedy Expression:= '#EXTINF:d{3},'+gstr+' - ([^n].*)'; maXbox Delphi System if Exec(fstr) then Repeat writeln(Format ('Songs of ' +gstr+': %s', [Match[1]])); (*if AnsiCompareText(Match[1], 'Woman') > 0 then begin closeMP3; PlayMP3(‘..EKON_13_14_15EKON1606_Woman_From_Tokyo.mp3'); end;*) Until Not ExecNext; finally Free; end; 31
  • 32. EXAMPLE: GREP WITH CHAR % cat grep-datafile northwest NW Charles Main 300000.00 western WE Sharon Gray 53000.89 southwest SW Lewis Dalsass 290000.73 southern SO Suan Chin 54500.10 southeast SE Patricia Hemenway 400000.00 eastern EA TB Savage 440500.45 northeast NE AM Main Jr. 57800.10 north NO Ann Stephens 455000.50 central CT KRush 575500.70 Extra [A-Z]****[0-9]..$5.00 Print all lines containing the number 5, followed by a literal period and any single character. % grep '5..' grep-datafile Extra [A-Z]****[0-9]..$5.00 32
  • 33. EXAMPLE: HTTP RegEx[ ] % cat get russian rouble rate - datafile procedure getHttpREGEX(Sender: TObject); var http1: TIDHTTP; htret: string; begin maXbox Delphi System http1:= TIDHTTP.Create(self); htret:= HTTP1.Get('http://win.www.citycat.ru/finance/finmarket/_CBR/'); //writeln(htret); with TRegExpr.Create do try Expression:= russTemplate; if Exec(htret) then begin //if output success writeln(Format ('Russian rouble rate at %s.%s.%s: %s', [Match [2], Match [1], Match [3], Match [4]])); end; //writeln(dump) finally Free; end; //text2html //writeln('deco: '+#13+#10+DecorateURLs(htret,[durlAddr, durlPath])) end; 33
  • 34. EXAMPLE: GREP WITH X{M} % cat grep-datafile northwest NW Charles Main 300000.00 western WE Sharon Gray 53000.89 southwest SW Lewis Dalsass 290000.73 southern SO Suan Chin 54500.10 southeast SE Patricia Hemenway 400000.00 eastern EA TB Savage 440500.45 northeast NE AM Main Jr. 57800.10 north NO Ann Stephens 455000.50 central CT KRush 575500.70 Extra [A-Z]****[0-9]..$5.00 Print all lines where there are at least six consecutive numbers followed by a period. % grep '[0-9]{6}.' grep-datafile northwest NW Charles Main 300000.00 southwest SW Lewis Dalsass 290000.73 southeast SE Patricia Hemenway 400000.00 eastern EA TB Savage 440500.45 north NO Ann Stephens 455000.50 34 central CT KRush 575500.70
  • 35. EXAMPLE: Extract Phones<city code 812 % cat grep-delphi-maXbox_datafile procedure ExtractPhones(const AText: string; APhones: TStrings); begin maXbox Delphi System with TRegExpr.Create do try Expression := '(+d*)?(((d+))*)?(d+(-d*)*)'; if Exec (AText) then REPEAT if Match[3] = '812' then APhones.Add(Match [4]); UNTIL not ExecNext; finally Free; end; end; writeln('Formula Gauss : '+ floatToSTr(maXcalc('1/SQRT(2*PI*3^2)*EXP((- 0.0014^2)/(2*3^2))'))); 35
  • 36. SUMMARY regular expressions Delphi Implementation maXbox Delphi System for PCRE family of commands Examples to train your brain 36