1. The document discusses using regular expressions (regex) in SQL to extract patterns from string fields in a database table.
2. It provides examples of six patterns to extract from the PLDESC field, such as "I#####" (letter I followed by 5 digits).
3. It shows the SQL query used to select records where the PLDESC field contains the patterns, extracting the matching substrings and returning "N" if no match.
1. SQL Regular Expression
A regular expression (regex or regexp for short) is a
special text string for describing a search pattern. You
can think of regular expressions as wildcards on
steroids. You are probably familiar with wildcard
notations such as *.txt to find all text files in a file
manager. The regex equivalent is ^.*.txt$. (Definition
from http://www.regular-expressions.info/).
2. SQL Regular Expression
There are many ways to use this as some example show on the internet, however I will use
an example which I was challenged with to ilustrate how interesting it could be to use
Regular Expression in SQL. I was asked to get all records from a table which C(50) field
contain the below pattern anywhere in the field.
Pattern 1: I###### 'I [0-9]{5}‘ character ‘I’ follow by 6 digits
Pattern 2: I ##### 'I [0-9]{5}‘ character ‘I’ follow by space and then by
6 digits
Pattern 3: IMP##### IMP [0-9]{5}‘ character IMP follow by 6 digits.
Pattern 4: IMP ##### ‘I-[0-9]{5}‘ character IMP follow by space and then by
6 digits.
Pattern 5: ###-#######-# [0-9]{3}-[0-9]{7}-[0-9]{1,} 3 digits, then dash, then 7 digits, then
1 digit and exact just one digits at the end, no more than one.
3. SQL Regular Expression
• 1). Select
• 2). PLDESC,
• 3). IFNULL(REGEXP_SUBSTR(PLDESC,'I [0-9]{5}'),'N'),
• 4). IFNULL(REGEXP_SUBSTR(PLDESC,'IMP [0-9]{5}'),'N'),
• 5). IFNULL(REGEXP_SUBSTR(PLDESC,'I-[0-9]{5}'),'N'),
• 6). IFNULL(REGEXP_SUBSTR(PLDESC,'IMP[0-9]{5}'),'N'),
• 7). IFNULL(REGEXP_SUBSTR(PLDESC,'I[0-9]{5}'),'N'),
• 8). IFNULL(REGEXP_SUBSTR(PLDESC,'[0-9]{3}-[0-9]{7}-[0-9]{1,}'),'N')
• 9). From apl
• 10). Where
• 11). REGEXP_like(PLDESC,'[0-9]{3}-[0-9]{7}-[0-9]{1}[^0-9]') Or
• 12). REGEXP_LIKE(PLDESC,'I [0-9]{5}') Or
• 13). REGEXP_LIKE(PLDESC,'IMP [0-9]{5}') Or
• 14). REGEXP_LIKE(PLDESC,'I-[0-9]{5}') Or
• 15). REGEXP_LIKE(PLDESC,'IMP[0-9]{5}') Or
• 16). REGEXP_LIKE(PLDESC,'I[0-9]{5}'
4. SQL Regular Expression
1). Sql select statement
2). Select Field PLDESC
3). Extract from the string field(REGEXP_SUBSTR) PLDESC the pattern
I 5 digits(I #####) (I [0-9]{5}') if null, means the string does not
have the pattern then instead of showing null (‘-’) show ‘N’
4). Extract from the string field(REGEXP_SUBSTR) PLDESC the pattern
IMP 5 digits(IMP #####)( Character IMP, follow by space and then
follow by 5 digits)
5). Extract from the string field(REGEXP_SUBSTR) PLDESC the pattern
I-5 digits(I--#####)( Character I-, follow by 5 digits) 'I-[0-9]{5}'
if null means the string does not have the pattern then instead of
Showing null (‘-’) show ‘N’.
6). Extract from the string field(REGEXP_SUBSTR) PLDESC the pattern
IMP 5 digits(IMP #####)( Character IMP, follow by space and then follow
by 5 digits)
IMP[0-9]{5} if null means the string does not have the pattern then
instead of Showing null (‘-’) show ‘N’.
7). Extract from the string field(REGEXP_SUBSTR) PLDESC the pattern
I5 digits(I#####)( Character I, follow by 5 digits) 'IMP[0-9]{5}'
if null means the string does not have the pattern then instead of
Showing null (‘-’) show ‘N’
5. SQL Regular Expression
8). Extract from the string field(REGEXP_SUBSTR) PLDESC the pattern ###-#######-# (3#-, dashes, 7#-, dashes, 1 or
more #)( 3 digits follow by dashes, 7digits follow by dashes and then 1 or more digits, Digits at the end no
character) ‘[ 0-9]{3}-[0-9]{7}-[0-9]{1,}' if null means the string does not have the pattern then instead of
showing null (‘-’) show ‘N’.
9). From table APL
10). Where Clause
11). String contain the pattern'[0-9]{3}-[0-9]{7}-[0-9]{1}[^0-9]' (###-#######-#) (3 numbers, dashes, 7 numbers,
dashes and one digit at the end no more than1). Pay Attention to this part [0-9]{1}[^0-9] that’s
means 1 digit ([0-9]{1}) and after this no more digits ([^0-9]) Or.
12). String contain the pattern ‘I [0-9]{5}' which means Character ‘I’ follow by space and then 5 digits Or
13). String contain the pattern 'IMP [0-9]{5}' which means Character ‘IMP’ follow by space and then 5 digits Or.
14). String contain the pattern ‘I-[0-9]{5}’ which means Character ‘I’ follow by dash and then 5 digits Or.
15). String contain the pattern 'IMP[0-9]{5}' which means Character ‘IMP’ follow by 5 digits Or.
16). String contain the pattern 'I[0-9]{5}’’ which means Character ‘I’ follow by 5 digits.
Note: this substring will be extracted it doesn’t matter in what position it exists in the string field PLDESC which is
C(50). Will be extracted From anywhere it exists in the string field.
There are other SQL REGEXP that maybe helpful in case you need it like
REGEXP_REPLACE (Replace a pattern of string)
REGEXP_INSTR (Give the position where a pattern of a string start)
REGEXP_MATCH_COUNT (Count occurrences of a pattern in a string)