SlideShare a Scribd company logo
1 of 36
RegEx on Medical Data
MD ABDUL HASIB (SAZZAD)
What is Regular Expression?
A regular expression is a special sequence of
characters that helps you match or find other
strings or sets of strings, using a specialized
syntax held in a pattern. They can be used to
search, edit, or manipulate text and data.
Why Use It?
It is an important tool in
every developer’s arsenal!
You are NOT a developer’s if you don’t know
regular expressions
- DR. Zunaid Kazi
And
You will need it everywhere
Reformatting dirty data
Validating input
Command line work
Text processing
Before RegEx Mastery
After RegEx Mastery
Readily Available
Support in programming languages: JavaScript,
Java, PHP, PERL, C/C++,etc
Command-line: grep, awk, sed
Text-editors: VIM, emacs, Notepad++
IDEs: Eclipse, Netbeans, Visual Studio .NET
Literal Characters
letters : A to Z, a to z
numbers : 0 to 9
symbols : ! @ # % &
Matched literally!
Matched anywhere, even middle of words
Is case sensitive
Examples
Regex Target String
is Jack is a boy
a Jack is a boy
j Jack is a boy
Jack is a boy
Jack is a boy
Meta Characters
Special characters
[  ^ $ . | ? * + ( ) ]
Regex Target String
The Dot Character
The Dot (.) character matches any single character except the
newline
Synonymous with [^n] (UNIX/Linux/Mac)
as well as [^rn] (Windows)
Use it sparingly - it’s expensive!!
a.boy Jack is a boy
.a aac bac cac dac eac fac
Jack is a boy
aac bac cac dac eac fac
Regex Target String
Character Classes
Indicated by [ ] and matches one and ONLY one character in
a set of characters
[Aa] : matches either ‘A’ or ‘a’
[Gg]r[ae]y Grayson drives a grey sedan.
Regex Target String
Character Classes
q[^u] Qatar is home to quite a lot of Iraqui/Iraqi citizens, but is
not a city in Iraq
Caret (^) inside a character class negates the match.
[^Aa] : matches anything but ‘A’ and ‘a’
Qatar is home to quite a lot of Iraqui/Iraqi citizens, but is
not a city in Iraq
Shorthand classes
Shortcut Name Equivalent Class
d Digit [0-9]
D Not digit [^0-9]
w Word [a-zA-Z0-9]
W Not word [^a-zA-Z0-9]
s Space (separator) [ tnrfv]
S Not space [^ tnrfv]
. everything [^n] (depends on mode)
Repeaters
Symbols indicating that the preceding element of the
pattern can repeat
Repeater Count
? Zero or one
+ One or more
* Zero or more
Quantifiers
{n} : matches exactly n times
{n,} : matches n or more times
{n,m} : matches between n and m times
* : same as {0,}
+ : same as {1,}
? : same as {0,1}
d{2,4} 1 11 111 1111 11111
d{4} 1 11 111 1111 11111
d{2,} 1 11 111 1111 11111
Regex Target String
Quantifiers
1 11 111 1111 11111
1 11 111 1111 11111
1 11 111 1111 11111
Quantifiers are greedy
<.+>
<div>holy RegEx, Batman!</div><div>holy RegEx, Batman!</div>
Making quantifiers lazy
? to make it lazy
<.+?>
<div>holy RegEx, Batman!</div>”<div>holy RegEx, Batman!</div>”
Groupings 1
Everything within ( … ) is grouped into a single element for the
purpose of repetition and alternation
Regex Target String
(la)+? la lala lalala all ala
schema(ta)? schema or schemata or schematicschema or schemata or schematic
la lala lalala all ala
Back references
([ai]).1.1
The magician said abracadabra!
Groupings set the regex together for applying repetition.
The magician said abracadabra!
Grouping also creates a back reference to refer to them later.
(abc){3} matches abcabcabc. First group matches abc.
Capture
During searches, patterns in ( … ) groups can be
captured for replacement.
Special variables $1, $2, $3 etc. or 1, 2, 3 etc.
contain the capture.
(ddd)-(dddd) 123-4567 is my
number
$1 contains
$2 contains
123
4567
Replacement
Regex most often used for search/replace
Syntax varies; most scripting languages and CLI tools use
s/pattern/replacement/
Lookahead
• Positive Lookahead
– Iron(?=man) : matches “Iron” only if it is followed by
“man”
• Negative Lookahead
– Iron(?!man) : matches “Iron” only if it is not followed
by “man”
Lookbehind
• Positive Lookbehind
– (?<=Iron)man : matches “man” only if it is preceded by
“Iron”
• Negative Lookbehind
– (?<!Iron)man : matches “man” only if it is not
preceded by “Iron”
Modifiers
alter behavior of the matching mode (differs
between tools)
/i : case-insensitive match
/m : Multi-line mode
/g : affects all possible matches, not just the first
Clinical Context
Cholesterol
Pattern : ((cholesterol|chol)s*((?<gr1>d{4}-d{1,2}-d{1,2})|(?<gr2>d{1,2}-d{1,2}-
d{2,4})|(?<gr3>d{1,2}/d{1,2}/d{2,4}))?s*(-
?d+)(s*(desirable:s*<d{3}))?)|((cholesterol|chol)s+reading.*?d+)|((cholesterol|chol).?s*((reading|test).*?
).?s+(is|was)?s*d+)
should match
latest result cholesterol test is 256
chol reading was 256
cholesterol reading was 256
chol. reading found to be 256
GERD/H-pylori, (on meds, GI consult) HTN (changed from nitrates to Lisinopril) cholesterol
Hypercholesterolemia
high cholesterol who was + n;
Should not Match
Cholesterol-HDL 1/10/2089 165
Cholesterol-LDL 10/10/2069 61 DESIRABLE: <130
Cholesterol-HDL 10/10/2069 61 DESIRABLE: <130
Cholesterol-HDL 10/10/2069 65
Cholesterol-HDL 10/10/2089 65
CHOL/HDL 5.1 [3][1] RESULT
Cholesterol-HDL 10/2/2069 -65
Cholesterol-LDL 10/10/2069 61 DESIRABLE: <130
Cholesterol-HDL 10/10/2069 61 DESIRABLE: <130
Cholesterol-HDL 10/10/2069 65
HDL/LDL
Pattern : (([^/](HDL|LDL)|(^HDL|^LDL))(s*)((?<gr1>d{4}-d{1,2}-d{1,2})|(?<gr2>d{1,2}-d{1,2}-
d{2,4})|(?<gr3>d{1,2}/d{1,2}/d{2,4}))?s*(-?d+)(s*(desirable:s*<d{3}))?)|([hl]dl.*?d+)
/* Covered Pattern*/
// chol -500 220 lipitor chol 300 hdl 55 cholesterol reading was 256 hdl reading was 256
// hdl123
// chol 250 hdl-36
// chol 250 hdl136 (low hdl annotation)
// asa hdl is greater than 40
// hdl is less than 40
// hdl > 40
// hdl < 40
// latest hdl 181
// latest hdl 21
// hdl 11 23
//
// hdl 11/11/11 21
//
// *hdl/ldl/chol ...... relevant values --- relevant dates - desired value*
//
// patient has high chol
// ldl/hdl/chol is less than/ </>/greater than value
// latest hdl/cho./ldl
/* ********************************** end ******************************** */
A1C
Pattern : (hba1c|a1c).*?((d+(.|,)d+)|(sd+(s|$)))
/* Covered Pattern*/
something A1C excellent at 5.9%
something else hbA1c = 5.6 in 2/80)
A1c 6.4 11/28/67
A1C 11/28/2067 6.40
/* ********************************** end ******************************** */
Hypertension
Pattern : "hypertension|htn|high bloods*pressure|high bp
/* Covered Pattern*/
something A1C excellent at 5.9%
something else hbA1c = 5.6 in 2/80)
A1c 6.4 11/28/67
A1C 11/28/2067 6.40
/* ********************************** end ******************************** */
Questions?
Thank You
Resources
• https://docs.google.com/presentation/d/1mI99VtFZ1u
bg0_wkaA5auRJBGkjnhL-m7jyRX-
0Wqq4/edit#slide=id.gbd8f3b927_1_0
• https://docs.google.com/presentation/d/1I_5U6
Cnvz91UR7KpiavcEo6NAbskRrRks4BEqjcc2kA/edit
#slide=id.p26

More Related Content

Similar to Regular Expression

Regular Expressions 101
Regular Expressions 101Regular Expressions 101
Regular Expressions 101Raj Rajandran
 
And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...Codemotion
 
New features in abap
New features in abapNew features in abap
New features in abapSrihari J
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsssuser8779cd
 
Coffee 'n code: Regexes
Coffee 'n code: RegexesCoffee 'n code: Regexes
Coffee 'n code: RegexesPhil Ewels
 
Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....Raffi Krikorian
 
Introduction To Regex in Lasso 8.5
Introduction To Regex in Lasso 8.5Introduction To Regex in Lasso 8.5
Introduction To Regex in Lasso 8.5bilcorry
 
Javascript正则表达式
Javascript正则表达式Javascript正则表达式
Javascript正则表达式ji guang
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressionsBen Brumfield
 
Regex - Regular Expression Basics
Regex - Regular Expression BasicsRegex - Regular Expression Basics
Regex - Regular Expression BasicsEterna Han Tsai
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentationarnolambert
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentationarnolambert
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to PerlSway Wang
 
my$talk=qr{((?:ir)?reg(?:ular )?exp(?:ressions?)?)}i;
my$talk=qr{((?:ir)?reg(?:ular )?exp(?:ressions?)?)}i;my$talk=qr{((?:ir)?reg(?:ular )?exp(?:ressions?)?)}i;
my$talk=qr{((?:ir)?reg(?:ular )?exp(?:ressions?)?)}i;dankogai
 
Regex startup
Regex startupRegex startup
Regex startupPayPal
 

Similar to Regular Expression (20)

Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Ruby RegEx
Ruby RegExRuby RegEx
Ruby RegEx
 
Regular expression for everyone
Regular expression for everyoneRegular expression for everyone
Regular expression for everyone
 
Regular Expressions 101
Regular Expressions 101Regular Expressions 101
Regular Expressions 101
 
Reg EX
Reg EXReg EX
Reg EX
 
And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...
 
New features in abap
New features in abapNew features in abap
New features in abap
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Coffee 'n code: Regexes
Coffee 'n code: RegexesCoffee 'n code: Regexes
Coffee 'n code: Regexes
 
Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....
 
Introduction To Regex in Lasso 8.5
Introduction To Regex in Lasso 8.5Introduction To Regex in Lasso 8.5
Introduction To Regex in Lasso 8.5
 
Javascript正则表达式
Javascript正则表达式Javascript正则表达式
Javascript正则表达式
 
Perl Presentation
Perl PresentationPerl Presentation
Perl Presentation
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressions
 
Regex - Regular Expression Basics
Regex - Regular Expression BasicsRegex - Regular Expression Basics
Regex - Regular Expression Basics
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to Perl
 
my$talk=qr{((?:ir)?reg(?:ular )?exp(?:ressions?)?)}i;
my$talk=qr{((?:ir)?reg(?:ular )?exp(?:ressions?)?)}i;my$talk=qr{((?:ir)?reg(?:ular )?exp(?:ressions?)?)}i;
my$talk=qr{((?:ir)?reg(?:ular )?exp(?:ressions?)?)}i;
 
Regex startup
Regex startupRegex startup
Regex startup
 

Regular Expression