SlideShare a Scribd company logo
Lecture 11: Representing medical knowledge
Dr. Martin Chapman
Principles of Health Informatics (7MPE1000). https://martinchapman.co.uk/teaching
Lecture structure
1. Terminologies: terms, groups, hierarchies and composition
2. Clinical terminologies and coding
3. Terminologies as models
4. Natural language processing
Learning outcomes
1. Be able to define terminologies, and related concepts such as
composition
2. Understand the concept of a clinical code
3. Understand how the limitations of models pass to terminologies
4. Be able to list and define the steps in named entity recognition
To understand how terminologies and natural language processing
support interventions.
It all comes back to interventions…
Terminologies: terms, groups, hierarchies
and composition
Terms
Previously, we saw how a terminology (a language), and the terms it
contains, provides us with a set of labels that we can apply to
symbols in the world.
In turn this allowed us to represent the state of the world.
Term(s)
Propellers
Plane
(has)
State
Terms vs concepts
In certain cases, there may be a single term commonly used for a symbol.
For example, propellers are mostly just known as propellers.
However, in other cases, there may be multiple terms for a given symbol.
For example, a plane can be referred to as one of a number of different
things. This flexibility is to be expected, but, in a terminology, we must
anchor these terms to a single concept for consistency.
Concept
Propellers
Sometimes these two ideas will
be conflated (including by
me!), but they are distinct.
Plane
We may just choose
one of the terms for
our concept.
Term(s)
Plane, aeroplane,
airliner, aircraft
Propellers
Aside: The Great British Bread Debate
For more evidence that
there can be many terms
for the same concept…
Groups
Group
Aviation
If we need to reference our terms and concepts collectively, we can
also group them.
For example, the concept of propellers and planes may be collected
together in an aviation group.
Concept
Propellers
Plane
Term(s)
Propellers
Plane, aeroplane,
airliner, aircraft
Problem: Too many terms
If we think about all the terms
we’ve met so far in relation to
a plane… there are lots
Vehicles
Air
Land
Plane
Seaplane
Sinks
Water landing
Sea
Recall: Search space
In Lecture 4, we saw how we can use a hierarchical structure to
organise a search space. We can use a similar structure to organise
and connect our terms:
Root node
Leaf node
e.g. Patient Encounter
e.g. Individual diagnosis
Hierarchy: Organisation
Vehicles
Air Land
Plane
Seaplane
Sea
Hierarchy: Organisation
This structure is neat.
It permits concept-driven exploration, which is akin to the heuristic
search techniques we saw previously. If I am to look for a plane I
know I am to explore the descendants of air, for example.
Conversely, if we did not know what planes were, this organisation
would tell us that planes were a type of air vehicle (and not a type of
land vehicle) simply from the structure of the terms themselves.
In other words, our hierarchy helps to add meaning to our terms.
Hierarchy: Connections
We can further add meaning to our terms by dictating that each link
(edge) in our hierarchy expresses a particular type of relationship
between the two terms on each end of the connection.
Seaplane
Plane
Is-A
Propeller
Plane
Part-whole
Sinks
Water landing
Causal
Useful if we’re
interested in
understanding
the type of a
given term in
our hierarchy.
Useful if we’re
interested in
understanding
dependencies
between terms.
Useful if we’re
interested in
understanding
what else is true
if we observe a
given term.
The type of link between the
nodes in our hierarchy needs to
be clearly defined to avoid
confusion.
If a terminology
mixes link types, we
call it multi-axial
Properties
We may not just have terms in
our terminology, but also
properties.
Properties provide more
information about the term.
When we organise our terms
into a hierarchy with links,
properties are inherited across
these links (i.e. pass from one
term to another).
Vehicles
Air Land
Plane
Seaplane
Sea
Wings = 1
Wings = 1
Problem: Too many terms (again)
As terminologies grow and more
terms are added, we once again –
even with our hierarchical structure
– end up with an unwieldy number
of terms and, in some cases, even
potential term overlap (in a way
that can’t be reconciled using a
unifying concept).
Vehicles
Air Land
Plane
Seaplane
Sea
Toy plane
Model plane Biplane
Cargo plane
Composition
Another school of
thought is to, instead,
agree on a fixed set of
basic terms, and to
compose newer, more
complex terms from
these basic terms.
But we need rules for
how terms can be
composed…
Vehicles
Air Land
Plane
Sea
Seaplane
Recall: Ontology
We saw previously how ontologies provide us with such restrictions.
We can also introduce
restrictions using our ‘is-a’
relationships.
Composition and cost
Constructing a terminology such that its terms can be composed (post-
coordinated), including enforcing rules via something like an ontology,
obviously has a higher initial cost.
But as terminologies grow, this cost is ultimately less than a pre-
coordinated terminology (where terms are simply added to the
terminology), owing to the fact that hierarchically related terms may need
to be updated, and error checking may be complex.
We saw this idea in Lecture 3, when we observed that task-oriented and
placeholder-oriented structures have a high initial effort associated with
them, but can ultimately be beneficial in the long run.
Clinical terminologies and coding
What does this all have to do with clinical practice?
Clinical terminologies
Much like the aviation domain, in the clinical domain we need a
language that provides us with a consistent way to represent, for
example, the state of a patient.
Clinical terminologies provide us with this, and have all the same
properties of general terminologies that we have just seen.
The terms included typically relate to patient diagnosis and
procedures performed.
Note: While clinical terminologies are technically distinct from medical terminologies
(which have a broader remit), you may find the two used interchangeably.
Similarly, you may also see terminologies referred to as classification systems.
Clinical terminologies
Group
Aviation
Concept
Propellers
Plane
Term(s)
Propellers
Plane, aeroplane,
airliner, aircraft
Type 2
Diabetes,
Diabetes, T2DM
Type 2
Diabetes
Cardio-
metabolic
Clinical codes
We saw earlier than concepts allow us to remain consistent in the
face of varying terms.
Clinical terminologies take this idea further by assigning a unique
code to each concept, which is often used instead.
Type 2
Diabetes,
Diabetes, T2DM
Type 2
Diabetes
Cardio-
metabolic
Group
Concept
Term(s)
44054006
Code
Note: Again, you may see references to medical
codes (technically a superset), or indeed diagnosis
codes (technically a subset), but these may also be
used interchangeably.
Clinical codes
Term(s)
Concept
Code
Hierarchy
Hierarchy
We’ll come back to this in Lecture 12.
Same concept,
different terms.
Coding
Given what we’ve seen, the process of coding is thus labelling, for
example, the condition(s) a patient has.
Labels are chosen from the terminology (rather than at random) to
improve consistency, and are stored somewhere, usually a patient’s
Electronic Health Record (EHR).
Good clinicians (😊) will code directly, but often coding involves
interpreting existing medical text from the EHR.
Once codes are derived they are useful for the future interpretation of a
patient’s state, and for activities like auditing.
Coding errors
In Lecture 3, we saw the concept of false positives and false
negatives.
The same is true of the coding process, when codes are incorrectly
assigned, or not assigned, to patients.
Coding errors may occur when an EHR is incorrect or incomplete; or
when the coder themselves does not correctly interpret the primary
reason for an encounter, does not have the correct expertise or
makes an entry error.
Computers and coding
One possible way to address
these issues is by introducing
computers into the coding
process.
For example, a computer may
help search free text, it may
assist during the actual entry of
the original information
(restricting inputs, for example)
or it may run the whole coding
from the free-text process itself
(more later).
Terminologies as models
Terminologies as models
Recall that a terminology is (or forms part of) a data model.
As such, terminologies have the limitations associated with
(information) models like these.
Terminology limitations: Simplification
Models are always a simplification of the phenomenon they abstract.
There is, therefore, no such thing as a single, universal terminology
for a given domain, which allows us to adequately label all the
symbols we encounter.
This is compounded by the fact that not everyone will apply the
same label to a given symbol (the symbol grounding problem).
Terminology limitations: Simplification
Another outcome of the simplification process is that the nuances of
concepts are often lost when represented in a terminology:
1. Concepts rarely have a pure definition: one would typically
consider someone over 70 years old as elderly, but ‘elderly’ can
also technically include pregnant women over a certain age.
2. As a result of the above, the meaning of concepts is often
context-dependent. The term elderly in a maternal context will
mean something other than it might do typically.
Terminology limitations: Snapshot and
Purposive
A snapshot: Models are always a snapshot of the things they
represent, and this snapshot becomes less relevant over time.
Therefore, terminologies cannot capture how concepts change over
time.
Purposive: Models are built for a particular purpose, so it thus
follows that a terminology exists for a particular purpose (e.g.
labelling certain types of vehicles), and cannot necessary be used for
other purposes.
Multiple clinical terminologies
As there is no such thing as a universal terminology, there is not just
a single clinical terminology (like SNOMED CT, (Systematized
Nomenclature of Medicine, Clinical Terms) as we’ve seen), but
instead multiple clinical terminologies.
Often, these terminologies will try and cover different aspects of the
domain. For example, ICD-10 (International Classification of
Disease, Version 10) focuses more on disease, while CPT (Current
Procedural Terminology) focusses more on interventions taken.
However, in many cases the same concept will be covered in multiple
terminologies.
Terminology Mapping
As such, if one institution, which adopts a particular terminology,
wants to interpret the data from another institution, which adopts a
different terminology, there will be a need to map from one set of
codes code to another.
Composition and mapping
While feasible in practice, mapping can, due to the issues with
terminologies we’ve seen, be a difficult process.
However if two terminologies have been created by composing terms
from a single terminology, this mapping is more straightforward, as
they have a common core.
Pre-coordinated Post-coordinated
Natural Language Processing
Note: This is a huge topic, so
we will just explore some of
the key concepts.
Automated coding
We saw earlier that computers can help solve issues with the coding
process.
One way in which they can do this is to take over the coding process
entirely.
We can generalise this to a computer applying labels from any
terminology to any text in order take make that text computable.
Let’s return to our plane terminology…
Named entity recognition
We call this process named entity recognition, a subfield of natural
language processing.
A seaplane is a powered fixed-wing
aircraft capable of taking off from
water. It can also land on water.
There is often an air of adventure
about those who board these vehicles.
How do we apply
labels from our
terminology to this
text?
Aside: Programming code
Over the next few slides, I shall reinforce some of the ideas I show
using programming code.
Programming code represents a series of steps to solve a problem.
If this is ultimately more confusing, you are welcome to skip these
slides.
Pre-pass: Stop word removal
Before we do anything else, we need to remove words we aren’t
interested in (e.g. articles). We refer to these words as stop words.
A seaplane is a powered fixed-wing
aircraft capable of taking off from
water. It can also land on water.
There is often an air of adventure
about those who board seaplanes.
In code: Stop word removal
import nltk
from nltk.corpus import stopwords
with open('text.txt','r') as file:
text = file.read()
text = [word.replace(".", "").replace("n", "").lower() for word
in text.split(" ")]
# get a list of stop words
stop = stopwords.words('english’)
text_no_stop = list(filter(lambda word: not word in stop, text));
print(text_no_stop)
Pre-pass: Stop word removal
'seaplane', 'powered', 'fixed-wing',
'aircraft', 'capable', 'taking', 'water',
'also', 'land', 'water', 'often', 'air',
'adventure', 'board', 'seaplanes'
What we’re
ultimately left with
after this process is
a ‘bag of words’.
First pass: ‘Bag of words’ (basic)
To start, we might simply find words in the text that are an exact
match for those from our terminology.
'seaplane', 'powered', 'fixed-wing',
'aircraft', 'capable', 'taking', 'water',
'also', 'land', 'water', 'often', 'air',
'adventure', 'board', 'seaplanes'
'seaplane', 'powered', 'fixed-wing',
'aircraft', 'capable', 'taking', 'water',
'also', 'land', 'water', 'often', 'air',
'adventure', 'board', 'seaplanes'
First pass: ‘Bag of words’ (basic) - Problem
But we’ve immediately hit a snag: what about plurals?
Morphology
We often find groups of words that all refer to same idea, only with
slight grammatical differences.
For example fly, flies, flew, flown, flying all refer to the idea of
moving through the air.
These sets of words are known as lexemes, and we often choose a
single word, or a lemma, to represent the whole set when, for
example, listing words in a dictionary, e.g. fly.
Morphological analysis is, in part, the process of finding this
canonical (or accepted base) form of a word.
Stemming vs. Lemmatization
When conducting named entity recognition, it is often important not
to look at words directly, but to instead look at their lemmas.
This is because it is more likely to be the lemma that is listed in a
terminology.
There are two approaches to deriving lemmas. The first, stemming,
is simple, and involves removing suffixes from a word, while the
second, lemmatization, follows more complex processes to derive a
lemma.
In code: Stemming vs. Lemmatization
Due to its simplicity, the process of stemming can often cause
issues. As such lemmatization may be preferred.
# Porter is a particular algorithm for stemming
from nltk.stem.porter import PorterStemmer
stemmer = PorterStemmer()
print(stemmer.stem("seaplanes"))
##
from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer()
print(lemmatizer.lemmatize("seaplanes"))
seaplan
seaplane
'seaplane', 'powered', 'fixed-wing',
'aircraft', 'capable', 'take', 'water',
'also', 'land', 'water', 'often', 'air',
'adventure', 'board', 'seaplane'
Second pass: ‘Bag of words’ (Stemming)
With our words stemmed, we can now properly identify both
instances of the seaplane.
Exercise: Uses of word-level processing
Remove stop words and create lemmas from the following text:
Patient has a pain in her left arm. This issue has
been present for several days. A treatment has been
prescribed accordingly.
Note: For stop words, there is no perfect answer here. Different people/systems have different
interpretations of what constitutes a stop word. And this isn’t always based on grammar.
Exercise: Uses of word-level processing
Remove stop words and create lemmas from the following text:
Patient has a pain in her left arm. This issue has
been present for several days. A treatment has been
prescribed accordingly.
'patient', 'pain', 'left', 'arm', 'issue', 'present',
'several', 'day', 'treatment', 'prescribe', 'accordingly'
Note: For stop words, there is no perfect answer here. Different people/systems have different
interpretations of what constitutes a stop word. And this isn’t always based on grammar.
Named entity recognition process
There are several different steps to the named entity recognition
process:
Label(s)
Word-
level
processing
Text
Second pass: ‘Bag of words’ (Stemming) - problem
We might decide to label land in the text from our terminology.
However this would be incorrect: in the text, land refers to the action
of landing, whereas our terminology refers to a type of vehicle.
'seaplane', 'powered', 'fixed-wing',
'aircraft', 'capable', 'take', 'water',
'also', 'land', 'water', 'often', 'air',
'adventure', 'board', 'seaplane'
Part-of-speech (POS) tagging
Our bag of words approach doesn’t allow us to appreciate that
words appear in sentences, and each have a different grammatical
role (e.g. verbs and nouns).
The process of determining the role each word plays in a sentence is
known as part-of-speech (POS) tagging.
It is important to understand whether a word is a noun or a verb, for
example, so we can correctly label entities from our terminology.
In code: Part-of-speech (POS) tagging
tagged = dict(nltk.pos_tag(text));
print(tagged);
'A': 'DT', 'seaplane': 'NN', 'is': 'VBZ', 'a': 'DT',
'powered': 'JJ', 'fixed-wing': 'NN', 'aircraft': 'NN',
'capable': 'JJ', 'of': 'IN', 'taking': 'VBG', 'off': 'RP',
'from': 'IN', 'water': 'NN', 'It': 'PRP', 'can': 'MD',
'also': 'RB', 'land': 'VB', 'on': 'IN', 'There': 'EX',
'often': 'RB', 'an': 'DT', 'air': 'NN', 'adventure':
'NN', 'about': 'IN', 'those': 'DT', 'who': 'WP',
'board': 'NN', 'seaplanes': 'NNS'
Aside: Markov models
Under the hood, POS tagging is often supported by something known as
a (Hidden) Markov Model.
Because there are multiple grammatical roles a word can have depending
on the sentence, this approach combines rules (e.g. nouns typically follow
adjectives) and frequency information (e.g. how often one type of word is
followed by another) to assign probabilistically.
Rules
Frequency
Markov models were also
looked at in the context of
decision support systems.
Third pass: POS tagging
Now we know not to label land in our text (it is a verb, whereas our
terminology uses it as a noun).
'seaplane', 'powered', 'fixed-wing',
'aircraft', 'capable', 'taking', 'water',
'also', 'land', 'water', 'seaplanes',
'divided', 'different', 'categories',
'based', 'technological',
'characteristics'
Exercise: Use of syntax analysis
Identify any potential sources of syntactic ambiguity in our text.
Patient has a pain in her left arm. This issue has
been present for several days. A treatment has been
prescribed accordingly.
Exercise: Use of syntax analysis
Identify any potential sources of syntactic ambiguity in our text.
Patient has a pain in her left arm. This issue has
been present for several days. A treatment has been
prescribed accordingly.
noun
noun: left; noun: the left; noun: Left; noun: the
Left
the left-hand part, side, or direction.
"turn to the left"
adjective
adjective: left
on, towards, or relating to the side of a
human body or of a thing that is to the west
when the person or thing is facing north.
"her left eye"
Named entity recognition
There are several different steps to the named entity recognition
process:
Label(s)
Word-
level
processing
Analysis of
syntactic
structures
Text
Third pass: POS tagging – problem
The word ‘air’ in our text matches a term in our terminology, and
has a matching grammatical form (noun), but means something
different in the text.
A seaplane is a powered fixed-wing
aircraft capable of taking off from
water. It can also land on water.
There is often an air of adventure
about those who board these vehicles.
Final pass: Onotology application
Even with tools like stemming and POS tagging, our named entity
recognition process likely isn’t perfect.
It is at this point that the use of ontologies, which provide us with
more semantic context, can come in to play to tell us, for example,
the two different meanings of the word air.
Named entity recognition
There are several different steps to the named entity recognition
process:
Text
Word-
level
processing
Analysis of
syntactic
structures
Use of
ontological
knowledge
Label(s)
Back to coding…
Hopefully it’s clear how this same procedure can be applied to
medical text.
We can automatically identify labels, and in turn this can tell us
something about the state of the patient being described in a
computable way.
Let’s look at an example of this process…
Example: MedCAT
Terminology
Label
It all comes back to interventions…
If we can automatically interpret (i.e. apply labels to) a patient’s
EHR using natural language processing – which relies on a
terminology and all the concepts that come with it – then the
appearance of specific words can trigger alerts, and in turn inform a
clinician that an intervention is required, or administer an
intervention automatically.
Summary
Terminologies are languages that allow us to represent the state of
the world.
Terminologies in a clinical context allow us to attribute codes to
patients based upon things such as observed conditions, and record
these in their EHR.
There is no such thing as a universal clinical terminology, so different
terminologies exist for different domains.
Natural language processing operates in different stages, and levels
of complexity, to assign labels to text.
References and Images
Enrico Coiera. Guide to Health Informatics (3rd ed.). CRC Press, 2015.
https://medcat.rosalind.kcl.ac.uk/
https://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/index.html
https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
https://etn-sas.eu/2020/09/23/part-of-speech-tagging-using-hidden-markov-models/
https://www.lego.com/en-gb/service/buildinginstructions/3178
http://angalmond.blogspot.com/2018/03/in-which-i-feel-little-barmy.html
https://www.healthline.com/health/ozone-therapy
https://termbrowser.nhs.uk/
https://www.riomed.com/electronic-patient-records-impact-on-healthcare-industry/
http://www.storagetwo.com/blog/2019/1/greenwich-kids-learn-to-code

More Related Content

Similar to Principles of Health Informatics: Representing medical knowledge

Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?
robertstevens65
 
Tutorial 1-Ontologies
Tutorial 1-OntologiesTutorial 1-Ontologies
Chapter 18 final presentation
Chapter 18 final presentationChapter 18 final presentation
Chapter 18 final presentation
jsolis8
 
Object Oriented Relationships
Object Oriented RelationshipsObject Oriented Relationships
Object Oriented Relationships
Taher Barodawala
 
Essay writing 4th
Essay writing 4thEssay writing 4th
Essay writing 4th
Edi Brata
 
2_Capability.ppt
2_Capability.ppt2_Capability.ppt
2_Capability.ppt
Krishna20539
 
The Practical Vocabulary
The Practical VocabularyThe Practical Vocabulary
The Practical Vocabulary
Liftoph Inc
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation
RIILP
 
Ma
MaMa
Ma
anesah
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Mauro Dragoni
 
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL DevelopmentProposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
Jorge Barreto
 
MORPHOSYNTAX: GENERATIVE MORPHOLOGY
MORPHOSYNTAX: GENERATIVE MORPHOLOGYMORPHOSYNTAX: GENERATIVE MORPHOLOGY
MORPHOSYNTAX: GENERATIVE MORPHOLOGY
Juvrianto Chrissunday Jakob
 
Word Embedding In IR
Word Embedding In IRWord Embedding In IR
Word Embedding In IR
Bhaskar Chatterjee
 
Synonyms, Alternative Labels, and Nonpreferred Terms
Synonyms, Alternative Labels, and Nonpreferred TermsSynonyms, Alternative Labels, and Nonpreferred Terms
Synonyms, Alternative Labels, and Nonpreferred Terms
Heather Hedden
 
Managing Mature Taxonomies: Resolving Orphan Terms
Managing Mature Taxonomies: Resolving Orphan TermsManaging Mature Taxonomies: Resolving Orphan Terms
Managing Mature Taxonomies: Resolving Orphan Terms
Heather Hedden
 
NI Manuscript. finale.pdf
NI Manuscript. finale.pdfNI Manuscript. finale.pdf
NI Manuscript. finale.pdf
ArceeFebDelaPaz
 
A Corpus-based Analysis of the Terminology of the Social Sciences and Humanit...
A Corpus-based Analysis of the Terminology of the Social Sciences and Humanit...A Corpus-based Analysis of the Terminology of the Social Sciences and Humanit...
A Corpus-based Analysis of the Terminology of the Social Sciences and Humanit...
Sarah Morrow
 
Term and terminology interactive fun
Term and terminology interactive funTerm and terminology interactive fun
Term and terminology interactive fun
Patricia Brenes
 
Topic Maps - Human-oriented semantics?
Topic Maps - Human-oriented semantics?Topic Maps - Human-oriented semantics?
Topic Maps - Human-oriented semantics?
Lars Marius Garshol
 
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...
dannyijwest
 

Similar to Principles of Health Informatics: Representing medical knowledge (20)

Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?
 
Tutorial 1-Ontologies
Tutorial 1-OntologiesTutorial 1-Ontologies
Tutorial 1-Ontologies
 
Chapter 18 final presentation
Chapter 18 final presentationChapter 18 final presentation
Chapter 18 final presentation
 
Object Oriented Relationships
Object Oriented RelationshipsObject Oriented Relationships
Object Oriented Relationships
 
Essay writing 4th
Essay writing 4thEssay writing 4th
Essay writing 4th
 
2_Capability.ppt
2_Capability.ppt2_Capability.ppt
2_Capability.ppt
 
The Practical Vocabulary
The Practical VocabularyThe Practical Vocabulary
The Practical Vocabulary
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation
 
Ma
MaMa
Ma
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
 
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL DevelopmentProposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
 
MORPHOSYNTAX: GENERATIVE MORPHOLOGY
MORPHOSYNTAX: GENERATIVE MORPHOLOGYMORPHOSYNTAX: GENERATIVE MORPHOLOGY
MORPHOSYNTAX: GENERATIVE MORPHOLOGY
 
Word Embedding In IR
Word Embedding In IRWord Embedding In IR
Word Embedding In IR
 
Synonyms, Alternative Labels, and Nonpreferred Terms
Synonyms, Alternative Labels, and Nonpreferred TermsSynonyms, Alternative Labels, and Nonpreferred Terms
Synonyms, Alternative Labels, and Nonpreferred Terms
 
Managing Mature Taxonomies: Resolving Orphan Terms
Managing Mature Taxonomies: Resolving Orphan TermsManaging Mature Taxonomies: Resolving Orphan Terms
Managing Mature Taxonomies: Resolving Orphan Terms
 
NI Manuscript. finale.pdf
NI Manuscript. finale.pdfNI Manuscript. finale.pdf
NI Manuscript. finale.pdf
 
A Corpus-based Analysis of the Terminology of the Social Sciences and Humanit...
A Corpus-based Analysis of the Terminology of the Social Sciences and Humanit...A Corpus-based Analysis of the Terminology of the Social Sciences and Humanit...
A Corpus-based Analysis of the Terminology of the Social Sciences and Humanit...
 
Term and terminology interactive fun
Term and terminology interactive funTerm and terminology interactive fun
Term and terminology interactive fun
 
Topic Maps - Human-oriented semantics?
Topic Maps - Human-oriented semantics?Topic Maps - Human-oriented semantics?
Topic Maps - Human-oriented semantics?
 
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...
 

More from Martin Chapman

Principles of Health Informatics: Artificial intelligence and machine learning
Principles of Health Informatics: Artificial intelligence and machine learningPrinciples of Health Informatics: Artificial intelligence and machine learning
Principles of Health Informatics: Artificial intelligence and machine learning
Martin Chapman
 
Principles of Health Informatics: Clinical decision support systems
Principles of Health Informatics: Clinical decision support systemsPrinciples of Health Informatics: Clinical decision support systems
Principles of Health Informatics: Clinical decision support systems
Martin Chapman
 
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Martin Chapman
 
Technical Validation through Automated Testing
Technical Validation through Automated TestingTechnical Validation through Automated Testing
Technical Validation through Automated Testing
Martin Chapman
 
Scalable architectures for phenotype libraries
Scalable architectures for phenotype librariesScalable architectures for phenotype libraries
Scalable architectures for phenotype libraries
Martin Chapman
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
Martin Chapman
 
Using AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patientsUsing AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patients
Martin Chapman
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
Martin Chapman
 
Principles of Health Informatics: Evaluating medical software
Principles of Health Informatics: Evaluating medical softwarePrinciples of Health Informatics: Evaluating medical software
Principles of Health Informatics: Evaluating medical software
Martin Chapman
 
Principles of Health Informatics: Usability of medical software
Principles of Health Informatics: Usability of medical softwarePrinciples of Health Informatics: Usability of medical software
Principles of Health Informatics: Usability of medical software
Martin Chapman
 
Principles of Health Informatics: Social networks, telehealth, and mobile health
Principles of Health Informatics: Social networks, telehealth, and mobile healthPrinciples of Health Informatics: Social networks, telehealth, and mobile health
Principles of Health Informatics: Social networks, telehealth, and mobile health
Martin Chapman
 
Principles of Health Informatics: Communication systems in healthcare
Principles of Health Informatics: Communication systems in healthcarePrinciples of Health Informatics: Communication systems in healthcare
Principles of Health Informatics: Communication systems in healthcare
Martin Chapman
 
Principles of Health Informatics: Informatics skills - searching and making d...
Principles of Health Informatics: Informatics skills - searching and making d...Principles of Health Informatics: Informatics skills - searching and making d...
Principles of Health Informatics: Informatics skills - searching and making d...
Martin Chapman
 
Principles of Health Informatics: Informatics skills - communicating, structu...
Principles of Health Informatics: Informatics skills - communicating, structu...Principles of Health Informatics: Informatics skills - communicating, structu...
Principles of Health Informatics: Informatics skills - communicating, structu...
Martin Chapman
 
Principles of Health Informatics: Models, information, and information systems
Principles of Health Informatics: Models, information, and information systemsPrinciples of Health Informatics: Models, information, and information systems
Principles of Health Informatics: Models, information, and information systems
Martin Chapman
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
Martin Chapman
 
Using Microservices to Design Patient-facing Research Software
Using Microservices to Design Patient-facing Research SoftwareUsing Microservices to Design Patient-facing Research Software
Using Microservices to Design Patient-facing Research Software
Martin Chapman
 
Using CWL to support EHR-based phenotyping
Using CWL to support EHR-based phenotypingUsing CWL to support EHR-based phenotyping
Using CWL to support EHR-based phenotyping
Martin Chapman
 
Phenoflow: An Architecture for Computable Phenotypes
Phenoflow: An Architecture for Computable PhenotypesPhenoflow: An Architecture for Computable Phenotypes
Phenoflow: An Architecture for Computable Phenotypes
Martin Chapman
 
Phenoflow 2021
Phenoflow 2021Phenoflow 2021
Phenoflow 2021
Martin Chapman
 

More from Martin Chapman (20)

Principles of Health Informatics: Artificial intelligence and machine learning
Principles of Health Informatics: Artificial intelligence and machine learningPrinciples of Health Informatics: Artificial intelligence and machine learning
Principles of Health Informatics: Artificial intelligence and machine learning
 
Principles of Health Informatics: Clinical decision support systems
Principles of Health Informatics: Clinical decision support systemsPrinciples of Health Informatics: Clinical decision support systems
Principles of Health Informatics: Clinical decision support systems
 
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
 
Technical Validation through Automated Testing
Technical Validation through Automated TestingTechnical Validation through Automated Testing
Technical Validation through Automated Testing
 
Scalable architectures for phenotype libraries
Scalable architectures for phenotype librariesScalable architectures for phenotype libraries
Scalable architectures for phenotype libraries
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
 
Using AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patientsUsing AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patients
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
 
Principles of Health Informatics: Evaluating medical software
Principles of Health Informatics: Evaluating medical softwarePrinciples of Health Informatics: Evaluating medical software
Principles of Health Informatics: Evaluating medical software
 
Principles of Health Informatics: Usability of medical software
Principles of Health Informatics: Usability of medical softwarePrinciples of Health Informatics: Usability of medical software
Principles of Health Informatics: Usability of medical software
 
Principles of Health Informatics: Social networks, telehealth, and mobile health
Principles of Health Informatics: Social networks, telehealth, and mobile healthPrinciples of Health Informatics: Social networks, telehealth, and mobile health
Principles of Health Informatics: Social networks, telehealth, and mobile health
 
Principles of Health Informatics: Communication systems in healthcare
Principles of Health Informatics: Communication systems in healthcarePrinciples of Health Informatics: Communication systems in healthcare
Principles of Health Informatics: Communication systems in healthcare
 
Principles of Health Informatics: Informatics skills - searching and making d...
Principles of Health Informatics: Informatics skills - searching and making d...Principles of Health Informatics: Informatics skills - searching and making d...
Principles of Health Informatics: Informatics skills - searching and making d...
 
Principles of Health Informatics: Informatics skills - communicating, structu...
Principles of Health Informatics: Informatics skills - communicating, structu...Principles of Health Informatics: Informatics skills - communicating, structu...
Principles of Health Informatics: Informatics skills - communicating, structu...
 
Principles of Health Informatics: Models, information, and information systems
Principles of Health Informatics: Models, information, and information systemsPrinciples of Health Informatics: Models, information, and information systems
Principles of Health Informatics: Models, information, and information systems
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
 
Using Microservices to Design Patient-facing Research Software
Using Microservices to Design Patient-facing Research SoftwareUsing Microservices to Design Patient-facing Research Software
Using Microservices to Design Patient-facing Research Software
 
Using CWL to support EHR-based phenotyping
Using CWL to support EHR-based phenotypingUsing CWL to support EHR-based phenotyping
Using CWL to support EHR-based phenotyping
 
Phenoflow: An Architecture for Computable Phenotypes
Phenoflow: An Architecture for Computable PhenotypesPhenoflow: An Architecture for Computable Phenotypes
Phenoflow: An Architecture for Computable Phenotypes
 
Phenoflow 2021
Phenoflow 2021Phenoflow 2021
Phenoflow 2021
 

Recently uploaded

How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Standardized tool for Intelligence test.
Standardized tool for Intelligence test.Standardized tool for Intelligence test.
Standardized tool for Intelligence test.
deepaannamalai16
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
BoudhayanBhattachari
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
RidwanHassanYusuf
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
TechSoup
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Henry Hollis
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
S. Raj Kumar
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
سمير بسيوني
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
nitinpv4ai
 
Stack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 MicroprocessorStack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 Microprocessor
JomonJoseph58
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
danielkiash986
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 

Recently uploaded (20)

How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Standardized tool for Intelligence test.
Standardized tool for Intelligence test.Standardized tool for Intelligence test.
Standardized tool for Intelligence test.
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
 
Stack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 MicroprocessorStack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 Microprocessor
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 

Principles of Health Informatics: Representing medical knowledge

  • 1. Lecture 11: Representing medical knowledge Dr. Martin Chapman Principles of Health Informatics (7MPE1000). https://martinchapman.co.uk/teaching
  • 2. Lecture structure 1. Terminologies: terms, groups, hierarchies and composition 2. Clinical terminologies and coding 3. Terminologies as models 4. Natural language processing
  • 3. Learning outcomes 1. Be able to define terminologies, and related concepts such as composition 2. Understand the concept of a clinical code 3. Understand how the limitations of models pass to terminologies 4. Be able to list and define the steps in named entity recognition To understand how terminologies and natural language processing support interventions. It all comes back to interventions…
  • 4. Terminologies: terms, groups, hierarchies and composition
  • 5. Terms Previously, we saw how a terminology (a language), and the terms it contains, provides us with a set of labels that we can apply to symbols in the world. In turn this allowed us to represent the state of the world. Term(s) Propellers Plane (has) State
  • 6. Terms vs concepts In certain cases, there may be a single term commonly used for a symbol. For example, propellers are mostly just known as propellers. However, in other cases, there may be multiple terms for a given symbol. For example, a plane can be referred to as one of a number of different things. This flexibility is to be expected, but, in a terminology, we must anchor these terms to a single concept for consistency. Concept Propellers Sometimes these two ideas will be conflated (including by me!), but they are distinct. Plane We may just choose one of the terms for our concept. Term(s) Plane, aeroplane, airliner, aircraft Propellers
  • 7. Aside: The Great British Bread Debate For more evidence that there can be many terms for the same concept…
  • 8. Groups Group Aviation If we need to reference our terms and concepts collectively, we can also group them. For example, the concept of propellers and planes may be collected together in an aviation group. Concept Propellers Plane Term(s) Propellers Plane, aeroplane, airliner, aircraft
  • 9. Problem: Too many terms If we think about all the terms we’ve met so far in relation to a plane… there are lots Vehicles Air Land Plane Seaplane Sinks Water landing Sea
  • 10. Recall: Search space In Lecture 4, we saw how we can use a hierarchical structure to organise a search space. We can use a similar structure to organise and connect our terms: Root node Leaf node e.g. Patient Encounter e.g. Individual diagnosis
  • 12. Hierarchy: Organisation This structure is neat. It permits concept-driven exploration, which is akin to the heuristic search techniques we saw previously. If I am to look for a plane I know I am to explore the descendants of air, for example. Conversely, if we did not know what planes were, this organisation would tell us that planes were a type of air vehicle (and not a type of land vehicle) simply from the structure of the terms themselves. In other words, our hierarchy helps to add meaning to our terms.
  • 13. Hierarchy: Connections We can further add meaning to our terms by dictating that each link (edge) in our hierarchy expresses a particular type of relationship between the two terms on each end of the connection. Seaplane Plane Is-A Propeller Plane Part-whole Sinks Water landing Causal Useful if we’re interested in understanding the type of a given term in our hierarchy. Useful if we’re interested in understanding dependencies between terms. Useful if we’re interested in understanding what else is true if we observe a given term. The type of link between the nodes in our hierarchy needs to be clearly defined to avoid confusion. If a terminology mixes link types, we call it multi-axial
  • 14. Properties We may not just have terms in our terminology, but also properties. Properties provide more information about the term. When we organise our terms into a hierarchy with links, properties are inherited across these links (i.e. pass from one term to another). Vehicles Air Land Plane Seaplane Sea Wings = 1 Wings = 1
  • 15. Problem: Too many terms (again) As terminologies grow and more terms are added, we once again – even with our hierarchical structure – end up with an unwieldy number of terms and, in some cases, even potential term overlap (in a way that can’t be reconciled using a unifying concept). Vehicles Air Land Plane Seaplane Sea Toy plane Model plane Biplane Cargo plane
  • 16. Composition Another school of thought is to, instead, agree on a fixed set of basic terms, and to compose newer, more complex terms from these basic terms. But we need rules for how terms can be composed… Vehicles Air Land Plane Sea Seaplane
  • 17. Recall: Ontology We saw previously how ontologies provide us with such restrictions. We can also introduce restrictions using our ‘is-a’ relationships.
  • 18. Composition and cost Constructing a terminology such that its terms can be composed (post- coordinated), including enforcing rules via something like an ontology, obviously has a higher initial cost. But as terminologies grow, this cost is ultimately less than a pre- coordinated terminology (where terms are simply added to the terminology), owing to the fact that hierarchically related terms may need to be updated, and error checking may be complex. We saw this idea in Lecture 3, when we observed that task-oriented and placeholder-oriented structures have a high initial effort associated with them, but can ultimately be beneficial in the long run.
  • 19. Clinical terminologies and coding What does this all have to do with clinical practice?
  • 20. Clinical terminologies Much like the aviation domain, in the clinical domain we need a language that provides us with a consistent way to represent, for example, the state of a patient. Clinical terminologies provide us with this, and have all the same properties of general terminologies that we have just seen. The terms included typically relate to patient diagnosis and procedures performed. Note: While clinical terminologies are technically distinct from medical terminologies (which have a broader remit), you may find the two used interchangeably. Similarly, you may also see terminologies referred to as classification systems.
  • 21. Clinical terminologies Group Aviation Concept Propellers Plane Term(s) Propellers Plane, aeroplane, airliner, aircraft Type 2 Diabetes, Diabetes, T2DM Type 2 Diabetes Cardio- metabolic
  • 22. Clinical codes We saw earlier than concepts allow us to remain consistent in the face of varying terms. Clinical terminologies take this idea further by assigning a unique code to each concept, which is often used instead. Type 2 Diabetes, Diabetes, T2DM Type 2 Diabetes Cardio- metabolic Group Concept Term(s) 44054006 Code Note: Again, you may see references to medical codes (technically a superset), or indeed diagnosis codes (technically a subset), but these may also be used interchangeably.
  • 23. Clinical codes Term(s) Concept Code Hierarchy Hierarchy We’ll come back to this in Lecture 12. Same concept, different terms.
  • 24. Coding Given what we’ve seen, the process of coding is thus labelling, for example, the condition(s) a patient has. Labels are chosen from the terminology (rather than at random) to improve consistency, and are stored somewhere, usually a patient’s Electronic Health Record (EHR). Good clinicians (😊) will code directly, but often coding involves interpreting existing medical text from the EHR. Once codes are derived they are useful for the future interpretation of a patient’s state, and for activities like auditing.
  • 25. Coding errors In Lecture 3, we saw the concept of false positives and false negatives. The same is true of the coding process, when codes are incorrectly assigned, or not assigned, to patients. Coding errors may occur when an EHR is incorrect or incomplete; or when the coder themselves does not correctly interpret the primary reason for an encounter, does not have the correct expertise or makes an entry error.
  • 26. Computers and coding One possible way to address these issues is by introducing computers into the coding process. For example, a computer may help search free text, it may assist during the actual entry of the original information (restricting inputs, for example) or it may run the whole coding from the free-text process itself (more later).
  • 28. Terminologies as models Recall that a terminology is (or forms part of) a data model. As such, terminologies have the limitations associated with (information) models like these.
  • 29. Terminology limitations: Simplification Models are always a simplification of the phenomenon they abstract. There is, therefore, no such thing as a single, universal terminology for a given domain, which allows us to adequately label all the symbols we encounter. This is compounded by the fact that not everyone will apply the same label to a given symbol (the symbol grounding problem).
  • 30. Terminology limitations: Simplification Another outcome of the simplification process is that the nuances of concepts are often lost when represented in a terminology: 1. Concepts rarely have a pure definition: one would typically consider someone over 70 years old as elderly, but ‘elderly’ can also technically include pregnant women over a certain age. 2. As a result of the above, the meaning of concepts is often context-dependent. The term elderly in a maternal context will mean something other than it might do typically.
  • 31. Terminology limitations: Snapshot and Purposive A snapshot: Models are always a snapshot of the things they represent, and this snapshot becomes less relevant over time. Therefore, terminologies cannot capture how concepts change over time. Purposive: Models are built for a particular purpose, so it thus follows that a terminology exists for a particular purpose (e.g. labelling certain types of vehicles), and cannot necessary be used for other purposes.
  • 32. Multiple clinical terminologies As there is no such thing as a universal terminology, there is not just a single clinical terminology (like SNOMED CT, (Systematized Nomenclature of Medicine, Clinical Terms) as we’ve seen), but instead multiple clinical terminologies. Often, these terminologies will try and cover different aspects of the domain. For example, ICD-10 (International Classification of Disease, Version 10) focuses more on disease, while CPT (Current Procedural Terminology) focusses more on interventions taken. However, in many cases the same concept will be covered in multiple terminologies.
  • 33. Terminology Mapping As such, if one institution, which adopts a particular terminology, wants to interpret the data from another institution, which adopts a different terminology, there will be a need to map from one set of codes code to another.
  • 34. Composition and mapping While feasible in practice, mapping can, due to the issues with terminologies we’ve seen, be a difficult process. However if two terminologies have been created by composing terms from a single terminology, this mapping is more straightforward, as they have a common core. Pre-coordinated Post-coordinated
  • 35. Natural Language Processing Note: This is a huge topic, so we will just explore some of the key concepts.
  • 36. Automated coding We saw earlier that computers can help solve issues with the coding process. One way in which they can do this is to take over the coding process entirely. We can generalise this to a computer applying labels from any terminology to any text in order take make that text computable. Let’s return to our plane terminology…
  • 37. Named entity recognition We call this process named entity recognition, a subfield of natural language processing. A seaplane is a powered fixed-wing aircraft capable of taking off from water. It can also land on water. There is often an air of adventure about those who board these vehicles. How do we apply labels from our terminology to this text?
  • 38. Aside: Programming code Over the next few slides, I shall reinforce some of the ideas I show using programming code. Programming code represents a series of steps to solve a problem. If this is ultimately more confusing, you are welcome to skip these slides.
  • 39. Pre-pass: Stop word removal Before we do anything else, we need to remove words we aren’t interested in (e.g. articles). We refer to these words as stop words. A seaplane is a powered fixed-wing aircraft capable of taking off from water. It can also land on water. There is often an air of adventure about those who board seaplanes.
  • 40. In code: Stop word removal import nltk from nltk.corpus import stopwords with open('text.txt','r') as file: text = file.read() text = [word.replace(".", "").replace("n", "").lower() for word in text.split(" ")] # get a list of stop words stop = stopwords.words('english’) text_no_stop = list(filter(lambda word: not word in stop, text)); print(text_no_stop)
  • 41. Pre-pass: Stop word removal 'seaplane', 'powered', 'fixed-wing', 'aircraft', 'capable', 'taking', 'water', 'also', 'land', 'water', 'often', 'air', 'adventure', 'board', 'seaplanes' What we’re ultimately left with after this process is a ‘bag of words’.
  • 42. First pass: ‘Bag of words’ (basic) To start, we might simply find words in the text that are an exact match for those from our terminology. 'seaplane', 'powered', 'fixed-wing', 'aircraft', 'capable', 'taking', 'water', 'also', 'land', 'water', 'often', 'air', 'adventure', 'board', 'seaplanes'
  • 43. 'seaplane', 'powered', 'fixed-wing', 'aircraft', 'capable', 'taking', 'water', 'also', 'land', 'water', 'often', 'air', 'adventure', 'board', 'seaplanes' First pass: ‘Bag of words’ (basic) - Problem But we’ve immediately hit a snag: what about plurals?
  • 44. Morphology We often find groups of words that all refer to same idea, only with slight grammatical differences. For example fly, flies, flew, flown, flying all refer to the idea of moving through the air. These sets of words are known as lexemes, and we often choose a single word, or a lemma, to represent the whole set when, for example, listing words in a dictionary, e.g. fly. Morphological analysis is, in part, the process of finding this canonical (or accepted base) form of a word.
  • 45. Stemming vs. Lemmatization When conducting named entity recognition, it is often important not to look at words directly, but to instead look at their lemmas. This is because it is more likely to be the lemma that is listed in a terminology. There are two approaches to deriving lemmas. The first, stemming, is simple, and involves removing suffixes from a word, while the second, lemmatization, follows more complex processes to derive a lemma.
  • 46. In code: Stemming vs. Lemmatization Due to its simplicity, the process of stemming can often cause issues. As such lemmatization may be preferred. # Porter is a particular algorithm for stemming from nltk.stem.porter import PorterStemmer stemmer = PorterStemmer() print(stemmer.stem("seaplanes")) ## from nltk.stem import WordNetLemmatizer lemmatizer = WordNetLemmatizer() print(lemmatizer.lemmatize("seaplanes")) seaplan seaplane
  • 47. 'seaplane', 'powered', 'fixed-wing', 'aircraft', 'capable', 'take', 'water', 'also', 'land', 'water', 'often', 'air', 'adventure', 'board', 'seaplane' Second pass: ‘Bag of words’ (Stemming) With our words stemmed, we can now properly identify both instances of the seaplane.
  • 48. Exercise: Uses of word-level processing Remove stop words and create lemmas from the following text: Patient has a pain in her left arm. This issue has been present for several days. A treatment has been prescribed accordingly. Note: For stop words, there is no perfect answer here. Different people/systems have different interpretations of what constitutes a stop word. And this isn’t always based on grammar.
  • 49. Exercise: Uses of word-level processing Remove stop words and create lemmas from the following text: Patient has a pain in her left arm. This issue has been present for several days. A treatment has been prescribed accordingly. 'patient', 'pain', 'left', 'arm', 'issue', 'present', 'several', 'day', 'treatment', 'prescribe', 'accordingly' Note: For stop words, there is no perfect answer here. Different people/systems have different interpretations of what constitutes a stop word. And this isn’t always based on grammar.
  • 50. Named entity recognition process There are several different steps to the named entity recognition process: Label(s) Word- level processing Text
  • 51. Second pass: ‘Bag of words’ (Stemming) - problem We might decide to label land in the text from our terminology. However this would be incorrect: in the text, land refers to the action of landing, whereas our terminology refers to a type of vehicle. 'seaplane', 'powered', 'fixed-wing', 'aircraft', 'capable', 'take', 'water', 'also', 'land', 'water', 'often', 'air', 'adventure', 'board', 'seaplane'
  • 52. Part-of-speech (POS) tagging Our bag of words approach doesn’t allow us to appreciate that words appear in sentences, and each have a different grammatical role (e.g. verbs and nouns). The process of determining the role each word plays in a sentence is known as part-of-speech (POS) tagging. It is important to understand whether a word is a noun or a verb, for example, so we can correctly label entities from our terminology.
  • 53. In code: Part-of-speech (POS) tagging tagged = dict(nltk.pos_tag(text)); print(tagged); 'A': 'DT', 'seaplane': 'NN', 'is': 'VBZ', 'a': 'DT', 'powered': 'JJ', 'fixed-wing': 'NN', 'aircraft': 'NN', 'capable': 'JJ', 'of': 'IN', 'taking': 'VBG', 'off': 'RP', 'from': 'IN', 'water': 'NN', 'It': 'PRP', 'can': 'MD', 'also': 'RB', 'land': 'VB', 'on': 'IN', 'There': 'EX', 'often': 'RB', 'an': 'DT', 'air': 'NN', 'adventure': 'NN', 'about': 'IN', 'those': 'DT', 'who': 'WP', 'board': 'NN', 'seaplanes': 'NNS'
  • 54. Aside: Markov models Under the hood, POS tagging is often supported by something known as a (Hidden) Markov Model. Because there are multiple grammatical roles a word can have depending on the sentence, this approach combines rules (e.g. nouns typically follow adjectives) and frequency information (e.g. how often one type of word is followed by another) to assign probabilistically. Rules Frequency Markov models were also looked at in the context of decision support systems.
  • 55. Third pass: POS tagging Now we know not to label land in our text (it is a verb, whereas our terminology uses it as a noun). 'seaplane', 'powered', 'fixed-wing', 'aircraft', 'capable', 'taking', 'water', 'also', 'land', 'water', 'seaplanes', 'divided', 'different', 'categories', 'based', 'technological', 'characteristics'
  • 56. Exercise: Use of syntax analysis Identify any potential sources of syntactic ambiguity in our text. Patient has a pain in her left arm. This issue has been present for several days. A treatment has been prescribed accordingly.
  • 57. Exercise: Use of syntax analysis Identify any potential sources of syntactic ambiguity in our text. Patient has a pain in her left arm. This issue has been present for several days. A treatment has been prescribed accordingly. noun noun: left; noun: the left; noun: Left; noun: the Left the left-hand part, side, or direction. "turn to the left" adjective adjective: left on, towards, or relating to the side of a human body or of a thing that is to the west when the person or thing is facing north. "her left eye"
  • 58. Named entity recognition There are several different steps to the named entity recognition process: Label(s) Word- level processing Analysis of syntactic structures Text
  • 59. Third pass: POS tagging – problem The word ‘air’ in our text matches a term in our terminology, and has a matching grammatical form (noun), but means something different in the text. A seaplane is a powered fixed-wing aircraft capable of taking off from water. It can also land on water. There is often an air of adventure about those who board these vehicles.
  • 60. Final pass: Onotology application Even with tools like stemming and POS tagging, our named entity recognition process likely isn’t perfect. It is at this point that the use of ontologies, which provide us with more semantic context, can come in to play to tell us, for example, the two different meanings of the word air.
  • 61. Named entity recognition There are several different steps to the named entity recognition process: Text Word- level processing Analysis of syntactic structures Use of ontological knowledge Label(s)
  • 62. Back to coding… Hopefully it’s clear how this same procedure can be applied to medical text. We can automatically identify labels, and in turn this can tell us something about the state of the patient being described in a computable way. Let’s look at an example of this process…
  • 64. It all comes back to interventions… If we can automatically interpret (i.e. apply labels to) a patient’s EHR using natural language processing – which relies on a terminology and all the concepts that come with it – then the appearance of specific words can trigger alerts, and in turn inform a clinician that an intervention is required, or administer an intervention automatically.
  • 65. Summary Terminologies are languages that allow us to represent the state of the world. Terminologies in a clinical context allow us to attribute codes to patients based upon things such as observed conditions, and record these in their EHR. There is no such thing as a universal clinical terminology, so different terminologies exist for different domains. Natural language processing operates in different stages, and levels of complexity, to assign labels to text.
  • 66. References and Images Enrico Coiera. Guide to Health Informatics (3rd ed.). CRC Press, 2015. https://medcat.rosalind.kcl.ac.uk/ https://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/index.html https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html https://etn-sas.eu/2020/09/23/part-of-speech-tagging-using-hidden-markov-models/ https://www.lego.com/en-gb/service/buildinginstructions/3178 http://angalmond.blogspot.com/2018/03/in-which-i-feel-little-barmy.html https://www.healthline.com/health/ozone-therapy https://termbrowser.nhs.uk/ https://www.riomed.com/electronic-patient-records-impact-on-healthcare-industry/ http://www.storagetwo.com/blog/2019/1/greenwich-kids-learn-to-code