SlideShare a Scribd company logo
Mining Electronic Health Records
Go Beyond Ontology Based Text Mining
October 15th 2015
Mining Electronic Health Records #110/16/2015
• Information management company providing text analysis,
data management and state-of-the-art semantic technology
• 70 software developers in Sofia, Bulgaria
• Presence in London and New York
• Clients include BBC, FT, AstraZeneca, DoD, Wiley & Sons
• Over 400 person-years in R&D to create a one-stop shop for:
– Content enrichment
– Data management
– Graph database engine
Ontotext
Mining Electronic Health Records #210/16/2015
Technology Portfolio
Mining Electronic Health Records #310/16/2015
Mining Electronic Health Records #410/16/2015
Clients
Healthcare Insights
Mining Electronic Health Records #510/16/2015
Mining Electronic Health Records #610/16/2015
• An ontology models
discrete knowledge
domain
• All ontology concepts
have a definition
• All ontology concepts
have alternative labels
• Where appropriate,
ontology concepts have
additional labels
• Inference can be
applied
Chronic Obstructive
Pulmonary Disease
rdf:typeCOPD
Disease
skos:prefLabel
skos:altLabel
COLD
Shortness
of Breath
rdf:type
Symptom
hasSymptom
skos:altLabel Chronic Airflow
Obstruction
rdf:type
Disease
Respiratory
Disease
Ontology Based IE
Ontology Based IE - problems
Mining Electronic Health Records #710/16/2015
• Does not model a domain completely (both on instance
level and labels)
 Extend ontologies
 Ontology enrichment via instance mappings
• Labels contain additional qualifying information
 Definition of literals rewrite and ignore rules
• Labels does not reflect natural language
 Apply “flexible” gazetteers
• Ambiguity in terminology
 Pre-filtering
 Ranking
 Semantic instance mappings
Vocabulary Enrichment – Semantic Mappings
Mining Electronic Health Records #810/16/2015
Chronic obstructive airway disease NOS
Chronic obstructive lung disease NOS
Chronic obstructive pulmonary disease, unspecified
Chronic obstructive lung disease
Chronic obstructive airways disease NOS
Chronic obstructive lung disease (disorder)
CAFL - Chronic airflow limitation
Chronic irreversible airway obstruction
ICD 10 CM SNOMED CT US
skos:closeMatch
Ontology Based IE - problems
Mining Electronic Health Records #910/16/2015
• Does not model a domain completely (both on instance
level and labels)
 Extend ontologies
 Ontology enrichment via instance mappings
• Labels contain additional qualifying information
 Definition of literals rewrite and ignore rules
• Labels does not reflect natural language
 Apply “flexible” gazetteers
• Ambiguity in terminology
 Pre-filtering
 Ranking
 Semantic instance mappings
Vocabulary Enrichment – Synonym Enrichment
Mining Electronic Health Records #1010/16/2015
Tumor
Tumour
Abdomen
Abd
Tumor of abdomen
Tumor of abd
Tumour of abdomen
Tumour of abd
Ontology Based IE - problems
Mining Electronic Health Records #1110/16/2015
• Does not model a domain completely (both on instance
level and labels)
 Extend ontologies
 Ontology enrichment via instance mappings
• Labels contain additional qualifying information
 Definition of literals rewrite and ignore rules
• Labels does not reflect natural language
 Apply “flexible” gazetteers
• Ambiguity in terminology
 Pre-filtering
 Ranking
 Semantic instance mappings
Ontology Based IE – example
Mining Electronic Health Records #1210/16/2015
Flexible Gazetteers
Mining Electronic Health Records #1310/16/2015
• Pre-coordinated terms cannot match all natural
language terms, especially those used in narrative
medical text!
 Inversions
concept “knee injury” vs. “injury of knee” in text
 Gaps due to additional qualifiers
concept “periorbital swelling” vs. “periorbital soft tissue swelling” in text
Detection of negations
Mining Electronic Health Records #1410/16/2015
• The ability to reliably identify negated medical
statements in text may significantly affect the quality
of the extracted information.
 Adverbial Negation
 Negations in noun phrase
 Prepositional Negation
 Adjective Negation
 Verb Negation
Temporality Identification
Mining Electronic Health Records #1510/16/2015
• Temporal resolution for events in clinical notes is
crucial for an accurate definition of patient history,
current medical condition and assigned treatment.
• Identified temporality classes are:
 Historical
 Hypothetical (“Not particular”)
 Recent
• The temporality data is important to be normalized
based on the medical documents meta data (date of
report/visit)!
Temporality Identification - Example
Mining Electronic Health Records #1610/16/2015
Post-coordination Patterns
Mining Electronic Health Records #1710/16/2015
• It is impossible to fully describe medical knowledge in
term of fully qualified concepts!
• Natural language does not follow the standardized
descriptions defined by domain ontologies!
• Concepts must describe basic entities
• Entity properties can be described by different
qualifier classes
• Patterns can generate new concepts, combining
specific instance and qualifier classes
Post-coordination Patterns - Examples
Mining Electronic Health Records #1810/16/2015
• Example pattern:
<disease> or <morphologic abnormality> as right most concept in a noun
phrase, preceded by <qualifier> and <body structure>
Data Modeling
Mining Electronic Health Records #1910/16/2015
• Based on normalized data
• … but allowing extension with free text
• Allow data fusion with background knowledge
• Capture all aspects of the extracted information
• Tightly coupled with the context
• Provide provenance and confidence score
• Explorable! Not just searchable
Data provenance: graph <http://linkedlifedata.com/resource/document/CD8672>
Data Modeling
Mining Electronic Health Records #2010/16/2015
rdf:typePatient XYZ
Patient
male
hasGender
hasBirthDate
1956/09/20 xsd:date
hasDiagnose
http://linkedlifedata.com/resource/icd9cm/157.9
current
Disease
hasStatus
skos:prefLabel
Malignant neoplasm of pancreas
rdf:type
Data provenance: graph <http://linkedlifedata.com/resource/document/CN127753>
hasTreatment
http://linkedlifedata.com/resource/treatment/DT127753
Treatment
hasDrug
hasDosage
rdf:type
http://linkedlifedata.com/resource/drug/irinotecan
180 mg/ 1 m2 for 80 min
Data provenance: graph <http://linkedlifedata.com/resource/drugBroshure/CAMPTOSAR>
Maximum Daily Dosage
Data Modeling – KB
Mining Electronic Health Records #2110/16/2015
http://linkedlifedata.com/resource/drugDosage/DD127753
Dosage
hasMedication
hasPopulationGroup
rdf:type
http://linkedlifedata.com/resource/drug/irinotecan
Adult
hasAdministration Route
http://linkedlifedata.com/resource/route/subcutaneus
hasAdministration Form
http://linkedlifedata.com/resource/form/injection
http://linkedlifedata.com/resource/icd9cm/157.9
hasIndication
hasDosageValue
180
hasDosageUnit
mg
hasDenominatorValue
1
hasDenominatorUnit
m2
Semantic Data Exploration and Mining
Mining Electronic Health Records #2210/16/2015
• Build Linked Data out of extracted facts and
background knowledge
• Semantic Faceted Search
• Cross Entity Search & Exploration
• Expert Text Mining Search in pre-annotated
documents
 Combine semantic annotations with PoS elements
 Identify post-coordination patterns
 Identify relations patterns
 Query expansion using background knowledge
• Information Extraction from EHRs is still a challenge!
• Making use of the extracted data is even more
challenging 
• Ontotext provides the technology stack to make it work!
life-sciences@ontotext.com
Thank you!
Mining Electronic Health Records #2310/16/2015

More Related Content

Viewers also liked

Benefits achieved at Osisko Mining Corp. through optimization inventory manag...
Benefits achieved at Osisko Mining Corp. through optimization inventory manag...Benefits achieved at Osisko Mining Corp. through optimization inventory manag...
Benefits achieved at Osisko Mining Corp. through optimization inventory manag...
IMAFS
 
Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar report
mayurik19
 

Viewers also liked (20)

Mrta watson himss
Mrta watson himssMrta watson himss
Mrta watson himss
 
2nd AMA-IEEE Describing Electronic Medical Record by Semantic Web Technology
2nd AMA-IEEE Describing Electronic Medical Record by Semantic Web Technology2nd AMA-IEEE Describing Electronic Medical Record by Semantic Web Technology
2nd AMA-IEEE Describing Electronic Medical Record by Semantic Web Technology
 
“Mine the Data”: New trends in energy management systems and benefits for min...
“Mine the Data”: New trends in energy management systems and benefits for min...“Mine the Data”: New trends in energy management systems and benefits for min...
“Mine the Data”: New trends in energy management systems and benefits for min...
 
Benefits achieved at Osisko Mining Corp. through optimization inventory manag...
Benefits achieved at Osisko Mining Corp. through optimization inventory manag...Benefits achieved at Osisko Mining Corp. through optimization inventory manag...
Benefits achieved at Osisko Mining Corp. through optimization inventory manag...
 
Data Mining Ieee Papers Trichy
Data Mining Ieee Papers TrichyData Mining Ieee Papers Trichy
Data Mining Ieee Papers Trichy
 
Presentation data mining(1)
Presentation data mining(1)Presentation data mining(1)
Presentation data mining(1)
 
Cloud computing 2015 ieee papers Data mining ieee project titles
Cloud computing  2015 ieee papers  Data mining ieee project titlesCloud computing  2015 ieee papers  Data mining ieee project titles
Cloud computing 2015 ieee papers Data mining ieee project titles
 
Project center in trichy @ieee 2016 17 titles for java and dotnet
Project center in trichy @ieee 2016 17 titles for java and dotnetProject center in trichy @ieee 2016 17 titles for java and dotnet
Project center in trichy @ieee 2016 17 titles for java and dotnet
 
PPT FOR BIG
PPT FOR BIGPPT FOR BIG
PPT FOR BIG
 
Graph Mining, Graph Patterns, Social Network, Set & List Valued Attribute, Sp...
Graph Mining, Graph Patterns, Social Network, Set & List Valued Attribute, Sp...Graph Mining, Graph Patterns, Social Network, Set & List Valued Attribute, Sp...
Graph Mining, Graph Patterns, Social Network, Set & List Valued Attribute, Sp...
 
Final year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCA
Final year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCAFinal year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCA
Final year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCA
 
Data mining on social networks for students learning experiences
Data mining on social networks for students learning experiences Data mining on social networks for students learning experiences
Data mining on social networks for students learning experiences
 
Data mining
Data miningData mining
Data mining
 
Text categorization
Text categorizationText categorization
Text categorization
 
Introduction for skills seminar on Search and Data Mining, Master of European...
Introduction for skills seminar on Search and Data Mining, Master of European...Introduction for skills seminar on Search and Data Mining, Master of European...
Introduction for skills seminar on Search and Data Mining, Master of European...
 
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan PhdSMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
 
Smart health prediction using data mining by customsoft
Smart health prediction using data mining by customsoftSmart health prediction using data mining by customsoft
Smart health prediction using data mining by customsoft
 
Improving Healthcare Operations Using Process Data Mining
Improving Healthcare Operations Using Process Data Mining Improving Healthcare Operations Using Process Data Mining
Improving Healthcare Operations Using Process Data Mining
 
Information Retrieval with Deep Learning
Information Retrieval with Deep LearningInformation Retrieval with Deep Learning
Information Retrieval with Deep Learning
 
Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar report
 

Similar to Mining Electronic Health Records for Insights

ICIC 2014 Finding Answers in the Data – The Future Role of Text and Data Mini...
ICIC 2014 Finding Answers in the Data – The Future Role of Text and Data Mini...ICIC 2014 Finding Answers in the Data – The Future Role of Text and Data Mini...
ICIC 2014 Finding Answers in the Data – The Future Role of Text and Data Mini...
Dr. Haxel Consult
 
Your Electronic Medical Record
Your Electronic Medical RecordYour Electronic Medical Record
Your Electronic Medical Record
Thomas Petry
 
ALA 2010 -- Jabin White
ALA 2010 -- Jabin WhiteALA 2010 -- Jabin White
ALA 2010 -- Jabin White
bisg
 
6-005-1430-Keeppanasseril
6-005-1430-Keeppanasseril6-005-1430-Keeppanasseril
6-005-1430-Keeppanasseril
med20su
 

Similar to Mining Electronic Health Records for Insights (20)

ICIC 2014 Finding Answers in the Data – The Future Role of Text and Data Mini...
ICIC 2014 Finding Answers in the Data – The Future Role of Text and Data Mini...ICIC 2014 Finding Answers in the Data – The Future Role of Text and Data Mini...
ICIC 2014 Finding Answers in the Data – The Future Role of Text and Data Mini...
 
Using Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical PathwaysUsing Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical Pathways
 
Standards in health informatics - Problem, clinical models and terminologies
Standards in health informatics - Problem, clinical models and terminologiesStandards in health informatics - Problem, clinical models and terminologies
Standards in health informatics - Problem, clinical models and terminologies
 
State of the WHO Family of International Classifications -2015
State of the WHO Family of International Classifications -2015State of the WHO Family of International Classifications -2015
State of the WHO Family of International Classifications -2015
 
Edi final postgrad_trainees_oct2017
Edi final postgrad_trainees_oct2017Edi final postgrad_trainees_oct2017
Edi final postgrad_trainees_oct2017
 
Leveraging Text Classification Strategies for Clinical and Public Health Appl...
Leveraging Text Classification Strategies for Clinical and Public Health Appl...Leveraging Text Classification Strategies for Clinical and Public Health Appl...
Leveraging Text Classification Strategies for Clinical and Public Health Appl...
 
Your Electronic Medical Record
Your Electronic Medical RecordYour Electronic Medical Record
Your Electronic Medical Record
 
Your electronic medical record - Structure versus Non-Structure - Order verus...
Your electronic medical record - Structure versus Non-Structure - Order verus...Your electronic medical record - Structure versus Non-Structure - Order verus...
Your electronic medical record - Structure versus Non-Structure - Order verus...
 
Simplifying semantics for biomedical applications
Simplifying semantics for biomedical applicationsSimplifying semantics for biomedical applications
Simplifying semantics for biomedical applications
 
Standards in health informatics - problem, clinical models and terminology
Standards in health informatics - problem, clinical models and terminologyStandards in health informatics - problem, clinical models and terminology
Standards in health informatics - problem, clinical models and terminology
 
Semantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical InformaticsSemantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical Informatics
 
Health IT in Hospital Settings
Health IT in Hospital SettingsHealth IT in Hospital Settings
Health IT in Hospital Settings
 
So What does the Mighty EHR Look Like?
So What does the Mighty EHR Look Like?So What does the Mighty EHR Look Like?
So What does the Mighty EHR Look Like?
 
HETT Conference Olympic Central 2014 Integrating Healthcare Delivery
HETT Conference Olympic Central 2014 Integrating Healthcare DeliveryHETT Conference Olympic Central 2014 Integrating Healthcare Delivery
HETT Conference Olympic Central 2014 Integrating Healthcare Delivery
 
Structured Reporting in Cath Lab.ppt
Structured Reporting in Cath Lab.pptStructured Reporting in Cath Lab.ppt
Structured Reporting in Cath Lab.ppt
 
ALA 2010 -- Jabin White
ALA 2010 -- Jabin WhiteALA 2010 -- Jabin White
ALA 2010 -- Jabin White
 
Linkages to EHRs and Related Standards. What can we learn from the Parallel U...
Linkages to EHRs and Related Standards. What can we learn from the Parallel U...Linkages to EHRs and Related Standards. What can we learn from the Parallel U...
Linkages to EHRs and Related Standards. What can we learn from the Parallel U...
 
A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...
 
6-005-1430-Keeppanasseril
6-005-1430-Keeppanasseril6-005-1430-Keeppanasseril
6-005-1430-Keeppanasseril
 
Updatedpowerpoint
UpdatedpowerpointUpdatedpowerpoint
Updatedpowerpoint
 

More from Ontotext

Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
Ontotext
 

More from Ontotext (20)

Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
 
It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got Semantics
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
 
[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 

Mining Electronic Health Records for Insights

  • 1. Mining Electronic Health Records Go Beyond Ontology Based Text Mining October 15th 2015 Mining Electronic Health Records #110/16/2015
  • 2. • Information management company providing text analysis, data management and state-of-the-art semantic technology • 70 software developers in Sofia, Bulgaria • Presence in London and New York • Clients include BBC, FT, AstraZeneca, DoD, Wiley & Sons • Over 400 person-years in R&D to create a one-stop shop for: – Content enrichment – Data management – Graph database engine Ontotext Mining Electronic Health Records #210/16/2015
  • 3. Technology Portfolio Mining Electronic Health Records #310/16/2015
  • 4. Mining Electronic Health Records #410/16/2015 Clients
  • 5. Healthcare Insights Mining Electronic Health Records #510/16/2015
  • 6. Mining Electronic Health Records #610/16/2015 • An ontology models discrete knowledge domain • All ontology concepts have a definition • All ontology concepts have alternative labels • Where appropriate, ontology concepts have additional labels • Inference can be applied Chronic Obstructive Pulmonary Disease rdf:typeCOPD Disease skos:prefLabel skos:altLabel COLD Shortness of Breath rdf:type Symptom hasSymptom skos:altLabel Chronic Airflow Obstruction rdf:type Disease Respiratory Disease Ontology Based IE
  • 7. Ontology Based IE - problems Mining Electronic Health Records #710/16/2015 • Does not model a domain completely (both on instance level and labels)  Extend ontologies  Ontology enrichment via instance mappings • Labels contain additional qualifying information  Definition of literals rewrite and ignore rules • Labels does not reflect natural language  Apply “flexible” gazetteers • Ambiguity in terminology  Pre-filtering  Ranking  Semantic instance mappings
  • 8. Vocabulary Enrichment – Semantic Mappings Mining Electronic Health Records #810/16/2015 Chronic obstructive airway disease NOS Chronic obstructive lung disease NOS Chronic obstructive pulmonary disease, unspecified Chronic obstructive lung disease Chronic obstructive airways disease NOS Chronic obstructive lung disease (disorder) CAFL - Chronic airflow limitation Chronic irreversible airway obstruction ICD 10 CM SNOMED CT US skos:closeMatch
  • 9. Ontology Based IE - problems Mining Electronic Health Records #910/16/2015 • Does not model a domain completely (both on instance level and labels)  Extend ontologies  Ontology enrichment via instance mappings • Labels contain additional qualifying information  Definition of literals rewrite and ignore rules • Labels does not reflect natural language  Apply “flexible” gazetteers • Ambiguity in terminology  Pre-filtering  Ranking  Semantic instance mappings
  • 10. Vocabulary Enrichment – Synonym Enrichment Mining Electronic Health Records #1010/16/2015 Tumor Tumour Abdomen Abd Tumor of abdomen Tumor of abd Tumour of abdomen Tumour of abd
  • 11. Ontology Based IE - problems Mining Electronic Health Records #1110/16/2015 • Does not model a domain completely (both on instance level and labels)  Extend ontologies  Ontology enrichment via instance mappings • Labels contain additional qualifying information  Definition of literals rewrite and ignore rules • Labels does not reflect natural language  Apply “flexible” gazetteers • Ambiguity in terminology  Pre-filtering  Ranking  Semantic instance mappings
  • 12. Ontology Based IE – example Mining Electronic Health Records #1210/16/2015
  • 13. Flexible Gazetteers Mining Electronic Health Records #1310/16/2015 • Pre-coordinated terms cannot match all natural language terms, especially those used in narrative medical text!  Inversions concept “knee injury” vs. “injury of knee” in text  Gaps due to additional qualifiers concept “periorbital swelling” vs. “periorbital soft tissue swelling” in text
  • 14. Detection of negations Mining Electronic Health Records #1410/16/2015 • The ability to reliably identify negated medical statements in text may significantly affect the quality of the extracted information.  Adverbial Negation  Negations in noun phrase  Prepositional Negation  Adjective Negation  Verb Negation
  • 15. Temporality Identification Mining Electronic Health Records #1510/16/2015 • Temporal resolution for events in clinical notes is crucial for an accurate definition of patient history, current medical condition and assigned treatment. • Identified temporality classes are:  Historical  Hypothetical (“Not particular”)  Recent • The temporality data is important to be normalized based on the medical documents meta data (date of report/visit)!
  • 16. Temporality Identification - Example Mining Electronic Health Records #1610/16/2015
  • 17. Post-coordination Patterns Mining Electronic Health Records #1710/16/2015 • It is impossible to fully describe medical knowledge in term of fully qualified concepts! • Natural language does not follow the standardized descriptions defined by domain ontologies! • Concepts must describe basic entities • Entity properties can be described by different qualifier classes • Patterns can generate new concepts, combining specific instance and qualifier classes
  • 18. Post-coordination Patterns - Examples Mining Electronic Health Records #1810/16/2015 • Example pattern: <disease> or <morphologic abnormality> as right most concept in a noun phrase, preceded by <qualifier> and <body structure>
  • 19. Data Modeling Mining Electronic Health Records #1910/16/2015 • Based on normalized data • … but allowing extension with free text • Allow data fusion with background knowledge • Capture all aspects of the extracted information • Tightly coupled with the context • Provide provenance and confidence score • Explorable! Not just searchable
  • 20. Data provenance: graph <http://linkedlifedata.com/resource/document/CD8672> Data Modeling Mining Electronic Health Records #2010/16/2015 rdf:typePatient XYZ Patient male hasGender hasBirthDate 1956/09/20 xsd:date hasDiagnose http://linkedlifedata.com/resource/icd9cm/157.9 current Disease hasStatus skos:prefLabel Malignant neoplasm of pancreas rdf:type Data provenance: graph <http://linkedlifedata.com/resource/document/CN127753> hasTreatment http://linkedlifedata.com/resource/treatment/DT127753 Treatment hasDrug hasDosage rdf:type http://linkedlifedata.com/resource/drug/irinotecan 180 mg/ 1 m2 for 80 min
  • 21. Data provenance: graph <http://linkedlifedata.com/resource/drugBroshure/CAMPTOSAR> Maximum Daily Dosage Data Modeling – KB Mining Electronic Health Records #2110/16/2015 http://linkedlifedata.com/resource/drugDosage/DD127753 Dosage hasMedication hasPopulationGroup rdf:type http://linkedlifedata.com/resource/drug/irinotecan Adult hasAdministration Route http://linkedlifedata.com/resource/route/subcutaneus hasAdministration Form http://linkedlifedata.com/resource/form/injection http://linkedlifedata.com/resource/icd9cm/157.9 hasIndication hasDosageValue 180 hasDosageUnit mg hasDenominatorValue 1 hasDenominatorUnit m2
  • 22. Semantic Data Exploration and Mining Mining Electronic Health Records #2210/16/2015 • Build Linked Data out of extracted facts and background knowledge • Semantic Faceted Search • Cross Entity Search & Exploration • Expert Text Mining Search in pre-annotated documents  Combine semantic annotations with PoS elements  Identify post-coordination patterns  Identify relations patterns  Query expansion using background knowledge
  • 23. • Information Extraction from EHRs is still a challenge! • Making use of the extracted data is even more challenging  • Ontotext provides the technology stack to make it work! life-sciences@ontotext.com Thank you! Mining Electronic Health Records #2310/16/2015