Keynote presentation for Mobilizing Computable Biomedical Knowledge Conference 2021. Looking in particular at emerging trends of cognitive assistants, personal health knowledge graphs, and meta descriptions for knowledge resources. Examples taken from RPI-IBM project on Health Empowerment by Analysis, Learning, and Semantics and NIEHS project with RPI-MSSM-Columbia on Human Health Exposure Analysis Repository Data Center.
2. Towards (More) Computable
Knowledge
Deborah L. McGuinness
Tetherless World Senior Constellation Chair
Professor of Computer, Cognitive, and Web Sciences
Rensselaer Polytechnic Institute, Troy, NY, USA
July 20, 2021
For Mobilizing Computable Biomedical Knowledge (MCBK) 2021
3. Themes
• Recommender Systems / Cognitive Assistants
– Here today… improved impact with enhanced explanation
and usability
• Personal Knowledge Stores
– Emerging…. improved impact with enhanced access
control, interoperability, tooling infrastructure, …
• Meta descriptions for use/reuse, embedded
assumptions, use cases, building for longevity
– Emerging … improved impact with increased (defacto)
standards for meta descriptions, methodologies…
2
McGuinness 7/20/21
4. Health Empowerment by Analytics,
Learning and Semantics (HEALS)
Using AI to relate
behaviors to medical-
condition knowledge to
produce personalized
disease-prevention and
disease management
insights and
explainable data-driven
recommendations
McGuinness 7/20/21
5. Rensselaer-IBM HEALS
Current Thrusts
4
• Personal Health Recommendations:
Empowering diabetic patients to improve their
health behaviors by tailoring health information and
recommendations to their lifestyles and preferences.
• Semantics for Health and Clinical Reasoning:
Enabling provider trust in clinical decision support
system recommendations.
• Improving health equity through Machine Learning,
Fairness, and Semantics *
McGuinness 7/20/21
6. HEALS Personalized Diabetes
Advisor
Combines personal health information, personal food diary, Food Knowledge
Graph, and ADA Diabetes guidelines to support lifestyle change
5
User
Dietary Needs
& Preferences
CPG &
Dietary
Guidelines
Time Series
Summarizer
Personal
Food Log
Food KG
Personal Health
Knowledge Graph
Semantic
Reasoner
Personal
Health
Records
Semantic Data
Dictionary
User Data
Recipe
Recommendations
User Question Expanded User Question
User
Knowledge Base Q&A
McGuinness 7/20/21
7. Diabetes Advisor
Capabilities
Personal health
profile, food likes
and dislikes, dietary
restrictions, etc.
Timely alerts and reminders
Specific food suggestions
Time series analysis
of food diary
Specific guidelines
that apply, with
performance
evaluation
McGuinness 7/20/21
8. Diabetes Advisor
Explainability
Beyond helping users trust specific recommendations, explanations promote health literacy
and patient understanding of conditions and guidelines that apply to them, contributing to
positive lifestyle change. Informed by interviews with dieticians.
Rationales for specific
recommendations (future)
Conversational interface
to request information and
practical suggestions
Explanations for why
specific guidelines apply,
with attribution to source
material
“suggest a good lunch dish without meat”
System response:
User query:
McGuinness 7/20/21
9. FoodKG: A Semantics-Driven
KG for Recommendations
• We create a single, cohesive food knowledge graph:
– Brings together recipes, nutrition, food vocabularies, etc.
– Links into existing ontologies / vocabularies
– Straightforward to use - modular and reusable
– Maintains provenance for every assertion
– Address use cases concerning personalized recommendations, including potential substitutions and explanations
• Primary Content:
– Recipes: Recipe1M (Marin et al. 2018) > 1million recipes (includes steps, sources, preparation time, serving
sizes,
– Nutrients: United States Department of Agriculture (USDA) Nutrient Database for Standard Reference ~8k food
types with nutrient information
– Food Ontology: FoodOn (Griffiths et al. 2016) https://foodon.org/ "Field to Fork" ontology – includes food-related
products, processes, includes treatment process, packaging, ….
– https://foodkg.github.io/ resource paper at ISWC
8
AMIA 2019 | amia.org
McGuinness 7/20/21
10. Food Knowledge Graph
Construction: Nutrients
9
AMIA 2019 | amia.org
Example USDA data with a few food items and nutrients.
Semantic structural representation of a subset of the USDA
Data. Extract food products from FoodOn using OntoFox
The resulting knowledge graph, pruned to display context-
relevant features
Semantic Data Dictionary
Ontology linkages
Rashid, McCusker, Pinheiro, Bax, Santos, Stingone, Das, McGuinness. The Semantic Data
Dictionary–An Approach for Describing and Annotating Data. Data Intelligence, MIT Press pp.
443-486. April 2021 DOI: https://doi.org/10.1162/dint_a_00058
11. Food Explanation Ontology
(FEO)
• Post-hoc system for food explainability
• Leverages Explanation Ontology1 and WhatToMake2 Ontology to create
a formalization for food related explanations
• Focused on three explanation types
• Contextual, Contrastive, Counterfactual
10
Black Box
Recommender
System
Recommendation FEO Explanation
1. Explanation Ontology: https://tetherless-world.github.io/explanation-ontology/index
2. WhatToMake Ontology: https://foodkg.github.io/whattomake.html
Chari , Seneviratne , Gruen , Foreman , Das, McGuinness; Explanation Ontology: A Model of Explanations for User-
Centered AI; Resource Track,19th International Semantic Web Conference 2020
12. Explanation Types
11
Explanation
Type
Prototypical Question Example
Contextual What broader information about the
current situation prompted you to
suggest this recommendation now?
What if non food-specific factors
contributed to the recommendation?
Contrastive Why recommend Food A over Food
B?
Why should I eat Butternut Squash
Soup over Broccoli Cheddar Soup?
Counterfactual What if we change an input variable
in our system?
What if I was pregnant? Would I still
be recommended Sushi?
• Task based evaluation, with three competency questions, based around three
explanation types.
• We used a subset of literature derived explanations from the Explanation Ontology
Padhiar, Seneviratne, Chari, Gruen, McGuinness, Semantic Modeling for Food Recommendation Explanations, Data
Engineering Meets Intelligent Food Cooking Recipes (DÉCOR) Workshop. Co-located with IEEE ICDE April 2021.
McGuinness 7/20/21
14. Variety of Challenges: Varying Levels
of Abstraction/ Different Purposes
Granularity
and use case
mismatch,
missing
common
usage, e.g.,
“pinch of
salt”… we will
see this again
McGuinness 7/20/21
15. 1
4
Use Cases help scope
and prioritize
Key Components
• Summary
• Usage Scenario
• Flow of Events
• Activity Diagram
• Competency Questions
• Resources
• See examples at
Ontology Engineering
• https://tw.rpi.edu/web/Courses/Ontologies/2020
• https://docs.google.com/document/d/1A2w-
xoN5aRwlSoCTEtDsWjs2caYDRD5bANif6icDS6k/edit?usp=sharing
14
Starting with the Use Case
McGuinness 7/20/21
Publish use cases
16. Methodology
Use Cases
Existing Ontologies
& Vocabularies
Expert Interviews
Labkey*,
Ontology
Fragments
Ontology
Curation
(ongoing)
Reviewers & Curators
* Ontology Development Team
* Domain collaborators
* Invited experts
* "Consumers" (data analysts)
Knowledge Graph
Integration
* Linking data and
metadata content to
domain terms
* Linking workflows
based on semantic
descriptions
Repository
Integration
* Source Datasets
* Analytics source
code
* Results
* Publications
Knowledge-
Enhanced
Search
Finding what is
there that
might be of
use
Semantic
Extract
Transform,
Load
(SETLr)
Expert
Guidance
Sources
Data Reporting
Templates
Data Dictionaries /
Codebooks
Foundational
Ontologies/Vocabularies
Human Aware
Data Acquisition
Framework /
Whyis
Generated
Ontology
* domain concepts
* authoritative
vocabularies
* vetted definitions
* supporting citations
17. User-Centered Context- &
Knowledge-Enabled Explanations
16
AMIA 2019 | amia.org
Chakraborty, P., Kwon, B. C., Dey, S., Dhurandhar, A., Gruen, D., Ng, K., ... & Varshney, K. R. (2020, August). Tutorial on Human-Centered Explainability for
Healthcare. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 3547-3548).
18. Explanation Types and Needs
17
AMIA 2019 | amia.org
Explanation Type Example
Case-based What other situations with complex patients have had this recommendation applied?
Contextual What broader information about the current situation prompted you to suggest this
recommendation now?
Contrastive
Why administer this new drug over the one I would typically prescribe?
Counterfactual
What if the patient had a high risk for cardiovascular disease? Would you still
recommend the same treatment plan?
Everyday What are the signs I should be careful to check for in this case?
Scientific
What is the biological basis, particularly the evidence, for this recommendation?
Simulation based
What would happen if we prescribe this drug to the patient?
Statistical
What percentage of similar patients who received this treatment recovered?
Trace based What steps were taken (rules were fired) by the system to generate this
recommendation?
Different question types
Address
• Why
• Why not
• How might that be
• What if
• How did that happen
• Data
• Output
• Performance
• Others
Chari, Seneviratne, Gruen, McGuinness. Directions for Explainable Knowledge-Enabled Systems” Tiddi, Lecue, Hitzler.
Knowledge Graphs for eXplainable AI -- Foundations, Applications and Challenges. Studies on the Semantic Web, 2020
Liao,, Gruen, Miller. "Questioning the AI: informing design
practices for explainable AI user experiences. CHI Conference
on Human Factors in Computing Systems. 2020.
Chari,Chakraborty, Seneviratne,Ghalwash, Gruen,
Sow, McGuinness; Towards Clinically Relevant Explanations
for Type-2 Diabetes Risk Prediction with the Explanation
Ontology”; AMIA 2021
McGuinness 7/20/21
19. Context-Aware Explanations for
T2D Comorbidity Risk Prediction
Approach: Provide relevant context by a multi-method approach to make available actionable and
patient-specific domain knowledge as answers to a set of pre-canned, clinically relevant questions.
Risk prediction
models
Post-hoc
explainers
Natural Language
Processing
Modules
Methods
S Chari, P Chakraborty, M Ghalwash, O Seneviratne, DM Gruen, FS Saiz, CH Chen, PM Rojas, DL McGuinness;
“Leveraging Clinical Context for User-Centered Explainability: A Diabetes Use Case”; KDD Data Science in
Healthcare Workshop; 2021; Available Online: https://arxiv.org/abs/2107.02359
Domain Knowledge Sources
Contextualized Entities
Patient’s
comorbidity
risk
Patient Post-hoc
Explanations
Patient’s Data
Guidelines
Ontologies
Clinician-friendly
Dashboard
Clinicians
contextualize
utilize
presented
on
interact
with
validate and inform
McGuinness 7/20/21
20. User Centered KG for Health
PERSONAL
DISEASE
DIET
ACTIVITIES
Web sources & online forums
Personal health
data
• Data-mining /
KDD
• NLP
• Semantic
Search
• Semantic Data
Integration
McGuinness 7/20/21
21. Personal Health KG
20
AMIA 2019 | amia.org
Still early,
needs
solutions
to access
control,
storage,
context
modeling,
just in
time
content,
…
McGuinness 7/20/21
22. Human Health Exposure Analysis
Repository (NIEHS)
Provide access to
exposure and health
outcome data
Enable construction of
customized,
harmonizable datasets
Facilitate harmonization
of HHEAR data with
external sources
Support pooling of data
for analysis
Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
McGuinness 7/20/21 Joint with MSSM, Columbia
23. From Datasets to Knowledge Graphs
Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
We move study content including datasets into knowledge graphs (KGs).
Since variable data in a given KG in normalized according to vocabulary
from a collection of standardized ontologies, new datasets of combined
studies can be generated on demand and made available for pooled
analysis
Dataset
study A
Dataset
study B
Dataset
generated on
demand
Dataset
study A
Dataset
study B
Knowledge
Graph
Dataset generated
on demand
McGuinness 7/20/21
24. Data Dictionaries Often Do Not Provide
Enough Metadata
Example 2: Childhood Asthma Study
Example 1: Pregnancy Study
Example courtesy J. Stingone
25. Lack of Standard Terminologies
educ Educ_m Var11
Stingone JA, Mervish N, Kovatch P, McGuinness DL, Gennings C, Teitelbaum SL. Curr Opin Pediatr 2017; 29:231-239.
26. 2
5
Imported Ontologies:
●Semantic Science Int Ontology
(SIO)
●PROV-O
●Units Ontology
●Human-Aware Science
Ontology (HAScO)
●Environment Ontology (ENVO)
Annotations:
●Simple Knowledge Organization
System (SKOS)
●Dublin Core (DC) Terms
Minimum Information to Reference an
External Ontology Term (MIREOT)-ed
Ontologies:
●Chemicals of Biological Interest (CheBI)
●Statistics Ontology (STAT-O)
●UBERON (Anatomy)
●Disease Ontology (DO)
●Clinical Measurement (CMO)
●Cogat (Cognitive Measures)
●UniProt (Proteins)
●ExO (Exposure Ontology)
●…
25
CHEAR/HHEAR Ontology
Foundations and Reuse
https://bioportal.bioontology.org/ontologies/HHEAR
McGuinness 7/20/21
28. Mapping Data to Meaning:
Semantic Data Dictionaries
Rashid, McCusker, Pinheiro, Bax, Santos, Stingone, Das, McGuinness. The Semantic Data Dictionary–
An Approach for Describing and Annotating Data. Data Intelligence, MIT Press pp. 443-486. April 2021
DOI: https://doi.org/10.1162/dint_a_00058
Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
McGuinness 7/20/21
29. Knowledge Modelling using Semantic Data
Dictionaries: Transforming human-readable data dictionaries
into machine-readable
Rashid SM et al The semantic data dictionary: an approach for describing and annotating data Data
Intelligence 2020; 2:443-486 doi: https://doi.org/10.1162/dint_a_00058 PMC7583433
Column Label Attribute attributeOf Unit Time Entity Role Relation inRelationTo
Child_PID CHEAR PID hasco:originalID ??child ??study
??child sio:SIO_000485 hhear:00492 ??mother
??mother sio:SIO_000485 hhear:00502 ??child
COI_Gender Sex of Child sio:SIO_010029 ??child
COI_YOB Year of child’s Birth sio:SIO_000428 ??birth
??birth sio:SIO_000582 ??child
??birth_timepoint hhear:00458 ??birth
COI_Parity Parity of COI PATO:0002370 ??mother
MomRace Race/eth of biological mother of
COI
sio:SIO_001015 ??mother
DadRace Race/eth of biological father of
COI
sio:SIO_001015 ??father
??father sio:SIO_000485 hhear:00501 ??child
Features and human descriptions
from Data Dictionary
What?
About
Who?
What is the
implicit object?
31. Automatic ingest, access
control, data governance,
precision download, …
Supports Search study, data
sample, subject, ...
Enables smart queries e.g.,
find
Child:BirthWt, Gender,
Gestational Age at Birth
Mother:Age, BMI “early in
pregnancy based on inclusion
criterion for the particular study”,
Parity, Education
Metals: As, CD, Mn, Mo, Pb
Ontology-Enabled HHEAR Human
Aware Data Acquisition Framework
Pinheiro, Santos, Liang, Liu, Rashid, McGuinness, Bax. HADatAc: A Framework for Scientific Data
Integration using Ontologies. Intl Semantic Web Conference, Monterey, CA, 2018.
Sample Question: Gennings
Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
33. 32
Contact: Deborah L. McGuinness: dlm@cs.rpi.edu
Posters at MCBK:
Towards Providing Clinical Context for a Diabetes Risk-Prediction Use Case via User-centered Explainability (poster 15)
A Hybrid Clinical Reasoning Approach that includes Abduction (poster 4)
Thanks to many: Kristin Bennett, Prithwish Chakraborty, Shruthi Chari, Ching-Hua Chen, Amar Das, John Erickson, Morgan
Foreman, Dan Gruen, Jay Franklin, Jim Hendler, Matt Johnson, Chip Masters, Jamie McCusker, Paulo Pinheiro, Sabbir Rashid,
Henrique Santos, Oshani Seneviratne, Sola Shirai, Jeanette Stingone, Susan Teitelbaum, Mohammed Zaki, …
HEALS , HHEAR , HADatAc , Whyis
Questions?
• Cognitive Assistants are here and growing AND can benefit from more
explainability/usability
• User-centered knowledge stores (PHKGs) are emerging AND can
benefit from enhanced access control / interoperability
• Meta descriptions for resources are useful to enhance reuse / longevity /
interoperability AND need more attention
McGuinness 7/20/21