SlideShare a Scribd company logo
1 of 38
Download to read offline
AI & Scientific Discovery in Oncology:
Opportunities, Challenges & Trends
André Freitas & dECMT AI Team
DART Meeting
Barcelona, May, 2023
Outline
What are the recent developments in AI
which are relevant for oncology research ?
Three perspectives:
• Explainable AI (XAI)
• Building associations over heterogeneous data
• Variational Autoencoders (VAEs)
• Unified evidence spaces for multi-omics
• Large Language Models (LLMs)
• Interpreting textual evidence (at scale)
New Infrastructures
for Scientific Discovery
New Models of Inference
Disclaimer:
AI-centered
Near-future(-istic) perspective (emerging trends)
Building associations over
heterogeneous data
Explainable AI (XAI)
Motivation:
• Certain aspects of tumor pathology need to be studied within tissue
context.
• Interaction of the neoplastic cell with the surrounding microenvironment
including the immune system.
Goal:
• Models which: bridge the gap between microscopic imaging and high-
dimensional ”omics” technologies.
• Facilitate the discovery of localised molecular features that drive the
spatially heterogeneous phenotypes of a tumour.
Breast cancer profiling by explainable AI
Binder et al. (Nature MI, 2021)
Breast cancer profiling by explainable AI
Binder et al. (Nature MI, 2021)
Morpho-molecular
integration
Computationally generated “fluorescence microscopy”
Correlation of spatio-morphological and molecular features
SVM
‘LRP’
Bag
of
keypoints
Molecular
Som. mut.
Copy num. var.
RNA-seq
DNA meth
Prot. profiles
Binder et al. (Nature MI, 2021)
over 200,000 individually
annotated cells
positive label indicating presence of at least one
cell of the respective type per patch
prediction for each
protein/gene separately
Bag
of
keypoints
SVM
‘LRP’
Cancer
Lymphocytes
Stroma
Heatmap: relevance of a pixel is the sum of the
relevance scores over all local features which
cover that pixel.
Layer-wise Relevance Propagation (LRP):
facilitates high-resolution (pixel-wise) classification
results allowing to identify individual cells while
requiring only coarse-grained training data (region
annotations).
TCGA: over 500
A kind of abductive reasoning:
‘Inference to the best explanation’
Additive/monotonic explanatory data integration
• Motivation:
• Molecular heterogeneity of cancer cells.
• Relapse of disease due to the escape of resistant cell populations.
• Goal:
• Model which: obtains characteristic network patterns for tumor cells and normal
epithelial cells.
• Approach:
• Coping with sampling heterogeneities (unique rare molecular properties).
• Capturing complex global correlations, which are inherent to biological networks.
• Leveraging single cell data which can deliver large training datasets (10k cells per
patient).
Single-cell gene regulatory network prediction
by explainable AI
Keyl et al. (Nuc. Ac. Res, 2023)
NN
scRNA-seq
Cell-type
Tumor
Ciliated
AT1
AT2
Club
Ciliated
10 patients with non-small
cell lung cancer
(1) A target gene is predicted based
on a set of other genes.
Predicted
masked gene
Single-cell gene regulatory network
prediction by explainable AI
Keyl et al. (Nuc. Ac. Res, 2023)
NN
LRP
scRNA-seq
Cell-type
Tumor
Ciliated
AT1
AT2
Club
Ciliated
10 patients with non-small
cell lung cancer
(1) A target gene is predicted based
on a set of other genes.
(2) LRP is used to infer the relevance
of every gene for this prediction.
Predicted
masked gene
Single-cell gene regulatory network
prediction by explainable AI
Keyl et al. (Nuc. Ac. Res, 2023)
NN
LRP
scRNA-seq
Infer interaction
strength graph
Cell-type
Tumor
Ciliated
AT1
AT2
Club
Ciliated
10 patients with non-small
cell lung cancer
(1) A target gene is predicted based
on a set of other genes.
(2) LRP is used to infer the relevance
of every gene for this prediction.
Predicted
masked gene
(3) Aggregate interaction strength
Single-cell gene regulatory network
prediction by explainable AI
Keyl et al. (Nuc. Ac. Res, 2023)
NN
scRNA-seq
Cell-type
Tumor
Ciliated
AT1
AT2
Club
Ciliated
10 patients with non-small
cell lung cancer
Predicted
masked gene
Single-cell gene regulatory network
prediction by explainable AI
Keyl et al. (Nuc. Ac. Res, 2023)
UMAP
Inter- and intra-tumoral distribution of tumor-
specific network activity
(dots represent tumor cells)
pathogenic networks
Keyl et al. (Nuc. Ac. Res, 2023)
Similarity/clustering ‘view’
(NN)
Linked associations
(expl.)
qualify
Certain patients show different active
networks in the same cells (e.g. T1 and T6 in
patient p032).
pathogenic networks
Keyl et al. (Nuc. Ac. Res, 2023)
Similarity/clustering ‘view’
(NN)
Linked associations
(expl.)
qualify
Some network modules are distinctly active
only in a minority of tumour cells (e.g. T2 in
patient p024), indicating a functional
heterogeneity.
pathogenic networks
Keyl et al. (Nuc. Ac. Res, 2023)
Similarity/clustering ‘view’
(NN)
Linked associations
(expl.)
qualify
Explainable AI (XAI): Take-away
• ML:
• Computes the relevant correlations in a predictive setting.
• Flexibility for operating over multi-modal, heterogeneous data.
• Explainability:
• Shift the focus from prediction to association.
• Allows for transparency, verifiability and control.
• Discovered relations are associational, not causal.
• Exploratory: useful to inform hypotheses for new
interventional studies.
Unified evidence spaces
for multi-omics
Variational Autoencoders (VAEs)
Motivation:
• Integration of transcriptomics from diverse/multicenter data sources.
• Accounting for batch effects.
Example:
• Transcriptomics profiles from
• 932 CCLE cell lines
• 434 patient-derived tumor xenografts
• 10’550 patient tumors from TCGA
• 406 metastatic tumors from MET50029
• 203 breast tumors from Count Me In (CMI)
Integration of transcriptomics profiles from
different datasets
Dimitrieva et al. (BioRxiv, 2022)
Enc
RNA-seq
TCGA
CCLE
PTX
MET500
CMI
MOBER architecture
Dec
μ
σ
z
Dimitrieva et al. (BioRxiv, 2022)
decoder takes a sample from the latent space
and reconstructs the gene expression profile
RNA-seq
aNN
Source discriminator
Integration of transcriptomics profiles from
different datasets
Self-supervision
Variational autoencoders (VAE)
• Compensates for batch effects.
• Alignment preserves biological subtype relationships.
• Information transfer between cell line and patient tumor datasets.
Dimitrieva et al. (BioRxiv, 2022)
On the Principles
Style transfer
https://genekogan.com/works/style-transfer/
Stable diffusion
InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs
VAEs induce a lower-dimensional
smooth space.
Disentangling latent factors.
Observations are organised within a
lower dimensional (perceptual-
semantic) manifold.
Imply: ‘Extrapolations’ are possible.
Zhu et al. (IJCV, 2022)
Style transfer in the gene expression space
Jain et al. (ArXiv:2106.15456, 2021)
Uhler (Talk at ICML, 2020)
Similarity/clustering ‘view’
Qualified changes as aligned trajectories (style
transfer)
Predict, project, extrapolate via style transfer
Lotfollahi et al. (2018): Generative modelling and
latent space arithmetics predict single-cell response
across cell types, studies and species.
• Foundation for a digital twin framework.
• Patient, cell states, etc.
Better mechanistic grounding
We need to:
• Gene expression is governed by a gene regulatory network.
• Closer dialogue to the real biochemical processes.
• Allows for a better control (move the system to a desirable state).
• How to ground on the actual mechanistic model?
Uhler et al. (ICML, 2022)
In order to:
• Identify optimal interventions.
• Transport drug intervention to a new cell-type.
METABRIC Integrating mechanistic knowledge via sparse connections
Large-scale perturbation models
• Systematic perturbations:
• Genome knockout (CRISPR) and compounds interventions.
• Arrayed CRISPR knockouts of 17k genes.
• De novo pathway reconstruction (no background knowledge).
Phenomics
Standardised assay: staining six common
molecular substructures
Rudnick et al. (AACR, 2023)
Haque (Talk at RS, 2023)
Large-scale perturbation models
• Systematic perturbations:
• Genome knockout (CRISPR) and compounds interventions.
• Arrayed CRISPR knockouts of 17k genes.
• De novo pathway reconstruction (no background knowledge).
Phenomics
Standardised assay: staining six common
molecular substructures
Rudnick et al. (AACR, 2023)
Haque (Talk at RS, 2023)
Interpreting Textual
Evidence (at scale)
Large Language Models (LLMs)
Accumulated
Knowledge
Hypotheses
Questions
Natural Language
Inference (NLI)
NLI
Models
Adapted from: https://human-centered.ai/project/explainable-ai-fwf-32554/
Automating meta-analyses
Cytokine release syndrome (CRS):
Significant adverse event of T cell-engaging
therapies.
Need: Predictive models for CRS
Problem: Lack of patient-level datasets.
Can one explore relevant evidence in the
literature?
Bogatu et al. (JBI, 2023)
~ 460 papers 17 highly
aligned papers
Parameter
extraction
Meta-review
19hs 38hs
Bogatu et al. (JBI, 2023)
~ 460 papers 17 highly
aligned papers
Parameter
extraction
Meta-review
19hs 38hs 7 mins
Bogatu et al. (JBI, 2023)
~ 460 papers 17 highly
aligned papers
Parameter
extraction
Meta-review
LLM
context window
chain of prompts
GPT 3.5
Table builder
Layout Extractor
Not possible one year ago!
Demo Wysocki & Wysocka
64 years old woman with:
• multiple myeloma,
• s/p allogeneic transplant with recurrent disease and with systemic amyloidosis (involvement of lungs, tongue, bladder, heart),
• on hemodialysis for ESRD who represents for malaise, weakness, and generalized body aching x 2 days.
• she was admitted with hypercalcemia and treated with pamidronate 30mg, calcitonin, and dialysis.
• patient was initially treated with melphalan and prednisone, followed by VAD regimen, and autologous stem cell transplant.
• with relapse of her myeloma, she received thalidomide velcade and thalidomide, which were eventually also held due to
worsening edema and kidney function.
~ 375,600
CTs
Clinical trial matching
Video 4
~ 375,600
CT reports
Coarse-grained
ranking
LLM
context window
chain of prompts
GPT 3.5
Layout Extractor
Neural-indexing
eligibility criteria
Neural-search
ranked list
(relevant trials)
Patient description
Fine-grained
inference
LM LM
Demo from Bogatu, Jullien
Jullien et al. (Semeval 2021)
Take-away
Universal framework for integrating and organising heterogeneous evidence
Emerging foundations for industrial-scale scientific inference
Explainable ML
Variational
Autoencoders
Large Language
Models
Flexibility for operating over multi-modal, heterogeneous data.
Allows for associations, transparency and verifiability.
Foundation for digital twins.
Integrating and extrapolating over multicentric, heterogeneous data.
Integrating mechanistic knowledge.
Allows for semantic interpretation of text at scale.
Extracting and structuring complex textual evidence.

More Related Content

Similar to AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends

BRITEREU_finalposter
BRITEREU_finalposterBRITEREU_finalposter
BRITEREU_finalposterElsa Fecke
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisDespoina Kalfakakou
 
Network Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systemsNetwork Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systemsGanesh Bagler
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBenjamin Good
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017David Cook
 
Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataJoel Saltz
 
Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례mothersafe
 
Sample Work For Engineering Literature Review and Gap Identification
Sample Work For Engineering Literature Review and Gap IdentificationSample Work For Engineering Literature Review and Gap Identification
Sample Work For Engineering Literature Review and Gap IdentificationPhD Assistance
 
Large scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyLarge scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyMaté Ongenaert
 
Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsChimezie Ogbuji
 
An Overview on Gene Expression Analysis
An Overview on Gene Expression AnalysisAn Overview on Gene Expression Analysis
An Overview on Gene Expression AnalysisIOSR Journals
 
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™Affymetrix
 
Genomics2 Phenomics Complete
Genomics2 Phenomics CompleteGenomics2 Phenomics Complete
Genomics2 Phenomics CompleteInterpretOmics
 
Cancer Analytics Poster
Cancer Analytics PosterCancer Analytics Poster
Cancer Analytics PosterMichael Atkins
 
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud Pilots
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud PilotsC-Change Cancer Big Data, NCI Genomic Data Commons, Cloud Pilots
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud PilotsWarren Kibbe
 
STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...Lars Juhl Jensen
 

Similar to AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends (20)

BRITEREU_finalposter
BRITEREU_finalposterBRITEREU_finalposter
BRITEREU_finalposter
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesis
 
Network Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systemsNetwork Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systems
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
 
Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming Data
 
Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례
 
Sample Work For Engineering Literature Review and Gap Identification
Sample Work For Engineering Literature Review and Gap IdentificationSample Work For Engineering Literature Review and Gap Identification
Sample Work For Engineering Literature Review and Gap Identification
 
10.1.1.80.2149
10.1.1.80.214910.1.1.80.2149
10.1.1.80.2149
 
Large scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyLarge scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biology
 
UNMSymposium2014
UNMSymposium2014UNMSymposium2014
UNMSymposium2014
 
Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical Informatics
 
An Overview on Gene Expression Analysis
An Overview on Gene Expression AnalysisAn Overview on Gene Expression Analysis
An Overview on Gene Expression Analysis
 
I1803056267
I1803056267I1803056267
I1803056267
 
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
 
Genomics2 Phenomics Complete
Genomics2 Phenomics CompleteGenomics2 Phenomics Complete
Genomics2 Phenomics Complete
 
Cancer Analytics Poster
Cancer Analytics PosterCancer Analytics Poster
Cancer Analytics Poster
 
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud Pilots
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud PilotsC-Change Cancer Big Data, NCI Genomic Data Commons, Cloud Pilots
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud Pilots
 
STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...
 
Mb viruses
Mb virusesMb viruses
Mb viruses
 

More from Andre Freitas

AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ ManchesterAndre Freitas
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep LearningAndre Freitas
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsAndre Freitas
 
Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018Andre Freitas
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsAndre Freitas
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...Andre Freitas
 
Semantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsSemantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsAndre Freitas
 
Semantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementSemantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementAndre Freitas
 
Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsAndre Freitas
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesAndre Freitas
 
Different Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsDifferent Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsAndre Freitas
 
WiSS Challenge - Day 2
WiSS Challenge - Day 2WiSS Challenge - Day 2
WiSS Challenge - Day 2Andre Freitas
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataAndre Freitas
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeAndre Freitas
 
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...Andre Freitas
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachAndre Freitas
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Andre Freitas
 
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...Andre Freitas
 
How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?Andre Freitas
 
Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackAndre Freitas
 

More from Andre Freitas (20)

AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ Manchester
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
 
Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP Systems
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
 
Semantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsSemantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering Systems
 
Semantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementSemantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and Refinement
 
Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary Definitions
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology Classes
 
Different Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsDifferent Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering Systems
 
WiSS Challenge - Day 2
WiSS Challenge - Day 2WiSS Challenge - Day 2
WiSS Challenge - Day 2
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked Data
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
 
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional Approach
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
 
How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?
 
Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web Stack
 

Recently uploaded

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends

  • 1. AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends André Freitas & dECMT AI Team DART Meeting Barcelona, May, 2023
  • 2. Outline What are the recent developments in AI which are relevant for oncology research ? Three perspectives: • Explainable AI (XAI) • Building associations over heterogeneous data • Variational Autoencoders (VAEs) • Unified evidence spaces for multi-omics • Large Language Models (LLMs) • Interpreting textual evidence (at scale) New Infrastructures for Scientific Discovery New Models of Inference Disclaimer: AI-centered Near-future(-istic) perspective (emerging trends)
  • 3. Building associations over heterogeneous data Explainable AI (XAI)
  • 4. Motivation: • Certain aspects of tumor pathology need to be studied within tissue context. • Interaction of the neoplastic cell with the surrounding microenvironment including the immune system. Goal: • Models which: bridge the gap between microscopic imaging and high- dimensional ”omics” technologies. • Facilitate the discovery of localised molecular features that drive the spatially heterogeneous phenotypes of a tumour. Breast cancer profiling by explainable AI Binder et al. (Nature MI, 2021)
  • 5. Breast cancer profiling by explainable AI Binder et al. (Nature MI, 2021) Morpho-molecular integration Computationally generated “fluorescence microscopy” Correlation of spatio-morphological and molecular features
  • 6. SVM ‘LRP’ Bag of keypoints Molecular Som. mut. Copy num. var. RNA-seq DNA meth Prot. profiles Binder et al. (Nature MI, 2021) over 200,000 individually annotated cells positive label indicating presence of at least one cell of the respective type per patch prediction for each protein/gene separately Bag of keypoints SVM ‘LRP’ Cancer Lymphocytes Stroma Heatmap: relevance of a pixel is the sum of the relevance scores over all local features which cover that pixel. Layer-wise Relevance Propagation (LRP): facilitates high-resolution (pixel-wise) classification results allowing to identify individual cells while requiring only coarse-grained training data (region annotations). TCGA: over 500 A kind of abductive reasoning: ‘Inference to the best explanation’ Additive/monotonic explanatory data integration
  • 7. • Motivation: • Molecular heterogeneity of cancer cells. • Relapse of disease due to the escape of resistant cell populations. • Goal: • Model which: obtains characteristic network patterns for tumor cells and normal epithelial cells. • Approach: • Coping with sampling heterogeneities (unique rare molecular properties). • Capturing complex global correlations, which are inherent to biological networks. • Leveraging single cell data which can deliver large training datasets (10k cells per patient). Single-cell gene regulatory network prediction by explainable AI Keyl et al. (Nuc. Ac. Res, 2023)
  • 8. NN scRNA-seq Cell-type Tumor Ciliated AT1 AT2 Club Ciliated 10 patients with non-small cell lung cancer (1) A target gene is predicted based on a set of other genes. Predicted masked gene Single-cell gene regulatory network prediction by explainable AI Keyl et al. (Nuc. Ac. Res, 2023)
  • 9. NN LRP scRNA-seq Cell-type Tumor Ciliated AT1 AT2 Club Ciliated 10 patients with non-small cell lung cancer (1) A target gene is predicted based on a set of other genes. (2) LRP is used to infer the relevance of every gene for this prediction. Predicted masked gene Single-cell gene regulatory network prediction by explainable AI Keyl et al. (Nuc. Ac. Res, 2023)
  • 10. NN LRP scRNA-seq Infer interaction strength graph Cell-type Tumor Ciliated AT1 AT2 Club Ciliated 10 patients with non-small cell lung cancer (1) A target gene is predicted based on a set of other genes. (2) LRP is used to infer the relevance of every gene for this prediction. Predicted masked gene (3) Aggregate interaction strength Single-cell gene regulatory network prediction by explainable AI Keyl et al. (Nuc. Ac. Res, 2023)
  • 11. NN scRNA-seq Cell-type Tumor Ciliated AT1 AT2 Club Ciliated 10 patients with non-small cell lung cancer Predicted masked gene Single-cell gene regulatory network prediction by explainable AI Keyl et al. (Nuc. Ac. Res, 2023) UMAP Inter- and intra-tumoral distribution of tumor- specific network activity (dots represent tumor cells)
  • 12. pathogenic networks Keyl et al. (Nuc. Ac. Res, 2023) Similarity/clustering ‘view’ (NN) Linked associations (expl.) qualify
  • 13. Certain patients show different active networks in the same cells (e.g. T1 and T6 in patient p032). pathogenic networks Keyl et al. (Nuc. Ac. Res, 2023) Similarity/clustering ‘view’ (NN) Linked associations (expl.) qualify
  • 14. Some network modules are distinctly active only in a minority of tumour cells (e.g. T2 in patient p024), indicating a functional heterogeneity. pathogenic networks Keyl et al. (Nuc. Ac. Res, 2023) Similarity/clustering ‘view’ (NN) Linked associations (expl.) qualify
  • 15. Explainable AI (XAI): Take-away • ML: • Computes the relevant correlations in a predictive setting. • Flexibility for operating over multi-modal, heterogeneous data. • Explainability: • Shift the focus from prediction to association. • Allows for transparency, verifiability and control. • Discovered relations are associational, not causal. • Exploratory: useful to inform hypotheses for new interventional studies.
  • 16. Unified evidence spaces for multi-omics Variational Autoencoders (VAEs)
  • 17. Motivation: • Integration of transcriptomics from diverse/multicenter data sources. • Accounting for batch effects. Example: • Transcriptomics profiles from • 932 CCLE cell lines • 434 patient-derived tumor xenografts • 10’550 patient tumors from TCGA • 406 metastatic tumors from MET50029 • 203 breast tumors from Count Me In (CMI) Integration of transcriptomics profiles from different datasets Dimitrieva et al. (BioRxiv, 2022)
  • 18. Enc RNA-seq TCGA CCLE PTX MET500 CMI MOBER architecture Dec μ σ z Dimitrieva et al. (BioRxiv, 2022) decoder takes a sample from the latent space and reconstructs the gene expression profile RNA-seq aNN Source discriminator Integration of transcriptomics profiles from different datasets Self-supervision Variational autoencoders (VAE)
  • 19. • Compensates for batch effects. • Alignment preserves biological subtype relationships. • Information transfer between cell line and patient tumor datasets. Dimitrieva et al. (BioRxiv, 2022)
  • 22. InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs VAEs induce a lower-dimensional smooth space. Disentangling latent factors. Observations are organised within a lower dimensional (perceptual- semantic) manifold. Imply: ‘Extrapolations’ are possible.
  • 23. Zhu et al. (IJCV, 2022)
  • 24. Style transfer in the gene expression space Jain et al. (ArXiv:2106.15456, 2021) Uhler (Talk at ICML, 2020) Similarity/clustering ‘view’ Qualified changes as aligned trajectories (style transfer) Predict, project, extrapolate via style transfer Lotfollahi et al. (2018): Generative modelling and latent space arithmetics predict single-cell response across cell types, studies and species. • Foundation for a digital twin framework. • Patient, cell states, etc.
  • 25. Better mechanistic grounding We need to: • Gene expression is governed by a gene regulatory network. • Closer dialogue to the real biochemical processes. • Allows for a better control (move the system to a desirable state). • How to ground on the actual mechanistic model? Uhler et al. (ICML, 2022) In order to: • Identify optimal interventions. • Transport drug intervention to a new cell-type.
  • 26. METABRIC Integrating mechanistic knowledge via sparse connections
  • 27. Large-scale perturbation models • Systematic perturbations: • Genome knockout (CRISPR) and compounds interventions. • Arrayed CRISPR knockouts of 17k genes. • De novo pathway reconstruction (no background knowledge). Phenomics Standardised assay: staining six common molecular substructures Rudnick et al. (AACR, 2023) Haque (Talk at RS, 2023)
  • 28. Large-scale perturbation models • Systematic perturbations: • Genome knockout (CRISPR) and compounds interventions. • Arrayed CRISPR knockouts of 17k genes. • De novo pathway reconstruction (no background knowledge). Phenomics Standardised assay: staining six common molecular substructures Rudnick et al. (AACR, 2023) Haque (Talk at RS, 2023)
  • 29. Interpreting Textual Evidence (at scale) Large Language Models (LLMs)
  • 30. Accumulated Knowledge Hypotheses Questions Natural Language Inference (NLI) NLI Models Adapted from: https://human-centered.ai/project/explainable-ai-fwf-32554/ Automating meta-analyses Cytokine release syndrome (CRS): Significant adverse event of T cell-engaging therapies. Need: Predictive models for CRS Problem: Lack of patient-level datasets. Can one explore relevant evidence in the literature? Bogatu et al. (JBI, 2023)
  • 31. ~ 460 papers 17 highly aligned papers Parameter extraction Meta-review 19hs 38hs Bogatu et al. (JBI, 2023)
  • 32. ~ 460 papers 17 highly aligned papers Parameter extraction Meta-review 19hs 38hs 7 mins Bogatu et al. (JBI, 2023)
  • 33.
  • 34. ~ 460 papers 17 highly aligned papers Parameter extraction Meta-review LLM context window chain of prompts GPT 3.5 Table builder Layout Extractor Not possible one year ago! Demo Wysocki & Wysocka
  • 35. 64 years old woman with: • multiple myeloma, • s/p allogeneic transplant with recurrent disease and with systemic amyloidosis (involvement of lungs, tongue, bladder, heart), • on hemodialysis for ESRD who represents for malaise, weakness, and generalized body aching x 2 days. • she was admitted with hypercalcemia and treated with pamidronate 30mg, calcitonin, and dialysis. • patient was initially treated with melphalan and prednisone, followed by VAD regimen, and autologous stem cell transplant. • with relapse of her myeloma, she received thalidomide velcade and thalidomide, which were eventually also held due to worsening edema and kidney function. ~ 375,600 CTs Clinical trial matching
  • 37. ~ 375,600 CT reports Coarse-grained ranking LLM context window chain of prompts GPT 3.5 Layout Extractor Neural-indexing eligibility criteria Neural-search ranked list (relevant trials) Patient description Fine-grained inference LM LM Demo from Bogatu, Jullien Jullien et al. (Semeval 2021)
  • 38. Take-away Universal framework for integrating and organising heterogeneous evidence Emerging foundations for industrial-scale scientific inference Explainable ML Variational Autoencoders Large Language Models Flexibility for operating over multi-modal, heterogeneous data. Allows for associations, transparency and verifiability. Foundation for digital twins. Integrating and extrapolating over multicentric, heterogeneous data. Integrating mechanistic knowledge. Allows for semantic interpretation of text at scale. Extracting and structuring complex textual evidence.