PSN for Precision Medicine

Patient Similarity Networks for Precision Medicine
Thi Nguyen, Ph.D. Candidate
Graduate Biomedical Sciences | Immunology Theme
University of Alabama at Birmingham (UAB)
kimthi@uab.edu
Clinical Informatics Journal Club
October 233d, 2018

Outline
• Current landscape in building a predictive risk model
• Patient similarity network (PSN) – emerging paradigm for clinical prediction
• Advantages of PSN
• Examples of 2 PSN: Similarity network Fusion and netDx
• Challenges of PSN analytics
• Vision for PSN-based tool for future clinic

Disease risk calculator
http://www.cvriskcalculator.com/
• ASCVD calculator = 10-year risk of heart disease/stroke
• 13 pieces of information: gender, age, blood lipid levels,
blood pressure, history.
• result of 50 years development/ refinement
• continue to adjust
Risk calculator = set of risk factors -> calculate disease risks to help monitoring,
diagnosis and treatment.

Fig. 1. Developing risk calculators
Ideal model:
• accurate
• generalizable
• reasonable time
• interpretable by clinicians

Methods used in clinical risk models

Genomics in clinical risk models
The rise of genomic area

predictive risk models – current needs
• Integrate diverse data types (genomics, metabolomics, imaging, EHR ...)
• Interpretable
• Handle sparse/ missing data
• Maintain patient private information
• Scale up : keep pace with the scale and complexity of the data

Network science
• New scientific discipline, broadly interdisciplinary approach
to study complex systems
• Developed its formalism from graph theory and uses statistical
physics as conceptual framework.
• Key concept: Regardless of the domain knowledges (computer,
social, biological), all networks are driven by the same fundamental
organizing principles.
• Common set of mathematical tools to explore these systems.
http://networksciencebook.com/
by A.-L. Barabási.

Why network science for new predictive risk models?
• Handle heterogeneous data
• missing data is naturally handled
• easy visualization : when presented as network, grouping/ decision boundary
can be visualized
• Intuitive: Analogous to clinical diagnosis: Physicians relates a patient’s case to
previous patients they have seen : mental database
• PSN doesn’t use direct patient data -> patients privacy -> easier to scale up
• Many existing methods in network sciences allowing to integrate data = fuse
networks.
• NetDX : make use of biological pathway –based feature to improve accuracy
and generalization + increase interpretability of genome data.

Patient similarity networks
• each node = individual
• edge = pairwise similarity for a given feature
• Labelled patients can be grouped (clustering/ unsupervised classification) and
patient with unknown status can be assigned to a group based on their
similarity to a particular group.
• each feature (=view) is represented as a network of pairwise patient similarities
• views can be integrated/fused to identify subgroups / predict outcome.

Similarity Network Fusion
“Similarity network fusion for aggregating data types on a genomic scale.” Nature Methods 2014
1. Construct similarity network for each data type
2. Fuse these networks into a single network using nonlinear combination method
• Data types: mRNA, DNA methylation and miRNA
• Single value decomposition -> cosine similarity -> fuse network by iterative boosting
• This method has been applied to subtype medulloblastoma + pancreatic ductal
adenocarcinoma tumors + subtypes of diabetes.

Similarity Network Fusion
“Similarity network fusion for aggregating data types on a genomic scale.” Nature Methods 2014
node = patient
node size = survival
edge thickness = similarity
mRNA
miRNA
DNA meth SNF-combined
n = 215 patients with GBM

NetDX- a supervised patient classification framework
WORKFLOW
https://www.biorxiv.org/content/early/2018/05/25/084418

NetDX- a supervised patient classification framework
WORKFLOW
https://www.biorxiv.org/content/early/2018/05/25/084418
Network integration:
• use GeneMANIA - network integration algorithm, which reduces redundant networks,
give weights to networks according to their discriminatory power -> linear combination
-> composite network
Input data design:
• any kind of data, as long as the measure of patient similarity can be defined
(Pearson correlation, cosine similarity, normalized age difference)
• address the curse of omics data (too many features/ overfitting), they group
measurements in biological pathways (~2000) -> also increase interpretability.
Feature selection:
• cross-validation to measure sensitivity and specificity
Class prediction:
• patient is assigned to the class with the highest rank, where the patient is the
most similar.

NetDX to predict ependymoma suptypes
• microarray data + clinical data
• Pearson correlation = similarity
• regression to correct batch effects
• Lasso regression in cross validation to
prefilter genes
pathway-based design:
• genes were group into 2118 networks , one per pathway
• pathway info were aggregated from HumanCyc, IOB’s NetPath, Reactome, NCI
curated pathways, mSigDB and Panther.

Challenges for PSN analytics
• large data sizes (thousands of genomes)
• improve feature selections
• improve signal-to-noise ratio automatically
• characterize patient heterogeneity (disease subtypes)
• make best use of complex genomics layers (tissue-specific
variants)
• tuning parameters
• build on prior knowledge/ data, e.g. known gene-gene
interaction, epigenetic information.

Vision for network-based classification tool
for precision medicine

Conclusions
• Patient Similarity Network is an emerging method used to build predictive risk model
• Many advantages compared to other approaches: integrate heterogeneous data types,
tolerate missing data, maintaining patients privacy, and have good interpretability.
• Since it is a new paradigm, there are many challenges to implement
• Similarity network Fusion and NetDX are two frameworks that implemented PSN with
success
• Opportunities

Questions/ Thoughts/ Comments
• Can pairwise comparison capture all the complexity of gene expression in each
patient? Is it a valid question for PSN?
• To what extent should we reduce the dimensions to make sense of the data without
stripping it out of its important nuances?
• Does combining the networks (fusing them) smooth out/ preserve the heterogeneity
underlying the structure of each type of data?
• Does the PSN actually make the network/ grouping similar to the way a clinician
would do?
• Would there be data types that are not compatible to be integrated?

PSN for Precision Medicine

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PSN for Precision Medicine

Similar to PSN for Precision Medicine (20)

More from Thi K. Tran-Nguyen, PhD

More from Thi K. Tran-Nguyen, PhD (20)

Recently uploaded

Recently uploaded (20)

PSN for Precision Medicine