Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
PSN for Precision Medicine
1. Patient Similarity Networks for Precision Medicine
Thi Nguyen, Ph.D. Candidate
Graduate Biomedical Sciences | Immunology Theme
University of Alabama at Birmingham (UAB)
kimthi@uab.edu
Clinical Informatics Journal Club
October 233d, 2018
2. Outline
• Current landscape in building a predictive risk model
• Patient similarity network (PSN) – emerging paradigm for clinical prediction
• Advantages of PSN
• Examples of 2 PSN: Similarity network Fusion and netDx
• Challenges of PSN analytics
• Vision for PSN-based tool for future clinic
3. Disease risk calculator
http://www.cvriskcalculator.com/
• ASCVD calculator = 10-year risk of heart disease/stroke
• 13 pieces of information: gender, age, blood lipid levels,
blood pressure, history.
• result of 50 years development/ refinement
• continue to adjust
Risk calculator = set of risk factors -> calculate disease risks to help monitoring,
diagnosis and treatment.
4. Fig. 1. Developing risk calculators
Ideal model:
• accurate
• generalizable
• reasonable time
• interpretable by clinicians
7. predictive risk models – current needs
• Integrate diverse data types (genomics, metabolomics, imaging, EHR ...)
• Interpretable
• Handle sparse/ missing data
• Maintain patient private information
• Scale up : keep pace with the scale and complexity of the data
8. Network science
• New scientific discipline, broadly interdisciplinary approach
to study complex systems
• Developed its formalism from graph theory and uses statistical
physics as conceptual framework.
• Key concept: Regardless of the domain knowledges (computer,
social, biological), all networks are driven by the same fundamental
organizing principles.
• Common set of mathematical tools to explore these systems.
http://networksciencebook.com/
by A.-L. Barabási.
9. Why network science for new predictive risk models?
• Handle heterogeneous data
• missing data is naturally handled
• easy visualization : when presented as network, grouping/ decision boundary
can be visualized
• Intuitive: Analogous to clinical diagnosis: Physicians relates a patient’s case to
previous patients they have seen : mental database
• PSN doesn’t use direct patient data -> patients privacy -> easier to scale up
• Many existing methods in network sciences allowing to integrate data = fuse
networks.
• NetDX : make use of biological pathway –based feature to improve accuracy
and generalization + increase interpretability of genome data.
10. Patient similarity networks
• each node = individual
• edge = pairwise similarity for a given feature
• Labelled patients can be grouped (clustering/ unsupervised classification) and
patient with unknown status can be assigned to a group based on their
similarity to a particular group.
• each feature (=view) is represented as a network of pairwise patient similarities
• views can be integrated/fused to identify subgroups / predict outcome.
11. Similarity Network Fusion
“Similarity network fusion for aggregating data types on a genomic scale.” Nature Methods 2014
1. Construct similarity network for each data type
2. Fuse these networks into a single network using nonlinear combination method
• Data types: mRNA, DNA methylation and miRNA
• Single value decomposition -> cosine similarity -> fuse network by iterative boosting
• This method has been applied to subtype medulloblastoma + pancreatic ductal
adenocarcinoma tumors + subtypes of diabetes.
12. Similarity Network Fusion
“Similarity network fusion for aggregating data types on a genomic scale.” Nature Methods 2014
node = patient
node size = survival
edge thickness = similarity
mRNA
miRNA
DNA meth SNF-combined
n = 215 patients with GBM
13. NetDX- a supervised patient classification framework
WORKFLOW
https://www.biorxiv.org/content/early/2018/05/25/084418
14. NetDX- a supervised patient classification framework
WORKFLOW
https://www.biorxiv.org/content/early/2018/05/25/084418
Network integration:
• use GeneMANIA - network integration algorithm, which reduces redundant networks,
give weights to networks according to their discriminatory power -> linear combination
-> composite network
Input data design:
• any kind of data, as long as the measure of patient similarity can be defined
(Pearson correlation, cosine similarity, normalized age difference)
• address the curse of omics data (too many features/ overfitting), they group
measurements in biological pathways (~2000) -> also increase interpretability.
Feature selection:
• cross-validation to measure sensitivity and specificity
Class prediction:
• patient is assigned to the class with the highest rank, where the patient is the
most similar.
15. NetDX to predict ependymoma suptypes
• microarray data + clinical data
• Pearson correlation = similarity
• regression to correct batch effects
• Lasso regression in cross validation to
prefilter genes
pathway-based design:
• genes were group into 2118 networks , one per pathway
• pathway info were aggregated from HumanCyc, IOB’s NetPath, Reactome, NCI
curated pathways, mSigDB and Panther.
16. Challenges for PSN analytics
• large data sizes (thousands of genomes)
• improve feature selections
• improve signal-to-noise ratio automatically
• characterize patient heterogeneity (disease subtypes)
• make best use of complex genomics layers (tissue-specific
variants)
• tuning parameters
• build on prior knowledge/ data, e.g. known gene-gene
interaction, epigenetic information.
18. Conclusions
• Patient Similarity Network is an emerging method used to build predictive risk model
• Many advantages compared to other approaches: integrate heterogeneous data types,
tolerate missing data, maintaining patients privacy, and have good interpretability.
• Since it is a new paradigm, there are many challenges to implement
• Similarity network Fusion and NetDX are two frameworks that implemented PSN with
success
• Opportunities
19. Questions/ Thoughts/ Comments
• Can pairwise comparison capture all the complexity of gene expression in each
patient? Is it a valid question for PSN?
• To what extent should we reduce the dimensions to make sense of the data without
stripping it out of its important nuances?
• Does combining the networks (fusing them) smooth out/ preserve the heterogeneity
underlying the structure of each type of data?
• Does the PSN actually make the network/ grouping similar to the way a clinician
would do?
• Would there be data types that are not compatible to be integrated?