1. Network analysis of biomedical data
Artem Ryblov, Alexey Zaikin, Oleg Blyuss, John Timms
Lobachevsky State University of Nizhniy
Novgorod
Institute for Women’s Health, UCL
19.05.15
2. Goal
Analyze biomedical data in order to predict diseases
(pancreatic cancer, diseases of the digestive system) at
various stages with the help of machine learning techniques
4. • The biggest program in the field of women's cancer
research
• Screening study: 200 out of 200 000 women – 100
cases/controls,100 oncomarkers
• Data available up to 12 years prior (!) to diagnosis
• Data stored in biobank and available for later
research
Pancreatic cancer
5. Biomedical data
…
An oncomarker is
a biomarker found in
the blood, urine, or body
tissues that can be
elevated in cancer,among
other tissue types.There
are many different tumor
markers, each indicative of
a particular disease
process, and they are used
in oncology to help detect
the presence of cancer.
93 markers
7. In search for network
oncomarker
or multi-multi marker assay
8. •Computer networks,e.g.WWW (the Internet)
•Functional networks,e.g.part of the genomу,human brain
•What to do, if we have many variables for cases/controls?
Pancreatic cancer:100 markers,14 cases, 36 controls
14. Topological indexes: 32 well-established indexes (centrality
scores), mean/max/min degree of nodes, betweenness,
closeness, page rank,....
Multivariate forward-backward selection in logistic model:
11 Indexes: 97% AUC vs 95% AUC logistic regression only
For proteomics data improvement is more impressive:
87% AUC vs 76% AUC logistic regression only
Conclusion
16. • Network analysis of oncomarkers is the way to early
diagnosis of pancreatic cancer
• Parenclitical network approach can be used in
multimarker analysis where the number of markers is
significant
• Open questions: What indices can we use? What
data can we analyze?
• Extendparenclitic networks approach for categorical
data (smoking, taking medicines, hormone therapy)
• Use cross-validation techniques
Conclusion