This project was completed during the Lviv Data Science Summer School 2016 (http://cs.ucu.edu.ua/en/summerschool). The project supervisor - Elena Sügis. The project goal was apply machine learning and bioinformatics algorithms including normalization and training a classifier in order to find most significantly differential proteins between Alzheimer’s patients and healthy controls.
8. WHAT WE DO
Preprocess data:
Take only significant proteins
Exclude highly correlated proteins
Take top 10 proteins (by p-value)
9. WHAT WE DO
Preprocess data:
Take only significant proteins
Exclude highly correlated proteins
Take top 10 proteins (by p-value)
PATIENTS
10. WHAT WE DO NEXT
Choose the model
Train the model
Get results
RAINFOREST
11. WHAT WE DO NEXT
Choose the model
Train the model
Get results
RAINFORESTRANDOM
12. 98,7 % correctly predicted results in training set
92,6 % correctly predicted results in test set
p = 2.2e-16
13. 98,7 % correctly predicted results in training set
92,6 % correctly predicted results in test set
p = 2.2e-16
14. WHAT DOES IT MEAN?
● PCBD2 + PTCD2 + CENTA2 + ANKHD1
● Support for previous research
● (Nagel et al., 2011)
● PCBD2 + PTCD2 as the heros
● (Han et al., 2012, Acharya et al., 2012)
18. Referencies:
Acharya, N.K., Nagele, E.P., Han, M., Coretti, N.J., DeMarshall, C., Kosciuk, M.C., Boulos, P.A. & Nagele,
R.G. (2012). Neuronal PAD4 expression and protein citrullination: possible role in production of
autoantibodies associated with neurodegenerative disease. Journal of Autoimmunity, 38(4):369-80.
Han, M., Nagele, E., DeMarshall, C., Acharya, N., &Nagele, R. (2012). Diagnosis of Parkinson's Disease
Based on Disease-Specific Autoantibody Profiles in Human Sera. PLoS ONE, 7(2): e32383.
Nagele, E., Han, M., DeMarshall, C., Belinka, B., & Nagele, R. (2011). Diagnosis of Alzheimer's Disease
Based on Disease-Specific Autoantibody Profiles in Human Sera. PLoS ONE, 6(8): e23112.