Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Patient subtypes: real or not?

104 views

Published on

Presented at AI & Machine Learning for Computational Medicine (Imperial College London, March 2018)

Published in: Science
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Patient subtypes: real or not?

  1. 1. Patient subtypes: real or not? (Clustering, you’re doing it wrong) Paul Agapow Data Science Institute <p.agapow@ic.ac.uk> @agapow March 2018
  2. 2. Precision medicine
  3. 3. Patient assignment is a clustering problem Precision medicine (assign patients to clusters) Translational medicine (infer clusters from patients)
  4. 4. Clustering requires many decisions • Similarity • Group boundaries • Spectra • Ground truth • Noise / non-clustered • Resolution / cut-offs • Uninteresting clusters • Stable & robust
  5. 5. So are clusters real? Every dataset contains clusters, with different set of clusters being revealed by different methods, but not all of these clusters are real or interesting or meaningful. (Paraphrased from Christian Hennig)
  6. 6. Example: Diabetes • 6 clinical vars produce 5 clusters • Validated in other datasets
  7. 7. But! After van Smeden, Harrel, Dahly: • Dimension reduce 6 to 5 only? • 2 related random vars can generate 6 “clusters” • Not validated in other data types
  8. 8. Example: Asthma • Complex & heterogeneous • Many attempts to stratify: 3, 4, 6+ clusters
  9. 9. Are asthma clusters real? • Use multiple methods & multiple datasets to validate • Compare nested clusters with homogeneity & completeness
  10. 10. Are all asthma clusters equally real? • Requires many more genes to id TAC3a & b than 1 or 2 • Because different clusters cluster differently
  11. 11. Biclustering • Better approach? • Simultaneously group samples & features • Relieves assumption of clustering everything
  12. 12. Take home • There are many ways to cluster & thus many clusters • Lots of different ways to be a cluster, even in same dataset • Too easy to fish • Validate with other data types
  13. 13. Or ... Clustering methods are hypothesis generators, cluster partitions are hypotheses and need to be validated or proven to be useful.
  14. 14. Thanks • Nazanin Kermani • Mansoor Saqi • Axel Oehmichen

×