Integration and analysis of high throughput data types for insights into complex disease - Sarah-Jane Schramm

6,496 views

Published on

Integrative ‘-omics’ – wherein multiple types of high-throughput data are combined and analysed together – continues to grow in popularity for its potential to illuminate the basis of complex diseases. Our work explores different ways of combining such data to reveal insights into cancer biology.

Published in: Science, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,496
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Integration and analysis of high throughput data types for insights into complex disease - Sarah-Jane Schramm

  1. 1. Integration and analysis of high- throughput data types for insights into complex disease National Council of Women NSW Olena Pchilka Branch of the Ukranian Women’s Association NSW SARAH-JANE SCHRAMM
  2. 2. Multiple approaches
  3. 3. Melanoma › High and rising incidence › Aggressive and therapy resistant, surgical resection is key › Same stage disease can have markedly different survival outcomes › Patient outcome predicted using clinical and histological features › Limited predictive power for individual patientsStages I & II, primary melanoma Stage III, lymphatic drainage from primary (nodal metastases) Stage IV, further dissemination (distant metastases) Image adapted and reproduced from LANCET ONCOLOGY|Vol 8|2007
  4. 4. Research aims › New prognostic markers - To determine whether there are significant biomarker and pathway differences between melanomas of good and bad prognosis after resection of nodal metastatic disease; › New therapeutic targets - To identify and validate the principal regulatory pathway abnormalities that characterise metastatic (stage III and IV) melanomas; - To investigate novel genomic drivers of melanoma tumour progression and outcome.
  5. 5. What is Cancer Systems Biology? “Recent findings point to daunting heterogeneity within individuals, and even within tumours over time… …rummaging through that complexity is exactly what systems biologists do… …Rather than focusing on one molecular pathway, this integrative approach blends many contexts, including DNA, RNA, proteins, signalling networks, cells, organs, whole organisms and even environmental factors. This varied data mix requires scientists to build complex mathematical models of cancer, which in turn drive new research questions…” Reprinted from NATURE|Vol 464|2010
  6. 6. How does one do Cancer Systems Biology? 1. Collect and prepare different data types such as, › Gene expression microarray data › MicroRNA expression array data › Proteomic data › Clinical data e.g., survival data › Pathologic data e.g., subtypes › Mutation data e.g., RNA/DNA-seq 2. Combine and interpret data with mathematical models 3. Validate the models Slide adapted from Los Alamos q-bio Summer School, 2009
  7. 7. 1. Collection and preparation of data P Natl Acad Sci USA|Nov. 13|2009Clin Cancer Res.|Vol. 14|2008 Clin Cancer Res.|Vol. 16|2010 JID|Vol 133|2013 Gene expression microarray data Thank you to Drs Anna Campain, Vivek Jawayasal and Yee Hwa Yang, School of Mathematics & Statistics, The University of Sydney
  8. 8. 1. Collection and preparation of data CLINICAL DATA Tumour_DateBanked Person_Sex Person_DateBirth Person_NumPrim Person_DateLastFUDeath Person_FUStatus Person_StageatBank Person_DateRelapse Age_Analysis Prognosis_TimeSinceLNMet GENOTYPE Tum_BRAFmut Tum_NRASmut Tum_FLT3mut Tum_METmut Tum_PIK3CAmut Tum_PDGFRAmut Tum_EGFRmut PATHOLOGY - PRECEDING PRIMARY Person_NumPrim Prim_Worst Prim_BestGuess Prim_Date_Diag Prognosis_TimeOverall Prim_Site Prim_Site_SunExp Prim_Stage Prim_TStage Prim_NStage Prim_Naevus Prim_Breslow Prim_Mitos Prim_Clark Prim_Histol Prim_Regress Prim_Ulc Prim_Vasc Prim_LymphInv Prim_Satell Sun_Damage_Score NM SSM PATHOLOGY - METASASES Tum_NumNodesInv Tum_MetSize Tum_Extranodal Tum_CellType Tum_CellSize Tum_Necrosis Tum_Pigment Tum_NonTumour% Clinical, pathological, and mutation type data Thank you to Prof. Richard Scolyer and his team at the Royal Prince Alfred Hospital, The University of Sydney
  9. 9. 1. Collection and preparation of data › Human Protein Reference Database - Keshava Prasad et al. 2009 › iRefWeb - Turner et al. 2010 › BioGRID - Chatr-aryamontri et al. 2013 › MetaCore - From GeneGo Inc. Hairball image generated using Cytoscape (Smoot et al. 2011) Protein-protein interaction data Thanks to Simone Li and Drs Igy Pang and David Fung at the Systems Biology Initiative, the University of New South Wales
  10. 10. 2. Mathematical modeling and interpretation NATURE BIOTECH.|Vol 27|2009
  11. 11. 2. Mathematical modeling and interpretation Results – gene co-expression networks are significantly disturbed among patients with good and poor clinical outcomes › A: › Patients surviving >4yr post resection of metastatic disease › B: › Patients surviving <1yr post resection of metastatic disease › C & D: › Enlarged view (HDAC) PIG. CELL & MEL. RES.|26(5):708-22|2013
  12. 12. 2. Mathematical modeling and interpretation Results – hubs are reproducibly ‘disturbed’ among good and poor outcomes Gene symbol ID Known drug target Causally implicated in cancer(s) Number of interaction partners (k) = 6-38 Previously prognosis- associated Previously progression- associated Previously tumor thickness- associated Protein type1 AKT1 P P Protein Kinase APPL1 P Protein CCNA2 P P Protein CDC25A P Phosphatase CIITA P P Protein CREBBP P Enzyme CSNK2A1 Protein Kinase FANCG P P Protein GATA4 P Transcription Factor GRAP2 P Protein GRB2 Protein HDAC1 P Enzyme PIG. CELL & MEL. RES.|26(5):708-22|2013
  13. 13. 2. Mathematical modeling and interpretation Results – hubs are reproducibly ‘disturbed’ among good and poor outcomes Gene symbol ID Known drug target Causally implicated in cancer(s) Number of interaction partners (k) = 6-38 Previously prognosis- associated Previously progression- associated Previously tumor thickness- associated Protein type1 HIF1A P P P Transcription Factor IKBKB P P Protein Kinase IL16 Receptor Ligand JAK1 P P Protein Kinase KHDRBS1 P Protein MYBL2 P Transcription Factor NF2 P P Protein PDZK1 P Protein PIM1 P P P Protein Kinase PSTPIP1 P Protein PTPN11 P P Phosphatase RAPGEF1 P Regulator PIG. CELL & MEL. RES.|26(5):708-22|2013
  14. 14. 2. Mathematical modeling and interpretation Results – hubs are reproducibly ‘disturbed’ among good and poor outcomes Gene symbol ID Known drug target Causally implicated in cancer(s) Number of interaction partners (k) = 6-38 Previously prognosis- associated Previously progression- associated Previously tumor thickness- associated Protein type1 RBL1 P Protein RBX1 P Enzyme SMAD2 P Transcription Factor SMAD7 P Protein STAMBP P Metalloproteas e TGM2 P P Enzyme TLE1 Protein TNF P P P Receptor Ligand › 9 are already known drug targets (although not in melanoma) › 8 already causally implicated in other cancers › 5 previously associated with melanoma progression or prognosis or indirectly associated via correlation with tumor thickness (more than would be expected by chance) PIG. CELL & MEL. RES.|26(5):708-22|2013
  15. 15. 2. Mathematical modeling and interpretation 15 Results – top ranking hubs are cancer-associated both individually (below) and as a gene set (data not shown) PIG. CELL & MEL. RES.|26(5):708-22|2013
  16. 16. 2. Mathematical modeling and interpretation Results - top ranking hubs can be used together to predict patient outcome Cohort Mann Bogunovic Jönsson John Sample size (ngood outcome; npoor outcome) 47 (23;25) 33 (23;10) 54 (7;47) 24 (10;14) Classes compared survival >4yr with no sign of relapse or <1yr after surgical resection of stage III disease survival ≥ 1.5yr or<1.5yr since metastasis overall survival time taken to tumor progression from stage III to stage IV disease ≥2yr or <2yr Class prediction error rate (LOOCV under KNN) 0.33 0.24 0.20 0.29 • Comparison with standard-of-care prognostic markers • Novel proposed prognostic biomarkers should be tested for improved performance relative to current biomarkers (McShane, Altman et al. 2005) • We compared the prediction accuracy of our 32-hub classifier with the prediction accuracy of the four most statistically significant clinico-pathologic prognostic parameters in stage III melanomas: i.e., number of tumor-positive lymph nodes, tumor burden at the time of staging (microscopic v. macroscopic), presence or absence of primary tumor ulceration, and thickness of the primary melanoma (Balch, Gershenwald et al. 2009). • Misclassification rate of 56% for our set of 48 patients, which is less accurate than the misclassification rate of 33% obtained for this cohort using the hub-based classifier. PIG. CELL & MEL. RES.|26(5):708-22|2013
  17. 17. 3. Validation with a view to a mechanism…? › Exome sequencing data (Hodis et al. 2012) › Calculation of functional mutation burden for ~16,000 genes (Broad Institute software) › Functional mutation burden is significantly (P<0.05) higher in protein interaction partners of top-ranking hubs than would be expected by chance › So, is functional mutation burden a pathogenic mechanism behind the differential network behaviour we observe between patients with good and poor clinical outcome? › If so, can differential network behaviour act as a compass by indicating genomic areas (i.e., members of disturbed networks) that should be carefully scrutinized (including non-coding regions) for undiscovered and potentially targetable mutations??? › More work is needed! 17 An association between network-type and functional mutation burden PIG. CELL & MEL. RES.|26(5):708-22|2013
  18. 18. Summary and conclusions • Used a large-scale, ‘systems biology’ approach to identify features of intracellular networks that are perturbed in poor-prognosis metastatic melanoma. • Showed this to be consistent in a number of independent patient cohorts identifying: • A portfolio of high priority potential targets for therapy, characterised by enrichment for cancer pathways, existing cancer drug targets, and functional mutation burden • Gene expression of the 32 hubs forms a new, a priori-selected prognostic gene expression signature in the setting of metastatic melanoma: a critical turning point for many patients for which therapeutic options are very limited (but further validation needed). • Present work is focussed on investigating our preliminary observation that network disturbances are associated with higher functional mutation burden • Modelling and integration of different data types to answer clinically relevant questions is ongoing
  19. 19. Square One. › Perform equivalent experiments using data from larger cohorts as well as other cancer types to see whether the observation can be repeated › So, - 1. Collect and prepare data: breast (TCGA), ovarian (Metabric), and melanoma (in-house/TCGA), lung (TCGA) • Permissions and applications…yikes! - 2. Mathematical modelling and interpretation • In collaboration with The USYD Maths and Stats team (Yee Hwa Yang, Shila Ghazanfar, and John Ormerod) • Software generated in-house and available externally (VAN, Jayaswal et al. 2013; MuText and InVex – Broad Institute, Hodis et al. 2012) - 3. Validation… 19 An association between network phenotype and functional mutation burden? Sincere thanks to Dr Yee Hwa Yang and Shila Ghanazfar for their essential collaboration in this work
  20. 20. Software spruik 20 VAN: identifying biologically perturbed networks using differential variability analysis BMC RES. NOTES.|6(430)w|2013, special thanks to Dr Vivek Jayaswal for his invaluable collaboration
  21. 21. Issues common among different integration approaches • Power • Handling of prior knowledge biases • Visualisation • Maintaining clinical relevance • Computational search space
  22. 22. Acknowledgements › UNSW › Marc Wilkins - Simone Li - Chi Nam Ignatius Pang - David Fung - Apurv Goel - Natalie Twine › USYD › Graham Mann - Gulietta Pupo & Varsha Tembe › Swetlana Mactier › Richard Scolyer (RPA) › Yee Hwa Yang - Anna Campain - Vivek Jayaswal - Kaushala Jayawardana - Shila Ghanazfar My contact details: ssch2971@uni.sydney.edu.au p. 0408 260 588

×