1. Bioinformatic Prioritisation of Synthetic Lethal Targets
for Drug Activity Against E-cadherin Deficient Cancers
S. Thomas Kelly, Parry J. Guilford, and Michael A. Black
Department of Biochemistry, University of Otago, Dunedin, New Zealand
Te Aho Matatū, Centre for Translational Cancer Research
INTRODUCTION
Expression of the tumour suppressor gene, E-cadherin (CDH1), is lost in a range of
different cancer types, through multiple mechanisms1
. Traditionally, tumour
suppressors have been unattractive drug targets, despite their importance in tumour
growth and development. More recently however, the concept of “synthetic lethality”
has provided a mechanism by which tumour cells exhibiting loss of a specific tumour
suppressor can be precisely targeted through an essential partner gene2
. A synthetic
lethal drug design approach to indirectly target CDH1 deficient cells could therefore
be used to develop effective chemopreventatives and treatments with fewer adverse
effects than existing anti-cancer regimes.
Experimental studies of synthetic lethality have been developed in cancer cell lines
and model organisms. However, high-throughput RNAi and drug screens are costly,
labour-intensive, and conducted in experimental models which may not reflect the
genetic background or variation of tumours in patients. We have developed a
bioinformatics methodology to both overcome some of the limitations of experimental
models, and to augment experimental data. Here we present an example relating to
breast cancers with low E-cadherin expression.
METHODOLOGY
Microarray and RNASeq gene expression data for breast and stomach cancers was
sourced from public databases: TCGA, GEO, ArrayExpress, and caArray3,4
. A statistical
methodology was then developed to predict synthetic lethal partners of a pre-specified
target gene. For ease of use by non-statisticians, this methodology was also placed in a
web-accessible framework using the R Shiny package.
RESULTS
The known synthetic lethal interaction between the BRCA genes and PARP1 was
predicted by our methodology. Predicted synthetic lethal partners of CDH1 are enriched
for chromatin remodelling, cytoskeleton, and cell signalling functions. As shown by
heatmap, significant CDH1 partners have important variation in CDH1 deficient
tumours. Low candidate target expression in suggests therapeutic resistance in basal or
ER negative tumours. High expression clustered the genes into 3 closely correlated
groups corresponding to distinct enriched biological pathways. Considering each group
separately provides biological insights and incentive to develop combination therapies.
Candidate Gene (e.g., SVIL)
Low Medium High
QueryGene
(e.g.,CDH1)
Low
Observed less than
expected
Observed more
than expected
Medium
High
A strategy for synthetic lethal detection using from3-quantile gene expression data
Post Prediction Analysis
Processed by TCGA Database
Raw .FASTQ
Sequence Files
Aligned .BAM
Sequence Files
Normalised by
TCGA
Exclude Genes
with Q3=0
Extract Files in
R and
Combine into
Data Matrix
Query SL
Gene
Chi-Sq
Test +
Direction
SL Table
looploop
Query all Genes for SL
Save Gene Files
Summary Plots
Function
Gene Set
Analysis
Most SL
Candidate
Gene Analysis
Pairwise SL
Interactome
Test
TCGA RNA-Seq
Transcriptome
Profiling
Gene/Exon
Quantification
Combine GAII
and Hi-Seq Data
Quantile
Normalise
Samples on
Log-Scale
Rounding
and
Truncate
Table
Correlation
Gene
Co-expression
matrix
L/M/H Matrix
TS_SL
FUTURE DIRECTIONS
Future directions for this project include replication of findings across datasets from
different sources and microarray platforms. This project is expanding to encompass RNA-
Seq datasets, and the comparison of synthetic lethal predictions across cancers and
tissue types. This procedure can be scaled up for testing multiple query genes in parallel
with high performance computing5
. Other research directions such as functional analysis
of gene sets and gene network analysis are being developed concurrently.
CONCLUSIONS
We have developed a bioinformatics tool which detects known and potentially novel
synthetic lethal interactions. Synthetic lethal interactions are detectable in a
heterogeneous tumour and may occur frequently in the human genome. Synthetic lethal
interactions could be exploited for anti-cancer therapy with the advantage of reduced
adverse effects and specific activity against loss of tumour supressor gene function.
E-cadherin mutations are an ideal case to develop synthetic lethal treatment against
sporadic and hereditary cancers in multiple tissues. The example of CDH1 synthetic
lethal partners demonstrates the value of integrating bioinformatics analysis to facilitate
drug design against tumour supressor mutations.
DISCUSSION
Compared with an experimental screen, a bioinformatics approach has the benefits of
reduced costs, with the potential for automation, scaling up, and replication of the same
gene across populations and cell types. Analysis of public genomic data accounts for real
tumour variation with predictions despite tumour heterogeniety and genomic instability.
Compared with a cell line or xenograft experimental model, we are limited by difficulties in
establishing validity of a novel method, lack of mechanism, or potential for testing drug
activity in the same system. This method may further miss useful therapeutic candidates
from variable genetic background and be limited by the population sampled.
Therefore we intend to apply this method which is integrated with laboratory screening
data to triage drug targets as part of an ongoing collaboration. This proof of concept
analysis with CDH1 synthetic lethal partners shows that we can detect potential synthetic
lethal interactions, even if they occur in only a subset of patients, with functional groups
of gene targets. This approach can increase the efficiency of experimental testing and
integrate into a pipeline to develop personalised medicine against tumour supressor
mutations. Clinical applications include prevention of hereditary cancers and treatment of
sporadic cancers.
1. Guilford, P.J., et al., E-cadherin germline mutations in familial gastric cancer.
Nature, 1998. 392: p. 402-5.
2. Kaelin, W.G., Jr., The concept of synthetic lethality in the context of anticancer
therapy. Nat Rev Cancer, 2005. 5: p. 689-98.
3. Soon, W.W., et al., Combined genomic and phenotype screening reveals secretory
factor SPINK1 as an invasion and survival factor associated with patient prognosis in
breast cancer. EMBO Mol Med, 2011. 3: p. 451-64.
4. Cancer Genome Atlas Research Network, Comprehensive molecular portraits of
human breast tumours. Nature, 2012. 490: p. 61-70.
5. Kelly, S.T., et al., Bioinformatic analysis of synthetic lethal genetic interactions in
breast cancer. Proceedings of eResearch NZ HPC Applications Workshop; 2014 Jun
30-Jul 2; Hamilton, NZ
Normal/Tumour/Metastasis
Ductal/Lobular
Stage
Estrogen Receptor
Progresterone Receptor
HER2 Status
Subtype (PAM50)
CDH1 levels
CDH1 Status
Cluster
Significance
Cluster
A workflow summary of the procedures involved in bioinformatic prediction of synthetic
lethality from a public database such as the cancer genome atlas .
WikiPathways Gene Set SL genes in Set Total Genes in Set p-value FDR p-value
GPCRs Class B Secretin-like (WP334) 8 24 0.00017 0.022
Eicosanoid Synthesis (WP167) 5 19 0.0091 .058
G Protein Signaling Pathways (WP35) 13 95 0.017 0.58
Endochronal Ossif cation (WP474) 10 66 0.018 0.58
Steroid Biosynthesis (WP496) 3 10 0.03 0.67
Arrhythmogenic right ventricular cadiomyopathy 10 74 0.036 0.67
ErbB signaling pathway (WP673) 8 55 0.04 0.67
Small Ligand GPCRs (WP247) 4 19 0.042 0.67
Nucleotide GPCRs (WP80) 3 12 0.049 0.67
AMK Signaling (WP1403) 9 68 0.051 0.67
Selenium Pathway (WP15) 8 60 0.062 0.73
Prostaglanding Synthesis and Regulation (WP93) 5 31 0.066 0.73
WikiPathways Gene Set SL genes in Set Total Genes in Set p-value FDR p-value
Adiposgenesis (WP236) 35 133 0.00000046 0.00067
Cytochrome P450 (WP43) 18 61 0.00022 0.016
Metapathway biotransformation (WP702) 36 176 0.00094 0.046
Complement and Coagulation Cascades (WP558) 13 50 0.0055 0.2
Mitochondrial LC-Fatty Acid Beta-Oxidation (WP368) 6 16 0.0087 0.23
Prostaglandin Synthesis and Regulation (WP98) 9 31 0.0094 0.23
19 91 0.012 0.25
Ovarian Infertility Genes (WP34) 8 31 0.028 0.5
Focal Adhesion (WP306) 32 191 0.035 0.5
Monoamine GPCRs (WP58) 8 33 0.04 0.5
Calcium Regulation in the Cardiac Cell (WP536) 26 151 0.04 0.5
Vitamin A and carotenoid metabolism (WP716) 9 39 0.041 0.5
WikiPathways Gene Set SL genes in Set Total Genes in Set p-value FDR p-value
41 191 0.000079 0.011
TFG Beta Signaling Pathway (WP560) 15 54 0.0011 0.064
28 133 0.0015 0.064
29 141 0.0018 0.064
24 117 0.0046 0.093
Arrhythmogenic right ventricular cadiomyopathy 17 74 0.005 0.093
22 105 0.005 0.093
11 40 0.0055 0.093
10 35 0.0059 0.093
11 42 0.0081 0.11
30 163 0.0086 0.11
15 66 0.009 0.11
Gene Set Analysis for WikiPathways (GeneSetDB) of Subgroups of CDH1 SL partners
Neural Crest Diffentiation (WP2064)
Focal Adhesion (WP306)
EGF/EGFR Signaling Pathway (WP437)
Integrin-mediatied cell adhesion (WP185)
IL-5 signaling pathway (WP127)
Signaling of Hepatocyte Growth Factor Receptor (WP313)
IL-2 signaling pathway (WP49)
MAPK signaling pathway (WP382)
Endochronal Ossif cation (WP474)
Adipogenesis (WP236)
TGF beta Signaling Pathway (WP366)
Download Sample
Files
ExpressionProfile
Sample
Significance
Gene
Gene
A gene correlation heatmap of FDR Significant SL partners of CDH1 in CDH1 low samples
A gene expression heatmap of FDR Significant SL partners of CDH1 in CDH1 low samples
ACKNOWLEDGEMENTS
I thank my supervisors, Mik and Parry, for their incredible support throughout this
project. The University of Otago and the Postgraduate Tassell Scholarship in Cancer
Research for course support and funding. The New Zealand eScience Infrastrcture
(NeSI) support team gave helpful advice on scaling up the computational
methodology. Thanks to James Boocock, Murray Cadzow, Augustine Chen, and the
Cancer Genetics Laboratory for assistance with R and biological relevance.
Gene Expression for FDR significant partners of CDH1 in TCGA Breast RNASeq data
Color Key
and Histogram
Color Key
and Histogram
Color Key
and Histogram
Gene Correlation for FDR significant partners of CDH1 in TCGA Breast RNASeq data