2. Colon Cancer Background: Incidence
• Deaths and incidence have decreased by 2-3% over the last 10yr, but high population burden remains due to its prevalence (2)
• ‡ Rates are age-adjusted to the 2000 U.S. standard population (19 age groups – Census P25–1130).
• *Age-adjusted based on 2009-2013 cases and deaths, men and women
• Sources
• 1. CDC 2013 Top Ten Cancers Datasheets: https://nccd.cdc.gov/uscs/toptencancers.aspx#Footnotes (inc. Incidence chart)
• 2. NIH SEER Stat Fact Sheets: http://seer.cancer.gov/statfacts/html/colorect.html (inc. Survival chart)
*
3. Methylation Varies with Location
1. Anatomic Image via CDC Colorectal Cancer Information:
https://www.cdc.gov/cancer/colorectal/basic_info/what-is-colorectalcancer. htm
2. Kaz et. al (2014) “Patterns of DNA methylation in the normal colon vary by anatomical location, gender, and age.” Epigenetics
4. iEVORA method
• iEVORA a sensitive and stringent method for detecting hypervariant
probes
• Used to identify significant differentially variable probes of potential
etiologic and clinical importance in breast cancer.
Sources:
• 1. Teschendorff et al (2016) “Stochastic epigenetic outliers can define field defects in cancer”
• 2. Teschendorf et al (2016) “DNA methylation outliers in normal breast tissue identify field defects that are enriched in cancer breast cancer
5. Experimental Goals Overview
• Compare normal tissue from healthy/cancer-free individuals
to normal tissue with a likely field
• Determine field-specific methylation signatures
• Derive highly sensitive and specific marker panelsfor
detection of fields in normal colon tissue
6. Premalignant Fields in Colon: Hypothesis and
Progression Model
• Definition: Normal tissue at heightened risk of progression to
cancer
• Under a clonal model, signatures of clonal cells may be
detectable in normal tissue
• Early clonal signatures may yield heightened methylation
variance in a field due to heterogeneous mixture of normal
and clonal cells
7. iEVORA Results in Colon
• 1. Discovery/Identification Set
• Normal-healthy (36) vs. Normal-high risk (14)
• 2. Optimization Set – TCGA
• Normal-matched (14) vs. tumor (107)
• 3. Marker Test Set
• Normal-Healthy (17) vs. Normal-matched/high risk (91)
8. iEVORA Results: Discovery Set
– Normal-healthy (36) vs. Normal-high risk (14)
199 significant iEVORA probes (DVMCs) identified
9. TCGA Optimization Set: Batch Detection Workflow
MDS and PCA commonly used in batch detection:
1. Multidimensional Scaling/Principle Coordinates analysis:
Ordination to show relatedness of individuals
2.Principle components analysis: Orthogonal variable transformation
Numbered according to amount of variability in dataset
explained.
R and JMPG show similar results, where normal-matched and cancer
tissues cluster distinctly for the most part.
11. TCGA Optimization Set: Evaluating Within vs.
Between-sample Variance
• Some datasets contain individuals multiply
represented (ie. TCGA inc. tumor and normal
tissues from same patients).
• Want to know whether patient/Participant ID
will bias our analysis
• Test whether more variation is explained by
primary predictor (Tissue type) or confounder
(ID)
• “Residual” variable explains variance not
described by characterized variables.
• When only Tissue Type, Plate, ID, and
Residuals are used, Residuals is higher and
Tissue Type variance < ID/Participant variance
overall
13. Next Steps
• Compare iEVORA probes in R/L colon and colon/rectum
• Asses functional role, determine inter/intragenic and
inter/intra island status
• Assess overlap with VELs, TF binding sites as available,
histone marks, etc.