SlideShare a Scribd company logo
1 of 79
Introduction to
Epigenetic Epidemiology
Day 1
Integration of Epidemiology and
Epigenetics
Requires:
• Subject matter knowledge
• Pressing biologic questions
• Determinants of epigenetic variation
• Possible confounders and effect modifiers
• Appropriate Study Design and Analysis
• Focus today: study design
• Can increase efficiency with study design
• Decrease bias, avoid or reduce inherent bias
Epigenome-Wide Association
Studies (EWAS)
• Burgeoning field, exponential increase in publications
• Vast majority study DNA methylation
• Relatively stable mark
• Methods to multiplex samples
• Established cohort often do not have the correct samples for
other epigenetics modifications
• Focus of this workshop will be DNA methylation
• Will discuss other modifications today
• Primarily working with microarray data
• Limited time and so much to cover!
Publication of EWAS
Michels et al. (2013) Nature Methods
Publication of EWAS
Burris and
Baccarelli
(2014) J Appl
Toxicol
Funding of EWAS
Percentage of Institute/Center Total Costs for FY 2012 Used for Epigenetic
Studies Overall, and by Mechanism
Burris and
Baccarelli
(2014) J Appl
Toxicol
National
Institutes of
Health (NIH)
spent over $700
million (2.8% of
their total costs)
on epigenetics
in 2012
Variation across the Life-Course
Mill and Heijmans (2013) Nature Reviews Genetics
Introduction to
Epigenetics
Epigenetic marks and how we measure them
• The study of stable, heritable
changes in gene function that
are not due to changes in the
primary sequence of DNA
• DNA methylation, histone
modification, and non-coding
RNAs (such as microRNAs)
• Cells can exhibit different
phenotypes and possess the
same genotype
Epigenetics
Alcohol Research: Current Reviews, Volume 35, Issue Number 1
DNA Methylation
DNA methylation: enzymatic methylation of cytosine
bases
DNA Methylation
Lister et al. (2009) Nature
• DNA methylation typically
occurs at CpG sites in
differentiated cells
• Non-CpG methylation is
prevalent in embryonic stem
cells
• H1 human embryonic stem
cells and IMR90 fetal lung
fibroblasts
• 25% of methylation in stem
cells was in non-CpG
contexts (mCHG and mCHH,
where H = A, C or T)
Distribution of CpG Loci
CpGs are not randomly distributed across the genome
• Strongly enriched in repetitive element sequences
• e.g. LINE-1
• CpG frequency is ~5 times less than expected
• About 70% of all human gene promoters have CpG
sequences
• 5mc accounts for <1% of nucleotides
Distribution of CpG Loci
Saxonov et al. (2005)
PNAS
See CpG enrichment
closer to the
transcription start
site (TSS)
Mutagenic CpG Loci
• Most common base substitution in the human genome
is C>T (65% of all single nucleotide polymorphisms in
dbSNP)
• Occur most frequently in CpG contexts
• Spontaneous deamination of 5-methyl cytosine to thymine
CpG Islands
CpG islands: regions of high CpG content that often occur near
the transcription start site
• Almost all housekeeping genes are associated with at least one CpG
island
Definitions:
• Gardiner-Garden & Frommer (1987)
• Definition used in Genome Browser
• At least 200 bases long
• G+C content: > 50%
• observed CpG/expected CpG ratio: >= 0.6
• Takai & Jones (2002)
• Longer than 500 bp – G+C content: > 55%
• observed CpG/expected CpG ratio: >= 0.65
• CpG islands are more likely to be associated with the 5’ regions of
genes and exclude most Alu’s with this definition
Transcriptional Silencing by DNA
Methylation
http://missinglink.ucsf.edu/
CpG Island “Resorts”
• CpG island surrounded by shores and shelves
• Shores 2kb out from a CpG island
• Shelves 2kb out from a CpG shore
• Methylation at CpG island shores has been
suggested to be more tissue-specific
TSS1000 TSS100 5’ UTR 1st Exon Gene Body 3’ UTR
CpG IslandShore
Tissue-Specificity of Shores
Irizarry et al. (2009) Nature Genetics
Tissue-Specificity of Shores
Doi et al. (2009) Nature Genetics
• Differentially methylation regions between fibroblasts
and induced pluripotent stem cells enriched for shores
Ways to Interrogate Methylation
Laird (2010) Nature Reviews Genetics
Ways to Interrogate Methylation
Rivera and Ren (2013) Cell
Chromatin
Role: packing the DNA into the cell
and controlling transcription and
replication
Basic unit: Nucleosome
• DNA wrapped around 8 histones
• Euchromatin:
• Partially decondensed
• Transcribed genes
• Heterochromatin:
• Hypercondensed in interphase
• Transcriptionally inert
• Formation of chromosomal
structures
Histone Code
• Histone Code Hypothesis: transcription of genetic
information encoded in DNA is in part regulated by
chemical modifications to histone proteins
• Post-translational modifications:
• Acetylation - Lys
• Methylation (mono-, di- and tri-) - Lys and Arg
• Phosphorylation - Ser and Thr
• Ubiquitination (mono- and poly-) - Lys
• Sumoylation (Lys); ADP-ribosylation; glycosylation;
• biotinylation; carbonylation
Histone Code
• How modifications may impact
transcription
• Structural role for modifications
• Based on charge density of histone
tails
• Possible sterical inhibition
• Example: Acetylation (generally
associated with transcriptionally
active genes)
• Modifications as recognition sites
• Marks read by other proteins to
control expression and these
correlations can be used to aid
identification of regulatory elements
Histone Code
• Ernst and Kellis (2010) Nature Biotechnology
• Discovering `chromatin states' in a systematic de novo way
across a complete genome based on a multivariate Hidden
Markov Model
• Distinguished six broad classes of chromatin states: promoter,
enhancer, insulator, transcribed, repressed and inactive states
Tissue-specificity of Histone Code
• Ersnt et al. (2011) Nature - Mapping nine chromatin
marks across nine cell types
• Functional enrichment among sites associated with promoter
& enhancer states
Promoter states Enhancer states
• Promoter clusters
showed activity in
multiple cell types
• Enhancer clusters
are more cell type
specific
MicroRNAs
• MicroRNAs (miRNAs) are
noncoding RNAs that are
about 22 nucleotides in
length
• Originate from capped &
polyadenylated full length
precursors (pri-miRNA)
• Important in development
and post-transcriptional
gene regulation
• In animals, miRNAs can
regulate gene expression
post-transcriptionally by
imperfect complementarity
with a target mRNA, thereby
inhibiting protein synthesis
Nature Reviews Genetics 5, 522-531 (July 2004)
Why Very Few Histone
Modification or miRNA EWAS?
• Samples must be collected and stored correctly
• Stored in RNAlater, need high quality RNA
• Not treated with proteases (histones are proteins)
• May be sensitive to freeze thaw cycles or the storage
temperature
• Can be very expensive
• To estimate chromatin states, may need to sequence multiple
marks per sample
• Harder to integrate studies across labs
• Difficult to find validation cohort
• Trend towards meta-analyses
Cross Talk between Epigenetic
Pathways
Gene
Expression
DNA
methylation
Histone
ModificationsmiRNA
Introduction to
Epidemiology
Study Design and Bias
Study Designs for EWAS
Adapted from: Michels (ed) Epigenetic Epidemiology
Types of Studies: Experimental
Studies/Randomized Control Trials
• Design: exposure is randomized
• Strengths:
• There is no confounding by design, adjusted analysis
should estimate causal effect (potentially intention to
treat effect)
• Weaknesses:
• Very expensive
• Time consuming
• Must consider eqipose
Types of Studies: Cohort Studies
• Design: enroll a group of people who do not have an
outcome yet, collect exposure information, follow
individuals over time
• Strengths:
• Efficient design when the exposure of interest is rare
• Information collected on multiple exposures and multiple outcomes
• If exposure ascertained before outcome, eliminate recall bias
• Weaknesses:
• Very time consuming
• Very expensive
• Difficult if outcome is rare
Types of Studies: Case-Control
Studies
• Design: enroll cases (individuals with incident or prevalent
disease) and controls (individuals who have never had the
disease who are at risk for disease) from the population
from which the cases arose
• Strengths:
• Likely faster and at lower cost if outcome has happened already
• Efficient design when the outcome is rare
• Weaknesses:
• Only one disease/outcome studied (can use data again but must
account for the selection)
• May not be possible if the outcome is rare
• Selection of controls may be difficult
• Complete exposure information may be difficult to ascertain
retrospectively
Types of
Studies:
Twin Studies
Mill and Heijmans (2013) Nature Reviews Genetics
Monozygotic (MZ) twin studies
help to discern the impact of
genetic sequence on epigenetic
variation and disease risk
MZ twins share their DNA
sequence, parents, birth date
and sex, and experienced a
very similar prenatal
environment
Types of Studies: Family-Based
• Can use great-grandparent, grandparent, maternal
and paternal data to investigate possible
transgenerational inheritance
• If interest is intrauterine exposure, can use paternal
exposures as a negative control to disentangle
impact of home environment
• Associations may reflect shared familial confounding
factors or by parental genotypes transmitted to the
offspring
• Impact of only maternal exposure would suggest
intrauterine effect
Transgenerational
Inheritance
In the case of an exposed
female mouse, if she is
pregnant, the fetus can be
affected in utero (F1), as can
the germline of the fetus
(the future F2)
• considered to be parental
effects, leading to
intergenerational
epigenetic inheritance
• Only F3 individuals can be
considered as true
transgenerational
inheritance
Does it exist in humans?
Heard and Martienssen (2014)
Cell
Example of Negative Controls
• Maternal smoking during pregnancy on offspring birthweight is
considerably greater than that of paternal smoking during
pregnancy
• Adjustment for maternal smoking attenuates the paternal effect to zero
• In line with evidence that maternal smoking has a causal effect on
offspring birthweight
Smith (2012) Epidemiology
Type of Studies: Family-Based
Odds ratios in meta-analyses of association between maternal smoking during
pregnancy (vs no maternal smoking during pregnancy) and paternal smoking any time
(vs no paternal smoking any time) and overweight or obesity in childhood
Riedel et al. (2014) International Journal of Epi
Defining Exposure
• Possible definitions of exposure
• Maximum intensity of exposure experienced
• Average intensity over a period of time
• Cumulative amount of exposure
• Other important variables that are not accounted
for by duration and intensity:
• Age that exposure started
• Age at cessation of exposure
• Timing of exposure relative to disease onset (lag or
induction period)
Validity of Results
• Internal validity – the extent to which the analysis
captures the true causal association
• Threats to internal validity
• Confounding
• Selection Bias
• Information Bias
• Can be addressed in both study design and analysis
• External validity –generalizability of the results
Which do we care about more?
Confounding
• Standard definition: a common cause of the
outcome and exposure
• Can result in the detection of an association between
exposure and outcome even if there is no direct effect
Confounding
• Surrogate confounders: variables correlated with a
confounder that have no direct association with
exposure or outcome
• Adjustment for these variables may reduce but not
completely remove confounding
• Useful when true confounder cannot be measured
Selection Bias
• Selection bias: biases that arise from conditioning
on a common effect of two variables, one of which
is either the exposure or a cause of the exposure,
and the other is the outcome or a cause of the
outcome
• Often we think about selection bias in case-control
studies, but can arise in cohort studies as well
Selection Bias in Case-Control
Studies
Occurs due to inappropriate selection of controls
• Cases in the cohort are more
likely to be selected than non-
cases
• Investigators selected controls
preferentially among women with
hip fracture and estrogen is
protective against hip fracture
Hernán MA, Robins JM (2016). Causal Inference
Selection Bias in Case-Control
Studies: Berkson’s Bias
Occurs due to inappropriate selection of controls
• Both disease 1 and 2 are unassociated
but both affect the probability of
hospital admission
• Hospital-based controls: cases had
Disease 1 and controls had Disease 2
that is affected by the exposure A.
• Risk factor A for Disease 2 would
appear to also be a risk factor for
Disease 1 even if A does not cause
Disease 1
Hernán MA, Robins JM (2016). Causal Inference
Selection Bias in Cohort Studies
Loss to follow-up
• exposure has side effects that increase the
probability of dropping out and certain symptoms
of disease increase the probability of dropping out
Hernán MA, Robins JM
(2016). Causal Inference
Selection Bias in Cohort Studies
Healthy worker bias
• The unmeasured health status U is a determinant of
both death Y and of being at work C
• L may be the result of some blood test or physical exam
Hernán MA, Robins JM
(2016). Causal Inference
Selection Bias in Cohort Studies
Volunteer Bias
• Bias may be present if the study is restricted to those
who volunteered – may be related to lifestyle
• Cannot occur in a randomized study – exposure
randomization happens after they elect to participate
• Can impact generalizability
Hernán MA, Robins JM
(2016). Causal Inference
Information Bias
• Two important properties of measurement error
• Independence
• Non-differentiality
• Y* and A* are the measured outcome and exposure, Y
and A are the true values
Independent Dependent
Hernán MA, Robins JM
(2016). Causal Inference
Information Bias: Recall Bias
• Recall Bias: outcome affects the measurement of
the exposure
• Independent but differential measurement error
• If the outcome is birth defects Y and ask mother to
recall alcohol use during pregnancy A after delivery
• Recall may be affected by the outcome of the pregnancy
Hernán MA, Robins JM
(2016). Causal Inference
Information Bias: Detection Bias
• Detection Bias: exposure affects the measurement of
the outcome
• Independent but differential measurement error
• Smokers concerned about health impacts of smoking
may seek medical attention more than nonsmokers
• Lead to emphysema to be diagnosed more frequently among
smokers than among nonsmokers
Hernán MA, Robins JM
(2016). Causal Inference
Increasing Efficiency by Matching
• Due to limited resources (money, number of case
samples) studies are limited in size
• Can increase our power to detect a change by
matching on confounders
• We must adjust for confounders in the analysis, by
matching we are trying to ensure that we do not have
certain sparse strata
• The impact of matching on internal validity
depends on the type of study that is being
conducted
• Case-control vs cohort
Matching in Cohort Studies
• If oversampling exposure groups, can match on
confounders to increase efficiency and control for
bias
• Effect of matching in analysis
• Matching prevents confounding even in crude analysis
• Assuming no other confounding
• Don’t have to adjust for the matching factor
• To improve precision, matching should be accounted for
in the analysis
• Effect modification of matching factors can be evaluated
in follow-up studies
Matching in Case-Control Studies
• Matching in case-control studies helps to increase precision
but it does not remove confounding in the crude analysis
• Bias introduced by matching
• If the matching factor(s) are associated with the exposure of
interest, matching will cause the exposure distribution among the
control group to be more similar to the cases than the true
distribution of exposure in the study base
• Bias towards the null
• Must adjust for the matching factors in the analysis
• If you match in a case-control study:
• You cannot study the main effects of matching factors
• You can evaluate if the matching factors are effect modifiers
Inappropriate Matching in Case-
Control Studies
• Appropriate matching: matching on a confounder
• Inappropriate matching:
• Unnecessary matching: match on variable that is not
associated with the exposure
• Do not need to adjust for matching factors in analysis
• Over-matching: match on variable that is only
associated with the exposure
• Results are biased if you do not adjust for the matching factors
• Have altered the exposure distribution among the controls
• Matching on intermediate
• Have removed any possible association between exposure and
outcome
• Impossible to rectify this error in the analysis
Mediation of effects of exposures
on disease outcomes
• Utility of Mediation analysis
• If blood pressure partially mediates the influence of BMI on CHD,
could therapeutically modifying blood pressure help break the link
between BMI and CHD?
Blood
pressure
CHDBMI
May be interested in looking at the impact of an exposure
on an outcome through methylation
Mediation of effects of exposures
on disease outcomes
• Identifying direct and indirect effects requires additional modeling
assumptions:
• 𝑌𝑌𝑎𝑎𝑎𝑎∐𝐴𝐴|𝐶𝐶 : Y is independent of A adjusting for C
M Y
C1
A
Mediation of effects of exposures
on disease outcomes
• Identifying direct and indirect effects requires additional modeling
assumptions:
• 𝑌𝑌𝑎𝑎𝑎𝑎∐𝑀𝑀|𝐴𝐴, 𝐶𝐶: Y is independent of M adjusting for C & A
M Y
C2
C1
A
Mediation of effects of exposures
on disease outcomes
• Identifying direct and indirect effects requires additional modeling
assumptions:
• 𝑀𝑀𝑎𝑎∐𝐴𝐴|𝐶𝐶: M is independent of A adjusting for C
M Y
C2
C1
A
C3
Mediation of effects of exposures
on disease outcomes
• Identifying direct and indirect effects requires additional modeling
assumptions:
• 𝑌𝑌𝑎𝑎𝑎𝑎∐𝑀𝑀𝑎𝑎|𝐶𝐶: No effect of exposure that confounds the
mediator-outcome relationship
M Y
C2
C1
A
C3
An Example of When Mediation
Analysis Can Introduce Bias
• Birthweight Paradox: among low birth weight
(LBW) infants, infant mortality is lower among
infants born to smokers
Is maternal smoking
beneficial to low
birth weight infants?
Lines cross around
2kg
Hernández-Díaz et al. (2006) Am J
Epidemiology
An Example of When Mediation
Analysis Can Introduce Bias
• Possible explanation, there
is a common cause of LBW
and mortality that has a
greater impact on mortality
than smoking
• Therefore if infant is LBW and
mother is not a smoker, more
likely to have other condition
which is associated with
higher mortality rate
• Results in an apparent
decreased risk of mortality
among LBW infants from
smokers
Hernández-Díaz et al. (2006) Am J
Epidemiology
Study Design
Considerations Specific
to Epigenetic Studies
Batch Effects
Lazar C et al. Brief Bioinform 2012
Many potential sources:
65
Impact of Batch Effects
• One of the largest
determinants of variation
tends to be batch
• Even after data preprocessing
(normalizing the signal
intensities across the array),
can still see the impact of
batch at the gene-level
Batch Effect. Nature Reviews Genetics 2010
66
Batch Effects
• Can try to reduce the impact of batch effects in the:
• Design of the study (discuss today)
• In the analysis (discuss tomorrow)
• One way to assess the impact of batch is through the
judicious use of technical replicates
• Possible to preclude completely confound the
association between epigenetic mark and exposure
with batch
• No way to fix this in the analysis
• Examples?
Addressing Batch Effects in Design
• The appropriate distribution of unique
biospecimens across batches depends on:
• The study design
• The question of interest
• Our ability to estimate the batch effects
• Need to think about potential sources of batch
effects
• Storage
• When samples were processed
• Chip processed on or sequencing lane
Addressing Batch Effects in Design
Study design Question of interest
Samples to be assayed in
same batch
Groups to be balanced
within batches
Randomized trial
Within-person changes over
time
Samples from same
participant
Intervention groups
Crossover intervention
trial
Within-person differences
between interventions
Samples from same
participant
Order of interventions
Cohort study
Comparison of exposed and
non-exposed persons
NA
Exposure categories if
categorical exposure
Case–cohort study
Comparison of diseased and
disease-free persons
NA
Proportion of cases and
subcohort membersa
Matched case–control
study
Comparison of cases and
controls
Cases and their matched
control(s)
NAb
Frequency-matched
case–control study
Comparison of cases and
controls
NA
Cases and controlsb
Frequency matching
characteristics if categorical
Case-series
Comparison of different case
groups
NA Case groups of interest
Cross-sectional study
Comparison of exposed and
non-exposed
NA
Exposure categories if
categorical exposure
Reviewed by: Tworoger and Hankinson (2006) Cancer Causes Control
Tissue Specificity
of Epigenetic
Change
Same genome
different
epigenome
2001 Terese Winslow, Caitlin Duckwall
Choice of Tissue
• Tissue of interest may not be readily accessible (e.g. brain)
• Can use reference epigenome project to inform choice of surrogate
tissue (discuss later today)
• Use available samples or establish a new study population?
• Using established study
• Nested case-control or cohort study
• Often only blood available – may not be tissue of interest
• Extensive covariate data available
• Long term outcomes
• Starting a new study
• Identify samples necessary, use correct storage
• Time consuming and more expensive
• Not possible to assess long term outcomes
Choice of Tissue
NIH
interested in
identifying
appropriate
surrogate
tissues
Bias Due to Cell Composition
When we have the a
hetereogenous
tissue, cell mixture in
that tissue may
impact the results
Possible impacts of
cell composition:
• Confounding
• Mediation
• Reverse causation
Houseman (2015) Current
Environmental Health Reports
Estimating Cellular Composition
Houseman (2015) Current
Environmental Health Reports
Can estimate cellular composition in heterogeneous tissue
using a reference data set:
Steps toward
a Successful
EWAS
Recommendations
for the design and
analysis of
epigenome-wide
association studies.
Michels et al (2013)
Nature Methods
Verification
• Definition: replicate the findings in the same
cohort using a different technology
• Original approach may not be capturing change in
percent methylation
• Verify that the findings were not technical error
• Some previously unrecognized batch effects – e.g. find out that
cases and controls were processed by different technicians
• Might be some inherent bias associated with the platform
• If results cannot be verified with a different technology
(at least as precise as the original technology) it suggests
the original results were a false positive
Validation
• Definition: replication of results in an independent
cohort
• Ideally using a different technology, ensure not the
result of some inherent bias of the platform
• Similar to the purpose of validation for other types of
epidemiologic studies
• Identify potentially important effect modification
• Possible important residual unmeasured confounding
Steps to Identifying True Positives
• Steps and possible issues:
Next Steps: Paper Discussion

More Related Content

What's hot

Unit Presentation Dec 08 Short
Unit Presentation Dec 08 ShortUnit Presentation Dec 08 Short
Unit Presentation Dec 08 ShortGilesOElliott
 
DNA Methylation Data Analysis
DNA Methylation Data AnalysisDNA Methylation Data Analysis
DNA Methylation Data AnalysisYi-Feng Chang
 
Personalized Medicine and the Omics Revolution by Professor Mike Snyder
Personalized Medicine and the Omics Revolution by Professor Mike SnyderPersonalized Medicine and the Omics Revolution by Professor Mike Snyder
Personalized Medicine and the Omics Revolution by Professor Mike SnyderThe Hive
 
Protein-protein interaction
Protein-protein interactionProtein-protein interaction
Protein-protein interactionsigma-tau
 
Sexual selection and genetic colour polymorphisms in animals
Sexual selection and genetic colour polymorphisms in animalsSexual selection and genetic colour polymorphisms in animals
Sexual selection and genetic colour polymorphisms in animalsUniversity Of Wuerzburg,Germany
 
Epigenetics importance in livestock breeding and production
Epigenetics importance in livestock breeding and productionEpigenetics importance in livestock breeding and production
Epigenetics importance in livestock breeding and productionDr Satheesha G M
 
Structural genomics
Structural genomicsStructural genomics
Structural genomicsAshfaq Ahmad
 
Data analysis
Data analysisData analysis
Data analysisamlbinder
 
Poster NIH summer 2013
Poster NIH summer 2013Poster NIH summer 2013
Poster NIH summer 2013Nestor Orozco
 
7 nucleic acids syllabus statements
7 nucleic acids syllabus statements7 nucleic acids syllabus statements
7 nucleic acids syllabus statementscartlidge
 
Nucleic acid based therapeutic drug delivery system
Nucleic acid based therapeutic  drug delivery systemNucleic acid based therapeutic  drug delivery system
Nucleic acid based therapeutic drug delivery systemtadisriteja9
 
Molecular basis of evolution and softwares used in phylogenetic tree contruction
Molecular basis of evolution and softwares used in phylogenetic tree contructionMolecular basis of evolution and softwares used in phylogenetic tree contruction
Molecular basis of evolution and softwares used in phylogenetic tree contructionUdayBhanushali111
 
Gene therapy ex vivo method
Gene therapy  ex vivo methodGene therapy  ex vivo method
Gene therapy ex vivo methodakash mahadev
 
Plant system biology
Plant system biologyPlant system biology
Plant system biologySubaParanie
 
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...Prasenjit Mitra
 

What's hot (20)

Unit Presentation Dec 08 Short
Unit Presentation Dec 08 ShortUnit Presentation Dec 08 Short
Unit Presentation Dec 08 Short
 
Cancer genome
Cancer genomeCancer genome
Cancer genome
 
DNA Methylation Data Analysis
DNA Methylation Data AnalysisDNA Methylation Data Analysis
DNA Methylation Data Analysis
 
Personalized Medicine and the Omics Revolution by Professor Mike Snyder
Personalized Medicine and the Omics Revolution by Professor Mike SnyderPersonalized Medicine and the Omics Revolution by Professor Mike Snyder
Personalized Medicine and the Omics Revolution by Professor Mike Snyder
 
Protein-protein interaction
Protein-protein interactionProtein-protein interaction
Protein-protein interaction
 
Sexual selection and genetic colour polymorphisms in animals
Sexual selection and genetic colour polymorphisms in animalsSexual selection and genetic colour polymorphisms in animals
Sexual selection and genetic colour polymorphisms in animals
 
DNA Methylation
DNA MethylationDNA Methylation
DNA Methylation
 
Epigenetics importance in livestock breeding and production
Epigenetics importance in livestock breeding and productionEpigenetics importance in livestock breeding and production
Epigenetics importance in livestock breeding and production
 
Pharmacogenomics
PharmacogenomicsPharmacogenomics
Pharmacogenomics
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Data analysis
Data analysisData analysis
Data analysis
 
Genomics,proteomics and comparative genomics
Genomics,proteomics and comparative genomicsGenomics,proteomics and comparative genomics
Genomics,proteomics and comparative genomics
 
Poster NIH summer 2013
Poster NIH summer 2013Poster NIH summer 2013
Poster NIH summer 2013
 
7 nucleic acids syllabus statements
7 nucleic acids syllabus statements7 nucleic acids syllabus statements
7 nucleic acids syllabus statements
 
Nucleic acid based therapeutic drug delivery system
Nucleic acid based therapeutic  drug delivery systemNucleic acid based therapeutic  drug delivery system
Nucleic acid based therapeutic drug delivery system
 
Molecular basis of evolution and softwares used in phylogenetic tree contruction
Molecular basis of evolution and softwares used in phylogenetic tree contructionMolecular basis of evolution and softwares used in phylogenetic tree contruction
Molecular basis of evolution and softwares used in phylogenetic tree contruction
 
epigenetics of kidney disease
epigenetics of kidney diseaseepigenetics of kidney disease
epigenetics of kidney disease
 
Gene therapy ex vivo method
Gene therapy  ex vivo methodGene therapy  ex vivo method
Gene therapy ex vivo method
 
Plant system biology
Plant system biologyPlant system biology
Plant system biology
 
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...
 

Similar to Introduction to epigenetics and study design

Concept of genomics, proteomics and metabolomics
Concept of genomics, proteomics and metabolomicsConcept of genomics, proteomics and metabolomics
Concept of genomics, proteomics and metabolomicsMuragendraswami Astagimath
 
GENETICS IN ORTHODONTICS WORD.docx
GENETICS IN ORTHODONTICS WORD.docxGENETICS IN ORTHODONTICS WORD.docx
GENETICS IN ORTHODONTICS WORD.docxsidharth779721
 
DNA methylation: from array to sequencing
DNA methylation: from array to sequencingDNA methylation: from array to sequencing
DNA methylation: from array to sequencingjyotirmoy211
 
IGNTU-eContent-328712472244-M.Sc-EnvironmentalScience-2-ManojkumarRai-Environ...
IGNTU-eContent-328712472244-M.Sc-EnvironmentalScience-2-ManojkumarRai-Environ...IGNTU-eContent-328712472244-M.Sc-EnvironmentalScience-2-ManojkumarRai-Environ...
IGNTU-eContent-328712472244-M.Sc-EnvironmentalScience-2-ManojkumarRai-Environ...sumitraDas14
 
20170209 ngs for_cancer_genomics_101
20170209 ngs for_cancer_genomics_10120170209 ngs for_cancer_genomics_101
20170209 ngs for_cancer_genomics_101Ino de Bruijn
 
BTG 708 GENE THERAPY.pptx
BTG 708 GENE THERAPY.pptxBTG 708 GENE THERAPY.pptx
BTG 708 GENE THERAPY.pptxRUTH AFUNWA
 
introduction to genetics in Nursing and its Importance
introduction to genetics  in Nursing and its Importanceintroduction to genetics  in Nursing and its Importance
introduction to genetics in Nursing and its Importancevirengeeta
 
Genomics epigenetic modifications & epigenomic technologies
Genomics epigenetic modifications & epigenomic technologiesGenomics epigenetic modifications & epigenomic technologies
Genomics epigenetic modifications & epigenomic technologiessciencelearning123
 
Bioc4700 2014 Guest Lecture
Bioc4700   2014 Guest LectureBioc4700   2014 Guest Lecture
Bioc4700 2014 Guest LectureDan Gaston
 
Genome Mapping And Biological Resources Slides.pptx
Genome Mapping And Biological Resources Slides.pptxGenome Mapping And Biological Resources Slides.pptx
Genome Mapping And Biological Resources Slides.pptxAqsaZakaria
 
Molecular profiling of breast cancer
Molecular profiling of breast cancerMolecular profiling of breast cancer
Molecular profiling of breast cancerdhanya89
 
CancerBioinformatics_DataTypesResources_012215.pptx
CancerBioinformatics_DataTypesResources_012215.pptxCancerBioinformatics_DataTypesResources_012215.pptx
CancerBioinformatics_DataTypesResources_012215.pptxQiZhi2
 
Gene mapping and DNA markers
Gene mapping and DNA markersGene mapping and DNA markers
Gene mapping and DNA markersAFSATH
 
Variation in crop genomes and heterosis
Variation in crop genomes and heterosis Variation in crop genomes and heterosis
Variation in crop genomes and heterosis Shaojun Xie
 
Prokaryote genome
Prokaryote genomeProkaryote genome
Prokaryote genomemonanarayan
 
2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studiesFOODCROPS
 
Incisionless surgery.pptx
Incisionless surgery.pptxIncisionless surgery.pptx
Incisionless surgery.pptxPradeep Pande
 
Molecular Markers and Their Application in Animal Breed.pptx
Molecular Markers and Their Application in Animal Breed.pptxMolecular Markers and Their Application in Animal Breed.pptx
Molecular Markers and Their Application in Animal Breed.pptxTrilokMandal2
 
population genomics.pdf
population genomics.pdfpopulation genomics.pdf
population genomics.pdfshinycthomas
 

Similar to Introduction to epigenetics and study design (20)

Concept of genomics, proteomics and metabolomics
Concept of genomics, proteomics and metabolomicsConcept of genomics, proteomics and metabolomics
Concept of genomics, proteomics and metabolomics
 
GENETICS IN ORTHODONTICS WORD.docx
GENETICS IN ORTHODONTICS WORD.docxGENETICS IN ORTHODONTICS WORD.docx
GENETICS IN ORTHODONTICS WORD.docx
 
DNA methylation: from array to sequencing
DNA methylation: from array to sequencingDNA methylation: from array to sequencing
DNA methylation: from array to sequencing
 
IGNTU-eContent-328712472244-M.Sc-EnvironmentalScience-2-ManojkumarRai-Environ...
IGNTU-eContent-328712472244-M.Sc-EnvironmentalScience-2-ManojkumarRai-Environ...IGNTU-eContent-328712472244-M.Sc-EnvironmentalScience-2-ManojkumarRai-Environ...
IGNTU-eContent-328712472244-M.Sc-EnvironmentalScience-2-ManojkumarRai-Environ...
 
20170209 ngs for_cancer_genomics_101
20170209 ngs for_cancer_genomics_10120170209 ngs for_cancer_genomics_101
20170209 ngs for_cancer_genomics_101
 
BTG 708 GENE THERAPY.pptx
BTG 708 GENE THERAPY.pptxBTG 708 GENE THERAPY.pptx
BTG 708 GENE THERAPY.pptx
 
introduction to genetics in Nursing and its Importance
introduction to genetics  in Nursing and its Importanceintroduction to genetics  in Nursing and its Importance
introduction to genetics in Nursing and its Importance
 
Genomics epigenetic modifications & epigenomic technologies
Genomics epigenetic modifications & epigenomic technologiesGenomics epigenetic modifications & epigenomic technologies
Genomics epigenetic modifications & epigenomic technologies
 
Bioc4700 2014 Guest Lecture
Bioc4700   2014 Guest LectureBioc4700   2014 Guest Lecture
Bioc4700 2014 Guest Lecture
 
Genome Mapping And Biological Resources Slides.pptx
Genome Mapping And Biological Resources Slides.pptxGenome Mapping And Biological Resources Slides.pptx
Genome Mapping And Biological Resources Slides.pptx
 
Molecular profiling of breast cancer
Molecular profiling of breast cancerMolecular profiling of breast cancer
Molecular profiling of breast cancer
 
CancerBioinformatics_DataTypesResources_012215.pptx
CancerBioinformatics_DataTypesResources_012215.pptxCancerBioinformatics_DataTypesResources_012215.pptx
CancerBioinformatics_DataTypesResources_012215.pptx
 
Gene mapping and DNA markers
Gene mapping and DNA markersGene mapping and DNA markers
Gene mapping and DNA markers
 
Variation in crop genomes and heterosis
Variation in crop genomes and heterosis Variation in crop genomes and heterosis
Variation in crop genomes and heterosis
 
Prokaryote genome
Prokaryote genomeProkaryote genome
Prokaryote genome
 
2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies
 
Incisionless surgery.pptx
Incisionless surgery.pptxIncisionless surgery.pptx
Incisionless surgery.pptx
 
Molecular Markers and Their Application in Animal Breed.pptx
Molecular Markers and Their Application in Animal Breed.pptxMolecular Markers and Their Application in Animal Breed.pptx
Molecular Markers and Their Application in Animal Breed.pptx
 
population genomics.pdf
population genomics.pdfpopulation genomics.pdf
population genomics.pdf
 
genetic_engineering.ppt
genetic_engineering.pptgenetic_engineering.ppt
genetic_engineering.ppt
 

Recently uploaded

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 

Recently uploaded (20)

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 

Introduction to epigenetics and study design

  • 2. Integration of Epidemiology and Epigenetics Requires: • Subject matter knowledge • Pressing biologic questions • Determinants of epigenetic variation • Possible confounders and effect modifiers • Appropriate Study Design and Analysis • Focus today: study design • Can increase efficiency with study design • Decrease bias, avoid or reduce inherent bias
  • 3. Epigenome-Wide Association Studies (EWAS) • Burgeoning field, exponential increase in publications • Vast majority study DNA methylation • Relatively stable mark • Methods to multiplex samples • Established cohort often do not have the correct samples for other epigenetics modifications • Focus of this workshop will be DNA methylation • Will discuss other modifications today • Primarily working with microarray data • Limited time and so much to cover!
  • 4. Publication of EWAS Michels et al. (2013) Nature Methods
  • 5. Publication of EWAS Burris and Baccarelli (2014) J Appl Toxicol
  • 6. Funding of EWAS Percentage of Institute/Center Total Costs for FY 2012 Used for Epigenetic Studies Overall, and by Mechanism Burris and Baccarelli (2014) J Appl Toxicol National Institutes of Health (NIH) spent over $700 million (2.8% of their total costs) on epigenetics in 2012
  • 7. Variation across the Life-Course Mill and Heijmans (2013) Nature Reviews Genetics
  • 9. • The study of stable, heritable changes in gene function that are not due to changes in the primary sequence of DNA • DNA methylation, histone modification, and non-coding RNAs (such as microRNAs) • Cells can exhibit different phenotypes and possess the same genotype Epigenetics
  • 10. Alcohol Research: Current Reviews, Volume 35, Issue Number 1 DNA Methylation DNA methylation: enzymatic methylation of cytosine bases
  • 11. DNA Methylation Lister et al. (2009) Nature • DNA methylation typically occurs at CpG sites in differentiated cells • Non-CpG methylation is prevalent in embryonic stem cells • H1 human embryonic stem cells and IMR90 fetal lung fibroblasts • 25% of methylation in stem cells was in non-CpG contexts (mCHG and mCHH, where H = A, C or T)
  • 12. Distribution of CpG Loci CpGs are not randomly distributed across the genome • Strongly enriched in repetitive element sequences • e.g. LINE-1 • CpG frequency is ~5 times less than expected • About 70% of all human gene promoters have CpG sequences • 5mc accounts for <1% of nucleotides
  • 13. Distribution of CpG Loci Saxonov et al. (2005) PNAS See CpG enrichment closer to the transcription start site (TSS)
  • 14. Mutagenic CpG Loci • Most common base substitution in the human genome is C>T (65% of all single nucleotide polymorphisms in dbSNP) • Occur most frequently in CpG contexts • Spontaneous deamination of 5-methyl cytosine to thymine
  • 15. CpG Islands CpG islands: regions of high CpG content that often occur near the transcription start site • Almost all housekeeping genes are associated with at least one CpG island Definitions: • Gardiner-Garden & Frommer (1987) • Definition used in Genome Browser • At least 200 bases long • G+C content: > 50% • observed CpG/expected CpG ratio: >= 0.6 • Takai & Jones (2002) • Longer than 500 bp – G+C content: > 55% • observed CpG/expected CpG ratio: >= 0.65 • CpG islands are more likely to be associated with the 5’ regions of genes and exclude most Alu’s with this definition
  • 16. Transcriptional Silencing by DNA Methylation http://missinglink.ucsf.edu/
  • 17. CpG Island “Resorts” • CpG island surrounded by shores and shelves • Shores 2kb out from a CpG island • Shelves 2kb out from a CpG shore • Methylation at CpG island shores has been suggested to be more tissue-specific TSS1000 TSS100 5’ UTR 1st Exon Gene Body 3’ UTR CpG IslandShore
  • 18. Tissue-Specificity of Shores Irizarry et al. (2009) Nature Genetics
  • 19. Tissue-Specificity of Shores Doi et al. (2009) Nature Genetics • Differentially methylation regions between fibroblasts and induced pluripotent stem cells enriched for shores
  • 20. Ways to Interrogate Methylation Laird (2010) Nature Reviews Genetics
  • 21. Ways to Interrogate Methylation Rivera and Ren (2013) Cell
  • 22. Chromatin Role: packing the DNA into the cell and controlling transcription and replication Basic unit: Nucleosome • DNA wrapped around 8 histones • Euchromatin: • Partially decondensed • Transcribed genes • Heterochromatin: • Hypercondensed in interphase • Transcriptionally inert • Formation of chromosomal structures
  • 23. Histone Code • Histone Code Hypothesis: transcription of genetic information encoded in DNA is in part regulated by chemical modifications to histone proteins • Post-translational modifications: • Acetylation - Lys • Methylation (mono-, di- and tri-) - Lys and Arg • Phosphorylation - Ser and Thr • Ubiquitination (mono- and poly-) - Lys • Sumoylation (Lys); ADP-ribosylation; glycosylation; • biotinylation; carbonylation
  • 24. Histone Code • How modifications may impact transcription • Structural role for modifications • Based on charge density of histone tails • Possible sterical inhibition • Example: Acetylation (generally associated with transcriptionally active genes) • Modifications as recognition sites • Marks read by other proteins to control expression and these correlations can be used to aid identification of regulatory elements
  • 25. Histone Code • Ernst and Kellis (2010) Nature Biotechnology • Discovering `chromatin states' in a systematic de novo way across a complete genome based on a multivariate Hidden Markov Model • Distinguished six broad classes of chromatin states: promoter, enhancer, insulator, transcribed, repressed and inactive states
  • 26. Tissue-specificity of Histone Code • Ersnt et al. (2011) Nature - Mapping nine chromatin marks across nine cell types • Functional enrichment among sites associated with promoter & enhancer states Promoter states Enhancer states • Promoter clusters showed activity in multiple cell types • Enhancer clusters are more cell type specific
  • 27. MicroRNAs • MicroRNAs (miRNAs) are noncoding RNAs that are about 22 nucleotides in length • Originate from capped & polyadenylated full length precursors (pri-miRNA) • Important in development and post-transcriptional gene regulation • In animals, miRNAs can regulate gene expression post-transcriptionally by imperfect complementarity with a target mRNA, thereby inhibiting protein synthesis Nature Reviews Genetics 5, 522-531 (July 2004)
  • 28. Why Very Few Histone Modification or miRNA EWAS? • Samples must be collected and stored correctly • Stored in RNAlater, need high quality RNA • Not treated with proteases (histones are proteins) • May be sensitive to freeze thaw cycles or the storage temperature • Can be very expensive • To estimate chromatin states, may need to sequence multiple marks per sample • Harder to integrate studies across labs • Difficult to find validation cohort • Trend towards meta-analyses
  • 29. Cross Talk between Epigenetic Pathways Gene Expression DNA methylation Histone ModificationsmiRNA
  • 31. Study Designs for EWAS Adapted from: Michels (ed) Epigenetic Epidemiology
  • 32. Types of Studies: Experimental Studies/Randomized Control Trials • Design: exposure is randomized • Strengths: • There is no confounding by design, adjusted analysis should estimate causal effect (potentially intention to treat effect) • Weaknesses: • Very expensive • Time consuming • Must consider eqipose
  • 33. Types of Studies: Cohort Studies • Design: enroll a group of people who do not have an outcome yet, collect exposure information, follow individuals over time • Strengths: • Efficient design when the exposure of interest is rare • Information collected on multiple exposures and multiple outcomes • If exposure ascertained before outcome, eliminate recall bias • Weaknesses: • Very time consuming • Very expensive • Difficult if outcome is rare
  • 34. Types of Studies: Case-Control Studies • Design: enroll cases (individuals with incident or prevalent disease) and controls (individuals who have never had the disease who are at risk for disease) from the population from which the cases arose • Strengths: • Likely faster and at lower cost if outcome has happened already • Efficient design when the outcome is rare • Weaknesses: • Only one disease/outcome studied (can use data again but must account for the selection) • May not be possible if the outcome is rare • Selection of controls may be difficult • Complete exposure information may be difficult to ascertain retrospectively
  • 35. Types of Studies: Twin Studies Mill and Heijmans (2013) Nature Reviews Genetics Monozygotic (MZ) twin studies help to discern the impact of genetic sequence on epigenetic variation and disease risk MZ twins share their DNA sequence, parents, birth date and sex, and experienced a very similar prenatal environment
  • 36. Types of Studies: Family-Based • Can use great-grandparent, grandparent, maternal and paternal data to investigate possible transgenerational inheritance • If interest is intrauterine exposure, can use paternal exposures as a negative control to disentangle impact of home environment • Associations may reflect shared familial confounding factors or by parental genotypes transmitted to the offspring • Impact of only maternal exposure would suggest intrauterine effect
  • 37. Transgenerational Inheritance In the case of an exposed female mouse, if she is pregnant, the fetus can be affected in utero (F1), as can the germline of the fetus (the future F2) • considered to be parental effects, leading to intergenerational epigenetic inheritance • Only F3 individuals can be considered as true transgenerational inheritance Does it exist in humans? Heard and Martienssen (2014) Cell
  • 38. Example of Negative Controls • Maternal smoking during pregnancy on offspring birthweight is considerably greater than that of paternal smoking during pregnancy • Adjustment for maternal smoking attenuates the paternal effect to zero • In line with evidence that maternal smoking has a causal effect on offspring birthweight Smith (2012) Epidemiology
  • 39. Type of Studies: Family-Based Odds ratios in meta-analyses of association between maternal smoking during pregnancy (vs no maternal smoking during pregnancy) and paternal smoking any time (vs no paternal smoking any time) and overweight or obesity in childhood Riedel et al. (2014) International Journal of Epi
  • 40. Defining Exposure • Possible definitions of exposure • Maximum intensity of exposure experienced • Average intensity over a period of time • Cumulative amount of exposure • Other important variables that are not accounted for by duration and intensity: • Age that exposure started • Age at cessation of exposure • Timing of exposure relative to disease onset (lag or induction period)
  • 41. Validity of Results • Internal validity – the extent to which the analysis captures the true causal association • Threats to internal validity • Confounding • Selection Bias • Information Bias • Can be addressed in both study design and analysis • External validity –generalizability of the results Which do we care about more?
  • 42. Confounding • Standard definition: a common cause of the outcome and exposure • Can result in the detection of an association between exposure and outcome even if there is no direct effect
  • 43. Confounding • Surrogate confounders: variables correlated with a confounder that have no direct association with exposure or outcome • Adjustment for these variables may reduce but not completely remove confounding • Useful when true confounder cannot be measured
  • 44. Selection Bias • Selection bias: biases that arise from conditioning on a common effect of two variables, one of which is either the exposure or a cause of the exposure, and the other is the outcome or a cause of the outcome • Often we think about selection bias in case-control studies, but can arise in cohort studies as well
  • 45. Selection Bias in Case-Control Studies Occurs due to inappropriate selection of controls • Cases in the cohort are more likely to be selected than non- cases • Investigators selected controls preferentially among women with hip fracture and estrogen is protective against hip fracture Hernán MA, Robins JM (2016). Causal Inference
  • 46. Selection Bias in Case-Control Studies: Berkson’s Bias Occurs due to inappropriate selection of controls • Both disease 1 and 2 are unassociated but both affect the probability of hospital admission • Hospital-based controls: cases had Disease 1 and controls had Disease 2 that is affected by the exposure A. • Risk factor A for Disease 2 would appear to also be a risk factor for Disease 1 even if A does not cause Disease 1 Hernán MA, Robins JM (2016). Causal Inference
  • 47. Selection Bias in Cohort Studies Loss to follow-up • exposure has side effects that increase the probability of dropping out and certain symptoms of disease increase the probability of dropping out Hernán MA, Robins JM (2016). Causal Inference
  • 48. Selection Bias in Cohort Studies Healthy worker bias • The unmeasured health status U is a determinant of both death Y and of being at work C • L may be the result of some blood test or physical exam Hernán MA, Robins JM (2016). Causal Inference
  • 49. Selection Bias in Cohort Studies Volunteer Bias • Bias may be present if the study is restricted to those who volunteered – may be related to lifestyle • Cannot occur in a randomized study – exposure randomization happens after they elect to participate • Can impact generalizability Hernán MA, Robins JM (2016). Causal Inference
  • 50. Information Bias • Two important properties of measurement error • Independence • Non-differentiality • Y* and A* are the measured outcome and exposure, Y and A are the true values Independent Dependent Hernán MA, Robins JM (2016). Causal Inference
  • 51. Information Bias: Recall Bias • Recall Bias: outcome affects the measurement of the exposure • Independent but differential measurement error • If the outcome is birth defects Y and ask mother to recall alcohol use during pregnancy A after delivery • Recall may be affected by the outcome of the pregnancy Hernán MA, Robins JM (2016). Causal Inference
  • 52. Information Bias: Detection Bias • Detection Bias: exposure affects the measurement of the outcome • Independent but differential measurement error • Smokers concerned about health impacts of smoking may seek medical attention more than nonsmokers • Lead to emphysema to be diagnosed more frequently among smokers than among nonsmokers Hernán MA, Robins JM (2016). Causal Inference
  • 53. Increasing Efficiency by Matching • Due to limited resources (money, number of case samples) studies are limited in size • Can increase our power to detect a change by matching on confounders • We must adjust for confounders in the analysis, by matching we are trying to ensure that we do not have certain sparse strata • The impact of matching on internal validity depends on the type of study that is being conducted • Case-control vs cohort
  • 54. Matching in Cohort Studies • If oversampling exposure groups, can match on confounders to increase efficiency and control for bias • Effect of matching in analysis • Matching prevents confounding even in crude analysis • Assuming no other confounding • Don’t have to adjust for the matching factor • To improve precision, matching should be accounted for in the analysis • Effect modification of matching factors can be evaluated in follow-up studies
  • 55. Matching in Case-Control Studies • Matching in case-control studies helps to increase precision but it does not remove confounding in the crude analysis • Bias introduced by matching • If the matching factor(s) are associated with the exposure of interest, matching will cause the exposure distribution among the control group to be more similar to the cases than the true distribution of exposure in the study base • Bias towards the null • Must adjust for the matching factors in the analysis • If you match in a case-control study: • You cannot study the main effects of matching factors • You can evaluate if the matching factors are effect modifiers
  • 56. Inappropriate Matching in Case- Control Studies • Appropriate matching: matching on a confounder • Inappropriate matching: • Unnecessary matching: match on variable that is not associated with the exposure • Do not need to adjust for matching factors in analysis • Over-matching: match on variable that is only associated with the exposure • Results are biased if you do not adjust for the matching factors • Have altered the exposure distribution among the controls • Matching on intermediate • Have removed any possible association between exposure and outcome • Impossible to rectify this error in the analysis
  • 57. Mediation of effects of exposures on disease outcomes • Utility of Mediation analysis • If blood pressure partially mediates the influence of BMI on CHD, could therapeutically modifying blood pressure help break the link between BMI and CHD? Blood pressure CHDBMI May be interested in looking at the impact of an exposure on an outcome through methylation
  • 58. Mediation of effects of exposures on disease outcomes • Identifying direct and indirect effects requires additional modeling assumptions: • 𝑌𝑌𝑎𝑎𝑎𝑎∐𝐴𝐴|𝐶𝐶 : Y is independent of A adjusting for C M Y C1 A
  • 59. Mediation of effects of exposures on disease outcomes • Identifying direct and indirect effects requires additional modeling assumptions: • 𝑌𝑌𝑎𝑎𝑎𝑎∐𝑀𝑀|𝐴𝐴, 𝐶𝐶: Y is independent of M adjusting for C & A M Y C2 C1 A
  • 60. Mediation of effects of exposures on disease outcomes • Identifying direct and indirect effects requires additional modeling assumptions: • 𝑀𝑀𝑎𝑎∐𝐴𝐴|𝐶𝐶: M is independent of A adjusting for C M Y C2 C1 A C3
  • 61. Mediation of effects of exposures on disease outcomes • Identifying direct and indirect effects requires additional modeling assumptions: • 𝑌𝑌𝑎𝑎𝑎𝑎∐𝑀𝑀𝑎𝑎|𝐶𝐶: No effect of exposure that confounds the mediator-outcome relationship M Y C2 C1 A C3
  • 62. An Example of When Mediation Analysis Can Introduce Bias • Birthweight Paradox: among low birth weight (LBW) infants, infant mortality is lower among infants born to smokers Is maternal smoking beneficial to low birth weight infants? Lines cross around 2kg Hernández-Díaz et al. (2006) Am J Epidemiology
  • 63. An Example of When Mediation Analysis Can Introduce Bias • Possible explanation, there is a common cause of LBW and mortality that has a greater impact on mortality than smoking • Therefore if infant is LBW and mother is not a smoker, more likely to have other condition which is associated with higher mortality rate • Results in an apparent decreased risk of mortality among LBW infants from smokers Hernández-Díaz et al. (2006) Am J Epidemiology
  • 65. Batch Effects Lazar C et al. Brief Bioinform 2012 Many potential sources: 65
  • 66. Impact of Batch Effects • One of the largest determinants of variation tends to be batch • Even after data preprocessing (normalizing the signal intensities across the array), can still see the impact of batch at the gene-level Batch Effect. Nature Reviews Genetics 2010 66
  • 67. Batch Effects • Can try to reduce the impact of batch effects in the: • Design of the study (discuss today) • In the analysis (discuss tomorrow) • One way to assess the impact of batch is through the judicious use of technical replicates • Possible to preclude completely confound the association between epigenetic mark and exposure with batch • No way to fix this in the analysis • Examples?
  • 68. Addressing Batch Effects in Design • The appropriate distribution of unique biospecimens across batches depends on: • The study design • The question of interest • Our ability to estimate the batch effects • Need to think about potential sources of batch effects • Storage • When samples were processed • Chip processed on or sequencing lane
  • 69. Addressing Batch Effects in Design Study design Question of interest Samples to be assayed in same batch Groups to be balanced within batches Randomized trial Within-person changes over time Samples from same participant Intervention groups Crossover intervention trial Within-person differences between interventions Samples from same participant Order of interventions Cohort study Comparison of exposed and non-exposed persons NA Exposure categories if categorical exposure Case–cohort study Comparison of diseased and disease-free persons NA Proportion of cases and subcohort membersa Matched case–control study Comparison of cases and controls Cases and their matched control(s) NAb Frequency-matched case–control study Comparison of cases and controls NA Cases and controlsb Frequency matching characteristics if categorical Case-series Comparison of different case groups NA Case groups of interest Cross-sectional study Comparison of exposed and non-exposed NA Exposure categories if categorical exposure Reviewed by: Tworoger and Hankinson (2006) Cancer Causes Control
  • 70. Tissue Specificity of Epigenetic Change Same genome different epigenome 2001 Terese Winslow, Caitlin Duckwall
  • 71. Choice of Tissue • Tissue of interest may not be readily accessible (e.g. brain) • Can use reference epigenome project to inform choice of surrogate tissue (discuss later today) • Use available samples or establish a new study population? • Using established study • Nested case-control or cohort study • Often only blood available – may not be tissue of interest • Extensive covariate data available • Long term outcomes • Starting a new study • Identify samples necessary, use correct storage • Time consuming and more expensive • Not possible to assess long term outcomes
  • 72. Choice of Tissue NIH interested in identifying appropriate surrogate tissues
  • 73. Bias Due to Cell Composition When we have the a hetereogenous tissue, cell mixture in that tissue may impact the results Possible impacts of cell composition: • Confounding • Mediation • Reverse causation Houseman (2015) Current Environmental Health Reports
  • 74. Estimating Cellular Composition Houseman (2015) Current Environmental Health Reports Can estimate cellular composition in heterogeneous tissue using a reference data set:
  • 75. Steps toward a Successful EWAS Recommendations for the design and analysis of epigenome-wide association studies. Michels et al (2013) Nature Methods
  • 76. Verification • Definition: replicate the findings in the same cohort using a different technology • Original approach may not be capturing change in percent methylation • Verify that the findings were not technical error • Some previously unrecognized batch effects – e.g. find out that cases and controls were processed by different technicians • Might be some inherent bias associated with the platform • If results cannot be verified with a different technology (at least as precise as the original technology) it suggests the original results were a false positive
  • 77. Validation • Definition: replication of results in an independent cohort • Ideally using a different technology, ensure not the result of some inherent bias of the platform • Similar to the purpose of validation for other types of epidemiologic studies • Identify potentially important effect modification • Possible important residual unmeasured confounding
  • 78. Steps to Identifying True Positives • Steps and possible issues:
  • 79. Next Steps: Paper Discussion