1. 5-hemi-hydroxymethylcytosine in E-Box motifs
ACAT|GTG and ACAC|GTG increases DNA-
binding of the B-HLH transcription factor TCF4
but not USF1
Shriyash Upadhyay
3. Background: Cytosine Methylation
Cytosine Methylation primarily occurs in CG Islands on both strands of DNA
Methyl groups are enzymatically added to DNA and oxidized in order to regulate gene
expression
Recent research suggests that 5-methylcytosine (5mC), 5-
hydroxymethylcytosine (5hmC), and further oxidized states, regularly occur
on single strands of DNA, outside of CG dinucleotides
Particularly common in mammalian stem cells and the brain
4. Background: TCF4 and USF1
The B-HLH family of transcription factors are involved primarily in
housekeeping
USF1 regulates many aspects of cell growth, including glucose and lipid metabolism
A minority of B-HLHs are involved in cell differentiation
TCF4 forms transcriptional networks that regulate cellular differentiation of many cell types
Aberrant expression and/or mutations in TCF4 can cause abnormal brain development
leading to neurodevelopmental disorders such as Pitt-Hopkins syndrome, schizophrenia,
and Fuchs’ corneal endothelial dystrophy
5. Hypothesis
Hypothesis: Single stranded cytosine modifications may modulate B-HLH TF
binding
To test this, I examined if single stranded 5mC and 5hmC changed the DNA
binding of TCF4 and USF1
Perhaps these modifications are what differentiates between regulatory and developmental
B-HLHs
6. Experimental Design
Independent Variable: Type of cytosine on complementary strand
Levels: Cytosine [positive control], [negative control] 5mC90, 5hmC
Dependent Variable: Fluorescent intensity of fluorescently marked proteins
Indicative of the level of binding
Fluorescent intensity of all possible 8-mers were measured using a Protein Binding
Microarray (PBM)
Temperature, salinity, pH, exposure times, and concentrations of proteins
8. Double Stranding Reaction
Prior methods of double stranding necessarily modify both strands
The enzymatic modifications occurs only after double stranding
Instead, the DNA was double stranded with a modified DNTP mixture
All cytosines on the complementary strand were modified
Bases were fluorescently labeled to quantify the success of the double
stranding reaction
10. Diagram of Procedures (cont.)
Add GST-Tagged Protein to PBM Quantify bound protein with
fluorescently-labeled anti-GST antibody
11. Structural analysis
Molecular models were developed with the CHARMM software package
The USF1 bound to DNA X-ray crystal structure identifier is PDB:1AN4
The structure of TCF4 is not available, so the similar TCF3 was analyzed instead
Alternate structures were obtained by energy minimization and molecular
dynamics
Only the sidechains of the binding glutamic acid and arginine residues and the DNA
modification groups were allowed to move
13. Fluorescent intensities
Double-stranding efficiency of the
PBMs with cytosine, 5mC, and
5hmC. Fluorescence intensities,
from lowest to highest values, of the
spiked Cy3-dCTP across 40k
features of the HK array (blue),
divided by the number of cytosines
in the 35-mer variable sequence
(red) for double stranding with
Cytosine, 5mC and 5hmC.
The double stranding was
successful
14. TCF4 Binding
TCF4-GST B-HLH domain binding to DNA 8-mers containing
cytosine, 5mC or 5hmC on one strand. DNA 8-mers containing
E-boxes are labeled as red spots, 8-mers with a cytosine are
black, and 8-mers without a cytosine are grey. (A) TCF4-GST
binding to 8-mers containing cytosine (X-axis) or 5mC on one
DNA strand (Y-axis). The Z-score values for cytosine and 5hmC
are written in [x-axis: y-axis] format. (B) TCF4-GST binding to 8-
mers containing cytosine (X-axis) or 5hmC (Y-axis). (C) TCF4-
GST binding to 8-mers containing 5mC (X-axis) or 5hmC (Y-
axis). 8-mers shown are from modified strand.
TCF4 was well bound to cytosine and 5hmC, but poorly bound
to 5mC
15. USF1 Binding
USF1-GST B-HLH domain binding to DNA 8-mers containing
cytosine, 5mC or 5hmC on one strand. (A) USF1-GST binding
to 8-mers containing cytosine (X-axis) or 5mC (Y-axis). (B)
USF1-GST binding to 8-mers containing cytosine (X-axis) or
5hmC (Y-axis). (C) USF1-GST binding to 8-mers containing
5mC (X-axis) or 5hmC (Y-axis). DNA 8-mers containing E-
boxes are labeled as red spots, 8-mers with a cytosine are
black, and 8-mers without a cytosine are grey.
USF1 was well bound to cytosine, but poorly bound to 5mC
and 5hmC
16. Molecular Modeling of TCF4
Structural modeling of TCF3 with cytosine, 5mC and 5hmC.
Crystal structure of TCF3 (E47) homodimer bound to E-box
DNA. One protein monomer is represented as a grey surface,
and the other monomer as a blue surface. Highlighted amino
acid sidechains are shown as van der Waals spheres, and
DNA is shown as sticks. Atom color code: protein carbon –
grey, DNA carbon – magenta, oxygen – red, nitrogen – blue,
phosphorous – yellow, hydrogen – white. (A) Invariant
glutamic acid interacting with the CA dinucleotide. Focus on
the interface of the protein with E-box DNA bases. (B) Steric
clash of 5mC modification with E345 and R348. The added
methyl carbon is shown as a transparent VDW sphere. (C)
Alternate structure with 5hmC modification.
17. Molecular Modeling of USF1
Structural modeling of USF1 with 5mC. Crystal structure of
USF homodimer bound to E-box DNA (Ferre-D’Amare et al.,
1994). Representations are similar to TCF4 model. (A)
Shown is the interface of the protein with the E-box DNA
bases, illustrating 5mC modification overlap with R211. (B)
Alternate view showing steric conflict of 5mC modification
and deoxyribose of previous nucleotide. The 5mC methyl
carbon and sugar-phosphate backbone are shown as VDW
spheres.
This highlights the structural differences between TCF4
and USF1
19. Conclusion
Hypothesis confirmed
5mC uniformly decreases DNA binding of both TCF4 and USF1. The bulkier
5hmC also inhibited USF1 binding but enhanced TCF4 binding to E-Boxes
containing ACAT|GTG and ACAC|GTG, being better bound than any 8-mer
containing cytosine.
Hydroxymethylation increased the binding of TCF4, which is unusual, as
most modifications decrease binding
20. Conclusion (cont.)
USF1 has four bulky arginines following the glutamic acid (ERRRR), while
TCF4 has only two (ERLRV) suggesting that the TCF4 structure may be
more amenable to conformational changes when it preferentially binds
5hmC. This conformational flexibility is seen in the two forms of the
TCF3–DNA complex in the X-ray structure
21. Conclusion (cont.)
I developed a new protein binding microarray method, in which single-
stranded oligonucleotide arrays were double-stranded with either 5-
methylcytosine or 5-hydroxymethylcytosine
Creates asymmetric distribution of cytosine mimicking what occurs in mammalian stem
cells and brain tissues
Now all combinations of modifications on each strand of DNA can be conducted, increasing
the number of possible testable modifications approximately 2.6 fold
Genome scale high throughput projects can be carried out with this method
to further study pathways of biological importance
22. Future Research
Thermodynamic cycling experiments to determine the effects of these amino
acids on the stability of the DNA-protein complex
Extending the same experiment to other B-HLH proteins
Generating alleles for the TCF4 gene could be designed such that do not bind
to cytosine, but do bind to 5hmC
In vivo testing of the biological significance of particular modifications using
the above mentioned alleles