Subtypes of Associated Protein-DNA (TF-TFBS) Patterns    Prepared by: Cyrus Tak-Ming Chan (tmchan@cse.cuhk.edu.hk)    Tak-...
Introduction Proteins bind to DNA fragments to regulate  genes  i.e. Transcription Factors (TFs) bind to Transcription F...
Motivations Finding patterns/motifs one-sided is  challenging and difficult      e.g. TFBS Motif Discovery: Noises, vari...
Introduction    Finding associated patterns on both sides is     shown to be promising—when you have many diverse      bi...
Introduction   Finding associated patterns on both sides is    shown to be promising—when you have many diverse    bindin...
Introduction—Motivations   We can go further with these promising    associated TF-TFBS patterns     Discovering and ana...
Methods & Materials                      7
Methods & Materials Both L-2 distance and p-value of Chi-  squared test are used to shortlist subtypes  (3rd: G-C; 4th:G/...
Results Sample results from  http://www.cse.cuhk.edu.hk/~tmchan/subtypes/                                               ...
Results Subtypes with evidence of changed binding preferences             >70% of subtypes (& pairs) reflect             ...
Results Subtype clusters show more conserved  (invariant) residues are important for protein-  DNA interactions; variant ...
Results Case study shows subtypes that are  potentially critical for regulation through  dimerization and thus TF-TFBS bi...
Discussion Further applications   Applications on TFBS (motif) matching by adding TF associated    subtype information  ...
Upcoming SlideShare
Loading in …5
×

Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

857 views

Published on

http://www.cse.cuhk.edu.hk/~tmchan/subtypes/

Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
857
On SlideShare
0
From Embeds
0
Number of Embeds
131
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

  1. 1. Subtypes of Associated Protein-DNA (TF-TFBS) Patterns Prepared by: Cyrus Tak-Ming Chan (tmchan@cse.cuhk.edu.hk) Tak-Ming Chan, Kwong-Sak Leung, Kin-Hong Lee, Man-Hon Wong, Chi-Kong Lau, Stephen Kwok-Wing Tsui, Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns , Nucleic Acids Research, 2012, doi: 10.1093/nar/gks749. 17/Sep/2012 Version 1.2 (Typos corrected on P12) 1
  2. 2. Introduction Proteins bind to DNA fragments to regulate genes  i.e. Transcription Factors (TFs) bind to Transcription Factor Binding Sites (TFBSs) Finding the binding cores (several residues only) is fundamental and important 2
  3. 3. Motivations Finding patterns/motifs one-sided is challenging and difficult  e.g. TFBS Motif Discovery: Noises, variations through mutations, unknown locations—weak signals to be recovered ? —Prediction —True TFBSTak-Ming Chan et al,IEEE Transactions on Evolutionary Computation, 2012 / 3BMC Bioinformatics, 2009, 10: 321 / Bioinformatics, 2007, 24(3)
  4. 4. Introduction  Finding associated patterns on both sides is shown to be promising—when you have many diverse binding sequences (e.g. TRANSFAC)  Associated TF-TFBS patterns found from sequences… x 7664 in TRANSFAC; 408 AAs on averageAssociated patterndiscovery x 26786 bound TFBSs,…NRIAA… …TGACA… 1225 matrices in TRANSFAC;…NRAAA… …TGACA… 25bp on average ……NREAA… …TGTGA… Tak-Ming Chan et al, Discovering approximate-associated sequence 4 patterns for protein-DNA interactions. Bioinformatics, 2011, 27(4)
  5. 5. Introduction  Finding associated patterns on both sides is shown to be promising—when you have many diverse binding sequences (e.g. TRANSFAC)  Associated TF-TFBS patterns found from sequences are verified on 3D structures to be binding cores! x 40222 binding pairs from 1290 PDB protein- Verified on 3D structures DNA complexes (binding cores <3.5Å)…NRIAA… …TGACA……NRAAA… …TGACA… ……NREAA… …TGTGA… Tak-Ming Chan et al, Discovering approximate-associated sequence 5 patterns for protein-DNA interactions. Bioinformatics, 2011, 27(4)
  6. 6. Introduction—Motivations  We can go further with these promising associated TF-TFBS patterns  Discovering and analyzing the binding variances (subtypes) Subtypes may •Lead to changed binding preferences •Distinguish conserved from flexible binding residues •Reveal novel binding mechanisms…NRIAA… …TGACA……NRAAA… …TGACA… ……NREAA… …TGTGA… 6
  7. 7. Methods & Materials 7
  8. 8. Methods & Materials Both L-2 distance and p-value of Chi- squared test are used to shortlist subtypes (3rd: G-C; 4th:G/C-G ) 8
  9. 9. Results Sample results from  http://www.cse.cuhk.edu.hk/~tmchan/subtypes/ 9
  10. 10. Results Subtypes with evidence of changed binding preferences >70% of subtypes (& pairs) reflect changed binding preferences according to PDB structure evidence. 10
  11. 11. Results Subtype clusters show more conserved (invariant) residues are important for protein- DNA interactions; variant residues show specific properties 11
  12. 12. Results Case study shows subtypes that are potentially critical for regulation through dimerization and thus TF-TFBS binding PKVEIL-CAGCTG PKVVIL-CACGTG myogenic regulatory factor (MRF) Myc family (Oncogene): PDB 1NKP family: PDB 1MDY PKVEIL appears in TFs of MRF4, PKVVIL appears in TFs of c/L/v-Myc Myf-5, Myf-6, MyoD… in TRANSFAC in TRANSFAC • The subtypes are discovered without family information while reflecting strong familial specificity • Literatures on wet-labs support that if V is mutated to AA (MycV394D) similar to E, the dimerization of Myc-Max will be abolished (Miz1 binding deficient) 12
  13. 13. Discussion Further applications  Applications on TFBS (motif) matching by adding TF associated subtype information  Extension of the method on high-throughput sequencing data (e.g. ChIP-Seq, Protein Binding Microarrays)  Integration of other information to enhance the TF-TFBS prediction  Incorporation of 3D homology modeling to better model protein- DNA interactions  Analysis of regulatory mechanisms with other data, e.g. allele- specific mRNA data, to reveal more detailed regulatory mechanisms 13

×