Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A functional and evolutionary perspective on transcription factor binding in Arabidopsis thaliana

872 views

Published on

A functional and evolutionary perspective on transcription factor binding in Arabidopsis thaliana

Klaas Vandepoele
Comparative & Integrative Genomics group
Department of Plant Biotechnology and Bioinformatics, Ghent University
Department of Plant Systems Biology, VIB - Belgium

Published in: Science
  • Be the first to comment

A functional and evolutionary perspective on transcription factor binding in Arabidopsis thaliana

  1. 1. A functional and evolutionary perspective on transcription factor binding in Arabidopsis thaliana Potsdam, October 2014 Comparative & Integrative Genomics group Department of Plant Biotechnology and Bioinformatics, Ghent University Department of Plant Systems Biology, VIB - Belgium plaza_genomics
  2. 2. OVERVIEW 1. Transcriptional gene regulation in plants 2. Inference of transcriptonal networks using an ensemble framework for phylogenetic footprinting in plants 3. An integrated gene regulatory network using experimental ChIP data of 27 transcription factors in Arabidopsis 4. Conclusions Jan Van de Velde Ken Heyndrickx
  3. 3. 1. TRANSCRIPTIONAL GENE REGULATION Mejia-Guerra et al., 2012 Arabidopsis -1,700-2,500 Transcription Factors - 180-791 miRNA - 2,708 expressed lncRNA 49MB non-coding DNA 11,000 regulatory interactions (AtRegNet)
  4. 4. EXPERIMENTAL CHARACTERIZATION OF REGULATORY INTERACTIONS Mejia-Guerra et al., 2012 EMSA Y1H SELEX PBM ChIP-Seq
  5. 5. COMPUTATIONAL ANALYSIS OF CIS-REGULATORY ELEMENTS  Mapping of known TF binding sites on promoter sequences  False positives  Low quality motifs (PWMs) + many motifs lack information about binding factor  Motif redundancy & multi-gene transcription factor families Database # CRE Species PLACE 469 Vascular plants AGRIS 99 Arabidopsis thaliana AtProbe 172 Arabidopsis thaliana PlantCARE 435 Monocots and dicots
  6. 6. 2. PHYLOGENETIC FOOTPRINTING: DETECTION OF CONSERVED NON-CODING SEQUENCES (CNS)  Comparative analysis of noncoding DNA sequences to identify candidate regulatory elements (in orthologous genes)  Regulatory elements are conserved during evolution due to functional constraint (vs. neutral carry-over)  The power of phylogenetic footprinting is enhanced significantly when data from a number of related species, which diverged sufficiently, is available
  7. 7. DEVELOPING AN ENSEMBLE FRAMEWORK FOR PHYLOGENETIC FOOTPRINTING IN PLANTS  Application of motif mapping and different pairwise alignment tools  Aggregate alignments in multi- species footprint using 11 comparator dicot genomes  Evaluate statistical signifcance incl. FDR analysis AtProbe Feature map @ RSAT 144 regulatory elements (63 genes) 774 DNA motifs
  8. 8. FROM PAIRWISE ALIGNMENTS TO MULTI-SPECIES FOOTPRINTS  Generate all pairwise alignments between Arabidopsis query gene and its orthologs  Map all pairwise alignments back to reference promoter  Count per position the #species that support a footprint  Significance estimation Van Bel, M., Proost, S., Wischnitzki, E., Movahedi, S., Scheerlinck, C., Van de Peer, Y., Vandepoele, K. (2012) Dissecting plant genomes with the PLAZA comparative genomics platform. Plant Physiology 158:590-600
  9. 9. EVALUATION ATPROBE EXPERIMENTAL CIS- REGULATORY ELEMENTS Significance Experimental motifs Scmm ACGTGGC = 0.54 P value < 0.001 G-box Scmm ATAGATAA = 0.09 P value 0.48 GA motif Scmm GATAAGATT = 0.36 P value < 0.001 I-box RBCS1A Scmm TATATATA = 0.7 P value < 0.001 GAPA ACA motif C-motif
  10. 10. PROPERTIES CNS  69,361 CNSs associated with 17,895 genes  Protein-coding genes (99%), miRNA genes (1%)  Median length: 11nt (min-max: 5-514nt)  CNS cover 1,070kb of the non-coding Arabidopsis genome
  11. 11. DETECTION OF EXPERIMENTAL REGULATORY ELEMENTS • Black boxes: percentage of recovered elements • White boxes: percentage of uniquely recovered elements in this study
  12. 12. RECOVERY OF IN VIVO FUNCTIONAL TARGETS USING CNS INFORMATION • White boxes: fold enrichment for CNSs • Black boxes: fold enrichment naïve motif mapping High-quality dataset 15 TFs ChIP-Seq binding + TF binding site + differentially expressed TF perturbation (n=2708)
  13. 13. GENOME-WIDE REGULATORY ANNOTATION Collapsed TF-target module network  40,758 TF-target interactions (157 TFs)  9/13 TFs significant overlap with experimentally confirmed targets (AtRegNet/Hussey et al., 2013)  Various functional genomics metrics confirm quality predicted GRN Van de Velde, J.*, Heyndrickx, K.S.*, and Vandepoele, K. (2014). Inference of Transcriptional Networks in Arabidopsis through Conserved Noncoding Sequence Analysis. Plant Cell.
  14. 14. A CONDITION-SPECIFIC SECONDARY CELL WALL GENE REGULATORY NETWORK.
  15. 15. 2. AN INTEGRATED GENE REGULATORY NETWORK USING CHIP DATA OF 27 TRANSCRIPTION FACTORS  How is TF binding organised across different target genes?  Have Highly Occupied Target (HOT) genes in plants the same distinct regulatory features as in organisms?  To what extent is binding linked to differential expression TF binding site presence?
  16. 16. * Heyndrickx KS, Vandepoele K (2012) Systematic Identification of Functional Plant Modules through the Integration of Complementary Data Sources. Plant physiology 159: 884-901
  17. 17. TF PROTEIN-PROTEIN INTERACTION NETWORK De Bodt S, Hollunder J, Nelissen H, Meulemeester N, Inze D (2012) CORNET 2.0: integrating plant coexpression, protein-protein interactions, regulatory interactions, gene associations and functional annotations. The New phytologist 195: 707-720 Flowering Light Response
  18. 18. GENOMIC REGIONS BOUND BY TFS
  19. 19. GENOMIC REGIONS BOUND BY TFS
  20. 20. GENE LOCATION ANALYSIS TF BINDING EVENTS  89% of upstream sites lie < 2kb  23,891 / 26,717  91% of downstream sites lie < 2kb  11,687 / 12,828
  21. 21. COORDINATED REGULATION BY DIFFERENT TFS
  22. 22. BINDING SITE ORGANISATION A B DH I sites: Zhang W, Zhang T, Wu Y, Jiang J (2012) Genome-Wide Identification of Regulatory DNA Elements and Protein-Binding Footprints Using Signatures of Open Chromatin in Arabidopsis. The Plant cell A B C D R
  23. 23. COOPERATIVE TF BINDING: HUB & HOT #TargetGenes # Bound TFs Gene Complexity A-D
  24. 24.  Hub genes  1,170 genes bound by ≥ 8 TFs  Significantly Enriched for TFs and miRNAs  Highly Occupied Target (HOT) regions  1,179 regions bound by ≥ 7 TFs COOPERATIVE TF BINDING: HUB & HOT A-D Enrichment for regulatory genes (TFs, kinases), response to stimuli & developmental genes
  25. 25. CHROMATIN STATES AND NUCLEOTIDE DIVERSITY OF TF- BOUND REGIONS Sequeira-Mendes et al., … Gutierrez, C. (2014). The Functional Topography of the Arabidopsis Genome Is Organized in a Reduced Number of Linear Motifs of Chromatin States. Plant Cell. Population sequence diversity based on 369 Arabidopsis strains (Weigel lab)
  26. 26. HOT-ASSOCIATED GENES ARE FACTOR-RESPONSIVE
  27. 27. HOT-ASSOCIATED GENES ARE FACTOR-RESPONSIVE
  28. 28. EXPRESSION LEVELS ARE CORRELATED WITH THE TOTAL NUMBER OF BOUND TFS Low: < 3 TFs; Intermediate: >= 3 TFs and < 8TFs; hub: >= 8 TFs (n=406 flowering-related genes)
  29. 29. A SINGLE MOTIF DOES NOT EXPLAIN BINDING
  30. 30. A SINGLE MOTIF DOES NOT EXPLAIN BINDING (II)
  31. 31. A SINGLE MOTIF DOES NOT EXPLAIN BINDING (II)
  32. 32. SEC. MOTIFS ARE ENRICHED FOR DE AS WELL PIF5: ChIP-Seq
  33. 33. SEC. MOTIFS ARE ENRICHED FOR DE AS WELL
  34. 34. NEW HYPOTHESES ON CO-BINDING AND TETHERING AP1 PIF3 PRR7 FHY3AP1 SEP3 Co-binding Tethering PRR5 PRR7
  35. 35. CONCLUSIONS & PERSPECTIVES  Integration of CNS with complementary experimental data sources offers new possibilities for regulatory gene annotation in plants  High specificity to predict TF-target interactions  Complementary to exerperimental TF-target detection methods  Study GRN conservation and rewiring across species  Integrated 27 TF ChIP-Seq gene regulatory network reveals  Complexly regulated are enriched for regulatory genes  HOT-associated regions represent functional binding events  Open chromatin  Sequence constraint  TF binding sites  Enrichment for regulated target genes  Co-binding and tethering patterns could explain the apparent discrepancy between binding and regulation in ChIP-chip/Seq studies
  36. 36. FURTHER READING Van de Velde, J.*, Heyndrickx, K.S.*, and Vandepoele, K. (2014). Inference of Transcriptional Networks in Arabidopsis through Conserved Noncoding Sequence Analysis. The Plant Cell 26(7):2729-2745 Proost, S., Van Bel, M. … and Vandepoele, K. (2015). PLAZA 3.0: an access point for plant comparative genomics. Nucleic Acids Research (accepted) Heyndrickx, K.S.*, Van de Velde, J.*, and Vandepoele, K. (2014). A functional and evolutionary perspective on transcription factor binding in Arabidopsis thaliana. The Plant Cell (accepted)

×