Multimodal texts combine multiple modes, such as verbal and non-verbal elements, to convey meaning, employing combinations of written, spoken, visual, audio, gestural, and spatial modes. These texts can be in various forms, including digital media, print, or live performances, and are defined by their integration of different communication modes. In composition, understanding multimodal elements involves recognizing how linguistic, visual, spatial, audio, and gestural modes interrelate and contribute to the overall meaning of the text.