This document discusses multimodal text analysis, which examines communication using multiple modes or semiotic resources like language, images, gestures, sound, etc. It outlines two major approaches: 1) Exploring theory using text analysis as examples to discuss general principles. 2) Closely examining actual texts to build detailed descriptions and derive generalizations. It provides Kress and van Leeuwen (1996) and O'Toole (1994) as examples that exemplify these approaches, with Kress focusing more on theoretical discussion and O'Toole emphasizing close analysis of specific texts. Multimodal analysis faces challenges regarding accessing, annotating, and reproducing dynamic media like video.