The document proposes a scheme to enrich textual answers from community question answering (cQA) forums with appropriate multimedia data like images and videos. It consists of three components: 1) determining what type of media to add, 2) generating queries to search for multimedia, and 3) selecting and presenting relevant images/videos. This allows cQA to provide more informative answers by supplementing text with visual media. The approach leverages existing human-generated textual answers and focuses on linking those to multimedia, addressing complex questions without requiring deep understanding. It was tested on a multi-source QA dataset and showed effectiveness.