The document discusses image annotation. It begins by explaining what image annotation is and its motivations, which include summarization, applications like video search and retrieval, minimizing required storage, and video reconstruction. The document then outlines the general steps for image annotation, which include image capturing and pre-processing, feature extraction, and determining scene semantic concepts from extracted objects and features. It discusses challenges like data inaccuracy and time consumption, and potential solutions like ontology-directed annotation. Finally, it reviews recent research that uses techniques like ontologies, sensor data, and fuzzy models to perform semantic image and video annotation.