The document summarizes the progress and future work of a video searching application that uses automatic annotation. It analyzes video structure by fragmenting videos into frames and identifying duplicates. Deep learning is used to detect objects in images by training a model on categorized image data. Semantic textual searching tokenizes user queries and matches them to an ontology and database to return relevant video results. Future work includes optimizing code, improving techniques for frame analysis, transferring models to proto buffers, automatically categorizing videos, and analyzing query relationships.