This document describes MMRetrieval.net, a multimodal search engine that allows retrieval of information from multiple modalities including text and images. It discusses challenges in multimodal retrieval like defining modality, fusing scores across modalities, and improving efficiency through two-stage retrieval. MMRetrieval.net was developed by researchers at DUTH to participate in the ImageCLEF 2010 Wikipedia retrieval task, where it achieved the second best performance overall by combining text and visual features through various fusion techniques. The system demonstrates the promise of multimodal search but also identifies open challenges around modality definitions, fusion methods, and scaling to large databases.