This document presents a method for scene text recognition in mobile applications using character descriptors and structure configuration. It proposes using a character descriptor that combines feature detectors and descriptors to extract text features effectively. It also models character structure using stroke configuration maps derived from character boundaries and skeletons. The method was tested on various datasets and achieved accuracy rates above 70%, outperforming existing methods. It can detect text regions and recognize text information for applications like text understanding and retrieval on mobile devices.