The document describes a proposed extension to a visual information narrator application using neural networks. The researchers developed an Android application that can generate image captions on inexpensive mobile hardware in real-time for visually impaired users. They trained an image captioning model using TensorFlow on the FLICKR dataset and implemented the model in an Android app to demonstrate real-world applicability and optimizations for low-cost hardware performance. The system uses an encoder-decoder neural network architecture that encodes an input image into a fixed-length vector using a convolutional neural network, then decodes the vector into a natural language description.