This document describes a deep neural network model for predicting emotions in images based on high-level semantic features like objects and backgrounds. The model was trained on a new image emotion dataset consisting of over 10,000 images labeled with valence and arousal values obtained through crowdsourcing. The model uses features like object detection, background semantics, and color statistics as inputs to a feedforward neural network that predicts continuous valence and arousal values, providing a more nuanced representation of emotions than discrete categories. Experiments showed the model could effectively predict emotions in images.