This document proposes a convolutional neural network (CNN) to automatically classify aerial and remote sensing images. The CNN has six layers - three convolutional layers to extract visual features from the images at different levels of abstraction, two fully-connected layers to integrate the extracted features, and a final softmax classifier layer to classify the images. The CNN is evaluated on two datasets and is shown to outperform state-of-the-art baselines in terms of classification accuracy, demonstrating its ability to learn spatial features directly from images without relying on handcrafted features or descriptors.