Detection and reconstruction of 3D buildings in urban areas has been a hot topic of research due to its many applications, including 3D population density studies, emergency planning, and building value estimation. Standard approaches to extract building footprint and measure building height rely on either aerial or space borne point cloud data, which in many areas is unavailable. In contrast, high resolution satellite imagery has become more readily available in recent years, and could provide enough information to estimate a building’s height. Recent successes of deep learning on semantic segmentation have shown that convolutional neural networks can be effective tools at extracting 2D building footprints. Using a digital surface model derived using FOSS and LiDAR data as ground truth, this study goes a step further by employing state of the art deep learning architectures such as U-net to infer both building footprints and estimated building heights in one pass from a single satellite image. This application of open deep learning frameworks can bring the benefits of 3D cities to a larger portion of the world.