This document summarizes the VNect system for real-time 3D human pose estimation from a single RGB camera. It uses a fully-convolutional neural network to regress 2D and 3D joint positions in real-time at 30Hz. It then fits the predicted joint positions to a kinematic skeleton model to estimate a full global 3D pose without requiring tightly cropped input frames or bounding boxes. The system is able to estimate 3D pose in real-time using only an off-the-shelf RGB camera as input.