SSD is a single shot detector model that uses multiple feature maps from different layers to detect objects at different scales. It directly predicts bounding boxes and class probabilities using convolutional layers, unlike previous models that separated classification and regression. SSD achieves accuracy comparable to state-of-the-art models while running in real-time by using default bounding boxes of different aspect ratios on feature maps to predict offsets for object detection.