This paper revisits self-supervised visual representation learning techniques. It conducts a large-scale study comparing different CNN architectures (ResNet, RevNet, VGG) and self-supervised techniques (rotation, exemplar, jigsaw, relative patch location). The study finds that using modern CNN architectures like ResNet instead of older AlexNet models significantly improves performance. Increasing the width of networks also boosts performance of self-supervised learning. Evaluation of representations on a new dataset shows the learned features generalize well.