The document discusses 'BoxInst', a high-performance instance segmentation model that uses only bounding box annotations for training, achieving a significant increase in average precision from 21.1% to 31.6% on the COCO benchmark. It introduces two proposed loss terms, projection loss and pairwise affinity loss, which work together to facilitate mask learning without the need for pixel-level mask annotations. The results indicate that BoxInst outperforms previous state-of-the-art models, including both weakly and fully supervised methods.