This document summarizes a research paper that proposes a new dataset called Physion for evaluating how well machine learning models can predict physical interactions from vision, similar to humans. The dataset contains videos of common physical phenomena. Several state-of-the-art models were evaluated on the dataset, including particle-based simulators and vision-based models. Particle-based simulators achieved performance on par with humans, while vision-based models performed poorly. The document provides background on the motivation for the dataset and describes the different models and their approaches.