This document summarizes the process of building a machine learning model to identify duplicate listings using data from Avito, a Russian online marketplace. It discusses the dataset containing over 3 million training pairs and 1 million test pairs of listings with images and text data. Many features were computed from the text, images, attributes, locations and other fields to quantify similarities and differences between listings. These features were then used to train models like XGBoost and ensembles of models to generate the final predictions. Key lessons learned included the importance of cross validation and managing a large number of computed features.