Presenter: Amaia Salvador
E. Mohedano, Salvador, A., McGuinness, K., Giró-i-Nieto, X., O'Connor, N., and Marqués, F., “Bags of Local Convolutional Features for Scalable Instance Search”, in ACM International Conference on Multimedia Retrieval (ICMR), New York City, NY; USA. 2016
A. Salvador, Giró-i-Nieto, X., Marqués, F., and Satoh, S. 'ichi, “Faster R-CNN Features for Instance Search”, in CVPR Workshop Deep Vision, Las Vegas, NV, USA. 2016.
Image representations derived from pre-trained Convolutional Neural Networks (CNNs) have become the new state of the art in computer vision tasks such as instance retrieval. This work proposes a simple pipeline for encoding the local activations of a convolutional layer of a pre-trained CNN using the well-known bag of words aggregation scheme (BoW). Assigning each local array of activations in a convolutional layer to a visual word produces an assignment map, a compact representation that relates regions of an image with a visual word. We use the assignment map for fast spatial reranking, obtaining object localizations that are used for query expansion. We further investigate the potential of using convolutional features from an object detection network such as Faster R-CNN, which allows to obtain image- and region- wise features in a single forward pass. We demonstrate the suitability of such representations for image retrieval on the Oxford Buildings 5k, Paris Buildings 6k and a subset of TRECVid Instance Search 2013, achieving competitive results. This talk will review the two publications related to this work, which have been recently accepted at ICMR 2016 and DeepVision CVPRW 2016.
Barcelona, 3 May 2016.