ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning
ViSiL: Fine-grained Spatio-Temporal
Video Similarity Learning
Giorgos Kordopatis-Zilos Symeon Papadopoulos Ioannis Patras Ioannis Kompatsiaris
Given two arbitrary videos, calculate their similarity based on their visual content.
• Video Retrieval
Z. Gao et al. “ER3: A unified framework for event retrieval, recognition and recounting”. CVPR, 2017.
G. Kordopatis-Zilos et al. “Near-duplicate video retrieval with deep metric learning”. ICCVW, 2017.
Video similarity calculation disregards
spatio-temporal information of videos
Y. Jiang and J. Wang. “Partial copy detection in videos: A benchmark and an evaluation of popular methods”. Tran. on Big Data, 2016.
L. Baraldi et al. “LAMV: Learning to align and match videos with kernelized temporal layers”. CVPR, 2018.
calculation disregards the
spatial structure of frames
Fine-grained similarity calculation
• Learn a video similarity function that respects:
• Spatial structure of video frames (intra-frame relations)
• Temporal structure of videos (inter-frame relations)