VasileiosMezaris

Sort by
MLLM Frame Subset Ensembling for Audio-Visual Video QA
MLLM-based Reranking for Ad-hoc Video Search
TSalV360: A Method and Dataset for Text-driven Saliency Detection in 360-Degrees Videos
An Experimental Study on Generating Plausible Textual Explanations for Video Summarization
SD-VSum: A Method and Dataset for Script-Driven Video Summarization
Cross-modal Image Recommendation for News Articles by Multimodal Foundation Models-based Retrieval-Reranking
Combatting video-borne disinformation and increasing trust in AI methods
An LLM Framework for Long-form Video Retrieval and Audio-Visual Question Answering Using Qwen2/2.5
Improving the Perturbation-Based Explanation of Deepfake Detectors Through the Use of Adversarially-Generated Samples
B-FPGM: Lightweight Face Detection via Bayesian-Optimized Soft FPGM Pruning
LMM-Regularized CLIP Embeddings for Image Classification
Disturbing Image Detection Using LMM-Elicited Emotion Embeddings
Exploiting LMM based knowledge for image classification tasks
Detecting visual-media-borne disinformation: a summary of latest advances at the IDT Lab of CERTH-ITI
Dataset and methods for 360-degree video summarization