Successfully reported this slideshow.
Your SlideShare is downloading. ×

Automatic and Semi-automatic Augmentation of Migration Related Semantic Concepts for Visual Media Retrieval

Automatic and Semi-automatic Augmentation of Migration Related Semantic Concepts for Visual Media Retrieval

Download to read offline

Understanding the factors related to migration, such as perceptions about routes and target countries, is critical for border agencies and society altogether. A systematic analysis of communication and news channels, such as social media, can improve our understanding of such factors. Videos and images play a critical role in social media as they have significant impact on perception manipulation and misinformation campaigns. However, more research is needed in the identification of semantically relevant visual content for specific queried concepts. Furthermore, an important problem to overcome in this area is the lack of annotated datasets that could be used to create and test accurate models. A recent study proposed a novel video representation and retrieval approach that effectively bridges the gap between a substantiated domain understanding - encapsulated into textual descriptions of Migration Related Semantic Concepts (MRSCs) - and the expression of such concepts in a video. In this work, we build on this approach and propose an improved procedure for the crucial step of the concept labels’ textual augmentation, which contributes towards the full automation of the pipeline. We assemble the first, to the best of our knowledge, migration-related videos and images dataset and we experimentally assess our method on it.

Understanding the factors related to migration, such as perceptions about routes and target countries, is critical for border agencies and society altogether. A systematic analysis of communication and news channels, such as social media, can improve our understanding of such factors. Videos and images play a critical role in social media as they have significant impact on perception manipulation and misinformation campaigns. However, more research is needed in the identification of semantically relevant visual content for specific queried concepts. Furthermore, an important problem to overcome in this area is the lack of annotated datasets that could be used to create and test accurate models. A recent study proposed a novel video representation and retrieval approach that effectively bridges the gap between a substantiated domain understanding - encapsulated into textual descriptions of Migration Related Semantic Concepts (MRSCs) - and the expression of such concepts in a video. In this work, we build on this approach and propose an improved procedure for the crucial step of the concept labels’ textual augmentation, which contributes towards the full automation of the pipeline. We assemble the first, to the best of our knowledge, migration-related videos and images dataset and we experimentally assess our method on it.

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Automatic and Semi-automatic Augmentation of Migration Related Semantic Concepts for Visual Media Retrieval

  1. 1. The MIRROR project has received funding from the European Union’s Horizon 2020 research and innovation action program under grant agreement № 832921. Automatic and Semi-automatic Augmentation of Migration Related Semantic Concepts for Visual Media Retrieval Damianos Galanopoulos, Erick Elejalde, Alexandros Pournaras, Claudia Niederee, Vasileios Mezaris Workshop on Open Challenges in Online Social Networks @ 32nd ACM Conference on Hypertext and Social Media, 30 Aug. - 2 Sep. 2021.
  2. 2. 2 Introduction ● Migration is a complex process and a critical issue ● A plethora of factors lead to migration decisions ● Social media impact these decisions ● Social media are used to manipulate perception and lead to misperceptions ● A better understanding of migration-decision factors and how they express in visual content is critical ● Predict and prevent possible misperception-related risks
  3. 3. 3 Introduction ● How are semantic concepts expressed in social media, and how are they related to visual content? ● Based on a previously developed approach to bridge the gap between Migration Related Semantic Concepts and their expressions in a video ● Automatic analysis of migration-related social media items ● Automatic and semi-automatic augmentation on semantic concepts for improved performance
  4. 4. 4 Migration-Related Semantic Concepts (MRSCs) What are MRSCs? ● Semantic concepts relevant in the context of migration How are MRSCs defined? ● In-depth study of migration theories ● Discussions with domain experts ● Semantic concepts collection expresses the migration aspects ● Based on three popular theoretical approaches ○ Νeo-classical economic equilibrium ○ Historical-structural approach ○ Migration systems theory
  5. 5. 5 MRSC - Factors Classification ● Semantic concepts are combined to form meaningful templates ○ For example, “family” and “war” combined as “Families in war” ● Based on such patterns, a hierarchical structure is constructed ● 106 MRSCs are grouped into five categories ○ Economic ○ Social ○ Demographic ○ Environmental ○ Political ● 20 on the first level and 86 under them
  6. 6. 6 MRSC - Factor Classification 106 MRSCs organized on five categories and two levels (partial listing)
  7. 7. 7 MRSC - Augmentation ● MRSCs are typically high-level abstractions of semantic concepts ● Video-MRSCs correlation is challenging, even for human experts ● What should be shown in the video to make it relevant to “Ethnicity”? ● Need for MRSC label enrichment with sentences that are explanatory and complementary to the original label ● Manually augmentation by human experts is time-consuming and it does not scale to dynamically-added new concepts ● We proposed two approaches: ○ Automatic ○ Semi-automatic
  8. 8. 8 MRSC - Augmentation Automatic augmentation ● Pre-defined large-scale pool of available sentences ● For every MRSC, a set of related sentences is identified ● The Sentence-BERT is used to generate a vector representation for every sentence and for the MRSC ● Cosine similarity between every MRSC-sentence pair is calculated ● Finally, a list of the 𝑘 most related sentences is created for every MRSC
  9. 9. 9 MRSC - Augmentation Semi-automatic augmentation ● Human expert judges evaluate and modify the automated generated lists ● We aim to minimize the manual intervention ● Experts are not allowed to add new sentences or to modify the existing ones ● Only sentence deletions are allowed
  10. 10. 10 MRSC - Augmentation Automatically and semi-automatically generated augmentations for the MRS “Labour market”
  11. 11. 11 MRSC-based Video Retrieval ● Ad-Hoc Video Search (AVS) is a similar cross-modal retrieval problem ● Based on a state-of-the-art method for the AVS problem ● Video shots and free text encoding into a joint feature space ● Retrieve the most relevant video shots by inputting an MRSC ● Trained with no migration-related video-caption pairs
  12. 12. 12 MRSC-based Video Retrieval Method overview
  13. 13. 13 MIGRATION-VISUAL Dataset ● No domain-specific dataset in the public ● We created a migration-related videos and images dataset ● Images and videos from many traditional online sources, e.g., newspapers, magazines, journals, think-tanks, scientific and policy-making institutions, and social media platforms ● The dataset covers the events during which migrants tried to cross the Greek-Turkish border in February and March 2020
  14. 14. 14 MIGRATION-VISUAL Dataset Dataset ● The original collection contained a considerable amount of noise (e.g., banners and logos) ● Noisy data removal ● 161.165 total images or keyframes ○ 110.209 still images ○ 50.956 keyframes from 19.232 video shots from 482 videos
  15. 15. 15 MIGRATION-VISUAL Dataset ● Similar to TRECVID AVS ground-truth generation using human expert annotators ● For each MRSC a ranked list of 1000 video-shots or images retrieved ● For each list the human expert annotates: ○ 100% of the first 200 retrieved images or video-shots ○ 20% of the remaining (randomly selected) ● Three different options: ○ positive (the image is relevant to the specified MRSC) ○ negative (the image is not relevant to this MRSC) ○ skipped (in case the annotator is uncertain)
  16. 16. 16 Experiments and Results Experimental set-up ● Training datasets ○ TGIF & MSR-VTT ● Keyframe representation ○ ResNet 152 trained on Imagenet 11K ● Word embeddings ○ Word2Vec ○ BERT ● Evaluation datasets ○ TRECVID SIN 2015 ○ MIGRATION-VISUAL ● Evaluation metric ○ Mean extended inferred average precision (MXinfAP)
  17. 17. 17 Experiments and Results Results of video shot retrieval on the SIN’15 dataset for 30 visual concepts in terms of XinfAP
  18. 18. Results of video shot retrieval on the MIGRATION-VISUAL dataset for 10 MRSCs in terms of XinfAP 18 Experiments and Results
  19. 19. 19 Experiments and Results Politics Urbanisation War Education Work Example results, for five selected MRSCs. The top-5 retrieved images or video shots are shown
  20. 20. 20 Conclusion ● Video analysis to bridge the gap between MRSCs and visual content ● No need for manually-generated augmentations ● No need for training corpus with MRSC ground-truth annotation ● Evaluation on a migration-related dataset and on a generic benchmark dataset ● Effective MRSC related visual content retrieval under real-world conditions ● The applicability of the method is not limited to migration-related content ● Understanding how multimedia content in social media relates to complex semantic concepts ● Better understanding of several social phenomena
  21. 21. 21 Contact details Dr. Vasileios Mezaris Information Technologies Institute-CERTH bmezaris@iti.gr www.iti.gr/~bmezaris

×