MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015

•

0 likes•202 views

We describe the participation of the CERTH/CEA LIST team in the Placing Task of MediaEval 2015. We submitted five runs in total to the Locale-based placing sub-task, providing the estimated locations for the test set released by the organisers. Out of five runs, two are based solely on textual information, using feature selection and weighting methods over an existing language model-based approach. One is based on visual content, using geo-spatial clustering over the most visually similar images, and two runs are based on hybrid approaches, using both visual and textual cues from the images. The best results (median error 22km, 27.5% at 1km) were obtained when both visual and textual features are combined, using external data for training. http://ceur-ws.org/Vol-1436/ http://www.multimediaeval.org

Education

CERTH/CEA LIST at MediaEval Placing Task 2015
Giorgos Kordopatis-Zilos1, Adrian Popescu2, Symeon Papadopoulos1 and
Yiannis Kompatsiaris1
1 Information Technologies Institute (ITI), CERTH, Greece
2 CEA LIST, 91190 Gif-sur-Yvette, France
MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany

Summary
#2
Tag-based location estimation (2 runs)
• Based on a geographic Language Model
• Built upon the scheme of our 2014 participation [2] (Kordopatis-Zilos et
al., MediaEval 2014)
• Extensions from [3]: improved feature selection and weighting
(Kordopatis-Zilos et al., PAISI 2015)
Visual-based location estimation (1 run)
• Geospatial clustering scheme of the most visually similar images
Hybrid location estimation (2 run)
• Combination of the textual and visual approaches
Training sets
• Training set released by the organisers (≈4.7M geotagged items)
• YFCC dataset, excl. images from users in test set (≈40M geotagged items)

Tag-based location estimation
#3
• Processing steps of the approach
– Offline: language model construction
– Online: location estimation

Feature Selection and Weighting
#5
accuracy locality

Locality – value distribution
#8
london (6975), paris (5452), nyc (3917)
luminancehdr (0.0035), dsc6362 (0.003), air photo (0.002)

Extensions
• Spatial Entropy (SE) function
– calculate entropy values applying the Shannon entropy formula in the tag-cell
probabilities
– build a Gaussian weight function based on the values of the tag SE
#9
• Internal Grid
– Built an additional LM using a finer grid, cell side length of 0.001°
– combine the MLC of the individual language models

Runs and Results
#13
measure RUN-1 RUN-2 RUN-3 RUN-4 RUN-5
acc(1m) 0.15 0.01 0.15 0.16 0.16
acc(10m) 0.61 0.08 0.62 0.75 0.76
acc(100m) 6.40 1.76 6.52 7.73 7.83
acc(1km) 24.33 5.19 24.61 27.30 27.54
acc(10km) 43.07 7.43 43.41 46.48 46.77
m. error (km) 69 5663 61 24 22
RUN-1: Tag-based location estimation + released training set
RUN-2: Visual-based location estimation + released training set
RUN-3: Hybrid location estimation + released training set
RUN-4: Tag-based location estimation + YFCC dataset
RUN-5: Hybrid location estimation + YFCC dataset

Thank you!
• Code:
https://github.com/MKLab-ITI/multimedia-geotagging
• Get in touch:
@sympapadopoulos / papadop@iti.gr
@georgekordopatis / georgekordopatis@iti.gr
#14

References
#15
[1] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama,
and T. Darrell. Caffe: Convolutional architecture for fast feature embedding.
arXiv preprint arXiv:1408.5093, 2014.
[2] G. Kordopatis-Zilos, G. Orfanidis, S. Papadopoulos, and Y. Kompatsiaris.
Socialsensor at mediaeval placing task 2014. In MediaEval 2014 Placing Task,
2014.
[3] G. Kordopatis-Zilos, S. Papadopoulos, and Y. Kompatsiaris. Geotagging social
media content with a refined language modelling approach. In Intelligence and
Security Informatics, pages 21–40, 2015.
[4] A. Popescu. CEA LIST's participation at mediaeval 2013 placing task. In
MediaEval 2013 Placing Task, 2013.
[5] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-
scale image recognition. In International Conference on Learning
Representations, 2015.
[6] O. Van Laere, S. Schockaert, and B. Dhoedt. Finding locations of Flickr resources
using language models and similarity search. ICMR ’11, pages 48:1–48:8, New
York, NY, USA, 2011. ACM.

Similar to MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015

CERTH/CEA LIST at MediaEval Placing Task 2015

Symeon Papadopoulos

Presenter: Giorgos Kordopatis-Zilos Placing Images with Refined Language Models and Similarity Search with PCA-reduced VGG Features In Working Notes Proceedings of the MediaEval 2016 Workshop, Hilversum, Netherlands, October 20-21, CEUR-WS.org (2016) by Giorgos Kordopatis-Zilos, Adrian Popescu, Symeon Papadopoulos, Yiannis Kompatsiaris Paper: http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_13.pdf Video: https://youtu.be/WR4I3CWjcR4 Abstract: We describe the participation of the CERTH/CEA-LIST team in the MediaEval 2016 Placing Task. We submitted five runs to the estimation-based sub-task: one based only on text by employing a Language Model-based approach with several refinements, one based on visual content, using geospatial clustering over the most visually similar images, and three based on a hybrid scheme exploiting both visual and textual cues from the multimedia items, trained on datasets of different size and origin. The best results were obtained by a hybrid approach trained with external training data and using two publicly available gazetteers.

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...

multimediaeval

Placing Images with Refined Language Models and Similarity Search with PCA-re...

Symeon Papadopoulos

Presenter: Jaeyoung Choi The Placing Task at MediaEval 2016 In Working Notes Proceedings of the MediaEval 2016 Workshop, Hilversum, Netherlands, October 20-21, CEUR-WS.org (2016) by Jaeyoung Choi, Claudia Hauff, Olivier Van Laere, Bart Thomee Paper: http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_7.pdf Video: https://youtu.be/8646sUGqMbg Abstract: The seventh edition of the Placing Task at MediaEval focuses on two challenges: (1) estimation-based placing, which addresses estimating the geographic location where a photo or video was taken, and (2) verification-based placing, which addresses verifying whether a photo or video was indeed taken at a pre-specified geographic location. Like the previous edition, we made the organizer baselines for both subtasks available as open source code, and published a live leaderboard that allows the participants to gain insights into the effectiveness of their approaches compared to the official baselines and in relation to each other at an early stage, before the actual run submissions are due.

MediaEval 2016 - Placing Task Overview

multimediaeval

The sixth edition of the Placing Task at MediaEval introduces two new sub-tasks: (1) locale-based placing, which emphasizes the need to move away from an evaluation purely based on latitude and longitude towards an entity-centered evaluation, and (2) mobility-based placing, which addresses predicting missing locations within a sequence of movements; the latter is a specific real-world use case that so far has received little attention within the research community. Two additional changes over the previous years are the introduction of open source organizer baselines for both sub-tasks shortly after the official data release, and the implementation of a live leaderboard, which allows the participants to gain insights into the effectiveness of their approaches compared to the official baselines and in relation to each other at an early stage, before the actual run submissions are due. http://ceur-ws.org/Vol-1436/ http://www.multimediaeval.org

MediaEval 2015 - The Placing Task at MediaEval 2015

multimediaeval

ITSC2015 http://www.itsc2015.org/ The paper presents a fine-grained walking activity recognition toward an inferring pedestrian intention which is an important topic to predict and avoid a pedestrian’s dangerous activity. The fine-grained activity recognition is to distinguish different activities between subtle changes such as walking with different directions. We believe a change of pedestrian’s activity is significant to grab a pedestrian intention. However, the task is challenging since a couple of reasons, namely (i) in-vehicle mounted camera is always moving (ii) a pedestrian area is too small to capture a motion and shape features (iii) change of pedestrian activity (e.g. walking straight into turning) has only small feature difference. To tackle these problems, we apply vision-based approach in order to classify pedestrian activities. The dense trajectories (DT) method is employed for high-level recognition to capture a detailed difference. Moreover, we additionally extract detection-based region-of-interest (ROI) for higher performance in fine-grained activity recognition. Here, we evaluated our proposed approach on “self-collected dataset” and “near-miss driving recorder (DR) dataset” by dividing several activities– crossing, walking straight, turning, standing and riding a bicycle. Our proposal achieved 93.7% on the self-collected NTSEL traffic dataset and 77.9% on the near-miss DR dataset.

【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset

Hirokatsu Kataoka

Can the Multimodal Real Time Information Systems Induce a More Sustainable Mo...

IEA-ETSAP

MEILI: a travel diary collection, annotation and automation system

Adrian C. Prelipcean

Icl gnss helsinki-2014_lisi_v02

Marco Lisi

Precision Viticulture R&D by Charlotte Wyatt

COGS Presentations

visaap apresentacao.pptx

Federal University of Pernambuco

Flickr Geotagged and Publicly Available Photos: Preliminary Study of Its Adeq...

Jacinto Estima

IFPRI-Role of technology in PMFBY-SS Ray

International Food Policy Research Institute- South Asia Office

ISVC2015 paper http://www.hirokatsukataoka.net/pdf/isvc15_kataoka_dt13feature.pdf Activity recognition has been an active research topic in computer vision. Recently, the most successful approaches use dense trajectories that extract a large number of trajectories and features on the trajectories into a codeword. In this paper, we evaluate various features in the framework of dense trajectories on several types of datasets. We implement 13 features in total by including five different types of descriptor, namely motion-, shape-, texture- trajectory- and co-occurrence-based feature descriptors. The experimental results show a relationship between feature descriptors and performance rate at each dataset. Different scenes of traffic, surgery, daily living and sports are used to analyze the feature characteristics. Moreover, we test how much the performance rate of concatenated vectors depends on the type, top-ranked in experiment and all 13 feature descriptors on fine-grained datasets. Feature evaluation is beneficial not only in the activity recognition problem, but also in other domains in spatio-temporal recognition.

【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...

Hirokatsu Kataoka

Despite being around for almost two decades, footmounted inertial navigation only has gotten a limited spread. Contributing factors to this are lack of suitable hardware platforms and difficult system integration. As a solution to this, we present an open-source wireless foot-mounted inertial navigation module with an intuitive and significantly simplified dead reckoning interface. The interface is motivated from statistical properties of the underlying aided inertial navigation and argued to give negligible information loss. The module consists of both a hardware platform and embedded software. Details of the platform and the software are described, and a summarizing description of how to reproduce the module are given. System integration of the module is outlined and finally, we provide a basic performance assessment of the module. In summary, the module provides a modularization of the foot-mounted inertial navigation and makes the technology significantly easier to use.

IPIN'14: Foot-Mounted Inertial Navigation Made Easy

oblu.io

mO4Rivers-CitizenScience for Rivers

Nuno Charneca

PCA and Classification

Fatwa Ramdani

Horizon DTC Integrator Event

Jeremy Morley

Applied GIS

Dr Zubairul Islam

سمینار آموزشی ارزیابی قابلیت پیاده روی در شهر لیسبون پرتقال

Hamid Moradi

Similar to MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015 (20)

CERTH/CEA LIST at MediaEval Placing Task 2015

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...

Placing Images with Refined Language Models and Similarity Search with PCA-re...

MediaEval 2016 - Placing Task Overview

MediaEval 2015 - The Placing Task at MediaEval 2015

【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset

Can the Multimodal Real Time Information Systems Induce a More Sustainable Mo...

MEILI: a travel diary collection, annotation and automation system

Icl gnss helsinki-2014_lisi_v02

Precision Viticulture R&D by Charlotte Wyatt

visaap apresentacao.pptx

Flickr Geotagged and Publicly Available Photos: Preliminary Study of Its Adeq...

IFPRI-Role of technology in PMFBY-SS Ray

【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...

IPIN'14: Foot-Mounted Inertial Navigation Made Easy

mO4Rivers-CitizenScience for Rivers

PCA and Classification

Horizon DTC Integrator Event

Applied GIS

سمینار آموزشی ارزیابی قابلیت پیاده روی در شهر لیسبون پرتقال

More from multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper62.pdf YouTube: https://youtu.be/gV-rvV3iFDA Pierre-Etienne Martin, Jenny Benois-Pineau, Boris Mansencal, Renaud Péteri and Julien Morlier : Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal CNN for MediaEval 2020. Proc. of MediaEval 2020, 14-15 December 2020, Online. This work presents a method for classifying table tennis strokes using spatio-temporal convolutional neural networks. The fine-grained classification is performed on trimmed video segments recorded at 120 fps with different players performing in natural conditions. From those segments, the frames are extracted, their optical flow is computed and the pose of the player is estimated. From the optical flow amplitude, a region of interest is inferred. A three stream spatio-temporal convolutional neural network using combination of those modalities and 3D attention mechanisms is presented in order to perform classification. Presented by: Pierre-Etienne Martin

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper50.pdf Hai Nguyen-Truong, San Cao, N. A. Khoa Nguyen, Bang-Dang Pham, Hieu Dao, Minh-Quan Le, Hoang-Phuc Nguyen-Dinh, Hai-Dang Nguyen and Minh-Triet Tran : HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table Tennis Strokes Classification Task. Proc. of MediaEval 2020, 14-15 December 2020, Online. The Sports Video Classification Tasks in the Multimedia Evaluation 2020 Challenge focuses on classifying different types of table tennis strokes in video segments. In this task, we - the HCMUS Team - perform multiple experiments, which includes a combination of models such as SlowFast, Optical Flow, DensePose, R2+1, Channel-Separated Convolutional Networks, to classify 21 types of table tennis strokes from video segments. In total, we submit eight runs corresponding to five different models with different sets of hyper-parameters in each of our models. In addition, we apply some pre-processing techniques on the dataset in order for our model to learn and classify more accurately. According to the evaluation results, one of our team's methods out-performs the other team's. In particular, our best run achieves 31.35\% global accuracy, and all of our methods show potential results in terms of local and global accuracy for action recognition tasks.

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper2.pdf YouTube: https://youtu.be/-bRL868b8ys Pierre-Etienne Martin, Jenny Benois-Pineau, Boris Mansencal, Renaud Péteri, Laurent Mascarilla, Jordan Calandre and Julien Morlier : Sports Video Classification: Classification of Strokes in Table Tennis for MediaEval 2020. Proc. of MediaEval 2020, 14-15 December 2020, Online. Fine-grained action classification has raised new challenges compared to classical action classification problems. Sport video analysis is a very popular research topic, due to the variety of application areas, ranging from multimedia intelligent devices with user-tailored digests, up to analysis of athletes' performances. Running since 2019 as a part of MediaEval, we offer a task which consists in classifying table tennis strokes from videos recorded in natural conditions at the University of Bordeaux. The aim is to build tools for teachers, coaches and players to analyse table tennis games. Such tools could lead to an automatic profiling of the player and adaptation of his training for improving his/her sport skills more efficiently. Presented by: Pierre-Etienne Martin

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper61.pdf YouTube: https://youtu.be/brmI4g3jLS4 Ricardo Kleinlein, Cristina Luna-Jiménez, Fernando Fernández-Martínez and Zoraida Callejas : Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention and LSTM Models. Proc. of MediaEval 2020, 14-15 December 2020, Online. This paper reports on the GTH-UPM team experience in the Predicting Media Memorability task at MediaEval 2020. Teams were requested to predict memorability scores at both short-term and long-term, understanding such score as a measure of whether a video was perdurable in a viewer's memory or not. Our proposed system relies on a late fusion of the scores predicted by three sequential models, each trained over a different modality: video captions, aural embeddings and visual optical flow-based vectors. Whereas single-modality models show a low or zero Spearman correlation coefficient value, their combination considerably boosts performance over development data up to 0.2 in the short-term memorability prediction subtask and 0.19 in the long-term subtask. However, performance over test data drops to 0.016 and -0.041, respectively. Presented by: Ricardo Kleinlein

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper52.pdf Janadhip Jacutprakart, Rukiye Savran Kiziltepe, John Q. Gan, Giorgos Papanastasiou and Alba G. Seco de Herrera : Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task. Proc. of MediaEval 2020, 14-15 December 2020, Online. In this paper, we present the methods of approach and the main results from the Essex NLIP Team’s participation in the MediEval 2020 Predicting Media Memorability task. The task requires participants to build systems that can predict short-term and long-term memorability scores on real-world video samples provided. The focus of our approach is on the use of colour-based visual features as well as the use of the video annotation meta-data. In addition, hyper-parameter tuning was explored. Besides the simplicity of the methodology, our approach achieves competitive results. We investigated the use of different visual features. We assessed the performance of memorability scores through various regression models where Random Forest regression is our final model, to predict the memorability of videos.

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper6.pdf YouTube: https://youtu.be/ySGGu_4vaxs Alba García Seco De Herrera, Rukiye Savran Kiziltepe, Jon Chamberlain, Mihai Gabriel Constantin, Claire-Hélène Demarty, Faiyaz Doctor, Bogdan Ionescu and Alan F. Smeaton : Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a Video Memorable? Proc. of MediaEval 2020, 14-15 December 2020, Online. This paper describes the MediaEval 2020 Predicting Media Memorability task. After first being proposed at MediaEval 2018, the Predicting Media Memorability task is in its 3rd edition this year, as the prediction of short-term and long-term video memorability (VM) remains a challenging task. In 2020, the format remained the same as in previous editions. This year the videos are a subset of the TRECVid 2019 Video to Text dataset, containing more action rich video content as compare with the 2019 task. In this paper a description of some aspects of this task is provided, including its main characteristics, a description of the collection, the ground truth dataset, evaluation metrics and the requirements for the run submission. Presented by: Rukiye Savran Kiziltepe

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper45.pdf Benoit Bonnet, Teddy Furon and Patrick Bas : Fooling an Automatic Image Quality Estimator. Proc. of MediaEval 2020, 14-15 December 2020, Online. In this paper we present our work on the 2020 MediaEval task: Pixel "Privacy: Quality Camouflage for Social Images". Blind Image Quality Assessment (BIQA) is a classifier that for any given image will return a quality score. Our task is to modify an image to decrease its BIQA score while maintaining a good perceived quality. Since BIQA is a deep neural network, we worked on an adversarial attack approach of the problem.

Fooling an Automatic Image Quality Estimator

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper16.pdf YouTube: https://youtu.be/ix_b9K7j72w Zhengyu Zhao : Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable Color Filter. Proc. of MediaEval 2020, 14-15 December 2020, Online. This paper presents the submission of our RU-DS team to the Pixel Privacy Task 2020. We propose to fool the blind image quality assessment model by transforming images based on optimizing a human-understandable color filter. In contrast to the common work that relies on small, $L_p$-bounded additive pixel perturbations, our approach yields large yet smooth perturbations. Experimental results demonstrate that in the specific context of this task, our approach is able to achieve strong adversarial effects, but has to sacrifice the image appeal. Presented by: Zhengyu Zhao

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper77.pdf YouTube: https://youtu.be/8Rr4KknGSac Zhuoran Liu, Zhengyu Zhao, Martha Larson and Laurent Amsaleg : Pixel Privacy: Quality Camouflage for Social Images. Proc. of MediaEval 2020, 14-15 December 2020, Online. High-quality social images shared online can be misappropriated for unauthorized goals, where the quality filtering step is commonly carried out by automatic Blind Image Quality Assessment (BIQA) algorithms. Pixel Privacy benchmarks privacy-protective approaches that protect privacy-sensitive images against unethical computer vision algorithms. In the 2020 task, participants are encouraged to develop camouflage methods that can effectively decrease the BIQA quality score of high-quality images and maintain image appeal. The camouflaged images need to be either imperceptible to the human eye, or it can be a visible enhancement. Presented by: Zhuoran Liu

Pixel Privacy: Quality Camouflage for Social Images

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper73.pdf YouTube: https://youtu.be/TadJ6y7xZeA Thuc Nguyen-Quang, Tuan-Duy Nguyen, Thang-Long Nguyen-Ho, Anh-Kiet Duong, Xuan-Nhat Hoang, Vinh-Thuyen Nguyen-Truong, Hai-Dang Nguyen and Minh-Triet Tran : HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching. Proc. of MediaEval 2020, 14-15 December 2020, Online. Matching text and images based on their semantics has an important role in cross-media retrieval. However, text and images in articles have a complex connection. In the context of MediaEval 2020 Challenge, we propose three multi-modal methods for mapping text and images of news articles to the shared space in order to perform efficient cross-retrieval. Our methods show systemic improvement and validate our hypotheses, while the best-performed method reaches a recall@100 score of 0.2064. Presented by: Thuc Nguyen-Quang

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper72.pdf Sabarinathan D and Suganya Ramamoorthy : Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attention Unit. Proc. of MediaEval 2020, 14-15 December 2020, Online. Colorectal cancer is the third most common cause of cancer worldwide. In the era of medical Industry, identifying colorectal cancer in its early stages has been a challenging problem. Inspired by these issues, the main objective of this paper is to develop a Multi supervision net algorithm for segmenting polys on a comprehensive dataset. The risk of colorectal cancer could be reduced by early diagnosis of poly during a colonoscopy. The disease and their symptoms are highly varying and always a need for a continuous update of knowledge for the doctors and medical analyst. The diseases fall into different categories and a small variation of symptoms may lead to higher rate of risk. We have taken Medico polyp challenge dataset, which consists of 1000 segmented polyp images from gastrointestinal track. We proposed an efficient Net B4 as a pre-trained architecture in multi-supervision net. The model is trained with multiple output layers. We present quantitative results on colorectal dataset to evaluate the performance and achieved good results in all the performance metrics. The experimental results proved that the proposed model is robust and provides a good level of accuracy in segmenting polyps on a comprehensive dataset for different metrics such as Dice coefficient, Recall, Precision and F2.

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper47.pdf YouTube: https://youtu.be/vMsM4zg2-JY Tien-Phat Nguyen, Tan-Cong Nguyen, Gia-Han Diep, Minh-Quan Le, Hoang-Phuc Nguyen-Dinh, Hai-Dang Nguyen and Minh-Triet Tran : HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ for Polyps Segmentation. Proc. of MediaEval 2020, 14-15 December 2020, Online. The Medico task, MediaEval 2020, explores the challenge of building accurate and high-performance algorithms to detect all types of polyps in endoscopic images. We proposed different approaches leveraging the advantages of either ResUnet++ or PraNet model to efficiently segment polyps in colonoscopy images, with modifications on the network structure, parameters, and training strategies to tackle various observed characteristics of the given dataset. Our methods outperform the other teams' methods, for both accuracy and efficiency. After the evaluation, we are at top 2 for task 1 (with Jaccard index of 0.777, best Precision and Accuracy scores) and top 1 for task 2 (with 67.52 FPS and Jaccard index of 0.658).

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper31.pdf Syed Muhammad Faraz Ali, Muhammad Taha Khan, Syed Unaiz Haider, Talha Ahmed, Zeshan Khan and Muhammad Atif Tahir : Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Intestinal Tract. Proc. of MediaEval 2020, 14-15 December 2020, Online. Identification of polyps in endoscopic images is critical for the diagnosis of colon cancer. Finding the exact shape and size of polyps requires the segmentation of endoscopic images. This research explores the advantage of using depth-wise separable convolution in the atrous convolution of the ResUNet++ architecture. Deep atrous spatial pyramid pooling was also implemented on the ResUNet++ architecture. The results show that architecture with separable convolution has a smaller size and fewer GFLOPs without degrading the performance too much.

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper22.pdf Debapriya Banik and Debotosh Bhattacharjee : Deep Conditional Adversarial learning for polyp Segmentation. Proc. of MediaEval 2020, 14-15 December 2020, Online. This approach has addressed the Medico automatic polyp segmentation challenge which is a part of Mediaeval 2020. We have proposed a deep conditional adversarial learning based network for the automatic polyp segmentation task. The network comprises of two interdependent models namely a generator and a discriminator. The generator network is a FCN employed for the prediction of the polyp mask while the discriminator enforces the segmentation to be as similar as the real segmented mask (ground truth). Our proposed model achieved a comparative result on the test dataset provided by the organizers of the challenge.

Deep Conditional Adversarial learning for polyp Segmentation

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper21.pdf Hwang Maxwell, Wu Cai, Hwang Kao-Shing, Xu Yong Si and Wu Chien-Hsing : A Temporal-Spatial Attention Model for Medical Image Detection. Proc. of MediaEval 2020, 14-15 December 2020, Online. A local region model with attentive temporal-spatial pathways is proposed for automatically learning various target structures. The attentive spatial pathway highlights the salient region to generate bounding boxes and ignores irrelevant regions in an input image. The proposed attention mechanism allows efficient object localization and the overall predictive performance is increased because there are fewer false positives for the object detection task for medical images with manual annotations. The experimental results show that proposed models consistently increase the base architectures' predictive performance for different datasets and training sizes without undue computational efficiency.

A Temporal-Spatial Attention Model for Medical Image Detection

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper20.pdf YouTube: https://youtu.be/CVelQl5Luf0 Quoc-Huy Trinh, Minh-Van Nguyen, Thiet-Gia Huynh and Minh-Triet Tran : HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Network and UNet for Polyps Segmentation. Proc. of MediaEval 2020, 14-15 December 2020, Online. The Medico: Multimedia Task focuses on developing an efficient and accurate framework to computer-aided diagnosis systems for automatic polyp segmentation to detect all types of polyps in endoscopic images of the gastrointestinal (GI) tract. We are HCMUS-team approach a solution, which includes combination Residual module, Inception module, Adaptive Convolutional neural network with Unet model and PraNet to semantic segmentation all types of polyps in endoscopic images. We submit multiple runs with different architecture and parameters in our model. Our methods show potential results in accuracy and efficiency through multiple experiments.

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper15.pdf Rabindra Khadka : Transfer of Knowledge: Fine-tuning for Polyp Segmentation with Attention. Proc. of MediaEval 2020, 14-15 December 2020, Online. This paper describes how the transfer of prior knowledge can effectively take on segmentation tasks with the help of attention mechanisms. The UNet model pretrained on brain MRI dataset was fine-tuned with the polyp dataset. Attention mechanism was integrated to focus on relevant regions in the input images. The implemented architecture is evaluated on 200 validation images based on intersection over union and dice score between groundtruth and predicted region. The model demonstrates a promising result with computational efciency.

Fine-tuning for Polyp Segmentation with Attention

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper12.pdf Adrian Krenzer and Frank Puppe : Bigger Networks are not Always Better: Deep Convolutional Neural Networks for Automated Polyp Segmentation. Proc. of MediaEval 2020, 14-15 December 2020, Online. This paper presents our team's (AI-JMU) approach to the Medico automated polyp segmentation challenge. We consider deep convolutional neural networks to be well suited for this task. To determine the best architecture we test and compare state of the art backbones and two different heads. Finally we achieve a Jaccard index of 73.74\% on the challenge test set. We further demonstrate that bigger networks do not always perform better. However the growing network size always increases the computational complexity.

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper51.pdf Amel Ksibi, Amina Salhi, Ala Alluhaidan and Sahar A. El-Rahman : Insights for wellbeing: Predicting Personal Air Quality Index using Regression Approach. Proc. of MediaEval 2020, 14-15 December 2020, Online. Providing air pollution information to individuals enables them to understand the air quality of their living environments. Thus, the association between people’s wellbeing and the properties of the surrounding environment is an essential area of investigation. This paper proposes Air Quality Prediction through harvesting public/open data and leveraging them to get the Personal Air Quality index. These are usually incomplete. To cope with the problem of missing data, we applied the KNN imputation method. To predict Personal Air Quality Index, we apply a voting regression approach based on three base regressors which are Gradient Boosting regressor, Random Forest regressor, and linear regressor. Evaluating the experimental results using the RMSE metric, we got an average score of 35.39 for Walker and 51.16 for Car.

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper40.pdf YouTube: https://youtu.be/SL5Hvu1mARY Trung-Quan Nguyen, Dang-Hieu Nguyen and Loc Tai Tan Nguyen : Use Visual Features From Surrounding Scenes to Improve Personal Air Quality Data Prediction Performance. Proc. of MediaEval 2020, 14-15 December 2020, Online. In this paper, we propose a method to predict the personal air quality index in an area by using the combination of the levels of the following pollutants: PM2.5, NO2, and O3, measured from the nearby weather stations of that area, and the photos of surrounding scenes taken at that area. Our approach uses the Inverse Distance Weighted (IDW) technique to estimate the missing air pollutant levels and then use regression to integrate visual features from taken photos to optimize the predicted values. After that, we can use those values to calculate the Air Quality Index (AQI). The results show that the proposed method may not improve the performance of the prediction in some cases.

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...

multimediaeval

More from multimediaeval (20)

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

Fooling an Automatic Image Quality Estimator

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

Pixel Privacy: Quality Camouflage for Social Images

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...

Deep Conditional Adversarial learning for polyp Segmentation

A Temporal-Spatial Attention Model for Medical Image Detection

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...

Fine-tuning for Polyp Segmentation with Attention

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...

Recently uploaded

Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...

EADTU

Wellbeing inclusion and digital dystopias.pptx

Jisc

Jamworks pilot and AI at Jisc (20/03/2024)

Jisc

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx

annathomasp01

Introduction to TechSoup’s Digital Marketing Services and Use Cases

TechSoup

Observing-Correct-Grammar-in-Making-Definitions.pptx

AdelaideRefugio

21st_Century_Skills_Framework_Final_Presentation_2.pptx

JoelynRubio1

REMIFENTANIL: An Ultra short acting opioid.pptx

Dr. Ravikiran H M Gowda

HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx

marlenawright1

Towards a code of practice for AI in AT.pptx

Jisc

Play hard learn harder: The Serious Business of Play

Pooky Knightsmith

dusjagr & nano talk on open tools for agriculture research and learning

Marc Dusseiller Dusjagr

TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...

Nguyen Thanh Tu Collection

AIM of Education-Teachers Training-2024.ppt

Nishitharanjan Rout

A procedure when multiple vendors are asked to submit proposals for a supply contract or project. It helps businesses to assess offers and select the best vendor based on factors like cost, quality, and other relevant factors. In Odoo 17, the Call for Tenders feature has undergone significant improvements, making procurement process management more efficient and seamless. Accessing and managing this feature can still be done through the same interface as Blanket Orders, with the addition of creating an exclusive agreement type for Call for Tenders.

How to Manage Call for Tendor in Odoo 17

Celine George

OS-operating systems- ch05 (CPU Scheduling) ...

Dr. Mazin Mohamed alkathiri

NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...

Amil baba

Program code examples (known also as worked examples) play a crucial role in learning how to program. Instructors use examples extensively to demonstrate the semantics of the programming language being taught and to highlight the fundamental coding patterns. Programming textbooks allocate considerable space to present and explain code examples. To make the process of studying code examples more interactive, CS education researchers developed a range of tools to engage students in the study of code examples. These tools include codecasts (codemotion,codecast,elicasts), interactive example explorers (WebEx, PCEX), and tutoring systems (DeepTutor). An important component in all types of worked examples is code explanations associated with specific code lines or code chunks of an example. The explanations connect examples with general programming knowledge explaining the role and function of code fragments or their behavior. In textbooks, these explanations are usually presented as comments in the code or as explanations on the margins. The example explorer tools allow students to examine these explanations interactively. Tutoring systems, which engage students in explaining the code, use these model explanations to check student responses and provide scaffolding. In all these cases, to make a worked example re-usable beyond its presentation in a lecture, the explanations have to be authored by instructors or domain experts i.e., produced and integrated into a specific system. As the experience of the last 10 years demonstrated, these explanations are hard to obtain. Those already collected are usually “locked” in a specific example-focused system and can’t be reused. The purpose of this working group is to support broader re-used of worked examples augmented with explanations. Our current plan is to develop а standard approach to represent explained examples. This approach will enable an example created for any of the existing systems to be explored in a standard format and imported into any other example-focused system. We plan to follow a successful experience of the PEML working group focused on re-using programming exercises.

SPLICE Working Group:Reusable Code Examples

Peter Brusilovsky

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx

Esquimalt MFRC

APM webinar hosted by the Scotland Network on 14 May 2024. Speakers: Chris Drysdale and Peter Huggett An interactive session discussing how Project Managers can identify mental health symptoms, provide tools to help themselves and others, plus also increase the capabilities of the Project Management function. This webinar was held on 14 May 2024. The covid-19 pandemic led to concerns about a worsening of mental health & wellbeing across the world and increased awareness in both society and the workplace. This webinar looks to advise the benefits of having a Mental Health First Aid function in the workplace whilst also providing tools and techniques that can be readily used and applied to yourself and colleagues. Additionally, there are wider benefits to Project Management which will be proposed and discussed.

Including Mental Health Support in Project Delivery, 14 May.pdf

Association for Project Management

Recently uploaded (20)

Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...

Wellbeing inclusion and digital dystopias.pptx

Jamworks pilot and AI at Jisc (20/03/2024)

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx

Introduction to TechSoup’s Digital Marketing Services and Use Cases

Observing-Correct-Grammar-in-Making-Definitions.pptx

21st_Century_Skills_Framework_Final_Presentation_2.pptx

REMIFENTANIL: An Ultra short acting opioid.pptx

HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx

Towards a code of practice for AI in AT.pptx

Play hard learn harder: The Serious Business of Play

dusjagr & nano talk on open tools for agriculture research and learning

TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...

AIM of Education-Teachers Training-2024.ppt

How to Manage Call for Tendor in Odoo 17

OS-operating systems- ch05 (CPU Scheduling) ...

NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...

SPLICE Working Group:Reusable Code Examples

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx

Including Mental Health Support in Project Delivery, 14 May.pdf

MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015

1. CERTH/CEA LIST at MediaEval Placing Task 2015 Giorgos Kordopatis-Zilos1, Adrian Popescu2, Symeon Papadopoulos1 and Yiannis Kompatsiaris1 1 Information Technologies Institute (ITI), CERTH, Greece 2 CEA LIST, 91190 Gif-sur-Yvette, France MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany

2. Summary #2 Tag-based location estimation (2 runs) • Based on a geographic Language Model • Built upon the scheme of our 2014 participation [2] (Kordopatis-Zilos et al., MediaEval 2014) • Extensions from [3]: improved feature selection and weighting (Kordopatis-Zilos et al., PAISI 2015) Visual-based location estimation (1 run) • Geospatial clustering scheme of the most visually similar images Hybrid location estimation (2 run) • Combination of the textual and visual approaches Training sets • Training set released by the organisers (≈4.7M geotagged items) • YFCC dataset, excl. images from users in test set (≈40M geotagged items)

3. Tag-based location estimation #3 • Processing steps of the approach – Offline: language model construction – Online: location estimation

4. Language Model (LM) #4

5. Feature Selection and Weighting #5 accuracy locality

6. Accuracy #6 Estimated Locations

7. Locality #7

8. Locality – value distribution #8 london (6975), paris (5452), nyc (3917) luminancehdr (0.0035), dsc6362 (0.003), air photo (0.002)

9. Extensions • Spatial Entropy (SE) function – calculate entropy values applying the Shannon entropy formula in the tag-cell probabilities – build a Gaussian weight function based on the values of the tag SE #9 • Internal Grid – Built an additional LM using a finer grid, cell side length of 0.001° – combine the MLC of the individual language models

10. Visual-based location estimation #10

11. Hybrid-based location estimation #11

12. Confidence #12

13. Runs and Results #13 measure RUN-1 RUN-2 RUN-3 RUN-4 RUN-5 acc(1m) 0.15 0.01 0.15 0.16 0.16 acc(10m) 0.61 0.08 0.62 0.75 0.76 acc(100m) 6.40 1.76 6.52 7.73 7.83 acc(1km) 24.33 5.19 24.61 27.30 27.54 acc(10km) 43.07 7.43 43.41 46.48 46.77 m. error (km) 69 5663 61 24 22 RUN-1: Tag-based location estimation + released training set RUN-2: Visual-based location estimation + released training set RUN-3: Hybrid location estimation + released training set RUN-4: Tag-based location estimation + YFCC dataset RUN-5: Hybrid location estimation + YFCC dataset

14. Thank you! • Code: https://github.com/MKLab-ITI/multimedia-geotagging • Get in touch: @sympapadopoulos / papadop@iti.gr @georgekordopatis / georgekordopatis@iti.gr #14

15. References #15 [1] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014. [2] G. Kordopatis-Zilos, G. Orfanidis, S. Papadopoulos, and Y. Kompatsiaris. Socialsensor at mediaeval placing task 2014. In MediaEval 2014 Placing Task, 2014. [3] G. Kordopatis-Zilos, S. Papadopoulos, and Y. Kompatsiaris. Geotagging social media content with a refined language modelling approach. In Intelligence and Security Informatics, pages 21–40, 2015. [4] A. Popescu. CEA LIST's participation at mediaeval 2013 placing task. In MediaEval 2013 Placing Task, 2013. [5] K. Simonyan and A. Zisserman. Very deep convolutional networks for large- scale image recognition. In International Conference on Learning Representations, 2015. [6] O. Van Laere, S. Schockaert, and B. Dhoedt. Finding locations of Flickr resources using language models and similarity search. ICMR ’11, pages 48:1–48:8, New York, NY, USA, 2011. ACM.

MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015

Recommended

Recommended

More Related Content

Similar to MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015

Similar to MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015 (20)

More from multimediaeval

More from multimediaeval (20)

Recently uploaded

Recently uploaded (20)

MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015