Paper: http://ceur-ws.org/Vol-2882/paper6.pdf
YouTube: https://youtu.be/ySGGu_4vaxs
Alba García Seco De Herrera, Rukiye Savran Kiziltepe, Jon Chamberlain, Mihai Gabriel Constantin, Claire-Hélène Demarty, Faiyaz Doctor, Bogdan Ionescu and Alan F. Smeaton : Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a Video Memorable? Proc. of MediaEval 2020, 14-15 December 2020, Online.
This paper describes the MediaEval 2020 Predicting Media Memorability task. After first being proposed at MediaEval 2018, the Predicting Media Memorability task is in its 3rd edition this year, as the prediction of short-term and long-term video memorability (VM) remains a challenging task. In 2020, the format remained the same as in previous editions. This year the videos are a subset of the TRECVid 2019 Video to Text dataset, containing more action rich video content as compare with the 2019 task. In this paper a description of some aspects of this task is provided, including its main characteristics, a description of the collection, the ground truth dataset, evaluation metrics and the requirements for the run submission.
Presented by: Rukiye Savran Kiziltepe
The VTT Task has been running at TRECVID to benchmark systems working on automatically describe short videos in 1 sentence (aka: video captioning). Videos come from the vimeo creative common dataset (V3C).
The Search and Hyperlinking Task at MediaEval 2014multimediaeval
The Search and Hyperlinking Task at MediaEval 2014 is the third edition of this task. As in previous versions, it consisted of two sub-tasks: (i) answering search queries from a collection of roughly 2700 hours of BBC broadcast TV material, and (ii) linking anchor segments from within the videos to other target segments within the video collection. For MediaEval 2014, both sub-tasks were based on an ad-hoc retrieval scenario, and were evaluated using a pooling procedure across participants submissions with crowdsourcing relevance assessment using Amazon Mechanical Turk.
mission model, mission model canvas, customer development, Hacking for Defense, lean startup, stanford, startup, steve blank, Pete Newell, Joe Felter, minimum viable product
The VTT Task has been running at TRECVID to benchmark systems working on automatically describe short videos in 1 sentence (aka: video captioning). Videos come from the vimeo creative common dataset (V3C).
The Search and Hyperlinking Task at MediaEval 2014multimediaeval
The Search and Hyperlinking Task at MediaEval 2014 is the third edition of this task. As in previous versions, it consisted of two sub-tasks: (i) answering search queries from a collection of roughly 2700 hours of BBC broadcast TV material, and (ii) linking anchor segments from within the videos to other target segments within the video collection. For MediaEval 2014, both sub-tasks were based on an ad-hoc retrieval scenario, and were evaluated using a pooling procedure across participants submissions with crowdsourcing relevance assessment using Amazon Mechanical Turk.
mission model, mission model canvas, customer development, Hacking for Defense, lean startup, stanford, startup, steve blank, Pete Newell, Joe Felter, minimum viable product
Peter Muschick MSc thesis
Universitat Pollitecnica de Catalunya, 2020
Sign language recognition and translation has been an active research field in the recent years with most approaches using deep neural networks to extract information from sign language data. This work investigates the mostly disregarded approach of using human keypoint estimation from image and video data with OpenPose in combination with transformer network architecture. Firstly, it was shown that it is possible to recognize individual signs (4.5% word error rate (WER)). Continuous sign language recognition though was more error prone (77.3% WER) and sign language translation was not possible using the proposed methods, which might be due to low accuracy scores of human keypoint estimation by OpenPose and accompanying loss of information or insufficient capacities of the used transformer model. Results may improve with the use of datasets containing higher repetition rates of individual signs or focusing more precisely on keypoint extraction of hands.
Alex Tellez's slides on Deep Learning Applications, including using auto-encoders, finding better Bordeaux wine, and fighting crime in Chicago, from the 3/11/15 Meetup at H2O.ai HQ and the 3/12/15 Meetup at Mills College.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Quality of Experience of Web-based Adaptive HTTP Streaming Clients in Real-Wo...Alpen-Adria-Universität
Multimedia streaming over HTTP has gained momentum with the approval of the MPEG-DASH standard and many research papers evaluated various aspects thereof but mainly within controlled environments. However, the actual behaviour of a DASH client within real-world environments has not yet been evaluated. The aim of this paper is to compare the QoE performance of existing DASH-based Web clients within real-world environments using crowdsourcing. Therefore, we select Google’s YouTube player and two open source implementations of the MPEG-DASH standard, namely DASH-JS from Alpen-Adria-Universitaet Klagenfurt and dash.js which is the official reference client of the DASH Industry Forum. Based on a predefined content con- figuration, which is comparable among the clients, we run a crowdsourcing campaign to determine the QoE of each implementation in order to determine the current state-of- the-art for MPEG-DASH systems within real-world environments. The gathered data and its analysis will be presented in the paper. It provides insights with respect to the QoE performance of current Web-based adaptive HTTP stream- ing systems.
Google Glass, The META and Co. - How to calibrate your Optical See-Through He...Jens Grubert
Slides from our ISMAR 2014 tutorial http://stctutorial.icg.tugraz.at/
Abstract:
Head Mounted Displays such as Google Glass and the META have the potential to spur consumer-oriented Optical See-Through Augmented Reality applications. A correct spatial registration of those displays relative to a user’s eye(s) is an essential problem for any HMD-based AR application.
At our ISMAR 2014 tutorial we provide an overview of established and novel approaches for the calibration of those displays (OST calibration) including hands on experience in which participants will calibrate such head mounted displays.
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2021/01/mlperf-an-industry-standard-performance-benchmark-suite-for-machine-learning-a-presentation-from-facebook-and-arizona-state-university/
Carole-Jean Wu, Research Scientist at Facebook AI Research and an Associate Professor at Arizona State University, presents the “MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learning” tutorial at the September 2020 Embedded Vision Summit.
The rapid growth in the use of DNNs has spurred the development of numerous specialized processor architectures and software frameworks. System and application developers need reliable performance metrics to help them select processors and frameworks. Processor and framework developers need reliable performance metrics so that they can improve their products.
This talk presents MLPerf, an industry-standard performance benchmark suite, and the design philosophies behind the benchmark suite and associated benchmarking methodologies.
Authors: Elisabet Carcel, Manuel Martos, Xavier Giró-i-Nieto and Ferran Marqués
Details: https://imatge.upc.edu/web/publications/rich-internet-application-semi-automatic-annotation-semantic-shots-keyframes
This paper describes a system developed for the semi- automatic annotation of keyframes in a broadcasting company. The tool aims at assisting archivists who traditionally label every keyframe manually by suggesting them an automatic annotation that they can intuitively edit and validate. The system is valid for any domain as it uses generic MPEG-7 visual descriptors and binary SVM classifiers. The classification engine has been tested on the multiclass problem of semantic shot detection, a type of metadata used in the company to index new con- tent ingested in the system. The detection performance has been tested in two different domains: soccer and parliament. The core engine is ac- cessed by a Rich Internet Application via a web service. The graphical user interface allows the edition of the suggested labels with an intuitive drag and drop mechanism between rows of thumbnails, each row representing a different semantic shot class. The system has been described as complete and easy to use by the professional archivists at the company.
Digital video watermarking using modified lsb and dct techniqueeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
MIPI DevCon 2016: How to Use the VESA Display Stream Compression (DSC) Standa...MIPI Alliance
The VESA Display Stream Compression (DSC) standard is a visually lossless video compression algorithm that decreases transmission bandwidth by up to 3X, while lowering power and reducing EMI. The standard has been adopted by leading suppliers of semiconductors for use in mobiles, tablets, in-car video, and DTV applications in order to achieve higher resolution displays. This presentation by Hardent's Alain Legault provides background information about DSC and the role it plays in today’s interface IP ecosystem when combined with MIPI® DSI, USB Type-C™, DisplayPort™ and Embedded DisplayPort™, and HDMI™ IPs. Several use cases are discussed, and practical information on how to successfully integrate DSC in semiconductor designs is also provided.
This presentation was given at the doctoral days at ENSIAS Morocco. The goal was to show how the innovation process goes and a particular example through what Cisco is doing for the media networks.
In this Dagstuhl talk, I presented my current research on cloud auto-scaling and component connector self-adaptation and how I employed type-2 fuzzy control to tame the uncertainty regarding knowledge specification.
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper62.pdf
YouTube: https://youtu.be/gV-rvV3iFDA
Pierre-Etienne Martin, Jenny Benois-Pineau, Boris Mansencal, Renaud Péteri and Julien Morlier : Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal CNN for MediaEval 2020. Proc. of MediaEval 2020, 14-15 December 2020, Online.
This work presents a method for classifying table tennis strokes using spatio-temporal convolutional neural networks. The fine-grained classification is performed on trimmed video segments recorded at 120 fps with different players performing in natural conditions. From those segments, the frames are extracted, their optical flow is computed and the pose of the player is estimated. From the optical flow amplitude, a region of interest is inferred. A three stream spatio-temporal convolutional neural network using combination of those modalities and 3D attention mechanisms is presented in order to perform classification.
Presented by: Pierre-Etienne Martin
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper50.pdf
Hai Nguyen-Truong, San Cao, N. A. Khoa Nguyen, Bang-Dang Pham, Hieu Dao, Minh-Quan Le, Hoang-Phuc Nguyen-Dinh, Hai-Dang Nguyen and Minh-Triet Tran : HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table Tennis Strokes Classification Task. Proc. of MediaEval 2020, 14-15 December 2020, Online.
The Sports Video Classification Tasks in the Multimedia Evaluation 2020 Challenge focuses on classifying different types of table tennis strokes in video segments. In this task, we - the HCMUS Team - perform multiple experiments, which includes a combination of models such as SlowFast, Optical Flow, DensePose, R2+1, Channel-Separated Convolutional Networks, to classify 21 types of table tennis strokes from video segments. In total, we submit eight runs corresponding to five different models with different sets of hyper-parameters in each of our models. In addition, we apply some pre-processing techniques on the dataset in order for our model to learn and classify more accurately. According to the evaluation results, one of our team's methods out-performs the other team's. In particular, our best run achieves 31.35\% global accuracy, and all of our methods show potential results in terms of local and global accuracy for action recognition tasks.
More Related Content
Similar to Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a Video Memorable?
Peter Muschick MSc thesis
Universitat Pollitecnica de Catalunya, 2020
Sign language recognition and translation has been an active research field in the recent years with most approaches using deep neural networks to extract information from sign language data. This work investigates the mostly disregarded approach of using human keypoint estimation from image and video data with OpenPose in combination with transformer network architecture. Firstly, it was shown that it is possible to recognize individual signs (4.5% word error rate (WER)). Continuous sign language recognition though was more error prone (77.3% WER) and sign language translation was not possible using the proposed methods, which might be due to low accuracy scores of human keypoint estimation by OpenPose and accompanying loss of information or insufficient capacities of the used transformer model. Results may improve with the use of datasets containing higher repetition rates of individual signs or focusing more precisely on keypoint extraction of hands.
Alex Tellez's slides on Deep Learning Applications, including using auto-encoders, finding better Bordeaux wine, and fighting crime in Chicago, from the 3/11/15 Meetup at H2O.ai HQ and the 3/12/15 Meetup at Mills College.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Quality of Experience of Web-based Adaptive HTTP Streaming Clients in Real-Wo...Alpen-Adria-Universität
Multimedia streaming over HTTP has gained momentum with the approval of the MPEG-DASH standard and many research papers evaluated various aspects thereof but mainly within controlled environments. However, the actual behaviour of a DASH client within real-world environments has not yet been evaluated. The aim of this paper is to compare the QoE performance of existing DASH-based Web clients within real-world environments using crowdsourcing. Therefore, we select Google’s YouTube player and two open source implementations of the MPEG-DASH standard, namely DASH-JS from Alpen-Adria-Universitaet Klagenfurt and dash.js which is the official reference client of the DASH Industry Forum. Based on a predefined content con- figuration, which is comparable among the clients, we run a crowdsourcing campaign to determine the QoE of each implementation in order to determine the current state-of- the-art for MPEG-DASH systems within real-world environments. The gathered data and its analysis will be presented in the paper. It provides insights with respect to the QoE performance of current Web-based adaptive HTTP stream- ing systems.
Google Glass, The META and Co. - How to calibrate your Optical See-Through He...Jens Grubert
Slides from our ISMAR 2014 tutorial http://stctutorial.icg.tugraz.at/
Abstract:
Head Mounted Displays such as Google Glass and the META have the potential to spur consumer-oriented Optical See-Through Augmented Reality applications. A correct spatial registration of those displays relative to a user’s eye(s) is an essential problem for any HMD-based AR application.
At our ISMAR 2014 tutorial we provide an overview of established and novel approaches for the calibration of those displays (OST calibration) including hands on experience in which participants will calibrate such head mounted displays.
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2021/01/mlperf-an-industry-standard-performance-benchmark-suite-for-machine-learning-a-presentation-from-facebook-and-arizona-state-university/
Carole-Jean Wu, Research Scientist at Facebook AI Research and an Associate Professor at Arizona State University, presents the “MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learning” tutorial at the September 2020 Embedded Vision Summit.
The rapid growth in the use of DNNs has spurred the development of numerous specialized processor architectures and software frameworks. System and application developers need reliable performance metrics to help them select processors and frameworks. Processor and framework developers need reliable performance metrics so that they can improve their products.
This talk presents MLPerf, an industry-standard performance benchmark suite, and the design philosophies behind the benchmark suite and associated benchmarking methodologies.
Authors: Elisabet Carcel, Manuel Martos, Xavier Giró-i-Nieto and Ferran Marqués
Details: https://imatge.upc.edu/web/publications/rich-internet-application-semi-automatic-annotation-semantic-shots-keyframes
This paper describes a system developed for the semi- automatic annotation of keyframes in a broadcasting company. The tool aims at assisting archivists who traditionally label every keyframe manually by suggesting them an automatic annotation that they can intuitively edit and validate. The system is valid for any domain as it uses generic MPEG-7 visual descriptors and binary SVM classifiers. The classification engine has been tested on the multiclass problem of semantic shot detection, a type of metadata used in the company to index new con- tent ingested in the system. The detection performance has been tested in two different domains: soccer and parliament. The core engine is ac- cessed by a Rich Internet Application via a web service. The graphical user interface allows the edition of the suggested labels with an intuitive drag and drop mechanism between rows of thumbnails, each row representing a different semantic shot class. The system has been described as complete and easy to use by the professional archivists at the company.
Digital video watermarking using modified lsb and dct techniqueeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
MIPI DevCon 2016: How to Use the VESA Display Stream Compression (DSC) Standa...MIPI Alliance
The VESA Display Stream Compression (DSC) standard is a visually lossless video compression algorithm that decreases transmission bandwidth by up to 3X, while lowering power and reducing EMI. The standard has been adopted by leading suppliers of semiconductors for use in mobiles, tablets, in-car video, and DTV applications in order to achieve higher resolution displays. This presentation by Hardent's Alain Legault provides background information about DSC and the role it plays in today’s interface IP ecosystem when combined with MIPI® DSI, USB Type-C™, DisplayPort™ and Embedded DisplayPort™, and HDMI™ IPs. Several use cases are discussed, and practical information on how to successfully integrate DSC in semiconductor designs is also provided.
This presentation was given at the doctoral days at ENSIAS Morocco. The goal was to show how the innovation process goes and a particular example through what Cisco is doing for the media networks.
In this Dagstuhl talk, I presented my current research on cloud auto-scaling and component connector self-adaptation and how I employed type-2 fuzzy control to tame the uncertainty regarding knowledge specification.
Similar to Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a Video Memorable? (20)
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper62.pdf
YouTube: https://youtu.be/gV-rvV3iFDA
Pierre-Etienne Martin, Jenny Benois-Pineau, Boris Mansencal, Renaud Péteri and Julien Morlier : Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal CNN for MediaEval 2020. Proc. of MediaEval 2020, 14-15 December 2020, Online.
This work presents a method for classifying table tennis strokes using spatio-temporal convolutional neural networks. The fine-grained classification is performed on trimmed video segments recorded at 120 fps with different players performing in natural conditions. From those segments, the frames are extracted, their optical flow is computed and the pose of the player is estimated. From the optical flow amplitude, a region of interest is inferred. A three stream spatio-temporal convolutional neural network using combination of those modalities and 3D attention mechanisms is presented in order to perform classification.
Presented by: Pierre-Etienne Martin
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper50.pdf
Hai Nguyen-Truong, San Cao, N. A. Khoa Nguyen, Bang-Dang Pham, Hieu Dao, Minh-Quan Le, Hoang-Phuc Nguyen-Dinh, Hai-Dang Nguyen and Minh-Triet Tran : HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table Tennis Strokes Classification Task. Proc. of MediaEval 2020, 14-15 December 2020, Online.
The Sports Video Classification Tasks in the Multimedia Evaluation 2020 Challenge focuses on classifying different types of table tennis strokes in video segments. In this task, we - the HCMUS Team - perform multiple experiments, which includes a combination of models such as SlowFast, Optical Flow, DensePose, R2+1, Channel-Separated Convolutional Networks, to classify 21 types of table tennis strokes from video segments. In total, we submit eight runs corresponding to five different models with different sets of hyper-parameters in each of our models. In addition, we apply some pre-processing techniques on the dataset in order for our model to learn and classify more accurately. According to the evaluation results, one of our team's methods out-performs the other team's. In particular, our best run achieves 31.35\% global accuracy, and all of our methods show potential results in terms of local and global accuracy for action recognition tasks.
Sports Video Classification: Classification of Strokes in Table Tennis for Me...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper2.pdf
YouTube: https://youtu.be/-bRL868b8ys
Pierre-Etienne Martin, Jenny Benois-Pineau, Boris Mansencal, Renaud Péteri, Laurent Mascarilla, Jordan Calandre and Julien Morlier : Sports Video Classification: Classification of Strokes in Table Tennis for MediaEval 2020. Proc. of MediaEval 2020, 14-15 December 2020, Online.
Fine-grained action classification has raised new challenges compared to classical action classification problems. Sport video analysis is a very popular research topic, due to the variety of application areas, ranging from multimedia intelligent devices with user-tailored digests, up to analysis of athletes' performances. Running since 2019 as a part of MediaEval, we offer a task which consists in classifying table tennis strokes from videos recorded in natural conditions at the University of Bordeaux. The aim is to build tools for teachers, coaches and players to analyse table tennis games. Such tools could lead to an automatic profiling of the player and adaptation of his training for improving his/her sport skills more efficiently.
Presented by: Pierre-Etienne Martin
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper61.pdf
YouTube: https://youtu.be/brmI4g3jLS4
Ricardo Kleinlein, Cristina Luna-Jiménez, Fernando Fernández-Martínez and Zoraida Callejas : Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention and LSTM Models. Proc. of MediaEval 2020, 14-15 December 2020, Online.
This paper reports on the GTH-UPM team experience in the Predicting Media Memorability task at MediaEval 2020. Teams were requested to predict memorability scores at both short-term and long-term, understanding such score as a measure of whether a video was perdurable in a viewer's memory or not. Our proposed system relies on a late fusion of the scores predicted by three sequential models, each trained over a different modality: video captions, aural embeddings and visual optical flow-based vectors. Whereas single-modality models show a low or zero Spearman correlation coefficient value, their combination considerably boosts performance over development data up to 0.2 in the short-term memorability prediction subtask and 0.19 in the long-term subtask. However, performance over test data drops to 0.016 and -0.041, respectively.
Presented by: Ricardo Kleinlein
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Taskmultimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper52.pdf
Janadhip Jacutprakart, Rukiye Savran Kiziltepe, John Q. Gan, Giorgos Papanastasiou and Alba G. Seco de Herrera : Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task. Proc. of MediaEval 2020, 14-15 December 2020, Online.
In this paper, we present the methods of approach and the main results from the Essex NLIP Team’s participation in the MediEval 2020 Predicting Media Memorability task. The task requires participants to build systems that can predict short-term and long-term memorability scores on real-world video samples provided. The focus of our approach is on the use of colour-based visual features as well as the use of the video annotation meta-data. In addition, hyper-parameter tuning was explored. Besides the simplicity of the methodology, our approach achieves competitive results. We investigated the use of different visual features. We assessed the performance of memorability scores through various regression models where Random Forest regression is our final model, to predict the memorability of videos.
Fooling an Automatic Image Quality Estimatormultimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper45.pdf
Benoit Bonnet, Teddy Furon and Patrick Bas : Fooling an Automatic Image Quality Estimator. Proc. of MediaEval 2020, 14-15 December 2020, Online.
In this paper we present our work on the 2020 MediaEval task: Pixel "Privacy: Quality Camouflage for Social Images". Blind Image Quality Assessment (BIQA) is a classifier that for any given image will return a quality score. Our task is to modify an image to decrease its BIQA score while maintaining a good perceived quality. Since BIQA is a deep neural network, we worked on an adversarial attack approach of the problem.
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper16.pdf
YouTube: https://youtu.be/ix_b9K7j72w
Zhengyu Zhao : Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable Color Filter. Proc. of MediaEval 2020, 14-15 December 2020, Online.
This paper presents the submission of our RU-DS team to the Pixel Privacy Task 2020. We propose to fool the blind image quality assessment model by transforming images based on optimizing a human-understandable color filter. In contrast to the common work that relies on small, $L_p$-bounded additive pixel perturbations, our approach yields large yet smooth perturbations. Experimental results demonstrate that in the specific context of this task, our approach is able to achieve strong adversarial effects, but has to sacrifice the image appeal.
Presented by: Zhengyu Zhao
Pixel Privacy: Quality Camouflage for Social Imagesmultimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper77.pdf
YouTube: https://youtu.be/8Rr4KknGSac
Zhuoran Liu, Zhengyu Zhao, Martha Larson and Laurent Amsaleg : Pixel Privacy: Quality Camouflage for Social Images. Proc. of MediaEval 2020, 14-15 December 2020, Online.
High-quality social images shared online can be misappropriated for unauthorized goals, where the quality filtering step is commonly carried out by automatic Blind Image Quality Assessment (BIQA) algorithms. Pixel Privacy benchmarks privacy-protective approaches that protect privacy-sensitive images against unethical computer vision algorithms. In the 2020 task, participants are encouraged to develop camouflage methods that can effectively decrease the BIQA quality score of high-quality images and maintain image appeal. The camouflaged images need to be either imperceptible to the human eye, or it can be a visible enhancement.
Presented by: Zhuoran Liu
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matchingmultimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper73.pdf
YouTube: https://youtu.be/TadJ6y7xZeA
Thuc Nguyen-Quang, Tuan-Duy Nguyen, Thang-Long Nguyen-Ho, Anh-Kiet Duong, Xuan-Nhat Hoang, Vinh-Thuyen Nguyen-Truong, Hai-Dang Nguyen and Minh-Triet Tran : HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching. Proc. of MediaEval 2020, 14-15 December 2020, Online.
Matching text and images based on their semantics has an important role in cross-media retrieval. However, text and images in articles have a complex connection. In the context of MediaEval 2020 Challenge, we propose three multi-modal methods for mapping text and images of news articles to the shared space in order to perform efficient cross-retrieval. Our methods show systemic improvement and validate our hypotheses, while the best-performed method reaches a recall@100 score of 0.2064.
Presented by: Thuc Nguyen-Quang
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper72.pdf
Sabarinathan D and Suganya Ramamoorthy : Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attention Unit. Proc. of MediaEval 2020, 14-15 December 2020, Online.
Colorectal cancer is the third most common cause of cancer worldwide. In the era of medical Industry, identifying colorectal cancer in its early stages has been a challenging problem. Inspired by these issues, the main objective of this paper is to develop a Multi supervision net algorithm for segmenting polys on a comprehensive dataset. The risk of colorectal cancer could be reduced by early diagnosis of poly during a colonoscopy. The disease and their symptoms are highly varying and always a need for a continuous update of knowledge for the doctors and medical analyst. The diseases fall into different categories and a small variation of symptoms may lead to higher rate of risk. We have taken Medico polyp challenge dataset, which consists of 1000 segmented polyp images from gastrointestinal track. We proposed an efficient Net B4 as a pre-trained architecture in multi-supervision net. The model is trained with multiple output layers. We present quantitative results on colorectal dataset to evaluate the performance and achieved good results in all the performance metrics. The experimental results proved that the proposed model is robust and provides a good level of accuracy in segmenting polyps on a comprehensive dataset for different metrics such as Dice coefficient, Recall, Precision and F2.
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper47.pdf
YouTube: https://youtu.be/vMsM4zg2-JY
Tien-Phat Nguyen, Tan-Cong Nguyen, Gia-Han Diep, Minh-Quan Le, Hoang-Phuc Nguyen-Dinh, Hai-Dang Nguyen and Minh-Triet Tran : HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ for Polyps Segmentation. Proc. of MediaEval 2020, 14-15 December 2020, Online.
The Medico task, MediaEval 2020, explores the challenge of building accurate and high-performance algorithms to detect all types of polyps in endoscopic images. We proposed different approaches leveraging the advantages of either ResUnet++ or PraNet model to efficiently segment polyps in colonoscopy images, with modifications on the network structure, parameters, and training strategies to tackle various observed characteristics of the given dataset. Our methods outperform the other teams' methods, for both accuracy and efficiency. After the evaluation, we are at top 2 for task 1 (with Jaccard index of 0.777, best Precision and Accuracy scores) and top 1 for task 2 (with 67.52 FPS and Jaccard index of 0.658).
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper31.pdf
Syed Muhammad Faraz Ali, Muhammad Taha Khan, Syed Unaiz Haider, Talha Ahmed, Zeshan Khan and Muhammad Atif Tahir : Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Intestinal Tract. Proc. of MediaEval 2020, 14-15 December 2020, Online.
Identification of polyps in endoscopic images is critical for the diagnosis of colon cancer. Finding the exact shape and size of polyps requires the segmentation of endoscopic images. This research explores the advantage of using depth-wise separable convolution in the atrous convolution of the ResUNet++ architecture. Deep atrous spatial pyramid pooling was also implemented on the ResUNet++ architecture. The results show that architecture with separable convolution has a smaller size and fewer GFLOPs without degrading the performance too much.
Deep Conditional Adversarial learning for polyp Segmentationmultimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper22.pdf
Debapriya Banik and Debotosh Bhattacharjee : Deep Conditional Adversarial learning for polyp Segmentation. Proc. of MediaEval 2020, 14-15 December 2020, Online.
This approach has addressed the Medico automatic polyp segmentation challenge which is a part of Mediaeval 2020. We have proposed a deep conditional adversarial learning based network for the automatic polyp segmentation task. The network comprises of two interdependent models namely a generator and a discriminator. The generator network is a FCN employed for the prediction of the polyp mask while the discriminator enforces the segmentation to be as similar as the real segmented mask (ground truth). Our proposed model achieved a comparative result on the test dataset provided by the organizers of the challenge.
A Temporal-Spatial Attention Model for Medical Image Detectionmultimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper21.pdf
Hwang Maxwell, Wu Cai, Hwang Kao-Shing, Xu Yong Si and Wu Chien-Hsing : A Temporal-Spatial Attention Model for Medical Image Detection. Proc. of MediaEval 2020, 14-15 December 2020, Online.
A local region model with attentive temporal-spatial pathways is proposed for automatically learning various target structures. The attentive spatial pathway highlights the salient region to generate bounding boxes and ignores irrelevant regions in an input image. The proposed attention mechanism allows efficient object localization and the overall predictive performance is increased because there are fewer false positives for the object detection task for medical images with manual annotations. The experimental results show that proposed models consistently increase the base architectures' predictive performance for different datasets and training sizes without undue computational efficiency.
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper20.pdf
YouTube: https://youtu.be/CVelQl5Luf0
Quoc-Huy Trinh, Minh-Van Nguyen, Thiet-Gia Huynh and Minh-Triet Tran : HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Network and UNet for Polyps Segmentation. Proc. of MediaEval 2020, 14-15 December 2020, Online.
The Medico: Multimedia Task focuses on developing an efficient and accurate framework to computer-aided diagnosis systems for automatic polyp segmentation to detect all types of polyps in endoscopic images of the gastrointestinal (GI) tract. We are HCMUS-team approach a solution, which includes combination Residual module, Inception module, Adaptive Convolutional neural network with Unet model and PraNet to semantic segmentation all types of polyps in endoscopic images. We submit multiple runs with different architecture and parameters in our model. Our methods show potential results in accuracy and efficiency through multiple experiments.
Fine-tuning for Polyp Segmentation with Attentionmultimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper15.pdf
Rabindra Khadka : Transfer of Knowledge: Fine-tuning for Polyp Segmentation with Attention. Proc. of MediaEval 2020, 14-15 December 2020, Online.
This paper describes how the transfer of prior knowledge can effectively take on segmentation tasks with the help of attention mechanisms. The UNet model pretrained on brain MRI dataset was fine-tuned with the polyp dataset. Attention mechanism was integrated to focus on relevant regions in the input images. The implemented architecture is evaluated on 200 validation images based on intersection over union and dice score between groundtruth and predicted region. The model demonstrates a promising result with computational efciency.
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper12.pdf
Adrian Krenzer and Frank Puppe : Bigger Networks are not Always Better: Deep Convolutional Neural Networks for Automated Polyp Segmentation. Proc. of MediaEval 2020, 14-15 December 2020, Online.
This paper presents our team's (AI-JMU) approach to the Medico automated polyp segmentation challenge. We consider deep convolutional neural networks to be well suited for this task. To determine the best architecture we test and compare state of the art backbones and two different heads. Finally we achieve a Jaccard index of 73.74\% on the challenge test set. We further demonstrate that bigger networks do not always perform better. However the growing network size always increases the computational complexity.
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper51.pdf
Amel Ksibi, Amina Salhi, Ala Alluhaidan and Sahar A. El-Rahman : Insights for wellbeing: Predicting Personal Air Quality Index using Regression Approach. Proc. of MediaEval 2020, 14-15 December 2020, Online.
Providing air pollution information to individuals enables them to understand the air quality of their living environments. Thus, the association between people’s wellbeing and the properties of the surrounding environment is an essential area of investigation. This paper proposes Air Quality Prediction through harvesting public/open data and leveraging them to get the Personal Air Quality index. These are usually incomplete. To cope with the problem of missing data, we applied the KNN imputation method. To predict Personal Air Quality Index, we apply a voting regression approach based on three base regressors which are Gradient Boosting regressor, Random Forest regressor, and linear regressor. Evaluating the experimental results using the RMSE metric, we got an average score of 35.39 for Walker and 51.16 for Car.
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...multimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper40.pdf
YouTube: https://youtu.be/SL5Hvu1mARY
Trung-Quan Nguyen, Dang-Hieu Nguyen and Loc Tai Tan Nguyen : Use Visual Features From Surrounding Scenes to Improve Personal Air Quality Data Prediction Performance. Proc. of MediaEval 2020, 14-15 December 2020, Online.
In this paper, we propose a method to predict the personal air quality index in an area by using the combination of the levels of the following pollutants: PM2.5, NO2, and O3, measured from the nearby weather stations of that area, and the photos of surrounding scenes taken at that area. Our approach uses the Inverse Distance Weighted (IDW) technique to estimate the missing air pollutant levels and then use regression to integrate visual features from taken photos to optimize the predicted values. After that, we can use those values to calculate the Air Quality Index (AQI). The results show that the proposed method may not improve the performance of the prediction in some cases.
Personal Air Quality Index Prediction Using Inverse Distance Weighting Methodmultimediaeval
Paper: http://ceur-ws.org/Vol-2882/paper39.pdf
YouTube: https://youtu.be/3r_oSguFPVM
Trung-Quan Nguyen, Dang-Hieu Nguyen and Loc Tai Tan Nguyen : Personal Air Quality Index Prediction Using Inverse Distance Weighting Method. Proc. of MediaEval 2020, 14-15 December 2020, Online.
In this paper, we propose a method to predict the personal air quality index in an area by only using the levels of the following pollutants: PM2.5, NO2, O3. All of them are measured from the nearby weather stations of that area. Our approach uses one of the most well-known interpolation methods in spatial analysis, the Inverse Distance Weighted (IDW) technique, to estimate the missing air pollutant levels. After that, we can use those levels to calculate the Air Quality Index (AQI). The results show that the proposed method is suitable for the prediction of those air pollutant levels.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a Video Memorable?
1. MediaEval2020
Predicting Media Memorability
Task Overview
Alba García Seco de Herrera, Rukiye Savran Kiziltepe, Jon Chamberlain, Mihai Gabriel Constantin,
Claire-Hélène Demarty, Faiyaz Doctor, Bogdan Ionescu, Alan Smeaton
Presentation Video
2. Task Description
Goal: predicting how memorable a video is to viewers
15/12/2020 MediaEval2020 2
• Automatically predicting short-term and
long-term memorability
• TRECVid 2019 Video to Text dataset1
• Sound and more action
1. Awad, G., Butt, A.A., Lee, Y., Fiscus, J., Godil, A., Delgado, A., Smeaton, A.F. and Graham, Y., Trecvid 2019:
An evaluation campaign to benchmark video activity detection, video captioning and matching, and video
search & retrieval. 2019.
3. Annotation Tool
• Short-term memorability : a few minutes after memorization
• Long-term memorability: 24 – 72 hours later
15/12/2020 MediaEval2020 3
Romain Cohendet, Claire-Hélène Demarty, Ngoc Duong, and Martin Engilberge. VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability. Proceedings of the IEEE
International Conference on Computer Vision. 2019.
Video Memorability Game
4. Annotation Protocol
Step 1 (180 videos)
• 40 targets– repeated after a few minutes
• 60 fillers – non target videos
• 20 vigilance fillers – repeated quickly to monitor the attention
15/12/2020 MediaEval2020 4
Romain Cohendet, Claire-Hélène Demarty, Ngoc Duong, and Martin Engilberge. VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability. Proceedings of the IEEE
International Conference on Computer Vision. 2019.
Step 2 (120 videos)
• 40 targets– randomly chosen from non-vigilance fillers
• 80 fillers – randomly chosen new videos
5. Dataset Description
• TRECVid 2019
(Video to Text)
• 1500 videos
• 1000 training set
• 500 test set
15/12/2020 MediaEval2020 5
6. Dataset Description
15/12/2020 MediaEval2020 6
• AlexNetFC7
• HOG
• HSVHist
• RGBHist
• LBP
• VGGFC7
• C3D
• Text descriptions
• Annotations
• Response time
• Key press
• Video position
Short-term memorability score
Long-term memorability score
7. Examples (Low Short-term and Long-term Memorability)
15/12/2020 MediaEval2020 7
• At football game, the ball is kicked past end zone and
woman is knocked down from her knees
• football player are playing at a football field.
• At a college football game, during a kickoff, the kicker
kicks the ball over the endzone and hits a spectator
in the face while they are trying to catch it.
• a person is injured when the football player kicked a
ball across a field during a game
• Football kicks football during a day game and a
cheerleader tries to catch it and ball hits her in the
head.
8. Examples (High Short-term and Long-term Memorability)
15/12/2020 MediaEval2020 8
• Two boys wearing white shirts on playground swings
• Two young men, are on a swing and yell, outdoors.