On the development of a Visual-Temporal-awareness Rheumatic Heart Disease classifier for Echocardiographic Videos

•

0 likes•567 views

Erickson Nascimento, Federal University of Minas Gerais - "On the development of a Visual-Temporal-awareness Rheumatic Heart Disease classifier for Echocardiographic Videos"

Technology

• Rheumatic Heart Disease (RHD) is a heart condition caused by abnormal immune
response to streptococcal infection,
• streptococcal: a bacteria normally associated with poor sanitation and
hygiene conditions.
• The burden of RHD is concentrated in low-income countries,
• health resources are scarce.
• Echocardiographic (echo) screening is the gold standard for diagnosis of latent
RHD;
• personnel shortages limit broad implementation.
• To address this issue, we aimed to develop a machine-learning model for automatic
identification to be used in further steps of our solution for RHD screening for
prioritization of follow-up.
1

Preprocessing phase
• Videos clipped at 16 frames
• Rotation and resizing to 128x171 pixels (required by the DNN chosen)
• Whitening (process that subtracts the pixels in each video by the mean of the
videos in the original training data)
2
Video Pre-processing
Before whitening After whitening
Frame of a video
with doppler
Frame of a video
without doppler

Methodology
• Videos with and without doppler were considered separately.
• Undersampling according to the borderline-RHD class
• Classify an exam directly, i.e., there is no view classification
• Use of the C3D neural network proposed by Tran et al. [2015], originally
trained with the Sports-1M dataset
• Changed the classification layer according to the problem modeling
followed
• Fine-tuned the parameters with the training set
3
Methodology
D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning Spatiotemporal Features with 3D Convolutional Networks,
ICCV 2015

Modified version of the C3D architecture (as showed below)
• Input: 16 frames from a video of an exam;
• 50 epochs with early stopping;
• Batch size of 16;
• Learning rate of 0.001 and a random crop strategy.
4
Network architecture
Normal
or
RHD
positive
Visual feature extraction Classifier

Preliminary experiments to understand the capability of the network in extracting visual
features and separating the 2 classes of interest.
We biased the training to maximize the Borderline accuracy.
Results of confusion matrix per video considering two classes: RHD positive and
negative:
• accuracy: 0.628 (95% CI, 0.573 – 0.682)
• specificity: 0.615 (95% CI, 0.435 – 0.795)
• sensibility: 0.641 (95% CI, 0.432 – 0.850)
5
Results per video
and 2 classes

• Hyperparameter tuning (hyperband)
• Take advantage of visual features from the doppler images;
• Analyze the visual features the networks use to classify the exams (interpretability)
and compare with those used by doctors;
• Build a network architecture with 2 arms (see figure below), considering both
doppler images and raw images from the exams.
6
Doing
Normal
or
RHD positive
DopplerImageRawImage

The document presents the Siamese-rPPG Network for estimating remote photoplethysmography (rPPG) signals from face videos. The network uses a Siamese architecture with 3D convolutional layers and weight sharing to learn rPPG signals from two facial regions simultaneously. Evaluation on three datasets shows the network outperforms existing methods for contactless heart rate estimation from video in terms of correlation and error metrics.

04-Timo_Kunkel.pdf

tristone1

The document discusses the development of the Perceptual Quantizer (PQ) tone mapping curve. PQ was designed to efficiently encode high dynamic range content for delivery and display based on properties of human vision. It uses a "worst case engineering" approach where quantization steps are set just below the threshold of perceptible differences over the luminance range. Through modeling contrast sensitivity and testing, the PQ curve was developed to retain image quality while using 12-bits of data. PQ has been adopted as a standard through rigorous evaluation.

Segmenting Medical MRI via Recurrent Decoding Cell

Seunghyun Hwang

Video Classification: Human Action Recognition on HMDB-51 dataset

Giorgio Carbone

Convolutional neural network and its layers

ArnavPlayz

Shahid presentation

Muhammad Shahid

The document discusses methods for objective and subjective video quality assessment and speech enhancement. It covers four parts: (1) a classification and review of no-reference visual quality assessment methods, (2) no-reference and reduced-reference methods for video quality assessment including neural network and support vector machine approaches, (3) subjective methods for video quality assessment including studies on low resolution videos and crowdsourcing, and (4) speech enhancement techniques including spectral center-of-gravity based demodulation and convex optimization based demodulation. The document evaluates various computational models and machine learning techniques for video and speech quality assessment.

Presentación Tesis 08022016

Universidad Politécnica de Madrid

This document summarizes Juan Pedro López Velasco's thesis work on developing visual attention and perception models for assessing video quality. The work has two main objectives: 1) Predicting visual discomfort in 3D stereoscopic video by analyzing factors like motion, disparity, and parallax changes. 2) Improving 2D video quality metrics by applying visual attention models that weight regions of interest to better correspond to human perception. The work involves conducting subjective testing to determine important quality factors, developing computational models of visual attention, and incorporating these models into new objective metrics to provide more accurate quality assessment.

Published in NOSSDAV'17 on June 2017. We study the problem of predicting the Field-of-Views (FoVs) of viewers watching 360° videos using commodity Head-Mounted Displays (HMDs). Existing solutions either use the viewer's current orientation to approximate the FoVs in the future, or extrapolate future FoVs using the historical orientations and dead-reckoning algorithms. In this paper, we develop fixation prediction networks that concurrently leverage sensor- and content-related features to predict the viewer fixation in the future, which is quite different from the solutions in the literature. The sensor-related features include HMD orientations, while the content-related features include image saliency maps and motion maps. We build a 360° video streaming testbed to HMDs, and recruit twenty-five viewers to watch ten 360° videos. We then train and validate two design alternatives of our proposed networks, which allows us to identify the better-performing design with the optimal parameter settings. Trace-driven simulation results show the merits of our proposed fixation prediction networks compared to the existing solutions, including: (i) lower consumed bandwidth, (ii) shorter initial buffering time, and (iii) short running time.

Presentation Quality Probe (Vilamoura, Portugal, IADIS Applied Computing Conf...

Universidad Politécnica de Madrid

This paper proposes a blind quality algorithm to analyze streaming video content in 5G networks. The algorithm detects common streaming errors like color degradation, frozen frames, and packet loss. It is included in a "Quality Probe" application that operates as a virtual network function and sends quality reports. The algorithm was tested on sequences with different impairments from a video quality database. It successfully detected packet loss, color errors, and frozen frames. The results validate the algorithm and show the need for intelligent network nodes to monitor quality and adapt transmissions to improve users' experience in 5G networks. Future work includes additional metrics, processing time analysis, and testing in a real network.

Portal Imaging used to clear setup uncertainty

MajoVJJose

Title: Portal Imaging in Radiotherapy: A Comprehensive Exploration of Techniques, Applications, and Advancements Introduction Portal imaging is a critical component of modern radiotherapy, playing a pivotal role in the verification and precision of radiation treatment delivery. This technique involves the acquisition of X-ray images during or immediately after a patient's radiotherapy session, providing valuable information on the alignment of the treatment field with the intended target and surrounding critical structures. In this comprehensive exploration, we delve into the principles, techniques, clinical applications, challenges, and future prospects of portal imaging in the context of radiotherapy. 1. Principles of Portal Imaging Portal imaging is rooted in the principles of verifying and ensuring the accuracy of radiation therapy delivery. Before each treatment fraction, the patient's position is verified to ensure it aligns precisely with the treatment plan. Portal images are acquired using specialized imaging devices, usually in the form of electronic portal imaging devices (EPIDs) or film-based systems. These images serve as a real-time snapshot of the radiation field, allowing clinicians to assess the actual treatment setup against the planned position. 2. Techniques of Portal Imaging 2.1 Electronic Portal Imaging Devices (EPIDs) Electronic portal imaging devices, or EPIDs, have become a standard tool in portal imaging due to their real-time imaging capabilities and digital nature. EPIDs consist of a detector panel that captures the transmitted radiation through the patient during treatment. The resulting electronic images are immediately available for review, facilitating prompt decision-making regarding the need for adjustments in patient positioning or treatment parameters. 2.2 Film-Based Portal Imaging Film-based portal imaging, while less commonly used today, has historical significance and is still employed in certain clinical settings. It involves exposing X-ray film positioned behind the patient during treatment. The film is then developed, and the resulting image is analyzed to verify the alignment of the treatment field. Though the process is not as immediate as with EPIDs, film-based systems may still offer advantages in certain situations. 3. Clinical Applications of Portal Imaging Portal imaging is integral to the success of radiotherapy across various cancer types and treatment modalities. 3.1 Treatment Verification and Positioning The primary application of portal imaging is to verify the accuracy of patient positioning and the alignment of the treatment field with the intended target volume. Any discrepancies detected through portal images allow for immediate adjustments to be made, ensuring that the radiation is delivered precisely to the targeted area while minimizing exposure to adjacent healthy tissues. 3.2 Tumor Localization and Changes in Anatomy Portal imaging aids in localizing tumors, particularly

A real time automatic eye tracking system for ophthalmology

Prarinya Siritanawan

Presentation of my senior Project about "A real time automatic eye tracking system for ophthalmology" In the presentation, it briefly explains about conventional object tracking method "template matching" based on Sum-of-Square difference. Therefore we also present the powerful matching technique called Gradient Orientation Pattern Matching (GOPM) proposed by T.Kondo and we proposed an improved version of GOPM called time-vary GOPM to solve a illumination and noise problem.

Application of machine learning and cognitive computing in intrusion detectio...

Mahdi Hosseini Moghaddam

This document describes a proposed hardware-based machine learning intrusion detection system using cognitive processors. It discusses the need for new intrusion detection approaches due to limitations of signature-based methods. The proposed system collects network packet data using a Raspberry Pi and classifies it using a Cognimem CM1K cognitive processor chip, which implements restricted coulomb energy and k-nearest neighbor algorithms. The document outlines the system architecture, data collection and normalization methodology, and analysis of results from testing the CM1K chip on both custom and NSL-KDD network datasets, finding accuracy levels around 70-80% but slower processing times than a software simulation of the chip's algorithms. Future work areas include adding more packet features, using

Quality Assessment for Recognition and Task-based multimedia applications (QART)

Mikołaj Leszczuk

Users of video to perform tasks require sufficient video quality to recognize the information needed for their application. Therefore, the fundamental measure of video quality in these applications is the success rate of these tasks (such as recognition), which is referred to as visual intelligibility or acuity. One of the major causes of reduction of visual intelligibility is loss of data, through various forms of compression. Additionally, the characteristics of the scene being captured have a direct effect on visual intelligibility and on the performance of a compression operation-specifically, the size of the target of interest, the lighting conditions, and the temporal complexity of the scene. The QART project is performing a series of tests to study the effects and interactions of compression and scene characteristics. An additional goal is to test existing or develop new objective measurements that will predict the results of the subjective tests of visual intelligibility.

Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...

Cristiano Rafael Steffens

1) The document evaluates how state-of-the-art convolutional neural networks (CNNs) perform on image recognition tasks when images are exposed to different types of noise, distortions and compression. 2) It finds that while CNN models are robust to mild exposure issues and noise, performance decreases significantly under moderate to severe exposure problems and salt and pepper noise. 3) Larger CNN models like NASNet Large perform best, while smaller mobile models are most affected by distortions. The study aims to improve CNN robustness and build image processing pipelines to handle faulty data.

Medical Video Processing (Tutorial)

klschoef

Current developments in video quality: From the emerging HEVC standard to tem...

Harilaos Koumaras

This document discusses current developments in video quality and the emerging HEVC video coding standard. It provides an overview of HEVC, including its key features such as flexible block structures, larger transform units, and new intra-coding and inter-coding prediction methods. Experimental results show that HEVC can achieve a 32-62% improvement in compression ratio over H.264/AVC while maintaining the same video quality. The document also discusses advances in video quality prediction through enhanced content classification of uncompressed video and improved prediction of quality for compressed video.

NMSL_2017summer

Wen-Chih Lo

This document discusses optimizing 360-degree video streaming to head-mounted virtual reality. It covers challenges like existing codecs only supporting 2D videos and 360 videos having wider views than conventional videos. Approaches proposed include fixation prediction to avoid streaming unwatched parts, QoE modeling designed for 360 videos to improve user experience, and an adaptive streaming platform to select and transmit tiles based on fixation prediction while allocating bitrates based on the QoE model. Part I discusses fixation prediction including using neural networks trained on viewing features. Part II covers QoE modeling, noting limitations of existing metrics and factors that affect QoE like content and bitrates. It constructs a logarithmic linear QoE model. Part III outlines an

Super resolution-review

Woojin Jeong

Sparse dissimilarity constrained coding for glaucoma screening

LogicMindtech Nologies

The document describes a method for glaucoma screening using retinal fundus images. Glaucoma is an irreversible eye disease that can cause vision loss if not detected early. The proposed method uses a novel sparse dissimilarity-constrained coding approach to segment and reconstruct the optic disc from fundus images. Reconstruction coefficients are used to calculate the cup to disc ratio, a metric for detecting glaucoma. The method was tested on 650 images and achieved better accuracy than other methods, with an average error of 0.064 compared to manual measurements. It also achieved good performance in glaucoma screening tests on two datasets. The method shows potential for large-scale population-based glaucoma screening using low-cost retinal imaging.

Target Detection and Classification Performance Enhancement using Super-Resol...

sipij

Long range infrared videos such as the Defense Systems Information Analysis Center (DSIAC) videos usually do not have high resolution. In recent years, there are significant advancement in video super-resolution algorithms. Here, we summarize our study on the use of super-resolution videos for target detection and classification. We observed that super-resolution videos can significantly improve the detection and classification performance. For example, for 3000 m range videos, we were able to improve the average precision of target detection from 11% (without super-resolution) to 44% (with 4x super-resolution) and the overall accuracy of target classification from 10% (without super-resolution) to 44% (with 2x superresolution).

TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...

sipij

TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...

sipij

Biometric Recognition using Deep Learning

SahithiKotha2

This document discusses biometric recognition using deep learning. It provides an overview of traditional biometric recognition processes and how deep learning has improved biometric recognition. Some key deep learning models for biometric recognition are convolutional neural networks, recurrent neural networks, autoencoders, and generative adversarial networks. Face recognition is discussed as an example application, outlining implementation steps and the use of OpenCV for face recognition. Challenges in biometric recognition using deep learning are also presented.

Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...

Seonho Park

Video processing.pptx

MohamedRiyaz115278

Video processing involves manipulating and analyzing digital video sequences. Common techniques include trimming, resizing, adjusting brightness/contrast, and analyzing using machine learning. Key concepts in video include compression, frames, frame rate, resolution, and aspect ratio. Compression reduces file sizes while maintaining quality. Frames are still images that make up video sequences. Frame rate determines smoothness. Resolution is pixels and quality. Aspect ratio is width to height ratio. Video can be compressed using intra-frame or inter-frame techniques. Enhancement improves quality using techniques like noise reduction and color correction. Analysis extracts information from video.

From ensembles to computer networks

CSIRO

Finding interesting patterns in data can lead to uncovering new knowledge. New patterns that haven’t occurred before can signify events of interest. Depending on context, these can be called novelties, anomalies, outliers or events. Whatever they are called, they are interesting because they tell a story different from the norm. In this talk, we will call them anomalies. Two diverse applications of anomaly detection are detecting fraudulent credit card transactions and identifying astronomical anomalies such as solar flares. However, there are many challenges in anomaly detection including high false positive rates and low predictive accuracy. Ensemble learning is a way of combining many algorithms or models to obtain better predictive performance. Anomaly detection is generally an unsupervised task, that is, we do not train models using labelled data. Constructing an unsupervised anomaly detection ensemble is challenging because we do not know the labels. In this talk we discuss two topics in anomaly detection. First, we introduce an anomaly detection ensemble using Item Response Theory (IRT) – a class of models used in educational psychometrics. Using IRT we construct an ensemble that can downplay noisy, non-discriminatory methods and accentuate sharper methods. Then we explore anomaly detection in computer network security. With cyber incidents and data breaches becoming increasingly common, we have seen a massive increase in computer network attacks over the years. Anomaly detection methods, even though used to detect suspicious behaviour, are criticized for high false positive rates. In addition, computer networks produce a large amount of complex data. We go through the end-to-end process of detecting anomalies in this scenario and show how we can minimize false positives and visualise anomalies developing over time.

Video Compression, Part 4 Section 2, Video Quality Assessment

Dr. Mohieddin Moradi

This document provides information on conducting subjective video quality assessments. It discusses different subjective assessment methods like double stimulus impairment scale (DSIS) and single stimulus continuous quality evaluation (SSCQE). It describes test parameters like number of observers, viewing conditions, grading scales and how to present the results. Guidelines are provided for tasks like screening observers, conducting test sessions, introducing impairments and collecting opinion scores to evaluate video coding standards and compression artifacts.

Software Defined Networking in the ATMOSPHERE project

On the development of a Visual-Temporal-awareness Rheumatic Heart Disease classifier for Echocardiographic Videos

Recommended

Recommended

More Related Content

Similar to On the development of a Visual-Temporal-awareness Rheumatic Heart Disease classifier for Echocardiographic Videos

Similar to On the development of a Visual-Temporal-awareness Rheumatic Heart Disease classifier for Echocardiographic Videos (20)

More from ATMOSPHERE .

More from ATMOSPHERE . (20)

Recently uploaded

Recently uploaded (20)

On the development of a Visual-Temporal-awareness Rheumatic Heart Disease classifier for Echocardiographic Videos