Erickson Nascimento, Federal University of Minas Gerais - "On the development of a Visual-Temporal-awareness Rheumatic Heart Disease classifier for Echocardiographic Videos"
Siamese-rPPG Network: Remote Photoplethysmography Signal Estimation from Face...ssuserbd51ec
The document presents the Siamese-rPPG Network for estimating remote photoplethysmography (rPPG) signals from face videos. The network uses a Siamese architecture with 3D convolutional layers and weight sharing to learn rPPG signals from two facial regions simultaneously. Evaluation on three datasets shows the network outperforms existing methods for contactless heart rate estimation from video in terms of correlation and error metrics.
The document discusses the development of the Perceptual Quantizer (PQ) tone mapping curve. PQ was designed to efficiently encode high dynamic range content for delivery and display based on properties of human vision. It uses a "worst case engineering" approach where quantization steps are set just below the threshold of perceptible differences over the luminance range. Through modeling contrast sensitivity and testing, the PQ curve was developed to retain image quality while using 12-bits of data. PQ has been adopted as a standard through rigorous evaluation.
Segmenting Medical MRI via Recurrent Decoding CellSeunghyun Hwang
Review : Segmenting Medical MRI via Recurrent Decoding Cell
- by Seunghyun Hwang (Yonsei University, Severance Hospital, Center for Clinical Data Science)
Video Classification: Human Action Recognition on HMDB-51 datasetGiorgio Carbone
Two-stream CNNs for video action recognition using Stacked Optical Flow, implemented in Keras, on HMDB-51 dataset.
We use spatial (ResNet-50 finetuned) and temporal stream cnn (stacked Optical Flows) under the Keras framework to perform Video-Based Human Action Recognition on HMDB-51 dataset.
The document discusses methods for objective and subjective video quality assessment and speech enhancement. It covers four parts: (1) a classification and review of no-reference visual quality assessment methods, (2) no-reference and reduced-reference methods for video quality assessment including neural network and support vector machine approaches, (3) subjective methods for video quality assessment including studies on low resolution videos and crowdsourcing, and (4) speech enhancement techniques including spectral center-of-gravity based demodulation and convex optimization based demodulation. The document evaluates various computational models and machine learning techniques for video and speech quality assessment.
This document summarizes Juan Pedro López Velasco's thesis work on developing visual attention and perception models for assessing video quality. The work has two main objectives: 1) Predicting visual discomfort in 3D stereoscopic video by analyzing factors like motion, disparity, and parallax changes. 2) Improving 2D video quality metrics by applying visual attention models that weight regions of interest to better correspond to human perception. The work involves conducting subjective testing to determine important quality factors, developing computational models of visual attention, and incorporating these models into new objective metrics to provide more accurate quality assessment.
Siamese-rPPG Network: Remote Photoplethysmography Signal Estimation from Face...ssuserbd51ec
The document presents the Siamese-rPPG Network for estimating remote photoplethysmography (rPPG) signals from face videos. The network uses a Siamese architecture with 3D convolutional layers and weight sharing to learn rPPG signals from two facial regions simultaneously. Evaluation on three datasets shows the network outperforms existing methods for contactless heart rate estimation from video in terms of correlation and error metrics.
The document discusses the development of the Perceptual Quantizer (PQ) tone mapping curve. PQ was designed to efficiently encode high dynamic range content for delivery and display based on properties of human vision. It uses a "worst case engineering" approach where quantization steps are set just below the threshold of perceptible differences over the luminance range. Through modeling contrast sensitivity and testing, the PQ curve was developed to retain image quality while using 12-bits of data. PQ has been adopted as a standard through rigorous evaluation.
Segmenting Medical MRI via Recurrent Decoding CellSeunghyun Hwang
Review : Segmenting Medical MRI via Recurrent Decoding Cell
- by Seunghyun Hwang (Yonsei University, Severance Hospital, Center for Clinical Data Science)
Video Classification: Human Action Recognition on HMDB-51 datasetGiorgio Carbone
Two-stream CNNs for video action recognition using Stacked Optical Flow, implemented in Keras, on HMDB-51 dataset.
We use spatial (ResNet-50 finetuned) and temporal stream cnn (stacked Optical Flows) under the Keras framework to perform Video-Based Human Action Recognition on HMDB-51 dataset.
The document discusses methods for objective and subjective video quality assessment and speech enhancement. It covers four parts: (1) a classification and review of no-reference visual quality assessment methods, (2) no-reference and reduced-reference methods for video quality assessment including neural network and support vector machine approaches, (3) subjective methods for video quality assessment including studies on low resolution videos and crowdsourcing, and (4) speech enhancement techniques including spectral center-of-gravity based demodulation and convex optimization based demodulation. The document evaluates various computational models and machine learning techniques for video and speech quality assessment.
This document summarizes Juan Pedro López Velasco's thesis work on developing visual attention and perception models for assessing video quality. The work has two main objectives: 1) Predicting visual discomfort in 3D stereoscopic video by analyzing factors like motion, disparity, and parallax changes. 2) Improving 2D video quality metrics by applying visual attention models that weight regions of interest to better correspond to human perception. The work involves conducting subjective testing to determine important quality factors, developing computational models of visual attention, and incorporating these models into new objective metrics to provide more accurate quality assessment.
Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual RealityWen-Chih Lo
Published in NOSSDAV'17 on June 2017.
We study the problem of predicting the Field-of-Views (FoVs) of viewers watching 360° videos using commodity Head-Mounted Displays (HMDs). Existing solutions either use the viewer's current orientation to approximate the FoVs in the future, or extrapolate future FoVs using the historical orientations and dead-reckoning algorithms. In this paper, we develop fixation prediction networks that concurrently leverage sensor- and content-related features to predict the viewer fixation in the future, which is quite different from the solutions in the literature. The sensor-related features include HMD orientations, while the content-related features include image saliency maps and motion maps. We build a 360° video streaming testbed to HMDs, and recruit twenty-five viewers to watch ten 360° videos. We then train and validate two design alternatives of our proposed networks, which allows us to identify the better-performing design with the optimal parameter settings.
Trace-driven simulation results show the merits of our proposed fixation prediction networks compared to the existing solutions, including: (i) lower consumed bandwidth, (ii) shorter initial buffering time, and (iii) short running time.
This paper proposes a blind quality algorithm to analyze streaming video content in 5G networks. The algorithm detects common streaming errors like color degradation, frozen frames, and packet loss. It is included in a "Quality Probe" application that operates as a virtual network function and sends quality reports. The algorithm was tested on sequences with different impairments from a video quality database. It successfully detected packet loss, color errors, and frozen frames. The results validate the algorithm and show the need for intelligent network nodes to monitor quality and adapt transmissions to improve users' experience in 5G networks. Future work includes additional metrics, processing time analysis, and testing in a real network.
Portal Imaging used to clear setup uncertaintyMajoVJJose
Title: Portal Imaging in Radiotherapy: A Comprehensive Exploration of Techniques, Applications, and Advancements
Introduction
Portal imaging is a critical component of modern radiotherapy, playing a pivotal role in the verification and precision of radiation treatment delivery. This technique involves the acquisition of X-ray images during or immediately after a patient's radiotherapy session, providing valuable information on the alignment of the treatment field with the intended target and surrounding critical structures. In this comprehensive exploration, we delve into the principles, techniques, clinical applications, challenges, and future prospects of portal imaging in the context of radiotherapy.
1. Principles of Portal Imaging
Portal imaging is rooted in the principles of verifying and ensuring the accuracy of radiation therapy delivery. Before each treatment fraction, the patient's position is verified to ensure it aligns precisely with the treatment plan. Portal images are acquired using specialized imaging devices, usually in the form of electronic portal imaging devices (EPIDs) or film-based systems. These images serve as a real-time snapshot of the radiation field, allowing clinicians to assess the actual treatment setup against the planned position.
2. Techniques of Portal Imaging
2.1 Electronic Portal Imaging Devices (EPIDs)
Electronic portal imaging devices, or EPIDs, have become a standard tool in portal imaging due to their real-time imaging capabilities and digital nature. EPIDs consist of a detector panel that captures the transmitted radiation through the patient during treatment. The resulting electronic images are immediately available for review, facilitating prompt decision-making regarding the need for adjustments in patient positioning or treatment parameters.
2.2 Film-Based Portal Imaging
Film-based portal imaging, while less commonly used today, has historical significance and is still employed in certain clinical settings. It involves exposing X-ray film positioned behind the patient during treatment. The film is then developed, and the resulting image is analyzed to verify the alignment of the treatment field. Though the process is not as immediate as with EPIDs, film-based systems may still offer advantages in certain situations.
3. Clinical Applications of Portal Imaging
Portal imaging is integral to the success of radiotherapy across various cancer types and treatment modalities.
3.1 Treatment Verification and Positioning
The primary application of portal imaging is to verify the accuracy of patient positioning and the alignment of the treatment field with the intended target volume. Any discrepancies detected through portal images allow for immediate adjustments to be made, ensuring that the radiation is delivered precisely to the targeted area while minimizing exposure to adjacent healthy tissues.
3.2 Tumor Localization and Changes in Anatomy
Portal imaging aids in localizing tumors, particularly
Presentation of my senior Project about "A real time automatic eye tracking system for ophthalmology"
In the presentation, it briefly explains about conventional object tracking method "template matching" based on Sum-of-Square difference. Therefore we also present the powerful matching technique called Gradient Orientation Pattern Matching (GOPM) proposed by T.Kondo and we proposed an improved version of GOPM called time-vary GOPM to solve a illumination and noise problem.
Application of machine learning and cognitive computing in intrusion detectio...Mahdi Hosseini Moghaddam
This document describes a proposed hardware-based machine learning intrusion detection system using cognitive processors. It discusses the need for new intrusion detection approaches due to limitations of signature-based methods. The proposed system collects network packet data using a Raspberry Pi and classifies it using a Cognimem CM1K cognitive processor chip, which implements restricted coulomb energy and k-nearest neighbor algorithms. The document outlines the system architecture, data collection and normalization methodology, and analysis of results from testing the CM1K chip on both custom and NSL-KDD network datasets, finding accuracy levels around 70-80% but slower processing times than a software simulation of the chip's algorithms. Future work areas include adding more packet features, using
Quality Assessment for Recognition and Task-based multimedia applications (QART)Mikołaj Leszczuk
Users of video to perform tasks require sufficient video quality to recognize the information needed for their application. Therefore, the fundamental measure of video quality in these applications is the success rate of these tasks (such as recognition), which is referred to as visual intelligibility or acuity. One of the major causes of reduction of visual intelligibility is loss of data, through various forms of compression. Additionally, the characteristics of the scene being captured have a direct effect on visual intelligibility and on the performance of a compression operation-specifically, the size of the target of interest, the lighting conditions, and the temporal complexity of the scene. The QART project is performing a series of tests to study the effects and interactions of compression and scene characteristics. An additional goal is to test existing or develop new objective measurements that will predict the results of the subjective tests of visual intelligibility.
Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...Cristiano Rafael Steffens
1) The document evaluates how state-of-the-art convolutional neural networks (CNNs) perform on image recognition tasks when images are exposed to different types of noise, distortions and compression.
2) It finds that while CNN models are robust to mild exposure issues and noise, performance decreases significantly under moderate to severe exposure problems and salt and pepper noise.
3) Larger CNN models like NASNet Large perform best, while smaller mobile models are most affected by distortions. The study aims to improve CNN robustness and build image processing pipelines to handle faulty data.
These are the slides to my tutorial that I have given at the International Conference on Image Processing Theorie, Tools & Applications (IPTA 2022) on April 19, 2022.
Current developments in video quality: From the emerging HEVC standard to tem...Harilaos Koumaras
This document discusses current developments in video quality and the emerging HEVC video coding standard. It provides an overview of HEVC, including its key features such as flexible block structures, larger transform units, and new intra-coding and inter-coding prediction methods. Experimental results show that HEVC can achieve a 32-62% improvement in compression ratio over H.264/AVC while maintaining the same video quality. The document also discusses advances in video quality prediction through enhanced content classification of uncompressed video and improved prediction of quality for compressed video.
This document discusses optimizing 360-degree video streaming to head-mounted virtual reality. It covers challenges like existing codecs only supporting 2D videos and 360 videos having wider views than conventional videos. Approaches proposed include fixation prediction to avoid streaming unwatched parts, QoE modeling designed for 360 videos to improve user experience, and an adaptive streaming platform to select and transmit tiles based on fixation prediction while allocating bitrates based on the QoE model. Part I discusses fixation prediction including using neural networks trained on viewing features. Part II covers QoE modeling, noting limitations of existing metrics and factors that affect QoE like content and bitrates. It constructs a logarithmic linear QoE model. Part III outlines an
The document discusses digital image upscaling techniques from traditional methods to deep learning methods. It covers classical super-resolution methods for images and videos, including interpolation-based, edge-directed, frequency-domain, and example-based methods. It also explains the challenges of super-resolution such as information loss during the digital conversion process.
The document describes a method for glaucoma screening using retinal fundus images. Glaucoma is an irreversible eye disease that can cause vision loss if not detected early. The proposed method uses a novel sparse dissimilarity-constrained coding approach to segment and reconstruct the optic disc from fundus images. Reconstruction coefficients are used to calculate the cup to disc ratio, a metric for detecting glaucoma. The method was tested on 650 images and achieved better accuracy than other methods, with an average error of 0.064 compared to manual measurements. It also achieved good performance in glaucoma screening tests on two datasets. The method shows potential for large-scale population-based glaucoma screening using low-cost retinal imaging.
Target Detection and Classification Performance Enhancement using Super-Resol...sipij
Long range infrared videos such as the Defense Systems Information Analysis Center (DSIAC) videos usually
do not have high resolution. In recent years, there are significant advancement in video super-resolution
algorithms. Here, we summarize our study on the use of super-resolution videos for target detection and
classification. We observed that super-resolution videos can significantly improve the detection and
classification performance. For example, for 3000 m range videos, we were able to improve the average
precision of target detection from 11% (without super-resolution) to 44% (with 4x super-resolution) and the
overall accuracy of target classification from 10% (without super-resolution) to 44% (with 2x superresolution).
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...sipij
Long range infrared videos such as the Defense Systems Information Analysis Center (DSIAC) videos usually
do not have high resolution. In recent years, there are significant advancement in video super-resolution
algorithms. Here, we summarize our study on the use of super-resolution videos for target detection and
classification. We observed that super-resolution videos can significantly improve the detection and
classification performance. For example, for 3000 m range videos, we were able to improve the average
precision of target detection from 11% (without super-resolution) to 44% (with 4x super-resolution) and the
overall accuracy of target classification from 10% (without super-resolution) to 44% (with 2x superresolution).
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...sipij
Long range infrared videos such as the Defense Systems Information Analysis Center (DSIAC) videos usually
do not have high resolution. In recent years, there are significant advancement in video super-resolution
algorithms. Here, we summarize our study on the use of super-resolution videos for target detection and
classification. We observed that super-resolution videos can significantly improve the detection and
classification performance. For example, for 3000 m range videos, we were able to improve the average
precision of target detection from 11% (without super-resolution) to 44% (with 4x super-resolution) and the
overall accuracy of target classification from 10% (without super-resolution) to 44% (with 2x superresolution).
Biometric Recognition using Deep LearningSahithiKotha2
This document discusses biometric recognition using deep learning. It provides an overview of traditional biometric recognition processes and how deep learning has improved biometric recognition. Some key deep learning models for biometric recognition are convolutional neural networks, recurrent neural networks, autoencoders, and generative adversarial networks. Face recognition is discussed as an example application, outlining implementation steps and the use of OpenCV for face recognition. Challenges in biometric recognition using deep learning are also presented.
Video processing involves manipulating and analyzing digital video sequences. Common techniques include trimming, resizing, adjusting brightness/contrast, and analyzing using machine learning. Key concepts in video include compression, frames, frame rate, resolution, and aspect ratio. Compression reduces file sizes while maintaining quality. Frames are still images that make up video sequences. Frame rate determines smoothness. Resolution is pixels and quality. Aspect ratio is width to height ratio. Video can be compressed using intra-frame or inter-frame techniques. Enhancement improves quality using techniques like noise reduction and color correction. Analysis extracts information from video.
Finding interesting patterns in data can lead to uncovering new knowledge. New patterns that haven’t occurred before can signify events of interest. Depending on context, these can be called novelties, anomalies, outliers or events. Whatever they are called, they are interesting because they tell a story different from the norm. In this talk, we will call them anomalies. Two diverse applications of anomaly detection are detecting fraudulent credit card transactions and identifying astronomical anomalies such as solar flares.
However, there are many challenges in anomaly detection including high false positive rates and low predictive accuracy. Ensemble learning is a way of combining many algorithms or models to obtain better predictive performance. Anomaly detection is generally an unsupervised task, that is, we do not train models using labelled data. Constructing an unsupervised anomaly detection ensemble is challenging because we do not know the labels. In this talk we discuss two topics in anomaly detection. First, we introduce an anomaly detection ensemble using Item Response Theory (IRT) – a class of models used in educational psychometrics. Using IRT we construct an ensemble that can downplay noisy, non-discriminatory methods and accentuate sharper methods.
Then we explore anomaly detection in computer network security. With cyber incidents and data breaches becoming increasingly common, we have seen a massive increase in computer network attacks over the years. Anomaly detection methods, even though used to detect suspicious behaviour, are criticized for high false positive rates. In addition, computer networks produce a large amount of complex data. We go through the end-to-end process of detecting anomalies in this scenario and show how we can minimize false positives and visualise anomalies developing over time.
Video Compression, Part 4 Section 2, Video Quality Assessment Dr. Mohieddin Moradi
This document provides information on conducting subjective video quality assessments. It discusses different subjective assessment methods like double stimulus impairment scale (DSIS) and single stimulus continuous quality evaluation (SSCQE). It describes test parameters like number of observers, viewing conditions, grading scales and how to present the results. Guidelines are provided for tasks like screening observers, conducting test sessions, introducing impairments and collecting opinion scores to evaluate video coding standards and compression artifacts.
Software Defined Networking in the ATMOSPHERE projectATMOSPHERE .
The ATMOSPHERE project aims to develop a federated cloud platform and associated tools to enable trustworthy distributed data processing and management across international borders. Key expected results include a development framework, mechanisms for evaluating and monitoring trustworthiness, and a pilot use case involving medical imaging processing in Brazil. The platform will provide various services while addressing challenges like sensitive data access, privacy, and infrastructure management across multiple cloud providers and regions.
Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE P...ATMOSPHERE .
In this webinar, Francisco Brasileiro and Ignacio Blanquer will discuss the trustworthiness requirements of big-data applications deployed atop cloud infrastructures, and how the ATMOSPHERE platform can be used to handle them. This will be explained using as example a medical application developed in the context of the ATMOSPHERE project, and deployed over a transatlantic federated cloud infrastructure.
More Related Content
Similar to On the development of a Visual-Temporal-awareness Rheumatic Heart Disease classifier for Echocardiographic Videos
Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual RealityWen-Chih Lo
Published in NOSSDAV'17 on June 2017.
We study the problem of predicting the Field-of-Views (FoVs) of viewers watching 360° videos using commodity Head-Mounted Displays (HMDs). Existing solutions either use the viewer's current orientation to approximate the FoVs in the future, or extrapolate future FoVs using the historical orientations and dead-reckoning algorithms. In this paper, we develop fixation prediction networks that concurrently leverage sensor- and content-related features to predict the viewer fixation in the future, which is quite different from the solutions in the literature. The sensor-related features include HMD orientations, while the content-related features include image saliency maps and motion maps. We build a 360° video streaming testbed to HMDs, and recruit twenty-five viewers to watch ten 360° videos. We then train and validate two design alternatives of our proposed networks, which allows us to identify the better-performing design with the optimal parameter settings.
Trace-driven simulation results show the merits of our proposed fixation prediction networks compared to the existing solutions, including: (i) lower consumed bandwidth, (ii) shorter initial buffering time, and (iii) short running time.
This paper proposes a blind quality algorithm to analyze streaming video content in 5G networks. The algorithm detects common streaming errors like color degradation, frozen frames, and packet loss. It is included in a "Quality Probe" application that operates as a virtual network function and sends quality reports. The algorithm was tested on sequences with different impairments from a video quality database. It successfully detected packet loss, color errors, and frozen frames. The results validate the algorithm and show the need for intelligent network nodes to monitor quality and adapt transmissions to improve users' experience in 5G networks. Future work includes additional metrics, processing time analysis, and testing in a real network.
Portal Imaging used to clear setup uncertaintyMajoVJJose
Title: Portal Imaging in Radiotherapy: A Comprehensive Exploration of Techniques, Applications, and Advancements
Introduction
Portal imaging is a critical component of modern radiotherapy, playing a pivotal role in the verification and precision of radiation treatment delivery. This technique involves the acquisition of X-ray images during or immediately after a patient's radiotherapy session, providing valuable information on the alignment of the treatment field with the intended target and surrounding critical structures. In this comprehensive exploration, we delve into the principles, techniques, clinical applications, challenges, and future prospects of portal imaging in the context of radiotherapy.
1. Principles of Portal Imaging
Portal imaging is rooted in the principles of verifying and ensuring the accuracy of radiation therapy delivery. Before each treatment fraction, the patient's position is verified to ensure it aligns precisely with the treatment plan. Portal images are acquired using specialized imaging devices, usually in the form of electronic portal imaging devices (EPIDs) or film-based systems. These images serve as a real-time snapshot of the radiation field, allowing clinicians to assess the actual treatment setup against the planned position.
2. Techniques of Portal Imaging
2.1 Electronic Portal Imaging Devices (EPIDs)
Electronic portal imaging devices, or EPIDs, have become a standard tool in portal imaging due to their real-time imaging capabilities and digital nature. EPIDs consist of a detector panel that captures the transmitted radiation through the patient during treatment. The resulting electronic images are immediately available for review, facilitating prompt decision-making regarding the need for adjustments in patient positioning or treatment parameters.
2.2 Film-Based Portal Imaging
Film-based portal imaging, while less commonly used today, has historical significance and is still employed in certain clinical settings. It involves exposing X-ray film positioned behind the patient during treatment. The film is then developed, and the resulting image is analyzed to verify the alignment of the treatment field. Though the process is not as immediate as with EPIDs, film-based systems may still offer advantages in certain situations.
3. Clinical Applications of Portal Imaging
Portal imaging is integral to the success of radiotherapy across various cancer types and treatment modalities.
3.1 Treatment Verification and Positioning
The primary application of portal imaging is to verify the accuracy of patient positioning and the alignment of the treatment field with the intended target volume. Any discrepancies detected through portal images allow for immediate adjustments to be made, ensuring that the radiation is delivered precisely to the targeted area while minimizing exposure to adjacent healthy tissues.
3.2 Tumor Localization and Changes in Anatomy
Portal imaging aids in localizing tumors, particularly
Presentation of my senior Project about "A real time automatic eye tracking system for ophthalmology"
In the presentation, it briefly explains about conventional object tracking method "template matching" based on Sum-of-Square difference. Therefore we also present the powerful matching technique called Gradient Orientation Pattern Matching (GOPM) proposed by T.Kondo and we proposed an improved version of GOPM called time-vary GOPM to solve a illumination and noise problem.
Application of machine learning and cognitive computing in intrusion detectio...Mahdi Hosseini Moghaddam
This document describes a proposed hardware-based machine learning intrusion detection system using cognitive processors. It discusses the need for new intrusion detection approaches due to limitations of signature-based methods. The proposed system collects network packet data using a Raspberry Pi and classifies it using a Cognimem CM1K cognitive processor chip, which implements restricted coulomb energy and k-nearest neighbor algorithms. The document outlines the system architecture, data collection and normalization methodology, and analysis of results from testing the CM1K chip on both custom and NSL-KDD network datasets, finding accuracy levels around 70-80% but slower processing times than a software simulation of the chip's algorithms. Future work areas include adding more packet features, using
Quality Assessment for Recognition and Task-based multimedia applications (QART)Mikołaj Leszczuk
Users of video to perform tasks require sufficient video quality to recognize the information needed for their application. Therefore, the fundamental measure of video quality in these applications is the success rate of these tasks (such as recognition), which is referred to as visual intelligibility or acuity. One of the major causes of reduction of visual intelligibility is loss of data, through various forms of compression. Additionally, the characteristics of the scene being captured have a direct effect on visual intelligibility and on the performance of a compression operation-specifically, the size of the target of interest, the lighting conditions, and the temporal complexity of the scene. The QART project is performing a series of tests to study the effects and interactions of compression and scene characteristics. An additional goal is to test existing or develop new objective measurements that will predict the results of the subjective tests of visual intelligibility.
Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...Cristiano Rafael Steffens
1) The document evaluates how state-of-the-art convolutional neural networks (CNNs) perform on image recognition tasks when images are exposed to different types of noise, distortions and compression.
2) It finds that while CNN models are robust to mild exposure issues and noise, performance decreases significantly under moderate to severe exposure problems and salt and pepper noise.
3) Larger CNN models like NASNet Large perform best, while smaller mobile models are most affected by distortions. The study aims to improve CNN robustness and build image processing pipelines to handle faulty data.
These are the slides to my tutorial that I have given at the International Conference on Image Processing Theorie, Tools & Applications (IPTA 2022) on April 19, 2022.
Current developments in video quality: From the emerging HEVC standard to tem...Harilaos Koumaras
This document discusses current developments in video quality and the emerging HEVC video coding standard. It provides an overview of HEVC, including its key features such as flexible block structures, larger transform units, and new intra-coding and inter-coding prediction methods. Experimental results show that HEVC can achieve a 32-62% improvement in compression ratio over H.264/AVC while maintaining the same video quality. The document also discusses advances in video quality prediction through enhanced content classification of uncompressed video and improved prediction of quality for compressed video.
This document discusses optimizing 360-degree video streaming to head-mounted virtual reality. It covers challenges like existing codecs only supporting 2D videos and 360 videos having wider views than conventional videos. Approaches proposed include fixation prediction to avoid streaming unwatched parts, QoE modeling designed for 360 videos to improve user experience, and an adaptive streaming platform to select and transmit tiles based on fixation prediction while allocating bitrates based on the QoE model. Part I discusses fixation prediction including using neural networks trained on viewing features. Part II covers QoE modeling, noting limitations of existing metrics and factors that affect QoE like content and bitrates. It constructs a logarithmic linear QoE model. Part III outlines an
The document discusses digital image upscaling techniques from traditional methods to deep learning methods. It covers classical super-resolution methods for images and videos, including interpolation-based, edge-directed, frequency-domain, and example-based methods. It also explains the challenges of super-resolution such as information loss during the digital conversion process.
The document describes a method for glaucoma screening using retinal fundus images. Glaucoma is an irreversible eye disease that can cause vision loss if not detected early. The proposed method uses a novel sparse dissimilarity-constrained coding approach to segment and reconstruct the optic disc from fundus images. Reconstruction coefficients are used to calculate the cup to disc ratio, a metric for detecting glaucoma. The method was tested on 650 images and achieved better accuracy than other methods, with an average error of 0.064 compared to manual measurements. It also achieved good performance in glaucoma screening tests on two datasets. The method shows potential for large-scale population-based glaucoma screening using low-cost retinal imaging.
Target Detection and Classification Performance Enhancement using Super-Resol...sipij
Long range infrared videos such as the Defense Systems Information Analysis Center (DSIAC) videos usually
do not have high resolution. In recent years, there are significant advancement in video super-resolution
algorithms. Here, we summarize our study on the use of super-resolution videos for target detection and
classification. We observed that super-resolution videos can significantly improve the detection and
classification performance. For example, for 3000 m range videos, we were able to improve the average
precision of target detection from 11% (without super-resolution) to 44% (with 4x super-resolution) and the
overall accuracy of target classification from 10% (without super-resolution) to 44% (with 2x superresolution).
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...sipij
Long range infrared videos such as the Defense Systems Information Analysis Center (DSIAC) videos usually
do not have high resolution. In recent years, there are significant advancement in video super-resolution
algorithms. Here, we summarize our study on the use of super-resolution videos for target detection and
classification. We observed that super-resolution videos can significantly improve the detection and
classification performance. For example, for 3000 m range videos, we were able to improve the average
precision of target detection from 11% (without super-resolution) to 44% (with 4x super-resolution) and the
overall accuracy of target classification from 10% (without super-resolution) to 44% (with 2x superresolution).
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...sipij
Long range infrared videos such as the Defense Systems Information Analysis Center (DSIAC) videos usually
do not have high resolution. In recent years, there are significant advancement in video super-resolution
algorithms. Here, we summarize our study on the use of super-resolution videos for target detection and
classification. We observed that super-resolution videos can significantly improve the detection and
classification performance. For example, for 3000 m range videos, we were able to improve the average
precision of target detection from 11% (without super-resolution) to 44% (with 4x super-resolution) and the
overall accuracy of target classification from 10% (without super-resolution) to 44% (with 2x superresolution).
Biometric Recognition using Deep LearningSahithiKotha2
This document discusses biometric recognition using deep learning. It provides an overview of traditional biometric recognition processes and how deep learning has improved biometric recognition. Some key deep learning models for biometric recognition are convolutional neural networks, recurrent neural networks, autoencoders, and generative adversarial networks. Face recognition is discussed as an example application, outlining implementation steps and the use of OpenCV for face recognition. Challenges in biometric recognition using deep learning are also presented.
Video processing involves manipulating and analyzing digital video sequences. Common techniques include trimming, resizing, adjusting brightness/contrast, and analyzing using machine learning. Key concepts in video include compression, frames, frame rate, resolution, and aspect ratio. Compression reduces file sizes while maintaining quality. Frames are still images that make up video sequences. Frame rate determines smoothness. Resolution is pixels and quality. Aspect ratio is width to height ratio. Video can be compressed using intra-frame or inter-frame techniques. Enhancement improves quality using techniques like noise reduction and color correction. Analysis extracts information from video.
Finding interesting patterns in data can lead to uncovering new knowledge. New patterns that haven’t occurred before can signify events of interest. Depending on context, these can be called novelties, anomalies, outliers or events. Whatever they are called, they are interesting because they tell a story different from the norm. In this talk, we will call them anomalies. Two diverse applications of anomaly detection are detecting fraudulent credit card transactions and identifying astronomical anomalies such as solar flares.
However, there are many challenges in anomaly detection including high false positive rates and low predictive accuracy. Ensemble learning is a way of combining many algorithms or models to obtain better predictive performance. Anomaly detection is generally an unsupervised task, that is, we do not train models using labelled data. Constructing an unsupervised anomaly detection ensemble is challenging because we do not know the labels. In this talk we discuss two topics in anomaly detection. First, we introduce an anomaly detection ensemble using Item Response Theory (IRT) – a class of models used in educational psychometrics. Using IRT we construct an ensemble that can downplay noisy, non-discriminatory methods and accentuate sharper methods.
Then we explore anomaly detection in computer network security. With cyber incidents and data breaches becoming increasingly common, we have seen a massive increase in computer network attacks over the years. Anomaly detection methods, even though used to detect suspicious behaviour, are criticized for high false positive rates. In addition, computer networks produce a large amount of complex data. We go through the end-to-end process of detecting anomalies in this scenario and show how we can minimize false positives and visualise anomalies developing over time.
Video Compression, Part 4 Section 2, Video Quality Assessment Dr. Mohieddin Moradi
This document provides information on conducting subjective video quality assessments. It discusses different subjective assessment methods like double stimulus impairment scale (DSIS) and single stimulus continuous quality evaluation (SSCQE). It describes test parameters like number of observers, viewing conditions, grading scales and how to present the results. Guidelines are provided for tasks like screening observers, conducting test sessions, introducing impairments and collecting opinion scores to evaluate video coding standards and compression artifacts.
Similar to On the development of a Visual-Temporal-awareness Rheumatic Heart Disease classifier for Echocardiographic Videos (20)
Software Defined Networking in the ATMOSPHERE projectATMOSPHERE .
The ATMOSPHERE project aims to develop a federated cloud platform and associated tools to enable trustworthy distributed data processing and management across international borders. Key expected results include a development framework, mechanisms for evaluating and monitoring trustworthiness, and a pilot use case involving medical imaging processing in Brazil. The platform will provide various services while addressing challenges like sensitive data access, privacy, and infrastructure management across multiple cloud providers and regions.
Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE P...ATMOSPHERE .
In this webinar, Francisco Brasileiro and Ignacio Blanquer will discuss the trustworthiness requirements of big-data applications deployed atop cloud infrastructures, and how the ATMOSPHERE platform can be used to handle them. This will be explained using as example a medical application developed in the context of the ATMOSPHERE project, and deployed over a transatlantic federated cloud infrastructure.
The document proposes designing an open IoT ecosystem to provide interoperability among existing and new IoT systems. Currently, developers must build all components of an IoT application from end to end. In the future, sensing and actuation systems will already exist. The open ecosystem would allow new systems to utilize existing components. The SWAMP project provides an example of an open IoT ecosystem for smart irrigation applications. Open source code, platforms, services, data, and knowledge are key enablers of such an ecosystem by allowing components and information to be shared.
Cloud Robotics: Cognitive Augmentation for Robots via the CloudATMOSPHERE .
Robot software development is difficult due to the complexity of robot components and low computational intelligence from slow processors. Developing even simple applications requires entire development teams but results are often not what was expected. Robots cannot efficiently run AI services due to these design challenges. Cloud robotics offers a solution by allowing robots to leverage remote computing resources for more advanced capabilities.
Optimization Models for on-demand GPUs in the CloudATMOSPHERE .
This document discusses optimization models for scheduling deep learning jobs on demand GPUs in the cloud. It aims to jointly plan VM capacity and schedule DL training jobs to minimize costs. The proposed model reduces total costs by over 90% compared to FIFO, priority, and EDF scheduling based on preliminary results for multiple node and job simulations. Performance models for predicting GPU-based deep learning applications are described in a referenced paper. The work is co-funded by the European Commission Horizon 2020 program.
The document summarizes the structure of Thematic Groups within the Brazilian Computer Society (SBC). SBC has three types of Thematic Groups organized hierarchically: 1) Major Areas which represent groups of Special Commissions in a thematic area, 2) Special Commissions which group members in a computing subarea, and 3) Interest Groups which are the smallest groups that can be formed with at least 10 members from 3 institutions. Special Commissions evolve from Interest Groups after 3 years and 50 members from 10 institutions. Interest Groups are linked to Special Commissions and require approval to be formed. This structure allows SBC members to connect through common computing interests.
This document outlines the Cloud Computing Interest Group which includes representatives from regulation writing, publicity and interaction, and financing. It discusses statute/regulation, publicizing and interacting with special committees, planned activities for 2019-2020 including WCN and an Interest Group meeting at CSBC 2020, and financing the group's activities.
5G-Range - 5G networks for remote areasATMOSPHERE .
5G-RANGE receives funding from the European Union and Brazil to provide mobile broadband connectivity in remote areas using 5G networks. The project aims to overcome limitations in range for 4G and 5G standards and reduce operational costs by using TV white space in remote areas. 5G-RANGE seeks to increase data rates at cell edges and bring 5G services like mobile broadband and IoT to rural and underserved areas, with a target cell radius of 50 km and data rate of 100 Mbps. It utilizes technologies like MIMO diversity, cognitive radio and software-defined radio to achieve its goals.
NECOS Project: Lightweight Slicing of CloudFederated InfrastructuresATMOSPHERE .
The document discusses a project called NECOS that aims to address limitations of current cloud computing infrastructures. It introduces a new service model called "Slice-as-a-Service" that allows configuration of slices over both network and cloud infrastructure resources. The goal of NECOS is to automate cloud and network configuration by providing uniform management of computing, connectivity, and storage resources based on the Lightweight Slice Defined Cloud concept. Current work includes developing prototypes and defining demonstrations involving IoT and tourism use cases.
SWAMP: Smart Water Management PlatformATMOSPHERE .
The SWAMP project develops IoT approaches for smart water management and precision irrigation. It pilots these approaches in Italy, Spain, and Brazil with the objectives of saving energy in the MATOPIBA region of Brazil, improving wine and grape quality in Guaspari, Brazil, saving water in Intercrop in Spain, and optimizing water distribution in CBEC, Italy. The SWAMP platform utilizes an IoT computing continuum and infrastructure to estimate water needs based on soil measurements, crop health, weather forecasts, climate data, and rain levels to plan and operate irrigation.
This document summarizes a project that received funding from the European Union and Brazil to address childhood obesity through an IoT-based solution. The project aims to promote healthy habits in children using a gamified mobile application supported by sensors and algorithms. It involves a multidisciplinary team that developed and validated the solution with children in schools. The document outlines the business model and assets created, as well as dissemination of results through publications and conferences.
The ATMOSPHERE project is a 24-month European Commission-funded project aiming to develop a platform to support the execution of trustworthy cloud services across multiple cloud providers. The platform will assess the trustworthiness of services and applications based on properties like security, privacy, and fairness. It will also monitor applications at runtime to ensure trustworthiness goals are maintained. The project builds on previous work to provide tools for secure data processing, analytics services, and hybrid cloud resource management. A sustainability plan is being developed to continue using and developing the main assets of the ATMOSPHERE platform after the project's completion.
Trustworthy cloud services for Medical Imaging BiomarkersATMOSPHERE .
This document discusses imaging biomarkers, which are quantifiable parameters extracted from medical images using computational models or AI. It describes how imaging biomarkers can provide information about conditions affecting the neurology, musculoskeletal system, and abdomen. It then discusses the QUIBIM precision platform for anonymizing, viewing, and automatically analyzing medical images to extract these biomarkers. Current infrastructure limitations for high workload are noted. A new high-performance computing infrastructure using containerization, orchestration, and collaboration is proposed to improve performance for large, complex analyses of medical images and biomarkers.
ATMOSPHERE: An architecture for trustworthy cloud servicesATMOSPHERE .
Francisco Brasileiro, ATMOSPHERE Brazilian Coordinator & Federal University of Campina Grande - "ATMOSPHERE: An architecture for trustworthy cloud services”
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Building Production Ready Search Pipelines with Spark and Milvus
On the development of a Visual-Temporal-awareness Rheumatic Heart Disease classifier for Echocardiographic Videos
1. • Rheumatic Heart Disease (RHD) is a heart condition caused by abnormal immune
response to streptococcal infection,
• streptococcal: a bacteria normally associated with poor sanitation and
hygiene conditions.
• The burden of RHD is concentrated in low-income countries,
• health resources are scarce.
• Echocardiographic (echo) screening is the gold standard for diagnosis of latent
RHD;
• personnel shortages limit broad implementation.
• To address this issue, we aimed to develop a machine-learning model for automatic
identification to be used in further steps of our solution for RHD screening for
prioritization of follow-up.
1
2. Preprocessing phase
• Videos clipped at 16 frames
• Rotation and resizing to 128x171 pixels (required by the DNN chosen)
• Whitening (process that subtracts the pixels in each video by the mean of the
videos in the original training data)
2
Video Pre-processing
Before whitening After whitening
Frame of a video
with doppler
Frame of a video
without doppler
3. Methodology
• Videos with and without doppler were considered separately.
• Undersampling according to the borderline-RHD class
• Classify an exam directly, i.e., there is no view classification
• Use of the C3D neural network proposed by Tran et al. [2015], originally
trained with the Sports-1M dataset
• Changed the classification layer according to the problem modeling
followed
• Fine-tuned the parameters with the training set
3
Methodology
D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning Spatiotemporal Features with 3D Convolutional Networks,
ICCV 2015
4. Modified version of the C3D architecture (as showed below)
• Input: 16 frames from a video of an exam;
• 50 epochs with early stopping;
• Batch size of 16;
• Learning rate of 0.001 and a random crop strategy.
4
Network architecture
Normal
or
RHD
positive
Visual feature extraction Classifier
5. Preliminary experiments to understand the capability of the network in extracting visual
features and separating the 2 classes of interest.
We biased the training to maximize the Borderline accuracy.
Results of confusion matrix per video considering two classes: RHD positive and
negative:
• accuracy: 0.628 (95% CI, 0.573 – 0.682)
• specificity: 0.615 (95% CI, 0.435 – 0.795)
• sensibility: 0.641 (95% CI, 0.432 – 0.850)
5
Results per video
and 2 classes
6. • Hyperparameter tuning (hyperband)
• Take advantage of visual features from the doppler images;
• Analyze the visual features the networks use to classify the exams (interpretability)
and compare with those used by doctors;
• Build a network architecture with 2 arms (see figure below), considering both
doppler images and raw images from the exams.
6
Doing
Normal
or
RHD positive
DopplerImageRawImage