This thesis aims to develop deep learning models to detect COVID-19 pneumonia in chest X-ray images. The author trains two models: 1) A binary classifier to distinguish COVID-19 pneumonia from non-COVID cases, which classifies all test cases correctly. 2) A four-class classifier to identify COVID-19, viral pneumonia, bacterial pneumonia, and normal cases, which achieves an average accuracy of 93% on the test set. Gradient-weighted Class Activation Mapping is used to interpret the four-class model and finds it can focus on patchy areas characteristic of COVID-19 pneumonia to make accurate predictions.
The document provides background information on machine learning and discusses its application to predicting COVID-19. It outlines the objectives of developing a machine learning model to predict whether a patient has COVID-19 based on their clinical information and identifying influential features. The document describes conducting a literature review and experiment to determine the most suitable machine learning techniques and influential features. It also defines the scope of the thesis and provides an outline of the following chapters.
This document is a 36-page bachelor's thesis written by Duc Minh Luong Nguyen titled "Detect COVID-19 from Chest X-Ray images using Deep Learning". The thesis was submitted to Metropolia University of Applied Sciences in May 2020. It aims to build a deep convolutional neural network to detect COVID-19 using only chest X-ray images. The model achieves an accuracy of 93% at detecting COVID-19 patients versus healthy patients, despite being trained on a small dataset of 115 images for each class.
The document proposes a secure personal identification system based on human retina recognition. The system uses retinal vascular patterns, which are unique to each individual, for identification. It consists of three stages: 1) preprocessing to extract the vascular pattern from retinal images, 2) feature extraction to identify feature points like endings and bifurcations, and represent them as vectors, and 3) matching input images to templates by calculating distances between feature vectors. Experimental results on two retinal image databases achieved over 97% accuracy, demonstrating the potential of the proposed system for high-security identification.
Agenda:
Introduction
Supercomputers for Scientific Research
Covid-19 Tracking and Prediction
Covid-19 Research and Diagnosis
Use Case 1 NLP and BERT to answer scientific questions
Use Case 2 Covid-19 Data Lake and Platform
COVID-19 detection from scarce chest X-Ray image data using few-shot deep lea...Shruti Jadon
In the current COVID-19 pandemic situation, there is an urgent need to screen infected patients quickly and accurately. Using deep learning models trained on chest X-ray images can become an efficient method for screening COVID-19 patients in these situations. Deep learning approaches are already widely used in the medical community. However, they require a large amount of data to be accurate.
The document compares the performance of different machine learning models for detecting COVID-19 from CT scans, including single models like SVM, NB, MLP, CNN and ensemble models like AdaBoost and GBDT. Based on accuracy, precision, recall, F1-score and MCC metrics, the SVM model achieved the best performance with an accuracy of 99.2%, followed by CNN and AdaBoost. While MLP, NB and GBDT showed lower performance, CNN had the advantage of automatically detecting important image features.
http://imatge-upc.github.io/telecombcn-2016-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
https://imatge.upc.edu/web/publications/video-retrieval-specific-persons-specific-locations
This thesis explores good practices for improving the detection of specific people in specific places. An approach combining recurrent and convolutional neural network have been considered to perform face detection. However, other more conventional methods have been tested, obtaining the best results by exploiting a deformable part model approach. A CNN is also used to obtain the face feature vectors and, with the purpose of helping in the face recognition, an approach to perform query expansion has been also developed. Furthermore, in order to be able to evaluate the different configurations in our non-labelled dataset, a user interface has been used to annotate the images and be able to obtain a precision of the system. Finally, different fusion and normalization strategies has been explored with the aim of combining the scores obtained from the face recognition with the ones obtained in the place recognition.
The document provides background information on machine learning and discusses its application to predicting COVID-19. It outlines the objectives of developing a machine learning model to predict whether a patient has COVID-19 based on their clinical information and identifying influential features. The document describes conducting a literature review and experiment to determine the most suitable machine learning techniques and influential features. It also defines the scope of the thesis and provides an outline of the following chapters.
This document is a 36-page bachelor's thesis written by Duc Minh Luong Nguyen titled "Detect COVID-19 from Chest X-Ray images using Deep Learning". The thesis was submitted to Metropolia University of Applied Sciences in May 2020. It aims to build a deep convolutional neural network to detect COVID-19 using only chest X-ray images. The model achieves an accuracy of 93% at detecting COVID-19 patients versus healthy patients, despite being trained on a small dataset of 115 images for each class.
The document proposes a secure personal identification system based on human retina recognition. The system uses retinal vascular patterns, which are unique to each individual, for identification. It consists of three stages: 1) preprocessing to extract the vascular pattern from retinal images, 2) feature extraction to identify feature points like endings and bifurcations, and represent them as vectors, and 3) matching input images to templates by calculating distances between feature vectors. Experimental results on two retinal image databases achieved over 97% accuracy, demonstrating the potential of the proposed system for high-security identification.
Agenda:
Introduction
Supercomputers for Scientific Research
Covid-19 Tracking and Prediction
Covid-19 Research and Diagnosis
Use Case 1 NLP and BERT to answer scientific questions
Use Case 2 Covid-19 Data Lake and Platform
COVID-19 detection from scarce chest X-Ray image data using few-shot deep lea...Shruti Jadon
In the current COVID-19 pandemic situation, there is an urgent need to screen infected patients quickly and accurately. Using deep learning models trained on chest X-ray images can become an efficient method for screening COVID-19 patients in these situations. Deep learning approaches are already widely used in the medical community. However, they require a large amount of data to be accurate.
The document compares the performance of different machine learning models for detecting COVID-19 from CT scans, including single models like SVM, NB, MLP, CNN and ensemble models like AdaBoost and GBDT. Based on accuracy, precision, recall, F1-score and MCC metrics, the SVM model achieved the best performance with an accuracy of 99.2%, followed by CNN and AdaBoost. While MLP, NB and GBDT showed lower performance, CNN had the advantage of automatically detecting important image features.
http://imatge-upc.github.io/telecombcn-2016-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
https://imatge.upc.edu/web/publications/video-retrieval-specific-persons-specific-locations
This thesis explores good practices for improving the detection of specific people in specific places. An approach combining recurrent and convolutional neural network have been considered to perform face detection. However, other more conventional methods have been tested, obtaining the best results by exploiting a deformable part model approach. A CNN is also used to obtain the face feature vectors and, with the purpose of helping in the face recognition, an approach to perform query expansion has been also developed. Furthermore, in order to be able to evaluate the different configurations in our non-labelled dataset, a user interface has been used to annotate the images and be able to obtain a precision of the system. Finally, different fusion and normalization strategies has been explored with the aim of combining the scores obtained from the face recognition with the ones obtained in the place recognition.
The adaptive mechanisms include the following AI paradigms that exhibit an ability to learn or adapt to new environments:
Swarm Intelligence (SI),
Artificial Neural Networks (ANN),
Evolutionary Computation (EC),
Artificial Immune Systems (AIS), and
Fuzzy Systems (FS).
Chest X-ray Pneumonia Classification with Deep LearningBaoTramDuong2
This document discusses using deep learning models to classify chest x-ray images as either normal or pneumonia. It obtained a dataset of over 5,800 pediatric chest x-rays from a Chinese hospital. Various deep learning models were explored, including multilayer perceptrons, convolutional neural networks, and transfer learning with VGG16, which achieved 92% validation accuracy. The document recommends future work such as distinguishing between viral and bacterial pneumonia and combining models with SVM. It also discusses recommendations to reduce childhood pneumonia prevalence.
This document is a research project submission for a Master's in Data Analytics. It proposes using a convolutional neural network to classify multiple types of skin lesions from dermoscopic images. Specifically, it will use a modified VGG19 network, replacing the classification layers with global average pooling, dropout, and two fully connected layers with softmax activation. The model will be tested on the ISIC 2018 dataset containing over 10,000 images across 7 lesion classes, after preprocessing the images. The goal is to assist dermatologists in identifying lesion types.
The document lists Shamik Tiwari's research publications and academic activities. It includes 5 published journal articles, 1 book chapter, and 2 accepted journal articles. It also lists that he coordinates academic monitoring, curriculum development, guides Ph.D. students, delivered lectures, and serves as a reviewer for several journals. He has also completed many online courses and achieved high student feedback for his online teaching.
IRJET- A Survey on Medical Image Interpretation for Predicting PneumoniaIRJET Journal
This document summarizes research on using machine learning and deep learning techniques to interpret medical images and predict pneumonia. It first discusses how medical image analysis is an active field for machine learning. It then reviews several related studies on using convolutional neural networks (CNNs) and transfer learning to classify chest x-rays and detect pneumonia. Specifically, it examines research on developing CNN models for pneumonia classification and using pre-trained CNN architectures like VGG16, VGG19, and ResNet with transfer learning. The document concludes that computer-aided diagnosis systems using deep learning can provide accurate predictions to assist radiologists in pneumonia diagnosis from chest x-rays.
IRJET- Classifying Chest Pathology Images using Deep Learning TechniquesIRJET Journal
This document discusses classifying chest pathology images using deep learning techniques. It explores using pre-trained convolutional neural networks (CNNs) to classify chest radiograph images as either healthy or pathological, and to identify specific pathologies. The document reviews previous work on applying deep learning to medical image analysis. It then proposes using features extracted from pre-trained CNN models to classify chest radiographs, focusing on classifying images as healthy vs. pathological as an important screening task. The strengths of deep learning approaches for analyzing various chest diseases are explored.
The document discusses the potential applications of deep learning in healthcare. It begins by explaining that deep learning models can improve accuracy of diagnosis, prognosis, and risk prediction by analyzing large datasets. It then discusses how deep learning can optimize hospital processes like resource allocation and patient flow by early and accurate prediction of diseases. Finally, it mentions that deep learning can help identify patient subgroups for personalized and precision medicine approaches.
This document summarizes a student project on stroke prediction using machine learning algorithms. The students collected two datasets on stroke from Kaggle, one benchmark and one non-benchmark. They preprocessed the data, addressed imbalance, and performed feature engineering. Various classification algorithms were tested on the data, including KNN, decision trees, SVM, and Naive Bayes. The results were evaluated to determine the most accurate model for predicting stroke risk based on attributes like age, gender, medical history, and lifestyle factors. The project aims to help identify individuals at high risk of stroke so preventative actions can be taken.
The document discusses using a U-Net convolutional neural network to automatically segment brain tumors in MRI images. It aims to eliminate the need for domain expertise by using deep learning to extract hierarchical features. The U-Net model is trained on the BRATS 2017 dataset and is able to segment tumors with 5% higher accuracy than previous methods, as measured by the Dice similarity coefficient. The system could be expanded to analyze additional MRI modalities and further improve automated tumor detection.
Techniques of Brain Cancer Detection from MRI using Machine LearningIRJET Journal
The document discusses techniques for detecting brain cancer from MRI scans using machine learning. It first provides background on brain tumors and MRI. It then outlines the cancer detection process, including pre-processing the MRI data, segmenting the images, extracting features, and classifying tumors using techniques like CNNs, SVMs, MLP, and Naive Bayes. The document reviews related work applying these techniques and compares their results, finding accuracy can be improved with larger, higher resolution datasets.
Prospects of Deep Learning in Medical ImagingGodswll Egegwu
A SEMINAR Presentation on the Prospects of Deep Learning in Medical Imaging Presented to the Department of Computer Science, Nasarawa State Polytechnic, Lafia.
BY:
EGEGWU, GODSWILL
08166643792
http://facebook.com/godswill.egegwu
http://egegwugodswill.name.ng
This document describes a study that used deep learning models to diagnose Alzheimer's disease using MRI scans. Specifically, it used convolutional neural networks (CNNs) trained on MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. The study achieved an F1-score of 89% for classifying Alzheimer's vs normal cognition. It also used a CycleGAN for data augmentation, which generated additional synthetic MRI images and improved the CNN classification accuracy to 95% F1-score. The document provides background on Alzheimer's diagnosis, CNNs, GANs including CycleGAN, related work applying machine learning to Alzheimer's diagnosis, and the methodology used in this study for data preprocessing, model training, and evaluating the impact of GAN
Towards A More Secure Web Based Tele Radiology System: A Steganographic ApproachCSCJournals
While it is possible to make a patient's medical images available to a practicing radiologist online e.g. through open network systems inter connectivity and email attachments, these methods don't guarantee the security, confidentiality and tamper free reliability required in a medical information system infrastructure. The possibility of securely and covertly transmitting such medical images remotely for clinical interpretation and diagnosis through a secure steganographic technique was the focus of this study.
We propose a method that uses an Enhanced Least Significant Bit (ELSB) steganographic insertion method to embed a patient's Medical Image (MI) in the spatial domain of a cover digital image and his/her health records in the frequency domain of the same cover image as a watermark to ensure tamper detection and nonrepudiation. The ELSB method uses the Marsenne Twister (MT) Pseudo Random Number Generator (PRNG) to randomly embed and conceal the patient's data in the cover image. This technique significantly increases the imperceptibility of the hidden information to steganalysis thereby enhancing the security of the embedded patient's data.
In measuring the effectiveness of the proposed method, the study adopted the Design Science Research (DSR) methodology, a paradigm for problem solving in computing and Information Systems (IS) that involves design and implementation of artefacts and methods considered novel and the analytical testing of the performance of such artefacts in pursuit of understanding and enhancing an existing method, artefact or practice.
The fidelity measures of the stego images from the proposed method were compared with those from the traditional Least Significant Bit (LSB) method in order to establish the imperceptibility of the embedded information. The results demonstrated improvements of between 1 to 2.6 decibels (dB) in the Peak Signal to Noise Ratio (PSNR), and up to 0.4 MSE ratios for the proposed method.
A transfer learning with deep neural network approach for diabetic retinopath...IJECEIAES
Diabetic retinopathy is an eye disease caused by high blood sugar and pressure which damages the blood vessels in the eye. Diabetic retinopathy is the root cause of more than 1% of the blindness worldwide. Early detection of this disease is crucial as it prevents it from progressing to a more severe level. However, the current machine learning-based approaches for detecting the severity level of diabetic retinopathy are either, i) rely on manually extracting features which makes an approach unpractical, or ii) trained on small dataset thus cannot be generalized. In this study, we propose a transfer learning-based approach for detecting the severity level of the diabetic retinopathy with high accuracy. Our model is a deep learning model based on global average pooling (GAP) technique with various pre-trained convolutional neural net- work (CNN) models. The experimental results of our approach, in which our best model achieved 82.4% quadratic weighted kappa (QWK), corroborate the ability of our model to detect the severity level of diabetic retinopathy efficiently.
This talk will cover various medical applications of deep learning including tumor segmentation in histology slides, MRI, CT, and X-Ray data. Also, more complicated tasks such as cell counting where the challenge is to count how many objects are in an image. It will also cover generative adversarial networks and how they can be used for medical applications. This presentation is accessible to non-doctors and non-computer scientists.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
In this project, we propose a new novel DNN-based automatic detection of diabetic retinopathy. In deep neural networks are used for classify the images that indicate diabetic retinopathy. The main aim of this project is to find the suitable way to detect the problems and classify them. We propose an deep neural network (RBFNN) classifier gives high precision in grouping of these disease through spatial examination. The RBFNN classifier does not require an large training time, therefore the model production can be expedited. We further find from our data set of 80,000 images used in our proposed RBFNN achieves a sensitivity of 95% and an accuracy of 75% on 5000 validation images. The fuzzy c means clustering is used to store the information as the processed images in this project . Finally, the proposed system is developed using matlab simulation.
This document is a thesis submitted by Tokelo Khalema to the University of the Free State in partial fulfillment of the requirements for a B.Sc. Honors degree in Mathematical Statistics. The thesis compares the Gaussian linear model, two Bayesian Student-t regression models, and the method of least absolute deviations through a Monte Carlo simulation study. The study aims to evaluate how soon and how severely the least squares regression model starts to lose optimality against these robust alternatives under violations of its assumptions. The document includes sections on robust statistical procedures, literature review of the models considered, research methodology, results and applications of the simulation study, and closing remarks.
Im-ception - An exploration into facial PAD through the use of fine tuning de...Cooper Wakefield
This document is a thesis submitted by Cooper Wakefield to the University of Queensland for the degree of Bachelor of Engineering. The thesis proposes developing a presentation attack detection (PAD) system through fine tuning a deep convolutional neural network. It aims to leverage pre-trained networks and fine tune the upper layers to differentiate between real and fake facial images with a high degree of accuracy. The thesis outlines the problem of presentation attacks on facial recognition systems, reviews prior approaches to PAD, and describes the proposed solution of using transfer learning on a CNN to classify images as real or fake.
The adaptive mechanisms include the following AI paradigms that exhibit an ability to learn or adapt to new environments:
Swarm Intelligence (SI),
Artificial Neural Networks (ANN),
Evolutionary Computation (EC),
Artificial Immune Systems (AIS), and
Fuzzy Systems (FS).
Chest X-ray Pneumonia Classification with Deep LearningBaoTramDuong2
This document discusses using deep learning models to classify chest x-ray images as either normal or pneumonia. It obtained a dataset of over 5,800 pediatric chest x-rays from a Chinese hospital. Various deep learning models were explored, including multilayer perceptrons, convolutional neural networks, and transfer learning with VGG16, which achieved 92% validation accuracy. The document recommends future work such as distinguishing between viral and bacterial pneumonia and combining models with SVM. It also discusses recommendations to reduce childhood pneumonia prevalence.
This document is a research project submission for a Master's in Data Analytics. It proposes using a convolutional neural network to classify multiple types of skin lesions from dermoscopic images. Specifically, it will use a modified VGG19 network, replacing the classification layers with global average pooling, dropout, and two fully connected layers with softmax activation. The model will be tested on the ISIC 2018 dataset containing over 10,000 images across 7 lesion classes, after preprocessing the images. The goal is to assist dermatologists in identifying lesion types.
The document lists Shamik Tiwari's research publications and academic activities. It includes 5 published journal articles, 1 book chapter, and 2 accepted journal articles. It also lists that he coordinates academic monitoring, curriculum development, guides Ph.D. students, delivered lectures, and serves as a reviewer for several journals. He has also completed many online courses and achieved high student feedback for his online teaching.
IRJET- A Survey on Medical Image Interpretation for Predicting PneumoniaIRJET Journal
This document summarizes research on using machine learning and deep learning techniques to interpret medical images and predict pneumonia. It first discusses how medical image analysis is an active field for machine learning. It then reviews several related studies on using convolutional neural networks (CNNs) and transfer learning to classify chest x-rays and detect pneumonia. Specifically, it examines research on developing CNN models for pneumonia classification and using pre-trained CNN architectures like VGG16, VGG19, and ResNet with transfer learning. The document concludes that computer-aided diagnosis systems using deep learning can provide accurate predictions to assist radiologists in pneumonia diagnosis from chest x-rays.
IRJET- Classifying Chest Pathology Images using Deep Learning TechniquesIRJET Journal
This document discusses classifying chest pathology images using deep learning techniques. It explores using pre-trained convolutional neural networks (CNNs) to classify chest radiograph images as either healthy or pathological, and to identify specific pathologies. The document reviews previous work on applying deep learning to medical image analysis. It then proposes using features extracted from pre-trained CNN models to classify chest radiographs, focusing on classifying images as healthy vs. pathological as an important screening task. The strengths of deep learning approaches for analyzing various chest diseases are explored.
The document discusses the potential applications of deep learning in healthcare. It begins by explaining that deep learning models can improve accuracy of diagnosis, prognosis, and risk prediction by analyzing large datasets. It then discusses how deep learning can optimize hospital processes like resource allocation and patient flow by early and accurate prediction of diseases. Finally, it mentions that deep learning can help identify patient subgroups for personalized and precision medicine approaches.
This document summarizes a student project on stroke prediction using machine learning algorithms. The students collected two datasets on stroke from Kaggle, one benchmark and one non-benchmark. They preprocessed the data, addressed imbalance, and performed feature engineering. Various classification algorithms were tested on the data, including KNN, decision trees, SVM, and Naive Bayes. The results were evaluated to determine the most accurate model for predicting stroke risk based on attributes like age, gender, medical history, and lifestyle factors. The project aims to help identify individuals at high risk of stroke so preventative actions can be taken.
The document discusses using a U-Net convolutional neural network to automatically segment brain tumors in MRI images. It aims to eliminate the need for domain expertise by using deep learning to extract hierarchical features. The U-Net model is trained on the BRATS 2017 dataset and is able to segment tumors with 5% higher accuracy than previous methods, as measured by the Dice similarity coefficient. The system could be expanded to analyze additional MRI modalities and further improve automated tumor detection.
Techniques of Brain Cancer Detection from MRI using Machine LearningIRJET Journal
The document discusses techniques for detecting brain cancer from MRI scans using machine learning. It first provides background on brain tumors and MRI. It then outlines the cancer detection process, including pre-processing the MRI data, segmenting the images, extracting features, and classifying tumors using techniques like CNNs, SVMs, MLP, and Naive Bayes. The document reviews related work applying these techniques and compares their results, finding accuracy can be improved with larger, higher resolution datasets.
Prospects of Deep Learning in Medical ImagingGodswll Egegwu
A SEMINAR Presentation on the Prospects of Deep Learning in Medical Imaging Presented to the Department of Computer Science, Nasarawa State Polytechnic, Lafia.
BY:
EGEGWU, GODSWILL
08166643792
http://facebook.com/godswill.egegwu
http://egegwugodswill.name.ng
This document describes a study that used deep learning models to diagnose Alzheimer's disease using MRI scans. Specifically, it used convolutional neural networks (CNNs) trained on MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. The study achieved an F1-score of 89% for classifying Alzheimer's vs normal cognition. It also used a CycleGAN for data augmentation, which generated additional synthetic MRI images and improved the CNN classification accuracy to 95% F1-score. The document provides background on Alzheimer's diagnosis, CNNs, GANs including CycleGAN, related work applying machine learning to Alzheimer's diagnosis, and the methodology used in this study for data preprocessing, model training, and evaluating the impact of GAN
Towards A More Secure Web Based Tele Radiology System: A Steganographic ApproachCSCJournals
While it is possible to make a patient's medical images available to a practicing radiologist online e.g. through open network systems inter connectivity and email attachments, these methods don't guarantee the security, confidentiality and tamper free reliability required in a medical information system infrastructure. The possibility of securely and covertly transmitting such medical images remotely for clinical interpretation and diagnosis through a secure steganographic technique was the focus of this study.
We propose a method that uses an Enhanced Least Significant Bit (ELSB) steganographic insertion method to embed a patient's Medical Image (MI) in the spatial domain of a cover digital image and his/her health records in the frequency domain of the same cover image as a watermark to ensure tamper detection and nonrepudiation. The ELSB method uses the Marsenne Twister (MT) Pseudo Random Number Generator (PRNG) to randomly embed and conceal the patient's data in the cover image. This technique significantly increases the imperceptibility of the hidden information to steganalysis thereby enhancing the security of the embedded patient's data.
In measuring the effectiveness of the proposed method, the study adopted the Design Science Research (DSR) methodology, a paradigm for problem solving in computing and Information Systems (IS) that involves design and implementation of artefacts and methods considered novel and the analytical testing of the performance of such artefacts in pursuit of understanding and enhancing an existing method, artefact or practice.
The fidelity measures of the stego images from the proposed method were compared with those from the traditional Least Significant Bit (LSB) method in order to establish the imperceptibility of the embedded information. The results demonstrated improvements of between 1 to 2.6 decibels (dB) in the Peak Signal to Noise Ratio (PSNR), and up to 0.4 MSE ratios for the proposed method.
A transfer learning with deep neural network approach for diabetic retinopath...IJECEIAES
Diabetic retinopathy is an eye disease caused by high blood sugar and pressure which damages the blood vessels in the eye. Diabetic retinopathy is the root cause of more than 1% of the blindness worldwide. Early detection of this disease is crucial as it prevents it from progressing to a more severe level. However, the current machine learning-based approaches for detecting the severity level of diabetic retinopathy are either, i) rely on manually extracting features which makes an approach unpractical, or ii) trained on small dataset thus cannot be generalized. In this study, we propose a transfer learning-based approach for detecting the severity level of the diabetic retinopathy with high accuracy. Our model is a deep learning model based on global average pooling (GAP) technique with various pre-trained convolutional neural net- work (CNN) models. The experimental results of our approach, in which our best model achieved 82.4% quadratic weighted kappa (QWK), corroborate the ability of our model to detect the severity level of diabetic retinopathy efficiently.
This talk will cover various medical applications of deep learning including tumor segmentation in histology slides, MRI, CT, and X-Ray data. Also, more complicated tasks such as cell counting where the challenge is to count how many objects are in an image. It will also cover generative adversarial networks and how they can be used for medical applications. This presentation is accessible to non-doctors and non-computer scientists.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
In this project, we propose a new novel DNN-based automatic detection of diabetic retinopathy. In deep neural networks are used for classify the images that indicate diabetic retinopathy. The main aim of this project is to find the suitable way to detect the problems and classify them. We propose an deep neural network (RBFNN) classifier gives high precision in grouping of these disease through spatial examination. The RBFNN classifier does not require an large training time, therefore the model production can be expedited. We further find from our data set of 80,000 images used in our proposed RBFNN achieves a sensitivity of 95% and an accuracy of 75% on 5000 validation images. The fuzzy c means clustering is used to store the information as the processed images in this project . Finally, the proposed system is developed using matlab simulation.
This document is a thesis submitted by Tokelo Khalema to the University of the Free State in partial fulfillment of the requirements for a B.Sc. Honors degree in Mathematical Statistics. The thesis compares the Gaussian linear model, two Bayesian Student-t regression models, and the method of least absolute deviations through a Monte Carlo simulation study. The study aims to evaluate how soon and how severely the least squares regression model starts to lose optimality against these robust alternatives under violations of its assumptions. The document includes sections on robust statistical procedures, literature review of the models considered, research methodology, results and applications of the simulation study, and closing remarks.
Im-ception - An exploration into facial PAD through the use of fine tuning de...Cooper Wakefield
This document is a thesis submitted by Cooper Wakefield to the University of Queensland for the degree of Bachelor of Engineering. The thesis proposes developing a presentation attack detection (PAD) system through fine tuning a deep convolutional neural network. It aims to leverage pre-trained networks and fine tune the upper layers to differentiate between real and fake facial images with a high degree of accuracy. The thesis outlines the problem of presentation attacks on facial recognition systems, reviews prior approaches to PAD, and describes the proposed solution of using transfer learning on a CNN to classify images as real or fake.
The document discusses the Value-at-Risk (VaR) measure which estimates the maximum potential loss of a portfolio over a given time period at a certain confidence level. It outlines some key parameters used in calculating VaR such as the sampling period, analysis horizon, and sampling frequency. The document also differentiates between relative VaR, marginal VaR, and conditional VaR which are related but distinct VaR measures. Finally, it provides an overview of different VaR methodologies including historical simulation, parametric simulation, and Monte Carlo simulation which will be explored in more detail later.
This document is a thesis submitted by Livinus Obiora Nweke for the degree of Master of Science in Computer Science. The thesis proposes a framework for validating network artifacts in digital forensics investigations based on stochastic and probabilistic modeling of the internal consistency of artifacts. The framework consists of three phases - data collection, feature selection using Monte Carlo Feature Selection, and a validation process using logistic regression analysis. The framework is demonstrated on network artifacts from intrusion detection systems. The experiment results show the validity of the network artifacts and can support assertions from the artifacts in investigations.
This thesis evaluates different methods for applying the Kalman filter to correct air quality model forecasts in Hong Kong. It introduces two alternative techniques - the logarithmic ratio bias method and relative difference bias method - to model bias beyond the traditional difference method. The performance of each correction method is assessed for PM2.5, NO2 and O3 at three monitoring stations over three years of hourly data using various statistical metrics. The aim is to determine the optimal bias modeling approach for each pollutant and location.
This thesis examines determinants of sovereign bond spreads using Bayesian model averaging (BMA). It considers 44 potential explanatory variables for bond spreads in 47 OECD countries from 1980-2010. Most variables previously suggested provide low inclusion probabilities, while unemployment and government consumption rank highly. These results are robust to different parameter and model priors. The author employs BMA to address uncertainty about which variables best explain spreads by averaging thousands of possible models.
This thesis presents an approach for non-rigid multi-modal object tracking using Gaussian mixture models (GMM). The target is represented by a GMM with each ellipsoid corresponding to a different fragment of the target. A region growing algorithm is used to automatically adapt the fragment set and extract accurate boundaries. Tracking performance is improved by incorporating joint Lucas-Kanade feature tracking to handle large motions. Experimental results demonstrate the effectiveness of the approach on challenging sequences.
Analysis and Classification of ECG Signal using Neural NetworkZHENG YAN LAM
This report describes an analysis and classification of electrocardiogram (ECG) signals using a neural network. The report includes:
1) An introduction to ECG signals, including acquisition and characteristics.
2) A description of neural network classification, including architectures like feedforward and feedback networks.
3) Details of experimental setup and methodology, including ECG preprocessing, feature extraction via wavelet decomposition, and a neural network classifier with a systematic data structure.
4) Results of two experiments on ECG classification, showing recognition rates of 80.89% and 93.75% respectively.
So in summary, the report presents a study using neural networks to classify ECG signals, with details
This document is a final report from 1993 on modelling and designing scalable parallel computing systems. It details the development of a fractal parallel computer topology that can extend to fill a wafer. Algorithms were developed for routing and load balancing, and a simulation program tested a 64-node network using UNIX and PC workstations. Benchmarking of a 16-node example network demonstrated scalability. The report discusses implementing hardware control to support applications using wafer-scale integration.
The document is Nathaniel Knapp's master's thesis titled "Parasite: Local Scalability Profiling for Parallelization" submitted to Technische Universität München. The thesis presents Parasite, a tool that measures the parallelism of function call sites in programs parallelized using Pthreads. Parasite calculates the parallelism ratio, which is an upper bound on potential speedup and useful for evaluating scalability. The thesis demonstrates Parasite on sorting algorithms, molecular dynamics simulations, and other programs to analyze parallelism and identify factors limiting scalability.
LinkedTV Deliverable D1.4 Visual, text and audio information analysis for hyp...LinkedTV
Having extensively evaluated the performance of the technologies included in the first release of WP1 multimedia analysis tools, using content from the LinkedTV scenarios and by participating in international benchmarking activities, concrete decisions regarding the
appropriateness and the importance of each individual method or combination of methods were made, which, combined with an updated list of information needs for each scenario, led to a new set of analysis requirements that had to be addressed through the release of the final set of analysis techniques of WP1. To this end, coordinated efforts on three directions, including
(a) the improvement of a number of methods in terms of accuracy and time efficiency,
(b) the development of new technologies and (c) the definition of synergies between methods for obtaining new types of information via multimodal processing, resulted in the final bunch of multimedia analysis methods for video hyperlinking. Moreover, the different developed analysis modules have been integrated into a web-based infrastructure, allowing the fully automatic linking of the multitude of WP1 technologies and the overall LinkedTV platform.
This document presents a search for the standard model Higgs boson using tau leptons detected by the CMS detector at the LHC. Background processes like Z→ττ decay, W+jets, and QCD are estimated. The Z→μμ and Z→ee channels are used to derive estimates for the Z→ττ process. No excess of tau pairs is observed above background predictions for Higgs masses from 115-140 GeV/c2, so upper limits are placed on the Higgs boson production cross section at 95% confidence level.
This thesis document describes Reza Pourramezan's master's thesis project on the design and implementation of an advanced centralized disturbance detection system for smart microgrids. The thesis presents a new advanced phasor data concentrator (APDC) that is capable of compensating for lost data and improving monitoring of distributed energy resources in microgrids. The APDC uses an adaptive compensation system to efficiently estimate missing data elements. It also includes a monitoring unit to reliably detect frequency excursions and identify distributed energy resources affected by islanding events. The performance of the proposed APDC is evaluated experimentally and shown to achieve high data integrity under normal and disturbed microgrid conditions.
This PhD thesis proposes methods for distributed iterative decoding and processing in wireless cooperative networks. It presents a block structure layered design for jointly designing channel encoders across all nodes in a wireless network. It also develops a generic implementation framework for sum-product algorithm message passing on factor graphs, using a proposed Karhunen-Loeve transform message representation. This framework can be applied to receiver processing in wireless networks. The thesis aims to advance both the global design of channel codes across network nodes, and the receiver processing of individual nodes, to better support wireless cooperative networks.
This document presents Marcus Low Junxiang's honours thesis on 3D surface change detection. It explores methods for preprocessing 3D point cloud data, point cloud registration, data smoothing techniques, and threshold-based change detection. The goal is to develop a comprehensive pipeline for detecting changes between two 3D surface scans of the same scene captured at different times. The thesis covers topics like moving least squares smoothing, correspondence between surfaces, differencing algorithms, and experiments evaluating the proposed pipeline.
Trade-off between recognition an reconstruction: Application of Robotics Visi...stainvai
Autonomous and ecient action of robots requires a robust robot vision system that can
cope with variable light and view conditions. These include partial occlusion, blur, and
mainly a large scale dierence of object size due to variable distance to the objects. This
change in scale leads to reduced resolution for objects seen from a distance. One of the
most important tasks for the robot's visual system is object recognition. This task is also
aected by orientation and background changes. These real-world conditions require a
development of specic object recognition methods.
This work is devoted to robotic object recognition. We develop recognition methods
based on training that includes incorporation of prior knowledge about the problem.
The prior knowledge is incorporated via learning constraints during training (parameter
estimation). A signicant part of the work is devoted to the study of reconstruction
constraints. In general, there is a tradeo between the prior-knowledge constraints and
the constraints emerging from the classication or regression task at hand. In order to
avoid the additional estimation of the optimal tradeo between these two constraints, we
consider this tradeo as a hyper parameter (under Bayesian framework) and integrate
over a certain (discrete) distribution. We also study various constraints resulting from
information theory considerations.
Experimental results on two face data-sets are presented. Signicant improvement in
face recognition is achieved for various image degradations such as, various forms of image
blur, partial occlusion, and noise. Additional improvement in recognition performance is
achieved when preprocessing the degraded images via state of the art image restoration
techniques.
This document presents a thesis on leader-following consensus control of multi-agent systems considering communication constraints. It introduces the motivation for studying consensus problems in multi-agent systems and provides an overview of relevant literature. It then describes the modeling of multi-agent systems using algebraic graph theory to represent different communication constraints. A Lyapunov-based control approach is developed to achieve consensus under constant and time-varying communication delays and packet data losses. Simulation and experimental results validate the proposed control approach.
This document provides an introduction to spatial data analysis using open source software R. It discusses spatial data concepts like spatial reference systems and coordinate reference systems. It describes how to create, load and visualize spatial point, line and polygon data in R. It also covers digital image processing and classification in QGIS. Methods discussed include spatial point pattern analysis, interpolation, geostatistics, spatial modeling and accuracy assessment. The document uses data from Kilimanjaro region as an example to demonstrate these spatial analysis techniques in R and QGIS.
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...shadow0702a
This document serves as a comprehensive step-by-step guide on how to effectively use PyCharm for remote debugging of the Windows Subsystem for Linux (WSL) on a local Windows machine. It meticulously outlines several critical steps in the process, starting with the crucial task of enabling permissions, followed by the installation and configuration of WSL.
The guide then proceeds to explain how to set up the SSH service within the WSL environment, an integral part of the process. Alongside this, it also provides detailed instructions on how to modify the inbound rules of the Windows firewall to facilitate the process, ensuring that there are no connectivity issues that could potentially hinder the debugging process.
The document further emphasizes on the importance of checking the connection between the Windows and WSL environments, providing instructions on how to ensure that the connection is optimal and ready for remote debugging.
It also offers an in-depth guide on how to configure the WSL interpreter and files within the PyCharm environment. This is essential for ensuring that the debugging process is set up correctly and that the program can be run effectively within the WSL terminal.
Additionally, the document provides guidance on how to set up breakpoints for debugging, a fundamental aspect of the debugging process which allows the developer to stop the execution of their code at certain points and inspect their program at those stages.
Finally, the document concludes by providing a link to a reference blog. This blog offers additional information and guidance on configuring the remote Python interpreter in PyCharm, providing the reader with a well-rounded understanding of the process.
artificial intelligence and data science contents.pptxGauravCar
What is artificial intelligence? Artificial intelligence is the ability of a computer or computer-controlled robot to perform tasks that are commonly associated with the intellectual processes characteristic of humans, such as the ability to reason.
› ...
Artificial intelligence (AI) | Definitio
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
1. UCLA
UCLA Electronic Theses and Dissertations
Title
Detecting Coronavirus Disease 2019 Pneumonia in Chest X-Ray Images Using Deep
Learning
Permalink
https://escholarship.org/uc/item/7355g8v8
Author
Zhu, Ziqi
Publication Date
2020
Peer reviewed|Thesis/dissertation
eScholarship.org Powered by the California Digital Library
University of California
2. UNIVERSITY OF CALIFORNIA
Los Angeles
Detecting Coronavirus Disease 2019 Pneumonia
in Chest X-Ray Images
Using Deep Learning
A thesis submitted in partial satisfaction
of the requirements for the degree
Master of Science in Statistics
by
Ziqi Zhu
2020
4. ABSTRACT OF THE THESIS
Detecting Coronavirus Disease 2019 Pneumonia
in Chest X-Ray Images
Using Deep Learning
by
Ziqi Zhu
Master of Science in Statistics
University of California, Los Angeles, 2020
Professor Yingnian Wu, Chair
The coronavirus disease 2019 (COVID-19) pandemic has already become a global threat. To
fight against COVID-19, effective and fast screening methods are needed. This study focuses
on leveraging deep learning techniques to automatically detect COVID-19 pneumonia in chest
X-ray images. Two models are trained based on transfer learning and residual neural network.
The first one is a binary classifier that separates COVID-19 pneumonia and non-COVID-19
cases. It classifies all test cases correctly. The second one is a four-class classifier that
distinguishes COVID-19 pneumonia, viral pneumonia, bacterial pneumonia and normal cases.
It reaches an average accuracy, precision, sensitivity, specificity, and F1-score of 93%, 93%,
93%, 97%, and 93%, respectively. To understand on how the four-class classifier detects
COVID-19 pneumonia, we apply Gradient-weighted Class Activation Mapping (Grad-CAM)
method and find out that the classifier is able to focus on the patchy areas in chest X-ray
images and make accurate predictions.
ii
5. The thesis of Ziqi Zhu is approved.
Chad J. Hazlett
Hongquan Xu
Yingnian Wu, Committee Chair
University of California, Los Angeles
2020
iii
11. CHAPTER 1
Introduction
The novel coronavirus disease 2019 (COVID-19) pandemic is an ongoing pandemic caused
by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). As of 28 May 2020,
more than 5.5 million cases of COVID-19 have been reported in 216 countries, areas and
territories, resulting in 353,373 deaths [Org20]. While most people only have mild to moderate
symptoms, some patients have developed severe illnesses that include pneumonia and acute
respiratory distress syndrome (ARDS). To control the spreading of COVID-19, effective
screening of patients is critical. So far, the gold standard screening method is the reverse
transcription polymerase chain reaction (RT-PCR) test which has been designed to detect
SARS-CoV-2 genetically. But it only has a positive rate ranging between 30% and 60%
[AYH20, YYS]. For patients who develop severe illnesses such as pneumonia and ARDS, a
good complementary screening method is radiography examination, where chest radiography
imaging (e.g., X-ray or computed tomography (CT) scan) is analyzed for SARS-CoV-2 viral
infection indicators including bilateral, peripheral ground-glass opacities and pulmonary
consolidations [SAB20]. However, the viral infection indicators can be subtle and it is difficult
for radiologists to distinguish COVID-19 pneumonia from normal cases or other pneumonias.
Thus, computer-aided diagnostic systems that can help detect COVID-19 pneumonia in chest
radiography images are highly desired.
In deep learning area, convolutional neural networks (CNN) are typically used to solve
object detection and image classification problem due to its ability to preserve spatial structure
and recognize image features. LeNet-5 [LBB98], introduced by LeCun et al. in 1998, is
the first instantiation of CNN that has been successfully used in practice. With only seven
1
12. layers, it has achieved high performance for digit recognition problem. In 2012, AlexNet
[KSH12] entered the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [RDS15],
a benchmark for object detection and image classification tasks at large scale, and was able
to outperform all previous non-deep-learning-based models by a significant margin with top-5
error rate at 16.4%. In 2013, ZFNet [ZF14] was designed with the architecture of AlexNet and
better hyperparameters. Thus, it reached less top-5 error rate at 11.7%. In 2014, VGGNet
[SZ14] and GoogLeNet [SLJ15] both made another jump in performance, with top-5 error
rate at 7.3% and 6.7%, respectively. The common thing between VGGNet and GoogLeNet is
that they both have deeper architectures. In 2015, the ILSVRC winner is the residual neural
network (ResNet) with 152 layers [HZR16]. It achieved the top-5 error at 3.6% which is less
than human error 5.1%. This is the first time that a CNN model outperforms human in large
scale object recognition tasks and it marks the promising future for CNN models.
Motivated by the need for fast and accurate analysis of radiography images, a number of
COVID-19 pneumonia detection models based on state-of-the-art CNN models have been
proposed and results have shown to be promising [CRK20, NKP20, AM20]. However, those
studies are reporting the results on small datasets under binary classification (COVID-19
pneumonia and non-COVID-19 cases) or three-class classification (COVID-19 pneumonia,
other pneumonias and normal cases) problem. In this study, we have prepared a comparatively
large dataset consisting chest X-rays images of COVID-19 pneumonia, viral pneumonia,
bacterial pneumonia, and normal cases, and have trained two models based on transfer
learning concept and ResNet architecture. One model is trained to distinguish COVID-19
pneumonia and non-COVID-19 cases in chest X-ray images, while the other one is trained
to classify COVID-19 pneumonia, viral pneumonia, bacterial pneumonia, and normal cases.
Using the first model, we can quickly screen patients who are suspected of having COVID-19
pneumonia and arrange immediate medical cares for them. The second model can help
clinicians to not only better screen COVID-19, but also decide treatment strategy and make
treatment plan based on cause of the infection.
This thesis is organized as follows. First, Chapter 2 describes the generation of dataset
2
13. for two models. Chapter 3 describes the model architecture, transfer learning concept,
implementation details, performance evaluation metrics, and the model interpretation strategy.
Chapter 4 presents and discusses the results of experiments conducted to evaluate and interpret
the proposed models. Finally, conclusions are drawn in Chapter 5.
3
14. CHAPTER 2
Dataset
2.1 Data Description
The dataset used to train and evaluate our models consists 6,216 chest X-ray images across
6,001 patients. To generate this dataset, we combine and modify two different datasets
that are publicly available: COVID Chest X-Ray Dataset [CMD20] and Labeled Optical
Coherence Tomography (OCT) and Chest X-Ray Images [KZG18]. The following section
gives a description of the two datasets and a summary of how we generate our dataset:
COVID Chest X-Ray Dataset: This dataset comprises 360 chest X-ray images and
CT scans across 198 patients which are positive or suspected of COVID-19 pneumonia or
other pneumonias. In this study, we only select 232 chest X-ray images with anteroposterior
or posteroanterior chest view from 145 patients who are positive or suspected of COVID-19
pneumonia.
Labeled OCT and Chest X-Ray Images: This dataset has a total of 109,309 OCT
images and 5,856 chest X-ray images that have been collected and labeled by the same lab.
The chest X-ray images include 4,273 pneumonia images (1,493 viral and 2,780 bacterial) and
1,583 normal images from total of 5,856 patients. X-ray images are already split into training
set and testing set. In this study, we choose all chest X-ray images and use the predefined
training set and testing set.
The summary of our dataset is shown in Table 2.1. Sample chest X-ray images for each
pneumonia type are shown in Figure 2.1.
4
15. Table 2.1: Details of our dataset
Original Dataset Type Total Patients Total Images
COVID Chest X-Ray Dataset COVID-19 145 232
Labeled OCT and Chest X-Ray Images
Viral 1493 1493
Bacterial 2780 2780
Normal 1583 1583
(A)
(B)
(C)
(D)
Figure 2.1: Sample chest X-ray images from our dataset: (A) COVID-19 pneumonia, (B)
viral pneumonia, (C) bacterial pneumonia, and (D) normal case.
5
16. 2.2 Dataset Preprocessing
Our dataset consists of 232 COVID-19 pneumonia images, 1,493 viral pneumonia images,
2,780 bacterial pneumonia images and 1,583 normal case images, shown in Table 2.1. For
COVID-19 pneumonia images, there are multiple images from some of the patients. In order
to ensure that images of each patient appear only in training set or testing set, we separate
the data by patient before splitting into training and testing set. Among 232 COVID-19
pneumonia images, 172 images fall into the training set and the rest 60 images fall into the
testing set. For chest X-ray images of the other pneumonia types, we use the predefined
training set and testing set in the "Labeled OCT and Chest X-Ray Images" dataset in our
four-class classification model, and label all of them as non-COVID-19 cases in our binary
classification model. The data distributions of the binary classification problem and four-class
classification problem are shown in Figure 2.2.
(A) (B)
Figure 2.2: Data distribution for: (A) binary classification, and (B) four-class classification.
A noticeable problem is the limited amount of COVID-19 pneumonia images, which causes
the data to be imbalanced and adds challenge to the performance of our classification models.
To alleviate this problem, we use over-sampling method to even-up the data between classes.
More specifically, we make COVID-19 pneumonia images 8-fold in the four-class classification
problem and 24-fold in the binary classification problem. The details of our final training
sets and testing sets are shown in Table 2.2 and Table 2.3. The final data distribution of two
6
17. problems are shown in Figure 2.3.
Table 2.2: Details of training and testing set for the binary classification model
Label Pneumonia Type Training Set Testing Set Total
Non-COVID-19 Viral, Bacterial, and
Normal (no infection)
5232 624 5856
COVID-19 COVID-19 4128 60 4188
Table 2.3: Details of training and testing set for the four-class classification model
Label Pneumonia Type Training Set Testing Set Total
COVID-19 COVID-19 1376 60 1436
Viral Viral 1345 148 1493
Bacterial Bacterial 2538 242 2780
Normal Normal (no infection) 1349 234 1583
(A) (B)
Figure 2.3: Final data distribution for: (A) binary classification, and (B) four-class classifica-
tion.
7
18. CHAPTER 3
Methodology
In this chapter, we will discuss the model architecture, transfer learning concept, implemen-
tation details, performance evaluation metrics, and the strategy for interpreting models.
3.1 ResNet Architecture
Residual neural network (ResNet) is proposed by He et al. in 2015 [HZR16]. The hypothesis
behind ResNet is that deeper networks are harder to optimize, since the deeper model should
be able to perform as well as the shallower model by copying the learned parameters from the
shallower model and setting additional layers to identity mapping. To help optimize deeper
models, residual blocks are designed to fit a residual mapping F(x) instead of the desired
underlying mapping H(x), and full ResNet architecture is built by stacking residual blocks.
More specifically, every residual block has two 3 × 3 convolutional layers. Periodically, the
number of filters are doubled and spatial downsampling is operated. Figure 3.1 illustrates the
structure within the residual block and Figure 3.2 shows an example of ResNet architecture -
ResNet with 18 layers (ResNet18).
8
19. Figure 3.1: Structure of residual block
Figure 3.2: Schema of ResNet18 architecture
In this study, we build two deep models based on ResNet18 for the classification under
two schemes. One is a binary classification that distinguishes COVID-19 pneumonia and
non-COVID-19 cases, while the other is a four-class classification that classifies COVID-19
pneumonia, viral pneumonia, bacterial pneumonia, and normal cases.
3.2 Transfer Learning
A successful training of deep neural networks often requires a large-scale dataset and a long
training period. Moreover, a basic assumption for many deep learning models is that the
9
20. training and testing data should be drawn from the same distribution. In many real-word
applications, the availability of data is limited due to the high cost of collecting and labelling
data. In such cases, retaining and reusing previously learned knowledge for a different data
distribution, task or domain is critical. Transfer learning is a machine learning method where
a previously trained model is reused as a starting point for the new model and task. It can
not only alleviate the need and effort to collect training data, but also accelerate training
process.
There are two commonly used strategies to exploit transfer learning on deep convolutional
neural network. The first strategy is called feature extraction where the pretrained model,
retaining both its initial architecture and the learned parameters, is only used to extract
image features for the input of the new classification model. The second strategy makes
some modifications to the pretrained model, such as architecture adjustments and parameter
tuning, to improve extracted image features on the new dataset and achieve optimal results.
In this study, the second strategy is used. More specifically, we use ImageNet[RDS15],
a large-scale dataset consisting 1.2 million high-resolution images in 1,000 different classes
(i.e.,fish, bird, tree, flowers, sport, room, etc.), to pretrain ResNet18. Since images in ImageNet
are different from our dataset, we cannot directly use the pretrained model to extract image
features for classification. Instead, we need to fine-tune parameters of the pretrained model
on our dataset to generate better image features. Besides, the architecture of ResNet18 has
been changed - the number of outputs in the final fully-connected layer is changed from 1,000
to the number of classes in each classification problem. Thus, the fine-tuned model outputs
scores for each class and can classify input image to the class with maximum score. The
schematic representation of the training process is depicted in Figure 3.3.
10
21. (A) (B)
Figure 3.3: Schematic of transfer learning with ResNet18 for: (A) four-class classification
problem, and (B) binary classification problem.
3.3 Implementation Details
3.3.1 Data Augmentation
We apply data augmentation methods that include rotation, scaling, translation, and horizon-
tal flip on the training set to address the data deficiency problem. That means, each image
in the training set is randomly operated by the following methods:
• Rotation: Rotating the image with an angle between 0 and 15 in the clockwise or
counter clockwise direction.
• Scaling: Sampling the scale of frame size of the image randomly between 90% and
110%.
• Translation: Translating image horizontally and vertically between -10% and 10%.
• Horizontal Flip: Horizontally flipping the image with a probability of 0.5.
11
22. 3.3.2 Hyperparameters
In this study, all images are normalized and resized to 224 × 224 pixels. We use five-fold cross
validation to evaluate the model and tune the hyperparameters, and obtain the generalized
results in the testing stage. The following hyperparameters are used for training:
Binary classification model: Batch size = 256, cross entropy loss, stochastic gradient
descent (SGD) optimizer with momentum = 0.9 and weight decay (L2 penalty) = 0.01,
learning rate = 0.003 (will decay by 0.1 at 10 and 20 epoch), 50 epochs in total, and early
stopping with patience = 10.
Four-class classification model: Batch size = 256, cross entropy loss, SGD opti-
mizer with momentum = 0.9 and weight decay (L2 penalty) = 0.1, cross entropy loss,
learning rate = 0.003 (will decay by 0.1 at 20 and 40 epoch), 50 epochs in total, and
early stopping with patience = 15.
3.4 Performance Evaluation Metrics
Due to the data imbalance problem, the performance of the model is evaluated using eight
evaluation metrics - accuracy, confusion matrix, precision, recall (or sensitivity), specificity
(or selectivity), F1-score, receiver operating characteristic (ROC) curve and area under the
curve (AUC).
3.4.1 Confusion Matrix
A confusion matrix is a summary of predicted labels and true labels on a classification
problem. It gives us insights not only into the errors but also into the types of errors that
are being made by the classifier. The schema of confusion matrix in a binary classification
problem is shown in Figure 3.4.
12
23. Figure 3.4: Schema of confusion matrix
The accuracy, precision, recall (or sensitivity), specificity (or selectivity) and F1-score can
be computed as follows based on the confusion matrix.
accuracy =
TP + FN
TP + FP + TN + FN
precision =
TP
TP + FP
recall (or sensitivity) =
TP
TP + FN
specificity (or selectivity) =
TN
TN + FP
F1-score =
2 × precision × recall
precision + recall
For multi-class classification problem, we can not only use one-vs-all methodology to
compute the above metrics but also use macro average and weighted average of metrics to
evaluate the overall model performance. The macro average score is average of the target
metric (e.g., accuracy) over all classes while the weighted average is average of the target
metric over classes weighted by sample size. If each class has the same sample size, the macro
average and the weighted average are always the same.
13
24. 3.4.2 ROC Curve and AUC
ROC curve is a graphical plot that shows the performance of a binary classification model. It
is created by plotting true positive rate (TPR) against false positive rate (FPR) at various
discrimination thresholds, where TPR is also known as sensitivity or recall, and FPR can be
calculated as (1 − specificity). By changing the model’s discrimination threshold, confusion
matrix may be different and a point in the ROC curve can be plotted.
The ROC curve for a random binary classification model would be a diagonal line from
(0,0) to (1,1). Any curve above the diagonal line represents good classification model which
is better than random, while curve below the diagonal line means the model is worse than
random. A perfect binary classification model would yield a straight line from (0,1) to (1,1),
meaning 100% sensitivity and 100% specificity.
AUC refers to the area under the ROC curve, which is always between 0 and 1. Based
on the concept of ROC curve, it can be concluded that higher the AUC, better the binary
classification model. A random classifier has AUC of 0.5, and a great model has AUC close
to 1. With AUC less than 0.5, the model tends to reciprocate the classes. A schema of ROC
curve and AUC is shown in Figure 3.5.
Figure 3.5: Schema of ROC curve and AUC : red line: a perfect classifier, blue curve: a great
classifier, yellow line: a random classifier, and shaded area: AUC for the random classifier.
In multi-class classification model, we can plot ROC curve for each class using one-vs-all
14
25. methodology.
3.5 Model Interpretation
Gradient-weighted class activation mapping (Grad-CAM) [SCD17] is a deep learning technique
that can generate visual explanations of CNN-based models. Thus, it can make models
more transparent and easier to interpret. Using the gradients of a target concept (i.e., the
COVID-19 class in our four-class classification) flowing into the final convolutional layer, it
can produce a localization map showing important regions for the model to make predictions.
Grad-CAM not only provides a way to evaluate models, but also helps human to study and
understand more about deep learning models.
In our study, we apply Grad-CAM method on the four-class classification model and
choose the COVID-19 pneumonia as the target concept to understand how the four-class
classification model makes predictions of COVID-19 cases.
15
26. CHAPTER 4
Results
In this section, we will discuss performance of the two classification models as well as the
interpretation of the four-class classification model.
4.1 Performance of the Binary Classification Model
A binary classifier is implemented to separate COVID-19 pneumonia and non-COVID-19
cases. The classifier manages to classify all COVID-19 and non-COVID-19 test cases correctly.
Accuracy, precision, sensitivity, specificity, F1-score and AUC are all 100.0%. Figure 4.1
shows the visualization of the confusion matrix for the binary classification. The summary of
results is in Table 4.1 and the ROC curve is shown in Figure 4.2.
Figure 4.1: Confusion matrix of the binary classification
16
27. Table 4.1: Results for binary classification
Label Size Accuracy Precision Recall Selectivity F1-score AUC
COVID-19 60 1.00 1.00 1.00 1.00 1.00 1.00
Non-COVID-19 624 1.00 1.00 1.00 1.00 1.00 1.00
Figure 4.2: ROC curve of the binary classification model
4.2 Performance of the Four-class Classification Model
A four-class classifier is implemented to distinguish COVID-19 pneumonia, viral pneunomia,
bacterial pneunomia, and normal cases. This classifier achieves an average accuracy of 93%,
with an average precision, sensitivity, specificity, and F1-score of 93%, 93%, 97%, and 93%,
respectively. ROC curves are generated to evaluate the model’s ability to distinguish each
class from other classes using one-vs-all method, and all AUCs are more than 96%. Figure
4.3 shows the visualization of the confusion matrix for the four-class classification model.
Table 4.2 summarizes the details of the results for this four-class classification model and
Figure 4.4 shows four ROC curves and the zoomed area.
17
29. (A) (B)
Figure 4.4: ROC curve of the four-class classification model: (A) the original ROC curves,
and (B) a zoomed area of the ROC curves.
Overall, the model reaches very high performance in distinguishing four classes. For
COVID-19 pneumonia, it is clear that all COVID-19 test cases are classified correctly. This
is very important in screening COVID-19 pneumonia. However, the model seems to be less
powerful in distinguishing other three categories - it tends to classify normal cases into other
categories and classify viral pneumonia images into bacterial pneumonia.
4.3 Model Interpretation
Grad-CAM method can generate a localization map highlighting important regions in image
for model to make predictions. After applying Grad-CAM method on COVID-19 images in the
four-class classification model, we can try to understand how the model predicts COVID-19
pneumonia given chest X-ray images. We are able to obtain some chest X-ray images of
COVID-19 cases with detailed description and diagnosis by radiologists[Rad]. By comparing
the diagnosis and the localization map, we can not only get an better understanding of
how the model make predictions but also check if the model focuses on the same region as
radiologists do. Some samples are shown in Figure 4.5, 4.6, 4.7, 4.8, and 4.9. Two things
need to be noted: 1) the left and right side are reversed in the chest X-ray images, and 2)
in the following figures, the left ones are original chest X-ray images while the right ones
19
30. (resized to 224 × 224 pixels) show localization maps generated by Grad-CAM. Radiologists’
comments are listed in captions.
Figure 4.5: Comment: patchy areas of air space opacification bilaterally with a lower zone
predominance.
Figure 4.6: Comment: there is peripheral patchy air space opacification seen in both lung
lower zones with diffuse ground-glass haze bilaterally.
Figure 4.7: Comment: mutifocal consolidation in right mid zone and left mid/lower zones.
20
31. Figure 4.8: Comment: multiple faint alveolar opacities are identified, predominantly peripheral
with greater involvement of the upper lobes.
Figure 4.9: Comment: there is established left upper lobe and now progressive patchy
consolidation in left lower, right upper and middle lobes.
In Figure 4.5 and 4.6, our model focuses on the right area. In Figure 4.7-4.9, our model
tends to be unfocused and highlights larger areas compared to radiologists’ descriptions.
From the above figures, we can conclude that 1) our model manages to focus on the lung
area instead of other areas (i.e., cervical spine, upper limb, costae, and abdomen) shown in
the chest X-ray images, and 2) our model tends to focus more on areas with rich textures (i.e.,
patchy areas). So, our model is more powerful when the patchy areas are the main indicator of
COVID-19 pneumonia. Otherwise, though our model can make accurate predictions, it tends
to be unfocused and fails to highlight specific areas that indicating COVID-19 pneumonia.
21
32. CHAPTER 5
Conclusion
In this study, we train two deep learning models based on transfer learning concept and
ResNet18 architecture for COVID-19 pneumonia detection in chest X-ray images. The first
one is a binary classification model that aims to separate COVID-19 pneumonia and non-
COVID-19 cases. It is able to classify all test cases correctly. The second one is a four-class
classification model that aims to distinguish COVID-19 pneumonia, viral pneumonia, bacterial
pneumonia and normal cases. It reaches an average accuracy, precision, sensitivity, specificity,
F1-score, and AUC of 93%, 93%, 93%, 97%, 93%, and 99%, respectively. This model is able
to detect all COVID-19 test cases, but it tends to classify normal cases into other classes
and tends to classify viral pneumonia images into bacterial pneumonia. To shed a light on
how the four-class classification model make predictions of COVID-19 pneumonia cases, we
apply Grad-CAM method to generate localized map that highlights important regions the
model thinks to make predictions. By comparing the highlighted regions and radiologists’
descriptions of chest X-ray images, we find out that our model is more powerful when patchy
areas are the main indicators of COVID-19 pneumonia. Otherwise, though our model can
make accurate predictions, it tends to be unfocused and fails to highlight specific areas that
indicating COVID-19 pneumonia.
COVID-19 has already become a global threat and has taken away hundreds of thousands
of people’s lives. This study shows the feasibility of building a computer-aided diagnostic
system that can help clinicians detect COVID-19 pneumonia from radiology images accurately
and quickly. Moreover, model interpretation techniques allow us to further evaluate and
understand models. However, the limited data adds challenge to the performance of model.
22
33. By collecting more chest X-ray images of COVID-19 pneumonia, other pneumonias and
normal cases, the model will be more robust and powerful.
23
34. REFERENCES
[AKM17] Benjamin Antin, Joshua Kravitz, and Emil Martayan. “Detecting pneumonia in
chest X-Rays with supervised learning.” Semanticscholar.org, 2017.
[AM20] Ioannis D Apostolopoulos and Tzani A Mpesiana. “Covid-19: automatic detection
from x-ray images utilizing transfer learning with convolutional neural networks.”
Physical and Engineering Sciences in Medicine, p. 1, 2020.
[AYH20] Tao Ai, Zhenlu Yang, Hongyan Hou, Chenao Zhan, Chong Chen, Wenzhi Lv, Qian
Tao, Ziyong Sun, and Liming Xia. “Correlation of chest CT and RT-PCR testing in
coronavirus disease 2019 (COVID-19) in China: a report of 1014 cases.” Radiology,
p. 200642, 2020.
[CMD20] Joseph Paul Cohen, Paul Morrison, and Lan Dao. “COVID-19 image data collec-
tion.” arXiv preprint arXiv:2003.11597, 2020.
[CRK20] Muhammad EH Chowdhury, Tawsifur Rahman, Amith Khandakar, Rashid Mazhar,
Muhammad Abdul Kadir, Zaid Bin Mahbub, Khandakar R Islam, Muham-
mad Salman Khan, Atif Iqbal, Nasser Al-Emadi, et al. “Can AI help in screening
viral and COVID-19 pneumonia?” arXiv preprint arXiv:2003.13145, 2020.
[DJV14] Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng,
and Trevor Darrell. “Decaf: A deep convolutional activation feature for generic
visual recognition.” In International conference on machine learning, pp. 647–655,
2014.
[HZR16] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep residual learning
for image recognition.” In Proceedings of the IEEE conference on computer vision
and pattern recognition, pp. 770–778, 2016.
[KGC18] Daniel S Kermany, Michael Goldbaum, Wenjia Cai, Carolina CS Valentim, Huiying
Liang, Sally L Baxter, Alex McKeown, Ge Yang, Xiaokang Wu, Fangbing Yan,
et al. “Identifying medical diagnoses and treatable diseases by image-based deep
learning.” Cell, 172(5):1122–1131, 2018.
[KSH12] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. “Imagenet classification
with deep convolutional neural networks.” In Advances in neural information
processing systems, pp. 1097–1105, 2012.
[KZG18] Daniel Kermany, Kang Zhang, and Michael Goldbaum. “Labeled optical coherence
tomography (oct) and chest X-ray images for classification.” Mendeley data, 2,
2018.
[LBB98] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. “Gradient-based
learning applied to document recognition.” Proceedings of the IEEE, 86(11):2278–
2324, 1998.
24
35. [NKP20] Ali Narin, Ceren Kaya, and Ziynet Pamuk. “Automatic detection of coronavirus
disease (covid-19) using x-ray images and deep convolutional neural networks.”
arXiv preprint arXiv:2003.10849, 2020.
[Org20] World Health Organization. “Coronavirus disease (COVID-19) outbreak situation.”
https://www.who.int/emergencies/diseases/novel-coronavirus-2019,
2020. [Online; accessed 28-May-2020].
[PY09] Sinno Jialin Pan and Qiang Yang. “A survey on transfer learning.” IEEE Transac-
tions on knowledge and data engineering, 22(10):1345–1359, 2009.
[Rad] Radiopaedia Blog Rss. “Cases.” https://radiopaedia.org/cases. [Online;
accessed 27-May-2020].
[RDS15] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma,
Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C.
Berg, and Li Fei-Fei. “ImageNet Large Scale Visual Recognition Challenge.”
International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
[SAB20] Sana Salehi, Aidin Abedi, Sudheer Balakrishnan, and Ali Gholamrezanezhad.
“Coronavirus disease 2019 (COVID-19): a systematic review of imaging findings in
919 patients.” American Journal of Roentgenology, pp. 1–7, 2020.
[SAS14] Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson.
“CNN features off-the-shelf: an astounding baseline for recognition.” In Proceedings
of the IEEE conference on computer vision and pattern recognition workshops, pp.
806–813, 2014.
[SCD17] Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedan-
tam, Devi Parikh, and Dhruv Batra. “Grad-cam: Visual explanations from deep
networks via gradient-based localization.” In Proceedings of the IEEE international
conference on computer vision, pp. 618–626, 2017.
[SLJ15] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir
Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. “Going
deeper with convolutions.” In Proceedings of the IEEE conference on computer
vision and pattern recognition, pp. 1–9, 2015.
[SZ14] Karen Simonyan and Andrew Zisserman. “Very deep convolutional networks for
large-scale image recognition.” arXiv preprint arXiv:1409.1556, 2014.
[UKK20] Buddhisha Udugama, Pranav Kadhiresan, Hannah N Kozlowski, Ayden Malek-
jahani, Matthew Osborne, Vanessa YC Li, Hongmin Chen, Samira Mubareka,
Jonathan B Gubbay, and Warren CW Chan. “Diagnosing COVID-19: the disease
and tools for detection.” ACS nano, 14(4):3822–3835, 2020.
25
36. [Wik20] Wikipedia contributors. “Coronavirus disease 2019 — Wikipedia, The Free En-
cyclopedia.” https://en.wikipedia.org/w/index.php?title=Coronavirus_
disease_2019&oldid=959365303, 2020. [Online; accessed 28-May-2020].
[WW20] Linda Wang and Alexander Wong. “COVID-Net: A tailored deep convolutional
neural network design for detection of COVID-19 cases from chest radiography
images.” arXiv preprint arXiv:2003.09871, 2020.
[YYS] Y Yang, M Yang, C Shen, F Wang, J Yuan, J Li, M Zhang, Z Wang, L Xing,
J Wei, et al. “Evaluating the accuracy of different respiratory specimens in the
laboratory diagnosis and monitoring the viral shedding of 2019-nCoV infections.
MedRxiv 2020: 2020.02. 11.20021493.” Sólo en caso de urgencia y si no se cuenta
con otro donante podrá ser recolectado evaluando cuidadosamente los riesgos y
beneficios. Podría volver a ser elegido si no tuvo historia de infección respiratoria
severa, hayan trascurrido, 28.
[ZF14] Matthew D Zeiler and Rob Fergus. “Visualizing and understanding convolutional
networks.” In European conference on computer vision, pp. 818–833. Springer,
2014.
26