This document presents research on using convolutional neural networks (CNNs) to detect skin lesions from dermoscopic images. The researchers:
1. Developed a CNN (U-Net) to segment skin lesions from images, achieving a Dice coefficient of 0.8689.
2. Used a fine-tuned VGG-16 network to classify images as benign or malignant. They found that using their automatic segmentations as input improved sensitivity over using unaltered images.
3. Concluded that their deep learning approach can help dermatologists diagnose skin cancer, and that automatic segmentation improves classification sensitivity compared to using whole images, even without perfect segmentation. This verifies their hypothesis that segmentation enhances classification.
https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Satellite Image Classification with Deep Learning Surveyijtsrd
Satellite imagery is important for many applications including disaster response, law enforcement and environmental monitoring etc. These applications require the manual identification of objects in the imagery. Because the geographic area to be covered is very large and the analysts available to conduct the searches are few, thus an automation is required. Yet traditional object detection and classification algorithms are too inaccurate and unreliable to solve the problem. Deep learning is a part of broader family of machine learning methods that have shown promise for the automation of such tasks. It has achieved success in image understanding by means that of convolutional neural networks. The problem of object and facility recognition in satellite imagery is considered. The system consists of an ensemble of convolutional neural networks and additional neural networks that integrate satellite metadata with image features. Roshni Rajendran | Liji Samuel ""Satellite Image Classification with Deep Learning: Survey"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-2 , February 2020,
URL: https://www.ijtsrd.com/papers/ijtsrd30031.pdf
Paper Url : https://www.ijtsrd.com/engineering/computer-engineering/30031/satellite-image-classification-with-deep-learning-survey/roshni-rajendran
https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
This presentation is an analysis of the paper,"SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing"
Poster - Convolutional Neural Networks for Real-time Road Sign Detection-V3Mr...Guangrui Liu
The document summarizes a convolutional neural network model called YOLO that performs real-time road sign detection in one stage. It divides the input image into a 7x7 grid, with each grid predicting 2 bounding boxes and confidence scores. It is trained on a dataset of 484 stop signs and 284 yield signs over 5000 batches, and tests at over 24 frames per second on videos with an overall accuracy of 92.5%, detecting stop signs at 90% accuracy and yield signs at 95% accuracy.
This document summarizes lecture material on face recognition. It discusses face detection, alignment, identification, and verification. It also reviews several popular face recognition systems like DeepFace, FaceNet, and Deep ID. Experiments were conducted at UPC on various databases using deep neural networks like VGG, GoogleNet, and ResNet. The best results achieved 97% accuracy on a database of 3,500 identities and 100,000 images. Ongoing work involves verification using advanced techniques like joint Bayesian models, siamese networks, and triplets.
An Image Based PCB Fault Detection and Its Classificationrahulmonikasharma
The field of electronics is skyrocketing like never before. The habitat for the electronic components is a printed circuit board (PCB). With the advent of newer and finer technologies it has almost become impossible to detect the faults in a printed circuit board manually which consumes lot of manpower and time. This paper proposes a simple and cost effective method of fault diagnosis in a PCB using image processing techniques. In addition to fault detection and its classification this paper addresses various problems faced during the pre-processing phase. This paper overcomes the drawbacks of the previous works such as improper orientations of the image and size variations of the image. Basically image subtraction algorithm is used for fault detection. The most commonly occurring faults are concentrated in this work and the same are implemented using MATLAB tool.
This document presents research on using convolutional neural networks (CNNs) to detect skin lesions from dermoscopic images. The researchers:
1. Developed a CNN (U-Net) to segment skin lesions from images, achieving a Dice coefficient of 0.8689.
2. Used a fine-tuned VGG-16 network to classify images as benign or malignant. They found that using their automatic segmentations as input improved sensitivity over using unaltered images.
3. Concluded that their deep learning approach can help dermatologists diagnose skin cancer, and that automatic segmentation improves classification sensitivity compared to using whole images, even without perfect segmentation. This verifies their hypothesis that segmentation enhances classification.
https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Satellite Image Classification with Deep Learning Surveyijtsrd
Satellite imagery is important for many applications including disaster response, law enforcement and environmental monitoring etc. These applications require the manual identification of objects in the imagery. Because the geographic area to be covered is very large and the analysts available to conduct the searches are few, thus an automation is required. Yet traditional object detection and classification algorithms are too inaccurate and unreliable to solve the problem. Deep learning is a part of broader family of machine learning methods that have shown promise for the automation of such tasks. It has achieved success in image understanding by means that of convolutional neural networks. The problem of object and facility recognition in satellite imagery is considered. The system consists of an ensemble of convolutional neural networks and additional neural networks that integrate satellite metadata with image features. Roshni Rajendran | Liji Samuel ""Satellite Image Classification with Deep Learning: Survey"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-2 , February 2020,
URL: https://www.ijtsrd.com/papers/ijtsrd30031.pdf
Paper Url : https://www.ijtsrd.com/engineering/computer-engineering/30031/satellite-image-classification-with-deep-learning-survey/roshni-rajendran
https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
This presentation is an analysis of the paper,"SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing"
Poster - Convolutional Neural Networks for Real-time Road Sign Detection-V3Mr...Guangrui Liu
The document summarizes a convolutional neural network model called YOLO that performs real-time road sign detection in one stage. It divides the input image into a 7x7 grid, with each grid predicting 2 bounding boxes and confidence scores. It is trained on a dataset of 484 stop signs and 284 yield signs over 5000 batches, and tests at over 24 frames per second on videos with an overall accuracy of 92.5%, detecting stop signs at 90% accuracy and yield signs at 95% accuracy.
This document summarizes lecture material on face recognition. It discusses face detection, alignment, identification, and verification. It also reviews several popular face recognition systems like DeepFace, FaceNet, and Deep ID. Experiments were conducted at UPC on various databases using deep neural networks like VGG, GoogleNet, and ResNet. The best results achieved 97% accuracy on a database of 3,500 identities and 100,000 images. Ongoing work involves verification using advanced techniques like joint Bayesian models, siamese networks, and triplets.
An Image Based PCB Fault Detection and Its Classificationrahulmonikasharma
The field of electronics is skyrocketing like never before. The habitat for the electronic components is a printed circuit board (PCB). With the advent of newer and finer technologies it has almost become impossible to detect the faults in a printed circuit board manually which consumes lot of manpower and time. This paper proposes a simple and cost effective method of fault diagnosis in a PCB using image processing techniques. In addition to fault detection and its classification this paper addresses various problems faced during the pre-processing phase. This paper overcomes the drawbacks of the previous works such as improper orientations of the image and size variations of the image. Basically image subtraction algorithm is used for fault detection. The most commonly occurring faults are concentrated in this work and the same are implemented using MATLAB tool.
This document summarizes an academic paper that proposes a method for incrementally training object detection models to classify unseen object classes in real-time. It begins by providing background on object detection techniques like YOLO and SSD that can perform detection in a single pass. The paper aims to improve these single-shot detectors through incremental learning to classify new object classes without retraining the entire model from scratch. It conducted experiments on YOLO and VGG16 to investigate how well they can classify objects from unseen classes and whether their performance is affected by factors like background, bounding box size, or network architecture. The goal is to develop a more robust object detection method that can easily adapt to new classes of objects in real-time applications.
Codetecon #KRK 3 - Object detection with Deep LearningMatthew Opala
There’s been enormous progress in object detection algorithms. Starting from multi-stage ones like R-CNN to end-to-end ones like SSD or YOLO, accuracy of the methods improved significantly. Current applications include pedestrian detection for cars and face detection on facebook.
But that’s just the beginning. I am going to show the algorithms for solving the problem, show what’s currently possible, and what will be possible in the near future.
This paper provides an overview of the runs submitted to TRECVID 2016 by ITI-CERTH. ITI-CERTH participated in the Ad-hoc Video Search (AVS), Multimedia Event Detection (MED), Instance Search (INS) and Surveillance Event Detection (SED) tasks. Our AVS task participation is based on a method that combines the linguistic analysis of the query and the concept-based annotation of video fragments. In the MED task, in 000Ex task we exploit the textual description of an event class in order retrieve related videos, without using positive samples. Furthermore, in 010Ex and 1000Ex tasks, a kernel sub class version of our discriminant analysis method (KSDA) combined with a fast linear SVM is employed. The INS task is performed by employing VERGE, which is an interactive
retrieval application that integrates retrieval functionalities that consider only visual information. For the surveillance event detection (SED) task, we deployed a novel activity detection algorithm that is based on Motion Boundary Activity Areas (MBAA), dense trajectories, Fisher vectors and an overlapping sliding window.
1) The document presents a method for detecting building damage from very high resolution satellite images using one-class SVM classification and shadow information.
2) Initial building damage is detected using one-class SVM classification on multitemporal images. Shadows are then detected and changes in shadows over time are identified.
3) The initial damage detection results are refined by considering areas of shadow change, removing detections not near shadow changes. This combined method improved damage detection accuracy over using spectral data alone.
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...YutaSuzuki27
In the image classification task, we only need to learn local features, but in the image segmentation task, we also need to learn positional information. Therefore, there is a difference between the image segmentation task and the image classification task in the features to be learned. In this study, we propose SE-U-Net++, which efficiently learns both local features and positional information by incorporating SE blocks, and a transfer learning algorithm that bridges the difference between the tasks by comparing parameters in the convolutional layer.
Title: Deep Learning based Segmentation Pipeline for Label-Free Phase-Contrast Microscopy Images
THE 28th IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS
5 - 7 October 2020
Video Link: https://youtu.be/b5tGt6GMN9E
In Comparison with other object detection algorithms, YOLO proposes the use of an end-to-end neural network that makes predictions of bounding boxes and class probabilities all at once.
Automatic Building detection for satellite Images using IGV and DSMAmit Raikar
This document presents a method for automatic building detection from satellite images using internal gray variance (IGV) and digital surface model (DSM). The proposed method aims to detect low-rising buildings and buildings with partially bright and partially dark rooftops more accurately than existing methods. The key steps include image enhancement, IGV feature extraction, seed point detection using the enhanced image and IGV, clustering using DSM data, binarization, thinning, shadow detection, and segmentation. Results on test satellite images show the method achieves higher detection percentages and lower branch factors than an existing method.
The document compares frame difference and Kalman filter techniques for detecting moving vehicles in video surveillance. Frame difference is a simple but low accuracy method that uses thresholding on differences between frames. Kalman filtering provides better accuracy by modeling each pixel as a Kalman filter and updating estimates based on observations. The paper applies both methods to a vehicle video and finds that Kalman filtering produces cleaner detection with fewer false positives compared to frame difference.
https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Gaussian kernel based anatomically-aided diffuse optical tomography reconstruction. The document introduces a kernel method for diffuse optical tomography (DOT) image reconstruction that uses anatomical guidance without requiring image segmentation. A Gaussian kernel is used to relate absorption coefficients between neighboring nodes based on their features. Simulation results show the kernel method achieves comparable or better image quality than soft-prior methods while being more robust to incorrect priors. Experimental validation using a tissue phantom also shows the kernel method can provide anatomical guidance without segmentation. Future work will investigate applying this method to clinical breast imaging data.
Александр Заричковый "Faster than real-time face detection"Fwdays
I will talk about object and face detection problems, evolution of different approaches to solving these problems and about the ideas behind each of these approaches. Also I will describe meta-architecture that achieve state of the art results on faces detection problem and works faster than real-time.
An artificial neural network was used to accurately identify the interaction positions of gamma photons in a gamma camera detector module. Training datasets were acquired along lines parallel to the x and y axes to simplify the training process and optimize the neural network structure. The proposed method improved discrimination accuracy at the edges of the detector compared to conventional algorithms and reduced the energy resolution from 22.8% to 15.7%, demonstrating its effectiveness for gamma camera systems.
Avihu Efrat's Viola and Jones face detection slideswolf
The document summarizes the Viola-Jones object detection framework. It uses a cascade of classifiers with increasingly more complex features trained with AdaBoost to rapidly detect objects. Integral images allow for very fast feature evaluations. The framework was applied to face detection, achieving very fast average detection speeds of 270 microseconds per sub-window while maintaining low false positive rates.
KaoNet: Face Recognition and Generation App using Deep LearningVan Huy
KaoNet is a face recognition and generation app using deep learning. It uses convolutional neural networks (CNNs) for face recognition and generative adversarial networks (GANs) for face generation. The app was trained on a dataset of celebrity faces collected from online sources. Initial results for face recognition were poor due to overfitting and limited data. Expanding the dataset improved validation accuracy to 98%. The GAN was also able to generate realistic looking faces after training.
IRJET - Real Time Object Detection using YOLOv3IRJET Journal
The document describes using the YOLO (You Only Look Once) algorithm for real-time object detection. YOLO uses a single neural network to predict bounding boxes and class probabilities for the entire image simultaneously. This allows it to detect multiple objects faster than algorithms that require region proposals or sliding windows. The authors trained a YOLO model to detect bottles, cars, and mobiles using 6000 iterations. On their test dataset, the model achieved a mean average precision of 98.14%, intersection over union of 83.19%, and F1-score of 0.94, demonstrating accurate real-time object detection.
This is a preliminary study and the objective of this study has been to reconstruct of missing parts or scratches of digital images is an important field used extensively in artwork restoration. This restoration can be done by using two approaches, image inpainting, and texture synthesis. There are many techniques for the two previous approaches that can carry out the process optimally and accurately. In this paper, the advantages and disadvantages of most algorithms of the image inpainting approach are discussed. Among the different algorithms, the proposed dynamic masking method outperformed than other techniques. This modification produces rapid and simple for reconstruction of small missing and damaged portions of images that are two to three orders of magnitude faster than current methods while producing comparable results with respect to other.
THE EFFECT OF PHYSICAL BASED FEATURES FOR RECOGNITION OF RECAPTURED IMAGESijcsit
It is very simple and easier to recapture a high quality images from LCD screens with the development of multimedia technology and digital devices. In authentication, the use of such recaptured images can be very dangerous. So, it is very important to recognize the recaptured images in order to increase authenticity. Even though, there are a number of features that have been proposed in various state-of-theart
visual recognition tasks, but it is still difficult to decide which feature or combination of features have more significant impact on this task. In this paper an image recapture detection method based on set of physical based features including texture, HSV colour and blurriness is proposed. Also, this paper evaluates the performance of different distinctive featuresin the context of recognition of recaptured
images. Several experimental setups have been conducted in order to demonstrate the performance of the proposed method. In all these experimental results, the proposed method is efficient with good recognition rate. Among the combination of low-level features, CS-LBP detection is to operator which is used to extract the texture feature is the most robust feature.
The document discusses face recognition using principal components analysis (PCA). It provides three key points:
1. PCA is used to reduce the dimensionality of face image data to 2D or 3D by finding patterns in high-dimensional data and visualizing it. This allows for face recognition by representing each face as a set of weights of significant eigenvectors.
2. A training set is used to form the PCA coordinate system and represent each training face as weights of eigenvectors. A test face is then recognized as the closest training face based on Euclidean distance between their representations in the PCA space.
3. PCA allows for data compression, noise reduction, and classification of faces by projecting high-dimensional image data onto
Detection and recognition of face using neural networkSmriti Tikoo
This document describes research on face detection and recognition using neural networks. It discusses using the Viola-Jones algorithm for face detection and a backpropagation neural network for face recognition. The Viola-Jones algorithm uses haar features, integral images, AdaBoost training, and cascading classifiers for real-time face detection. A backpropagation network with sigmoid activation functions is trained on facial images for recognition. Results show the network can accurately recognize faces after training. The document concludes the approach allows face recognition from an input image and discusses limitations and potential improvements.
This document summarizes an academic paper that proposes a method for incrementally training object detection models to classify unseen object classes in real-time. It begins by providing background on object detection techniques like YOLO and SSD that can perform detection in a single pass. The paper aims to improve these single-shot detectors through incremental learning to classify new object classes without retraining the entire model from scratch. It conducted experiments on YOLO and VGG16 to investigate how well they can classify objects from unseen classes and whether their performance is affected by factors like background, bounding box size, or network architecture. The goal is to develop a more robust object detection method that can easily adapt to new classes of objects in real-time applications.
Codetecon #KRK 3 - Object detection with Deep LearningMatthew Opala
There’s been enormous progress in object detection algorithms. Starting from multi-stage ones like R-CNN to end-to-end ones like SSD or YOLO, accuracy of the methods improved significantly. Current applications include pedestrian detection for cars and face detection on facebook.
But that’s just the beginning. I am going to show the algorithms for solving the problem, show what’s currently possible, and what will be possible in the near future.
This paper provides an overview of the runs submitted to TRECVID 2016 by ITI-CERTH. ITI-CERTH participated in the Ad-hoc Video Search (AVS), Multimedia Event Detection (MED), Instance Search (INS) and Surveillance Event Detection (SED) tasks. Our AVS task participation is based on a method that combines the linguistic analysis of the query and the concept-based annotation of video fragments. In the MED task, in 000Ex task we exploit the textual description of an event class in order retrieve related videos, without using positive samples. Furthermore, in 010Ex and 1000Ex tasks, a kernel sub class version of our discriminant analysis method (KSDA) combined with a fast linear SVM is employed. The INS task is performed by employing VERGE, which is an interactive
retrieval application that integrates retrieval functionalities that consider only visual information. For the surveillance event detection (SED) task, we deployed a novel activity detection algorithm that is based on Motion Boundary Activity Areas (MBAA), dense trajectories, Fisher vectors and an overlapping sliding window.
1) The document presents a method for detecting building damage from very high resolution satellite images using one-class SVM classification and shadow information.
2) Initial building damage is detected using one-class SVM classification on multitemporal images. Shadows are then detected and changes in shadows over time are identified.
3) The initial damage detection results are refined by considering areas of shadow change, removing detections not near shadow changes. This combined method improved damage detection accuracy over using spectral data alone.
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...YutaSuzuki27
In the image classification task, we only need to learn local features, but in the image segmentation task, we also need to learn positional information. Therefore, there is a difference between the image segmentation task and the image classification task in the features to be learned. In this study, we propose SE-U-Net++, which efficiently learns both local features and positional information by incorporating SE blocks, and a transfer learning algorithm that bridges the difference between the tasks by comparing parameters in the convolutional layer.
Title: Deep Learning based Segmentation Pipeline for Label-Free Phase-Contrast Microscopy Images
THE 28th IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS
5 - 7 October 2020
Video Link: https://youtu.be/b5tGt6GMN9E
In Comparison with other object detection algorithms, YOLO proposes the use of an end-to-end neural network that makes predictions of bounding boxes and class probabilities all at once.
Automatic Building detection for satellite Images using IGV and DSMAmit Raikar
This document presents a method for automatic building detection from satellite images using internal gray variance (IGV) and digital surface model (DSM). The proposed method aims to detect low-rising buildings and buildings with partially bright and partially dark rooftops more accurately than existing methods. The key steps include image enhancement, IGV feature extraction, seed point detection using the enhanced image and IGV, clustering using DSM data, binarization, thinning, shadow detection, and segmentation. Results on test satellite images show the method achieves higher detection percentages and lower branch factors than an existing method.
The document compares frame difference and Kalman filter techniques for detecting moving vehicles in video surveillance. Frame difference is a simple but low accuracy method that uses thresholding on differences between frames. Kalman filtering provides better accuracy by modeling each pixel as a Kalman filter and updating estimates based on observations. The paper applies both methods to a vehicle video and finds that Kalman filtering produces cleaner detection with fewer false positives compared to frame difference.
https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Gaussian kernel based anatomically-aided diffuse optical tomography reconstruction. The document introduces a kernel method for diffuse optical tomography (DOT) image reconstruction that uses anatomical guidance without requiring image segmentation. A Gaussian kernel is used to relate absorption coefficients between neighboring nodes based on their features. Simulation results show the kernel method achieves comparable or better image quality than soft-prior methods while being more robust to incorrect priors. Experimental validation using a tissue phantom also shows the kernel method can provide anatomical guidance without segmentation. Future work will investigate applying this method to clinical breast imaging data.
Александр Заричковый "Faster than real-time face detection"Fwdays
I will talk about object and face detection problems, evolution of different approaches to solving these problems and about the ideas behind each of these approaches. Also I will describe meta-architecture that achieve state of the art results on faces detection problem and works faster than real-time.
An artificial neural network was used to accurately identify the interaction positions of gamma photons in a gamma camera detector module. Training datasets were acquired along lines parallel to the x and y axes to simplify the training process and optimize the neural network structure. The proposed method improved discrimination accuracy at the edges of the detector compared to conventional algorithms and reduced the energy resolution from 22.8% to 15.7%, demonstrating its effectiveness for gamma camera systems.
Avihu Efrat's Viola and Jones face detection slideswolf
The document summarizes the Viola-Jones object detection framework. It uses a cascade of classifiers with increasingly more complex features trained with AdaBoost to rapidly detect objects. Integral images allow for very fast feature evaluations. The framework was applied to face detection, achieving very fast average detection speeds of 270 microseconds per sub-window while maintaining low false positive rates.
KaoNet: Face Recognition and Generation App using Deep LearningVan Huy
KaoNet is a face recognition and generation app using deep learning. It uses convolutional neural networks (CNNs) for face recognition and generative adversarial networks (GANs) for face generation. The app was trained on a dataset of celebrity faces collected from online sources. Initial results for face recognition were poor due to overfitting and limited data. Expanding the dataset improved validation accuracy to 98%. The GAN was also able to generate realistic looking faces after training.
IRJET - Real Time Object Detection using YOLOv3IRJET Journal
The document describes using the YOLO (You Only Look Once) algorithm for real-time object detection. YOLO uses a single neural network to predict bounding boxes and class probabilities for the entire image simultaneously. This allows it to detect multiple objects faster than algorithms that require region proposals or sliding windows. The authors trained a YOLO model to detect bottles, cars, and mobiles using 6000 iterations. On their test dataset, the model achieved a mean average precision of 98.14%, intersection over union of 83.19%, and F1-score of 0.94, demonstrating accurate real-time object detection.
This is a preliminary study and the objective of this study has been to reconstruct of missing parts or scratches of digital images is an important field used extensively in artwork restoration. This restoration can be done by using two approaches, image inpainting, and texture synthesis. There are many techniques for the two previous approaches that can carry out the process optimally and accurately. In this paper, the advantages and disadvantages of most algorithms of the image inpainting approach are discussed. Among the different algorithms, the proposed dynamic masking method outperformed than other techniques. This modification produces rapid and simple for reconstruction of small missing and damaged portions of images that are two to three orders of magnitude faster than current methods while producing comparable results with respect to other.
THE EFFECT OF PHYSICAL BASED FEATURES FOR RECOGNITION OF RECAPTURED IMAGESijcsit
It is very simple and easier to recapture a high quality images from LCD screens with the development of multimedia technology and digital devices. In authentication, the use of such recaptured images can be very dangerous. So, it is very important to recognize the recaptured images in order to increase authenticity. Even though, there are a number of features that have been proposed in various state-of-theart
visual recognition tasks, but it is still difficult to decide which feature or combination of features have more significant impact on this task. In this paper an image recapture detection method based on set of physical based features including texture, HSV colour and blurriness is proposed. Also, this paper evaluates the performance of different distinctive featuresin the context of recognition of recaptured
images. Several experimental setups have been conducted in order to demonstrate the performance of the proposed method. In all these experimental results, the proposed method is efficient with good recognition rate. Among the combination of low-level features, CS-LBP detection is to operator which is used to extract the texture feature is the most robust feature.
The document discusses face recognition using principal components analysis (PCA). It provides three key points:
1. PCA is used to reduce the dimensionality of face image data to 2D or 3D by finding patterns in high-dimensional data and visualizing it. This allows for face recognition by representing each face as a set of weights of significant eigenvectors.
2. A training set is used to form the PCA coordinate system and represent each training face as weights of eigenvectors. A test face is then recognized as the closest training face based on Euclidean distance between their representations in the PCA space.
3. PCA allows for data compression, noise reduction, and classification of faces by projecting high-dimensional image data onto
Detection and recognition of face using neural networkSmriti Tikoo
This document describes research on face detection and recognition using neural networks. It discusses using the Viola-Jones algorithm for face detection and a backpropagation neural network for face recognition. The Viola-Jones algorithm uses haar features, integral images, AdaBoost training, and cascading classifiers for real-time face detection. A backpropagation network with sigmoid activation functions is trained on facial images for recognition. Results show the network can accurately recognize faces after training. The document concludes the approach allows face recognition from an input image and discusses limitations and potential improvements.
This document provides an overview of facial recognition technology and its applications. It discusses how facial recognition systems work by using nodal points on faces to recognize individuals. The objectives are to design software that can accurately detect faces from images without physical interaction and allow for high identification and verification rates. The research methodology involves a workflow for the facial recognition system. Potential applications mentioned include using facial recognition for computer and border security, voting verification, and commercial uses like residential security and banking.
Face recognition: A Comparison of Appearance Based Approachessadique_ghitm
Face recognition approaches can be divided into three main categories: direct correlation, eigenfaces, and fisherfaces. Direct correlation directly compares pixel intensity values between images. Eigenfaces uses principal component analysis to project faces into a face space defined by eigenvectors. Fisherfaces aims to maximize between-class variations while minimizing within-class variations to better account for differences in lighting and expressions. Pre-processing techniques like color normalization, histogram equalization, and edge detection can improve the accuracy of face recognition systems by reducing the effects of lighting variations. Testing various pre-processing techniques on different approaches found that the fisherfaces method combined with SLBC preprocessing achieved the lowest error rate of 17.8%, followed closely by direct correlation with intensity normalization at 18.
facial recognition system is a technology capable of matching a human face from a digital image or a video frame against a database of faces, typically employed to authenticate users through ID verification services, works by pinpointing and measuring facial features from a given image.[1]
Development began on similar systems in the 1960s, beginning as a form of computer application. Since their inception, facial recognition systems have seen wider uses in recent times on smartphones and in other forms of technology, such as robotics. Because computerized facial recognition involves the measurement of a human's physiological characteristics, facial recognition systems are categorized as biometrics. Although the accuracy of facial recognition systems as a biometric technology is lower than iris recognition and fingerprint recognition, it is widely adopted due to its contactless process.[2] Facial recognition systems have been deployed in advanced human–computer interaction, video surveillance and automatic indexing of images.[3]
Facial recognition systems are employed throughout the world today by governments and private companies.[4] Their effectiveness varies, and some systems have previously been scrapped because of their ineffectiveness. The use of facial recognition systems has also raised controversy, with claims that the systems violate citizens' privacy, commonly make incorrect identifications, encourage gender norms and racial profiling, and do not protect important biometric data. The appearance of synthetic media such as deepfakes has also raised concerns about its security.[5] These claims have led to the ban of facial recognition systems in several cities in the United States.[6] As a result of growing societal concerns, Meta announced[7] that it plans to shut down Facebook facial recognition system, deleting the face scan data of more than one billion users.[8] This change will represent one of the largest shifts in facial recognition usage in the technology's history Facial recognition systems are employed throughout the world today by governments and private companies.[4] Their effectiveness varies, and some systems have previously been scrapped because of their ineffectiveness. The use of facial recognition systems has also raised controversy, with claims that the systems violate citizens' privacy, commonly make incorrect identifications, encourage gender norms and racial profiling, and do not protect important biometric data. The appearance of synthetic media such as deepfakes has also raised concerns about its security.[5] These claims have led to the ban of facial recognition systems in several cities in the United States.[6] As a result of growing societal concerns, Meta announced[7] that it plans to shut down Facebook facial recognition system, deleting the face scan data of more than one billion users.[8] This change will represent one of the largest shifts in facial recognition usage in the technology's history. Pleasure.
This document outlines a project that uses face recognition from face motion manifolds. It proposes an information-theoretic approach using Resistor-Average Distance (RAD) as a dissimilarity measure between distributions of face images. A kernel-based algorithm is introduced that allows modeling of complex, nonlinear manifolds while retaining the closed-form RAD expression between normal distributions. Recognition rates of 90-100% can be achieved on databases of 10-100 people by modeling errors in face registration. The algorithm uses kernel PCA to nonlinearly map data and computes RAD on the mapped data as the dissimilarity measure between face image distributions.
This document presents a method for face detection and gender recognition using data science. It introduces the importance of estimating age and gender from facial images for applications like access control and surveillance. The method uses mean absolute error and cumulative score to measure age estimation accuracy. It then describes the steps of the method which include segmenting the input image pixels, removing non-face objects, separating connected components, identifying components as faces or non-faces, refining face positions, finding faces through template matching, and detecting gender based on mean intensity and template matching. Test results on sample images are presented showing high accuracy of face detection and low false positive rates.
Face and Eye Detection Varying Scenarios With Haar Classifier_2015Showrav Mazumder
The document presents a face and eye detection system. It discusses challenges in face detection like image quality, pose variation, and facial expressions. It describes the history of face detection and various methods like knowledge-based, feature-invariant, template matching, and appearance-based. The methodology section explains the Viola-Jones algorithm using Haar-like features, integral images, AdaBoost, and cascade classifiers. The implementation uses OpenCV for detection. Experiments showed high detection rates for single faces but lower rates for group faces and detecting eyes with pose variations. Future work involves improving classifiers and detecting side faces in real-time.
This document describes a system for detecting brain tumors in MRI images using image segmentation. It begins with an introduction and abstract. It then describes the existing manual tumor detection system and its limitations. The proposed system applies preprocessing like noise removal, image segmentation to detect tumor edges, feature extraction, and classification to automatically detect tumors. The system is implemented in MATLAB and aims to help doctors detect tumors faster and earlier.
This document describes a system for detecting brain tumors in MRI images using image segmentation. It discusses how existing manual detection of tumors is difficult due to noise and requires many days. The proposed system applies preprocessing like filtering and grayscale conversion. It then uses image segmentation techniques to detect tumor edges and boundaries. Features are extracted and classification is used to differentiate between normal and tumor images, helping doctors detect tumors earlier. The system is implemented in MATLAB and aims to overcome difficulties in early tumor detection.
This document describes a system for detecting brain tumors in MRI images using image segmentation. It begins with an introduction and abstract. It then describes the existing manual tumor detection system and its limitations. The proposed system applies preprocessing like noise removal, image segmentation to detect tumor edges, feature extraction, and classification to automatically detect tumors. The system is implemented in MATLAB and aims to help doctors detect tumors faster and earlier.
This document describes a system for detecting brain tumors in MRI images using image segmentation. It begins with an introduction and abstract. It then describes the existing manual tumor detection system and its limitations. The proposed system applies preprocessing like noise removal, image segmentation to detect tumor edges, feature extraction, and classification to automatically detect tumors. The system is implemented in MATLAB and segmented images are used for accurate diagnosis. It concludes the segmentation method can help physicians with improved diagnosis and treatment.
Cross Pose Facial Recognition Method for Tracking any Person's Location an Ap...ijtsrd
In todays world, there are number of existing methods for facial recognition. These methods are based on frontal view face data. There are few methods which are based on non-frontal view face recognition method. In most of the face recognition algorithm, œFeature space approach is used. In this approach, different feature vectors are extracted from face. These distances are compared to determine matches. In this paper, it is proposed that how any person can be located in a campus or in a city using a cross pose face recognition method. This paper is focusing on three parts 1) generation of multi-view images 2) comparison of images 3) showing the actual location of a person. Sanjay D. Sawaitul"Cross Pose Facial Recognition Method for Tracking any Persons Location an Approach" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-1 , December 2017, URL: http://www.ijtsrd.com/papers/ijtsrd7186.pdf http://www.ijtsrd.com/computer-science/data-processing/7186/cross-pose-facial-recognition-method-for--tracking-any-persons-location-an-approach/sanjay-d-sawaitul
This document discusses face detection techniques. It begins with an introduction that defines face detection and discusses why it is important and challenging. It then covers topics like image segmentation, face detection approaches, morphological image processing, and skin color-based face detection. The document analyzes literature on face detection methods and provides descriptions of techniques like thresholding, edge detection, region-based segmentation, and template matching. It also includes a case study on specific face detection software applications and concludes by summarizing the discussed techniques.
This document describes a face detection method using principal component analysis. It first preprocesses images using histogram equalization to address illumination issues. It then detects faces using skin segmentation to identify skin regions. Finally, it recognizes the extracted facial features using principal component analysis and a neural network, which reduces the dimensionality of the images for efficient recognition.
Humans often use faces to recognize individuals, and advancements in computing capability over the past few decades now enable similar recognitions automatically. Early facial recognition algorithms used simple geometric models, but the recognition process has now matured into a science of sophisticated mathematical representations and matching processes. Major advancements and initiatives in the past 10 to 15 years have propelled facial recognition technology into the spotlight. Facial recognition can be used for both verification and identification.
Facial recognition is a type of biometric system that identifies individuals by analyzing patterns in images of their faces. The presentation summarizes how facial recognition systems work by detecting faces, normalizing them, extracting distinguishing features to create a template, and then matching templates to identify individuals. It notes advantages like convenience but also challenges like difficulty with changes in appearance over time. Applications discussed include security, banking, and voter verification.
IRJET- Survey on Face Detection MethodsIRJET Journal
The document reviews 15 papers on various face detection methods published between 2013 and 2018. It finds that the most popular feature extraction method is skin color segmentation, which achieves detection rates of 88-98%. The Viola-Jones method typically detects face regions as well as other body parts at a rate of 80-90%. Common face detection methods reviewed include skin color segmentation, Viola-Jones, Haar features, 3D mean shift, and Cascaded Head and Shoulder Detection. OpenCV, Python or MATLAB are typically used to implement real-time face detection systems.
Face Recognition Based Attendance System with Auto Alert to Guardian using Ca...ijtsrd
This document presents a face recognition based attendance system that automatically marks student attendance using image processing techniques. It uses the Viola-Jones face detection algorithm to detect faces in images and then performs face recognition using algorithms like PCA to identify students and mark their attendance in a database. It also provides alerts to guardians if a student is marked absent by sending SMS or making phone calls. The system aims to automate the manual attendance marking process which is time-consuming and error-prone. It discusses the architecture of the system and the face detection and recognition algorithms used in detail. The paper concludes that the automatic attendance system replaces the manual process and is faster, more efficient and saves time and costs.
This document provides an overview of digital image processing (DIP) and discusses various topics related to it. It begins with welcoming remarks and introductions. It then discusses key areas of application for image processing like optical character recognition, security, compression, and medical imaging. Some main techniques covered include image acquisition, pre-processing, enhancement, segmentation, feature extraction, classification, and understanding. Application areas like remote sensing, astronomy, security, and OCR are also summarized. The document provides examples and illustrations of different image processing concepts.
Similar to The Impact of Segmentation on the Accuracy and Sensitivity of a Melanoma Classifier Based on Skin Lesion Images (20)
Visual Information Retrieval: Advances, Challenges and OpportunitiesOge Marques
Visual Information Retrieval: Advances, Challenges and Opportunities discusses advances and challenges in visual information retrieval. Key points include:
- Visual information retrieval aims to find relevant images/videos based on visual and text queries, addressing the "semantic gap" between low-level features and high-level meanings.
- Advances include improved text-based, content-based, and mixed search methods, as well applications in medical image retrieval and mobile visual search.
- Ongoing challenges include capturing image similarity, addressing various representation gaps, understanding user intentions, and developing broad domain solutions.
Image Processing and Computer Vision in iOSOge Marques
- Image processing and computer vision applications are becoming more common on mobile devices like the iPhone and iPad. There are many opportunities to build successful apps that can improve how users work with images and videos.
- The talk provided an overview of developing image and computer vision apps for iOS, including recommended tools like Core Image and OpenCV. It also offered advice on focusing an app idea on solving a specific problem and being aware of competition and market timing.
- Mobile image processing and computer vision have a promising future, and there is a need for good solutions to specific problems in this area that developers can work on building.
Using games to improve computer vision solutionsOge Marques
Dr. Oge Marques discusses using games to improve computer vision solutions. Specifically, Dr. Marques describes a two-player web-based guessing game called Ask'nSeek that helps solve the computer vision problems of object detection, labeling, and semantic scene segmentation. Ask'nSeek logs spatial relationships and labels from a small number of games per image to train machine learning models for these tasks.
Image retrieval: challenges and opportunitiesOge Marques
Google Goggles is a mobile visual search system that allows users to search for information by taking photos with their smartphone cameras. It uses computer vision techniques like interest point detection, feature extraction and indexing to match query images to images in large online databases. This kind of visual search is relevant because of the rise of powerful mobile devices with cameras and popular image-sharing apps. It enables new commercial opportunities for visual search and discovery on mobile phones.
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...Oge Marques
Part I – Concepts, challenges, and state of the art
Part II – Medical image retrieval
Part III – Mobile visual search
Part IV – Where is image search headed?
Mobile Visual Search (MVS) is a fascinating research field that has the potential to impact how visual data is organized, annotated, and retrieved using mobile devices. The document outlines opportunities in MVS, basic concepts, and technical aspects of MVS systems. It discusses the MVS pipeline including descriptor extraction, interest point detection, feature descriptor computation, feature indexing/matching, and geometric verification. Challenges of MVS like low latency, robust recognition, and handling broad/narrow domains are also covered. The Compressed Histogram of Gradients (CHoG) descriptor is presented as an example of a compact descriptor designed for MVS.
This document discusses advances in image search and retrieval. It begins with an overview of visual information retrieval and its challenges, including the semantic gap between low-level visual features and high-level semantics. It then covers recent techniques like Google image search and similarity search. The document outlines core concepts like capturing similarity, large datasets, and user needs. It also revisits a 2000 paper on the challenges still facing the field, including the unsolved semantic gap and need for standardized evaluation benchmarks.
Image Processing and Computer Vision in iPhone and iPadOge Marques
This document provides an overview of image processing and computer vision applications for the iPhone and iPad. It discusses the growing market for mobile apps in this field and the technical capabilities of iPhone devices. The document outlines a mini-course on developing iPhone and iPad apps for image processing and computer vision. It covers fundamentals of iOS development like Xcode, Objective-C, classes and objects, and the model-view-controller design pattern. It also discusses OpenCV and examples of commercial apps and student projects.
Recent advances in visual information retrieval marques klu june 2010Oge Marques
The document summarizes key points from a 2010 presentation on visual information retrieval (VIR). It revisits conclusions from a 2000 paper on challenges facing content-based image retrieval (CBIR). While some predictions were accurate, like increased data sizes and interaction options, others were not, like solving image understanding. Significant progress was made on benchmarks and datasets but less on similarity metrics. Medical image retrieval poses new challenges to understand but offers opportunities if VIR methods can adapt to new domains.
Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)Oge Marques
- Image search and retrieval remains a challenging problem with many open issues even 10 years after it was deemed to be past its early years.
- While progress has been made in areas like datasets, benchmarks and interfaces, core problems around similarity, semantics, and bridging the semantic gap between low-level visual features and high-level concepts remain largely unsolved.
- Narrowing domains and combining content-based techniques with metadata and user involvement through tagging and feedback may provide more successful solutions going forward.
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Introduction to Jio Cinema**:
- Brief overview of Jio Cinema as a streaming platform.
- Its significance in the Indian market.
- Introduction to retention and engagement strategies in the streaming industry.
2. **Understanding Retention and Engagement**:
- Define retention and engagement in the context of streaming platforms.
- Importance of retaining users in a competitive market.
- Key metrics used to measure retention and engagement.
3. **Jio Cinema's Content Strategy**:
- Analysis of the content library offered by Jio Cinema.
- Focus on exclusive content, originals, and partnerships.
- Catering to diverse audience preferences (regional, genre-specific, etc.).
- User-generated content and interactive features.
4. **Personalization and Recommendation Algorithms**:
- How Jio Cinema leverages user data for personalized recommendations.
- Algorithmic strategies for suggesting content based on user preferences, viewing history, and behavior.
- Dynamic content curation to keep users engaged.
5. **User Experience and Interface Design**:
- Evaluation of Jio Cinema's user interface (UI) and user experience (UX).
- Accessibility features and device compatibility.
- Seamless navigation and search functionality.
- Integration with other Jio services.
6. **Community Building and Social Features**:
- Strategies for fostering a sense of community among users.
- User reviews, ratings, and comments.
- Social sharing and engagement features.
- Interactive events and campaigns.
7. **Retention through Loyalty Programs and Incentives**:
- Overview of loyalty programs and rewards offered by Jio Cinema.
- Subscription plans and benefits.
- Promotional offers, discounts, and partnerships.
- Gamification elements to encourage continued usage.
8. **Customer Support and Feedback Mechanisms**:
- Analysis of Jio Cinema's customer support infrastructure.
- Channels for user feedback and suggestions.
- Handling of user complaints and queries.
- Continuous improvement based on user feedback.
9. **Multichannel Engagement Strategies**:
- Utilization of multiple channels for user engagement (email, push notifications, SMS, etc.).
- Targeted marketing campaigns and promotions.
- Cross-promotion with other Jio services and partnerships.
- Integration with social media platforms.
10. **Data Analytics and Iterative Improvement**:
- Role of data analytics in understanding user behavior and preferences.
- A/B testing and experimentation to optimize engagement strategies.
- Iterative improvement based on data-driven insights.
A presentation that explain the Power BI Licensing
The Impact of Segmentation on the Accuracy and Sensitivity of a Melanoma Classifier Based on Skin Lesion Images
1. #SIIM17
The Impact of Segmentation on the
Accuracy and Sensitivity of a
Melanoma Classifier
Based on Skin Lesion Images
Oge Marques, PhD
Professor
College of Engineering and Computer Science
Florida Atlantic University
2. #SIIM17
Our Team
Adrià Romero López Xavier Giró-i.Nieto
Image Processing Group
Signal Theory and
Communications Department
MIDDLE Research Group
Oge Marques Borko Furht Jack Burdick Janet Weinthal
NSF Award No. 1464537, I/UCRC Phase II under NSF 13-542
3. #SIIM17
Outline
• Motivation
• Context
• Scope and goals
• Challenges
• State of the art
• Our work
• Hypothesis
• Methodology
• Experimental results
• Ongoing and future work
• Concluding remarks
5. #SIIM17
Context
• This is not a typical SIIM presentation
• No specialized imaging equipment
• No PACS
• No DICOM
• No metadata
• No workflows
• Instead...
• Regular photographs
• Unstructured (and purely visual) data
• Minimal ground truth
6. #SIIM17
Scope
• Skin Disease: An
Illustrated Taxonomy
• Our focus: skin
lesion analysis for
(early) melanoma
detection
[Source: Esteva et al., Nature (2017)]
7. #SIIM17
Scope and Goals
• Scope:
• Help physicians to detect melanoma (a 2-class
classifier)
• Goals:
• Design an intelligent medical imaging skin lesion
diagnosis system using deep learning techniques
• Achieve (or improve upon) state-of-the-art results for
skin lesion classification
16. #SIIM17
Transfer Learning
1. Train on
Imagenet
3. Medium dataset:
finetuning
more data = retrain more of
the network (or all of it)
2. Small dataset:
feature extractor
Freeze these
Train this
Freeze these
Train this
Slide credit: Bay Area Deep Learning School Presentation by A. Karpathy
Medical Imaging case
19. #SIIM17
Our Hypothesis
• Image segmentation improves the performance of skin lesion
classifiers using convolutional neural networks.
[Source:International Skin Imaging Collaboration Archive]
Not segmented
Perfectly segmented
Partially segmented
20. #SIIM17
Methods
• ISBI 2016 Challenge dataset
• Skin Lesion Analysis towards melanoma detection
• 1279 RGB images
• Labeled as either benign or malignant
Class
Benign Malignant Total Images
Training subset 727 173 900
Testing subset 304 75 379
21. #SIIM17
Methods
• Dataset balancing through downsampling.
• Dataset split: 70-30% training/testing
• Input images:
• Unsegmented images: straight from the dataset.
• Perfectly segmented images: bitwise AND operation of the unaltered
images and its corresponding binary mask provided by the ISIC
dataset.
• Partially segmented images: original binary masks morphologically
dilated with a disk-shaped structuring element (50 pixel radius).
• Additional preprocessing methods (resizing and normalization) were
also performed to match the input size expected by the VGG16
architecture.
23. #SIIM17
Further Investigation
• What if we vary the degree of border expansion?
Sensitivity Accuracy AUC
Perfect Segmentation 45.3% 58.7% 62.2%
+25 53.3% 61.3% 64.2%
+50 56.0% 60.7% 62.6%
+75 57.3% 59.3% 60.8%
+100 34.7% 55.3% 57.9%
Unsegmented 24.0% 51.3% 53.2%
24. #SIIM17
Ongoing and Future Work
•Additional / larger / more challenging datasets
•Other CNN architectures
•Better image preprocessing
•Partnerships and collaborations
•Mobile app
25. #SIIM17
Concluding remarks
• Challenges
• Difficulty in acquiring datasets and reproducing / benchmarking results
• The “black box” aspect of DL-based solutions
• Hard to tell positives from negatives
• Learning curve: TensorFlow, Keras, HPC, DL concepts and best practices, etc.
• Opportunities
• Many variations of the basic classification problem
• Mobile app market
• Tech-minded dermatology practices