SlideShare a Scribd company logo
Kevin McGuinness
kevin.mcguinness@dcu.ie
Research Fellow
Insight Centre for Data Analytics
Dublin City University
DEEP
LEARNING
WORKSHOP
Dublin City University
28-29 April 2017
Learning where to look: focus
and attention in deep vision
Overview
Visual attention models and their applications
Deep vision for medical image analysis
Deep crowd analysis
Interactive deep vision: image segmentation
Visual Attention Models and their
Applications
The importance of visual attention
The importance of visual attention
The importance of visual attention
The importance of visual attention
Why don’t we see the changes?
We don’t really see the whole image
We only focus on small specific regions: the salient parts
Human beings reliably attend to the same regions of images
when shown
What we perceive
Where we look
What we actually see
Can we predict where humans will look?
Yes! Computational models of visual saliency
Why might this be useful?
SalNet: deep visual saliency model
Predict map of visual attention from image pixels
(find the parts of the image that stand out)
● Feedforward 8 layer “fully convolutional”
architecture
● Transfer learning in bottom 3 layers from
pretrained VGG-M model on ImageNet
● Trained on SALICON dataset (simulated
crowdsourced attention dataset using
mouse and artificial foveation)
● Top-5 in MIT 300 saliency benchmark
http://saliency.mit.edu/results_mit300.html
Predicted Ground truth
Pan, McGuinness, et al. Shallow and Deep Convolutional Networks for Saliency Prediction, CVPR 2016 http://arxiv.org/abs/1603.00845
ImageGroundtruthPrediction
ImageGroundtruthPrediction
SalGAN
Adversarial loss
Data loss
Junting Pan, Cristian Canton, Kevin McGuinness, Noel E. O’Connor, Jordi Torres, Elisa Sayrol and Xavier Giro-i-Nieto. “SalGAN: Visual
Saliency Prediction with Generative Adversarial Networks.” arXiv. 2017.
18
SalNet and SalGAN benchmarks
Applications of visual attention
Intelligent image cropping
Image retrieval
Improved image classification
Intelligent image cropping
Image retrieval: query by example
Given:
● An example query image
that illustrates the user's
information need
● A very large dataset of
images
Task:
● Rank all images in the
dataset according to how
likely they are to fulfil the
user's information need
23
Retrieval benchmarks
Oxford Buildings
2007
Paris Buildings
2008
TRECVID INS
2014
Bags of convolutional features instance search
Objective: rank images according to relevance to
query image
Local CNN features and BoW
● Pretrained VGG-16 network
● Features from conv-5
● L2-norm, PCA, L2-norm
● K-means clustering -> BoW
● Cosine similarity
● Query augmentation, spatial reranking
Scalable, fast, high-performance on Oxford 5K,
Paris 6K and TRECVid INS
BoW Descriptor
Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search, ICMR 2016 http://arxiv.org/abs/1604.04653
Bags of convolutional features instance search
BoW Descriptor
Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search, ICMR 2016 http://arxiv.org/abs/1604.04653
Using saliency to improve retrieval
CNN
CNN
Saliency
Semantic
features
Importance
weighting
Weighted
features
Pooling (e.g.
BoW)
Image descriptors
Saliency weighted retrieval
Oxford Paris INSTRE
Global Local Global Local Global Local
No weighting 0.614 0.680 0.621 0.720 0.304 0.472
Center prior 0.656 0.702 0.691 0.758 0.407 0.546
Saliency 0.680 0.717 0.716 0.770 0.514 0.617
QE saliency - 0.784 - 0.834 0.719
Mean Average Precision
12.4%
Using saliency to improve image classification
Conv 1
Conv 3
Conv 4
Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Max-Pooling
Max-Pooling
RGB
Saliency
Conv 1
Batch Norm.
Max-Pooling
Figure credit: Eric Arazo
Why does it improve classification accuracy?
Acoustic guitar +25 %
Volleyball +23 %
Deep Vision for Medical Image
Analysis
Task: predict KL grade from X-Ray images
Pipeline: locate and classify
Detection performance
FCN detection performance
Template matching: (J > 0.5) 8.3%
SVM on handcrafted features: (J > 0.5): 38.6%
Multi-objective learning helps!
Same network used to regress on KL grade and predict a discrete KL grade
Jointly train on both objectives
Comparison with the state of the art
How far are we from human-level accuracy?
Most errors are between grade 0 and 1 and grade 1 and 2.
Human experts have a hard time with these grades too.
Agreement among humans on OAI
● weighted kappa of 0.70 [0.65-0.76]
Human machine agreement
● weighted kappa of 0.67 [0.65-0.68]
Predictions agree with the “gold standard” about as well as
the “gold standard” agrees with itself.
Confusion matrix
Neonatal brain image segmentation
Volumetric semantic segmentation: label each pixel with class of brain matter.
Applications:
● Prerequisite for volumetric analysis
● Early identification of risk factors for impaired brain development
Challenge:
● Neonatal brains very different
● Sparse training data! Neobrains challenge has 2 training examples
Cerebellum Unmyelinated
white matter
Cortical grey
matter
Ventricles Cerebrospinal
fluid
Brainstem
The task
Model
● 8 layer FCN
● 64/96 convolution filters per layer
● Atrous (dilated) convolution to increase receptive field
without sacrificing prediction resolution
● 9D per pixel softmax over classes
● Binary cross entropy loss
● L2
regularization
● Aggressive data augmentation: scale, crop, rotate, flip,
gamma
● Train on 2 axial volumes (~50 slides per volume) for 500
epochs using Adam optimizer
Animation from: http://nicolovaligi.com/deep-learning-models-semantic-segmentation.html
Sample results
Cerebellum Unmyelinated
white matter
Cortical grey
matter
Ventricles Cerebrospinal
fluid
Brainstem
Tissue Ours LRDE_LTCI
UPF_SIMBioSy
s
Cerebellum 0.92 0.94 0.94
Myelinated white matter 0.51 0.06 0.54
Basal ganglia and thalami 0.91 0.91 0.93
Ventricles 0.89 0.87 0.83
Unmyelinated white matter 0.93 0.93 0.91
Brainstem 0.82 0.85 0.85
Cortical grey matter 0.88 0.87 0.85
Cerebrospinal fluid 0.83 0.83 0.79
UWM+MWM 0.93 0.93 0.90
CSF+Ven 0.84 0.84 0.79
0.85 0.80 0.83
Neobrains challenge
New state of the art on
Neobrains infant brain
segmentation challenge for axial
volume segmentation
Deep learning with only 2
training examples!
No ensembling yet. Best
competing approach is a large
ensemble.
Second best is also a deep net.
Deep Vision for Crowd Analysis
Fully convolutional crowd counting
FCN ∑
Crowd
count
estimate
8x conv layers
2x max pooling
Crowd density
estimate
Marsden et al. Fully convolutional crowd counting on highly congested scenes. VISAPP 2017
https://arxiv.org/abs/1612.00220
True count: 1544
Predicted count: 1566
Benchmark results: UCF CC 50 dataset
45-45,000 people per image
State of the art improved by 11% (MSE) and 13% (MSE)
Interactive Deep Vision
Interactive image segmentation
Image Encoder
FCN
Interaction Encoder
FCN
Segmentation Decoder
FCN
Channel Concatenation
Ground truth (MS
COCO)
Weak Labels
Cross Entropy
Loss
Auto-generated
User Outline
Gaussian Process
Input Image
Crop from MS
COCO
Dilated
Convolutions
Pixel Predictions
DeepClip: training
Image Encoder
FCN
Interaction Encoder
FCN
Segmentation Decoder
FCN
Channel Concatenation
User
interactions
Input Image
Dilated
Convolutions
Pixel Predictions
DeepClip: prediction
Binary
Segmentation
Vector tracer
Graph Cuts
Optimizer
Closing remarks
● Deep learning has completely revolutionized computer vision
● Human visual attention is important! Incorporating visual attention models
helps in many tasks
● You don’t need a huge amount of training data to train an effective deep
model
● Simulation techniques are effective for data generation
● Multi-task deep learning is an effective way of providing “more signal” during
training.
Questions?

More Related Content

What's hot

CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
Sungjoon Choi
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
Brodmann17
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Abhishek Bhandwaldar
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
datasciencekorea
 
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
Yogendra Tamang
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
S N
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
Yan Xu
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Seonho Park
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용
홍배 김
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
Junho Cho
 
Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial
Ligeng Zhu
 
Tutorial on Deep Learning
Tutorial on Deep LearningTutorial on Deep Learning
Tutorial on Deep Learning
inside-BigData.com
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
Poo Kuan Hoong
 
Geek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine LearningGeek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine Learning
GeekNightHyderabad
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
Hannes Hapke
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer Vision
David Dao
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
Sungjoon Choi
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Mustafa Aldemir
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
Yogendra Tamang
 

What's hot (20)

CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
 
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial
 
Tutorial on Deep Learning
Tutorial on Deep LearningTutorial on Deep Learning
Tutorial on Deep Learning
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
Geek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine LearningGeek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine Learning
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer Vision
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
 

Similar to Learning where to look: focus and attention in deep vision

最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
Hiroshi Fukui
 
researchpaper_2023_Skin_Csdbjsjvnvsdnfvancer.pdf
researchpaper_2023_Skin_Csdbjsjvnvsdnfvancer.pdfresearchpaper_2023_Skin_Csdbjsjvnvsdnfvancer.pdf
researchpaper_2023_Skin_Csdbjsjvnvsdnfvancer.pdf
AvijitChaudhuri3
 
Lung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdfLung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdf
jagan477830
 
Brain Tumor Detection Using Deep Learning ppt new made.pptx
Brain Tumor Detection Using Deep Learning ppt new made.pptxBrain Tumor Detection Using Deep Learning ppt new made.pptx
Brain Tumor Detection Using Deep Learning ppt new made.pptx
vikyt2211
 
Deep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text GenerationDeep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text Generation
ijtsrd
 
Improved Brain Segmentation using Pixel Separation and Additional Segmentatio...
Improved Brain Segmentation using Pixel Separation and Additional Segmentatio...Improved Brain Segmentation using Pixel Separation and Additional Segmentatio...
Improved Brain Segmentation using Pixel Separation and Additional Segmentatio...
AFEng1
 
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
Universitat Politècnica de Catalunya
 
Image Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine LearningImage Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine Learning
ijtsrd
 
Project Presentation.pptx
Project Presentation.pptxProject Presentation.pptx
Project Presentation.pptx
BME62ThejeswarSeggam
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
Anil Kumar Gupta
 
Scene recognition using Convolutional Neural Network
Scene recognition using Convolutional Neural NetworkScene recognition using Convolutional Neural Network
Scene recognition using Convolutional Neural Network
DhirajGidde
 
IRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
IRJET- A Survey on Medical Image Interpretation for Predicting PneumoniaIRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
IRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
IRJET Journal
 
Overview of computer vision and machine learning
Overview of computer vision and machine learningOverview of computer vision and machine learning
Overview of computer vision and machine learning
smckeever
 
Skin_Cancer.pptx
Skin_Cancer.pptxSkin_Cancer.pptx
Skin_Cancer.pptx
Mahabuba Nasrin Eity
 
Top Cited Articles in Signal & Image Processing 2021-2022
Top Cited Articles in Signal & Image Processing 2021-2022Top Cited Articles in Signal & Image Processing 2021-2022
Top Cited Articles in Signal & Image Processing 2021-2022
sipij
 
Recent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesRecent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectives
Namkug Kim
 
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
IRJET Journal
 
Image Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural NetworkImage Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural Network
AIRCC Publishing Corporation
 
Image Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural NetworkImage Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural Network
AIRCC Publishing Corporation
 
Glaucoma Detection using Deep Learning (1).pptx
Glaucoma Detection using Deep Learning (1).pptxGlaucoma Detection using Deep Learning (1).pptx
Glaucoma Detection using Deep Learning (1).pptx
noyarav597
 

Similar to Learning where to look: focus and attention in deep vision (20)

最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
researchpaper_2023_Skin_Csdbjsjvnvsdnfvancer.pdf
researchpaper_2023_Skin_Csdbjsjvnvsdnfvancer.pdfresearchpaper_2023_Skin_Csdbjsjvnvsdnfvancer.pdf
researchpaper_2023_Skin_Csdbjsjvnvsdnfvancer.pdf
 
Lung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdfLung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdf
 
Brain Tumor Detection Using Deep Learning ppt new made.pptx
Brain Tumor Detection Using Deep Learning ppt new made.pptxBrain Tumor Detection Using Deep Learning ppt new made.pptx
Brain Tumor Detection Using Deep Learning ppt new made.pptx
 
Deep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text GenerationDeep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text Generation
 
Improved Brain Segmentation using Pixel Separation and Additional Segmentatio...
Improved Brain Segmentation using Pixel Separation and Additional Segmentatio...Improved Brain Segmentation using Pixel Separation and Additional Segmentatio...
Improved Brain Segmentation using Pixel Separation and Additional Segmentatio...
 
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
 
Image Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine LearningImage Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine Learning
 
Project Presentation.pptx
Project Presentation.pptxProject Presentation.pptx
Project Presentation.pptx
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 
Scene recognition using Convolutional Neural Network
Scene recognition using Convolutional Neural NetworkScene recognition using Convolutional Neural Network
Scene recognition using Convolutional Neural Network
 
IRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
IRJET- A Survey on Medical Image Interpretation for Predicting PneumoniaIRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
IRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
 
Overview of computer vision and machine learning
Overview of computer vision and machine learningOverview of computer vision and machine learning
Overview of computer vision and machine learning
 
Skin_Cancer.pptx
Skin_Cancer.pptxSkin_Cancer.pptx
Skin_Cancer.pptx
 
Top Cited Articles in Signal & Image Processing 2021-2022
Top Cited Articles in Signal & Image Processing 2021-2022Top Cited Articles in Signal & Image Processing 2021-2022
Top Cited Articles in Signal & Image Processing 2021-2022
 
Recent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesRecent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectives
 
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
 
Image Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural NetworkImage Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural Network
 
Image Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural NetworkImage Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural Network
 
Glaucoma Detection using Deep Learning (1).pptx
Glaucoma Detection using Deep Learning (1).pptxGlaucoma Detection using Deep Learning (1).pptx
Glaucoma Detection using Deep Learning (1).pptx
 

More from Universitat Politècnica de Catalunya

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
 

Recently uploaded

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 

Recently uploaded (20)

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 

Learning where to look: focus and attention in deep vision

  • 1. Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University DEEP LEARNING WORKSHOP Dublin City University 28-29 April 2017 Learning where to look: focus and attention in deep vision
  • 2. Overview Visual attention models and their applications Deep vision for medical image analysis Deep crowd analysis Interactive deep vision: image segmentation
  • 3. Visual Attention Models and their Applications
  • 4. The importance of visual attention
  • 5.
  • 6. The importance of visual attention
  • 7. The importance of visual attention
  • 8.
  • 9. The importance of visual attention
  • 10. Why don’t we see the changes? We don’t really see the whole image We only focus on small specific regions: the salient parts Human beings reliably attend to the same regions of images when shown
  • 14. Can we predict where humans will look? Yes! Computational models of visual saliency Why might this be useful?
  • 15. SalNet: deep visual saliency model Predict map of visual attention from image pixels (find the parts of the image that stand out) ● Feedforward 8 layer “fully convolutional” architecture ● Transfer learning in bottom 3 layers from pretrained VGG-M model on ImageNet ● Trained on SALICON dataset (simulated crowdsourced attention dataset using mouse and artificial foveation) ● Top-5 in MIT 300 saliency benchmark http://saliency.mit.edu/results_mit300.html Predicted Ground truth Pan, McGuinness, et al. Shallow and Deep Convolutional Networks for Saliency Prediction, CVPR 2016 http://arxiv.org/abs/1603.00845
  • 18. SalGAN Adversarial loss Data loss Junting Pan, Cristian Canton, Kevin McGuinness, Noel E. O’Connor, Jordi Torres, Elisa Sayrol and Xavier Giro-i-Nieto. “SalGAN: Visual Saliency Prediction with Generative Adversarial Networks.” arXiv. 2017. 18
  • 19. SalNet and SalGAN benchmarks
  • 20. Applications of visual attention Intelligent image cropping Image retrieval Improved image classification
  • 22.
  • 23. Image retrieval: query by example Given: ● An example query image that illustrates the user's information need ● A very large dataset of images Task: ● Rank all images in the dataset according to how likely they are to fulfil the user's information need 23
  • 24. Retrieval benchmarks Oxford Buildings 2007 Paris Buildings 2008 TRECVID INS 2014
  • 25. Bags of convolutional features instance search Objective: rank images according to relevance to query image Local CNN features and BoW ● Pretrained VGG-16 network ● Features from conv-5 ● L2-norm, PCA, L2-norm ● K-means clustering -> BoW ● Cosine similarity ● Query augmentation, spatial reranking Scalable, fast, high-performance on Oxford 5K, Paris 6K and TRECVid INS BoW Descriptor Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search, ICMR 2016 http://arxiv.org/abs/1604.04653
  • 26. Bags of convolutional features instance search BoW Descriptor Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search, ICMR 2016 http://arxiv.org/abs/1604.04653
  • 27. Using saliency to improve retrieval CNN CNN Saliency Semantic features Importance weighting Weighted features Pooling (e.g. BoW) Image descriptors
  • 28. Saliency weighted retrieval Oxford Paris INSTRE Global Local Global Local Global Local No weighting 0.614 0.680 0.621 0.720 0.304 0.472 Center prior 0.656 0.702 0.691 0.758 0.407 0.546 Saliency 0.680 0.717 0.716 0.770 0.514 0.617 QE saliency - 0.784 - 0.834 0.719 Mean Average Precision
  • 29. 12.4% Using saliency to improve image classification Conv 1 Conv 3 Conv 4 Conv 5 FC 1 FC 1 FC 3 - Output Drop Out Drop Out Batch Norm. Max-Pooling Max-Pooling RGB Saliency Conv 1 Batch Norm. Max-Pooling Figure credit: Eric Arazo
  • 30. Why does it improve classification accuracy? Acoustic guitar +25 % Volleyball +23 %
  • 31. Deep Vision for Medical Image Analysis
  • 32. Task: predict KL grade from X-Ray images
  • 34. Detection performance FCN detection performance Template matching: (J > 0.5) 8.3% SVM on handcrafted features: (J > 0.5): 38.6%
  • 35. Multi-objective learning helps! Same network used to regress on KL grade and predict a discrete KL grade Jointly train on both objectives
  • 36. Comparison with the state of the art
  • 37. How far are we from human-level accuracy? Most errors are between grade 0 and 1 and grade 1 and 2. Human experts have a hard time with these grades too. Agreement among humans on OAI ● weighted kappa of 0.70 [0.65-0.76] Human machine agreement ● weighted kappa of 0.67 [0.65-0.68] Predictions agree with the “gold standard” about as well as the “gold standard” agrees with itself. Confusion matrix
  • 38. Neonatal brain image segmentation Volumetric semantic segmentation: label each pixel with class of brain matter. Applications: ● Prerequisite for volumetric analysis ● Early identification of risk factors for impaired brain development Challenge: ● Neonatal brains very different ● Sparse training data! Neobrains challenge has 2 training examples
  • 39. Cerebellum Unmyelinated white matter Cortical grey matter Ventricles Cerebrospinal fluid Brainstem The task
  • 40. Model ● 8 layer FCN ● 64/96 convolution filters per layer ● Atrous (dilated) convolution to increase receptive field without sacrificing prediction resolution ● 9D per pixel softmax over classes ● Binary cross entropy loss ● L2 regularization ● Aggressive data augmentation: scale, crop, rotate, flip, gamma ● Train on 2 axial volumes (~50 slides per volume) for 500 epochs using Adam optimizer Animation from: http://nicolovaligi.com/deep-learning-models-semantic-segmentation.html
  • 41. Sample results Cerebellum Unmyelinated white matter Cortical grey matter Ventricles Cerebrospinal fluid Brainstem
  • 42. Tissue Ours LRDE_LTCI UPF_SIMBioSy s Cerebellum 0.92 0.94 0.94 Myelinated white matter 0.51 0.06 0.54 Basal ganglia and thalami 0.91 0.91 0.93 Ventricles 0.89 0.87 0.83 Unmyelinated white matter 0.93 0.93 0.91 Brainstem 0.82 0.85 0.85 Cortical grey matter 0.88 0.87 0.85 Cerebrospinal fluid 0.83 0.83 0.79 UWM+MWM 0.93 0.93 0.90 CSF+Ven 0.84 0.84 0.79 0.85 0.80 0.83 Neobrains challenge New state of the art on Neobrains infant brain segmentation challenge for axial volume segmentation Deep learning with only 2 training examples! No ensembling yet. Best competing approach is a large ensemble. Second best is also a deep net.
  • 43. Deep Vision for Crowd Analysis
  • 44. Fully convolutional crowd counting FCN ∑ Crowd count estimate 8x conv layers 2x max pooling Crowd density estimate Marsden et al. Fully convolutional crowd counting on highly congested scenes. VISAPP 2017 https://arxiv.org/abs/1612.00220
  • 46. Benchmark results: UCF CC 50 dataset 45-45,000 people per image State of the art improved by 11% (MSE) and 13% (MSE)
  • 49. Image Encoder FCN Interaction Encoder FCN Segmentation Decoder FCN Channel Concatenation Ground truth (MS COCO) Weak Labels Cross Entropy Loss Auto-generated User Outline Gaussian Process Input Image Crop from MS COCO Dilated Convolutions Pixel Predictions DeepClip: training
  • 50. Image Encoder FCN Interaction Encoder FCN Segmentation Decoder FCN Channel Concatenation User interactions Input Image Dilated Convolutions Pixel Predictions DeepClip: prediction Binary Segmentation Vector tracer Graph Cuts Optimizer
  • 51.
  • 52. Closing remarks ● Deep learning has completely revolutionized computer vision ● Human visual attention is important! Incorporating visual attention models helps in many tasks ● You don’t need a huge amount of training data to train an effective deep model ● Simulation techniques are effective for data generation ● Multi-task deep learning is an effective way of providing “more signal” during training.