SlideShare a Scribd company logo
Visual Attention: Detecting Saliency on Images Vicente Ordonez Department of Computer Science State University of New York Stony Brook, NY 11790
I will be working mainly on the following paper Learning to Detect a Salient Object. T. Liu, J. Sun, N. Zheng, X. Tang, H. Shum. (Xian Jiaotong University and Microsoft Research Asia) from CVPR 2007.  http://research.microsoft.com/en-us/um/people/jiansun/papers/SalientDetection_CVPR07.pdf
What is Saliency? What is Visual Attention? “Everyone knows what attention is...” —William James, 1890
This is a problem of… Arbitrary object detection? Background / Foreground segmentation? Modeling Visual Attention?
The Method Features:  Multiscale Contrast    (Done!) Center surround histogram   (Mostly Done!) (Done!) Color spatial distribution (Done!) Supervised learning using Conditional Random Fields to determine the parameters to combine the features obtained above.  (Done!) [I will use a labeled dataset of 5000 images provided by Microsoft Research Asia!]
Multiscale Contrast Function Generate the Gaussian Pyramid for the input image. For each level in the pyramid  Do gaussian blurring Do resampling I’m using a 6 levels Gaussian pyramid for each RGB channel.
How a Gaussian pyramid looks like Figure from David Forsyth
Generate contrast maps for each level of the Pyramid. Sum all of the results to produce the final multiscale contrast map. The two steps mentioned above are described in this formula: Multiscale Contrast Function
Input image
Contrast maps
Contrast maps Original image Contrast map at level 1 Contrast map at level 4 Contrast map at level 6
Multiscale Contrast Map Output
Center Surround Histogram Feature ,[object Object]
For each possible rectangle with a reasonable size and aspect ratio
Create a surrounding rectangle and calculate the histogram of the rectangle and the surrounding area.
Pick and record the rectangle that maximizes the Chi-Square distance between the two histograms calculated above and also record the Chi-Square distance.,[object Object]
Center Surround Histogram Feature The algorithm as described before is computationally expensive…  It is required to use a technique called Integral Histogram. It allows you fast calculation of the histogram of any given rectangular region of an image. The algorithm was introduced in: “Integral Histogram: A Fast Way to Extract Histograms in Cartesian Spaces” by FatihPorikli, Mitsubishi Electric Research Lab in CVPR 2005.
Center Surround Histogram Feature Use the Chi Square Distances Map and the Map of Most Salient Rectangle Regions per pixel to generate the Center Surround Histogram Feature using the next formula:
Center Surround Histogram Results Using my Implementation        (15.2 sec, size = 245x384) Results Reported in the Paper
Center Surround Histogram Results Using my Implementation        (13.6 sec, size = 247x346) Results Reported in the Paper
Center Surround Histogram Results Using my Implementation        (10.2 sec, size = 248x277)
More Results
More Results
More results
More Results
More Results
More Results
More Results
More Results
More Results
More Results
More Results
Color Spatial Distribution
Color Spatial Distribution Make an initial clustering of the colors in the image using k-means.  Further refine the clusters by using Gaussian Mixture Models. The Gaussian Mixture Model parameters are calculated using the EM algorithm. I am using 5 clusters (5 colors) per image. And the results look similar to those presented in the paper with an execution time of around 17 seconds per image.
Color Spatial Distribution Calculate the vertical variance of the horizontal positions of the pixels for each cluster. And then the same for the vertical positions.  Sum the variances and use this value to weight more those clusters with less spatial variance. Penalize the clusters that contain the majority of its pixels away from the center of the image.
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Combine Features Together
Conditional Random Field Training and Inference Accelerated Training of Conditional Random Fields with Stochastic Meta-Descent S Vishwanathan, N. Schraudolph, M. Schmidt, K. Murphy. ICML'06 (Intl Conf on Machine Learning).  I did the training using this toolbox from the above paper: http://people.cs.ubc.ca/~murphyk/Software/CRF/crf.html
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                      Combined features                    Ground truth
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                      Combined features                    Ground truth
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                 Combined features        Ground truth
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                 Combined features        Ground truth

More Related Content

What's hot

BASICS OF DIGITAL IMAGE PROCESSING,MARIA PETROU
BASICS OF DIGITAL IMAGE PROCESSING,MARIA PETROUBASICS OF DIGITAL IMAGE PROCESSING,MARIA PETROU
BASICS OF DIGITAL IMAGE PROCESSING,MARIA PETROU
anjunarayanan
 
Basics of Image Processing using MATLAB
Basics of Image Processing using MATLABBasics of Image Processing using MATLAB
Basics of Image Processing using MATLABvkn13
 
Fundamentals of Image Processing & Computer Vision with MATLAB
Fundamentals of Image Processing & Computer Vision with MATLABFundamentals of Image Processing & Computer Vision with MATLAB
Fundamentals of Image Processing & Computer Vision with MATLAB
Ali Ghanbarzadeh
 
Handwritten Digit Recognition
Handwritten Digit RecognitionHandwritten Digit Recognition
Handwritten Digit Recognition
ijtsrd
 
Unit 3 daa
Unit 3 daaUnit 3 daa
Unit 3 daa
Nv Thejaswini
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
Yan Xu
 
Psuedo color
Psuedo colorPsuedo color
Psuedo color
Mariashoukat1206
 
Video display devices
Video display devicesVideo display devices
Video display devices
shalinikarunakaran1
 
Two dimensionaltransformations
Two dimensionaltransformationsTwo dimensionaltransformations
Two dimensionaltransformations
Nareek
 
Computer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC AlgorithmComputer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC Algorithm
allyn joy calcaben
 
Lecture 8 (Stereo imaging) (Digital Image Processing)
Lecture 8 (Stereo imaging) (Digital Image Processing)Lecture 8 (Stereo imaging) (Digital Image Processing)
Lecture 8 (Stereo imaging) (Digital Image Processing)
VARUN KUMAR
 
Spline representations
Spline representationsSpline representations
Spline representations
Nikhil krishnan
 
Smoothing Filters in Spatial Domain
Smoothing Filters in Spatial DomainSmoothing Filters in Spatial Domain
Smoothing Filters in Spatial Domain
Madhu Bala
 
Quadric surfaces
Quadric surfacesQuadric surfaces
Quadric surfaces
Ankur Kumar
 
Line drawing algo.
Line drawing algo.Line drawing algo.
Line drawing algo.Mohd Arif
 
06 Vector Visualization
06 Vector Visualization06 Vector Visualization
06 Vector Visualization
Valerii Klymchuk
 
Image segmentation techniques
Image segmentation techniquesImage segmentation techniques
Image segmentation techniques
gmidhubala
 
3D Transformation in Computer Graphics
3D Transformation in Computer Graphics3D Transformation in Computer Graphics
3D Transformation in Computer Graphics
sabbirantor
 

What's hot (20)

BASICS OF DIGITAL IMAGE PROCESSING,MARIA PETROU
BASICS OF DIGITAL IMAGE PROCESSING,MARIA PETROUBASICS OF DIGITAL IMAGE PROCESSING,MARIA PETROU
BASICS OF DIGITAL IMAGE PROCESSING,MARIA PETROU
 
Basics of Image Processing using MATLAB
Basics of Image Processing using MATLABBasics of Image Processing using MATLAB
Basics of Image Processing using MATLAB
 
Fundamentals of Image Processing & Computer Vision with MATLAB
Fundamentals of Image Processing & Computer Vision with MATLABFundamentals of Image Processing & Computer Vision with MATLAB
Fundamentals of Image Processing & Computer Vision with MATLAB
 
Handwritten Digit Recognition
Handwritten Digit RecognitionHandwritten Digit Recognition
Handwritten Digit Recognition
 
Unit 3 daa
Unit 3 daaUnit 3 daa
Unit 3 daa
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Psuedo color
Psuedo colorPsuedo color
Psuedo color
 
Video display devices
Video display devicesVideo display devices
Video display devices
 
Two dimensionaltransformations
Two dimensionaltransformationsTwo dimensionaltransformations
Two dimensionaltransformations
 
Curve clipping
Curve clippingCurve clipping
Curve clipping
 
Computer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC AlgorithmComputer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC Algorithm
 
Lecture 8 (Stereo imaging) (Digital Image Processing)
Lecture 8 (Stereo imaging) (Digital Image Processing)Lecture 8 (Stereo imaging) (Digital Image Processing)
Lecture 8 (Stereo imaging) (Digital Image Processing)
 
Shading
ShadingShading
Shading
 
Spline representations
Spline representationsSpline representations
Spline representations
 
Smoothing Filters in Spatial Domain
Smoothing Filters in Spatial DomainSmoothing Filters in Spatial Domain
Smoothing Filters in Spatial Domain
 
Quadric surfaces
Quadric surfacesQuadric surfaces
Quadric surfaces
 
Line drawing algo.
Line drawing algo.Line drawing algo.
Line drawing algo.
 
06 Vector Visualization
06 Vector Visualization06 Vector Visualization
06 Vector Visualization
 
Image segmentation techniques
Image segmentation techniquesImage segmentation techniques
Image segmentation techniques
 
3D Transformation in Computer Graphics
3D Transformation in Computer Graphics3D Transformation in Computer Graphics
3D Transformation in Computer Graphics
 

Viewers also liked

Iccv11 salientobjectdetection
Iccv11 salientobjectdetectionIccv11 salientobjectdetection
Iccv11 salientobjectdetectionJie Feng
 
Salient Point Detection
Salient Point DetectionSalient Point Detection
Salient Point DetectionTylerTK
 
Visual attention
Visual attentionVisual attention
Visual attentionannakalme
 
Visual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IMVisual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IM
Interactive Metronome
 
Chris Atherton at TCUK09
Chris Atherton at TCUK09Chris Atherton at TCUK09
Chris Atherton at TCUK09
Chris Atherton @finiteattention
 
Visual attention: models and performance
Visual attention: models and performanceVisual attention: models and performance
Visual attention: models and performance
Olivier Le Meur
 

Viewers also liked (6)

Iccv11 salientobjectdetection
Iccv11 salientobjectdetectionIccv11 salientobjectdetection
Iccv11 salientobjectdetection
 
Salient Point Detection
Salient Point DetectionSalient Point Detection
Salient Point Detection
 
Visual attention
Visual attentionVisual attention
Visual attention
 
Visual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IMVisual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IM
 
Chris Atherton at TCUK09
Chris Atherton at TCUK09Chris Atherton at TCUK09
Chris Atherton at TCUK09
 
Visual attention: models and performance
Visual attention: models and performanceVisual attention: models and performance
Visual attention: models and performance
 

Similar to Visual Saliency: Learning to Detect Salient Objects

Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image Processing
MeetupDataScienceRoma
 
Conception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfConception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdf
SofianeHassine2
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14Ashish Mundhra
 
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
CSCJournals
 
Lw3620362041
Lw3620362041Lw3620362041
Lw3620362041
IJERA Editor
 
Currency recognition on mobile phones
Currency recognition on mobile phonesCurrency recognition on mobile phones
Currency recognition on mobile phones
habeebsab
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous Driving
Yu Huang
 
Introduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer VisionIntroduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer Vision
othersk46
 
Normal Mapping / Computer Graphics - IK
Normal Mapping / Computer Graphics - IKNormal Mapping / Computer Graphics - IK
Normal Mapping / Computer Graphics - IK
Ilgın Kavaklıoğulları
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Sunando Sengupta
 
IEEE ICAPR 2009
IEEE ICAPR 2009IEEE ICAPR 2009
IEEE ICAPR 2009
Dakshina Ranjan Kisku
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNZihao(Gerald) Zhang
 
A Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting TechniquesA Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting Techniques
ijsrd.com
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFT
IRJET Journal
 
Design and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of viewDesign and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of view
sipij
 
Super Resolution of Image
Super Resolution of ImageSuper Resolution of Image
Super Resolution of ImageSatheesh K
 
Remotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acmRemotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acm
Kriti Bajpai
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
cscpconf
 

Similar to Visual Saliency: Learning to Detect Salient Objects (20)

Praseed Pai
Praseed PaiPraseed Pai
Praseed Pai
 
Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image Processing
 
Conception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfConception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdf
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14
 
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
 
Lw3620362041
Lw3620362041Lw3620362041
Lw3620362041
 
Currency recognition on mobile phones
Currency recognition on mobile phonesCurrency recognition on mobile phones
Currency recognition on mobile phones
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous Driving
 
Introduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer VisionIntroduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer Vision
 
Normal Mapping / Computer Graphics - IK
Normal Mapping / Computer Graphics - IKNormal Mapping / Computer Graphics - IK
Normal Mapping / Computer Graphics - IK
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
 
IEEE ICAPR 2009
IEEE ICAPR 2009IEEE ICAPR 2009
IEEE ICAPR 2009
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
 
A Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting TechniquesA Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting Techniques
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFT
 
Design and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of viewDesign and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of view
 
Super Resolution of Image
Super Resolution of ImageSuper Resolution of Image
Super Resolution of Image
 
Remotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acmRemotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acm
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
 
Av4301248253
Av4301248253Av4301248253
Av4301248253
 

More from Vicente Ordonez

From Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesFrom Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesVicente Ordonez
 
Data-driven Generation of Image Descriptions
Data-driven Generation of Image DescriptionsData-driven Generation of Image Descriptions
Data-driven Generation of Image DescriptionsVicente Ordonez
 
Im2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned PhotographsIm2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned Photographs
Vicente Ordonez
 
Texture Synthesis
Texture SynthesisTexture Synthesis
Texture Synthesis
Vicente Ordonez
 
Contenido Generado Por Los Usuarios
Contenido Generado Por Los UsuariosContenido Generado Por Los Usuarios
Contenido Generado Por Los Usuarios
Vicente Ordonez
 
Pantallas Plasma vs LCD
Pantallas Plasma vs LCDPantallas Plasma vs LCD
Pantallas Plasma vs LCD
Vicente Ordonez
 
Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009Vicente Ordonez
 
Sistema de Recuperacion de Audio
Sistema de Recuperacion de AudioSistema de Recuperacion de Audio
Sistema de Recuperacion de AudioVicente Ordonez
 
MapReduce
MapReduceMapReduce
MapReduce
Vicente Ordonez
 
Transmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / InternetTransmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / InternetVicente Ordonez
 
Buscadores de Podcast en Internet
Buscadores de Podcast en InternetBuscadores de Podcast en Internet
Buscadores de Podcast en Internet
Vicente Ordonez
 
Sistemas Operativos 3D
Sistemas Operativos 3DSistemas Operativos 3D
Sistemas Operativos 3D
Vicente Ordonez
 
Portal Concepts and .NET Webparts
Portal Concepts and .NET WebpartsPortal Concepts and .NET Webparts
Portal Concepts and .NET WebpartsVicente Ordonez
 

More from Vicente Ordonez (16)

From Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesFrom Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level Categories
 
Data-driven Generation of Image Descriptions
Data-driven Generation of Image DescriptionsData-driven Generation of Image Descriptions
Data-driven Generation of Image Descriptions
 
Im2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned PhotographsIm2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned Photographs
 
Texture Synthesis
Texture SynthesisTexture Synthesis
Texture Synthesis
 
Contenido Generado Por Los Usuarios
Contenido Generado Por Los UsuariosContenido Generado Por Los Usuarios
Contenido Generado Por Los Usuarios
 
Pantallas Plasma vs LCD
Pantallas Plasma vs LCDPantallas Plasma vs LCD
Pantallas Plasma vs LCD
 
Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009
 
Sistema de Recuperacion de Audio
Sistema de Recuperacion de AudioSistema de Recuperacion de Audio
Sistema de Recuperacion de Audio
 
Suenaemprendevive
SuenaemprendeviveSuenaemprendevive
Suenaemprendevive
 
MapReduce
MapReduceMapReduce
MapReduce
 
Robotica
RoboticaRobotica
Robotica
 
Transmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / InternetTransmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / Internet
 
Buscadores de Podcast en Internet
Buscadores de Podcast en InternetBuscadores de Podcast en Internet
Buscadores de Podcast en Internet
 
Sistemas Operativos 3D
Sistemas Operativos 3DSistemas Operativos 3D
Sistemas Operativos 3D
 
Ajax Atlas
Ajax AtlasAjax Atlas
Ajax Atlas
 
Portal Concepts and .NET Webparts
Portal Concepts and .NET WebpartsPortal Concepts and .NET Webparts
Portal Concepts and .NET Webparts
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 

Visual Saliency: Learning to Detect Salient Objects

  • 1. Visual Attention: Detecting Saliency on Images Vicente Ordonez Department of Computer Science State University of New York Stony Brook, NY 11790
  • 2. I will be working mainly on the following paper Learning to Detect a Salient Object. T. Liu, J. Sun, N. Zheng, X. Tang, H. Shum. (Xian Jiaotong University and Microsoft Research Asia) from CVPR 2007. http://research.microsoft.com/en-us/um/people/jiansun/papers/SalientDetection_CVPR07.pdf
  • 3. What is Saliency? What is Visual Attention? “Everyone knows what attention is...” —William James, 1890
  • 4. This is a problem of… Arbitrary object detection? Background / Foreground segmentation? Modeling Visual Attention?
  • 5. The Method Features: Multiscale Contrast (Done!) Center surround histogram (Mostly Done!) (Done!) Color spatial distribution (Done!) Supervised learning using Conditional Random Fields to determine the parameters to combine the features obtained above. (Done!) [I will use a labeled dataset of 5000 images provided by Microsoft Research Asia!]
  • 6. Multiscale Contrast Function Generate the Gaussian Pyramid for the input image. For each level in the pyramid Do gaussian blurring Do resampling I’m using a 6 levels Gaussian pyramid for each RGB channel.
  • 7. How a Gaussian pyramid looks like Figure from David Forsyth
  • 8. Generate contrast maps for each level of the Pyramid. Sum all of the results to produce the final multiscale contrast map. The two steps mentioned above are described in this formula: Multiscale Contrast Function
  • 11. Contrast maps Original image Contrast map at level 1 Contrast map at level 4 Contrast map at level 6
  • 13.
  • 14. For each possible rectangle with a reasonable size and aspect ratio
  • 15. Create a surrounding rectangle and calculate the histogram of the rectangle and the surrounding area.
  • 16.
  • 17. Center Surround Histogram Feature The algorithm as described before is computationally expensive… It is required to use a technique called Integral Histogram. It allows you fast calculation of the histogram of any given rectangular region of an image. The algorithm was introduced in: “Integral Histogram: A Fast Way to Extract Histograms in Cartesian Spaces” by FatihPorikli, Mitsubishi Electric Research Lab in CVPR 2005.
  • 18. Center Surround Histogram Feature Use the Chi Square Distances Map and the Map of Most Salient Rectangle Regions per pixel to generate the Center Surround Histogram Feature using the next formula:
  • 19. Center Surround Histogram Results Using my Implementation (15.2 sec, size = 245x384) Results Reported in the Paper
  • 20. Center Surround Histogram Results Using my Implementation (13.6 sec, size = 247x346) Results Reported in the Paper
  • 21. Center Surround Histogram Results Using my Implementation (10.2 sec, size = 248x277)
  • 34. Color Spatial Distribution Make an initial clustering of the colors in the image using k-means. Further refine the clusters by using Gaussian Mixture Models. The Gaussian Mixture Model parameters are calculated using the EM algorithm. I am using 5 clusters (5 colors) per image. And the results look similar to those presented in the paper with an execution time of around 17 seconds per image.
  • 35. Color Spatial Distribution Calculate the vertical variance of the horizontal positions of the pixels for each cluster. And then the same for the vertical positions. Sum the variances and use this value to weight more those clusters with less spatial variance. Penalize the clusters that contain the majority of its pixels away from the center of the image.
  • 45. Conditional Random Field Training and Inference Accelerated Training of Conditional Random Fields with Stochastic Meta-Descent S Vishwanathan, N. Schraudolph, M. Schmidt, K. Murphy. ICML'06 (Intl Conf on Machine Learning).  I did the training using this toolbox from the above paper: http://people.cs.ubc.ca/~murphyk/Software/CRF/crf.html
  • 46. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 47. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 48. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 49. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 50. Precision / Recall obtained
  • 51. Some Conclusions The results of the original research paper on computing the visual features have been successfully replicated in a considerable extent. The Conditional Random Field framework used in this project turned out to perform well for this task. The center-surround histogram map turned out to be the feature that gave the higher precision. The amount of time required for computing the individual features is in the order of several seconds.

Editor's Notes

  1. Not so good result
  2. Good result
  3. Not so good result