This document proposes an adaptive metric learning algorithm (AML) for visual saliency detection. AML learns two complementary distance metrics: 1) a generic metric (GML) that considers the global distribution of training data, and 2) a specific metric (SML) that considers the structure of individual images. GML and SML are combined to better distinguish salient objects from background. The algorithm also uses superpixel-wise Fisher vector coding of low-level features to enhance saliency detection performance. Experimental results show the proposed AML approach outperforms other state-of-the-art saliency detection methods.
An ensemble classification algorithm for hyperspectral imagessipij
Hyperspectral image analysis has been used for many purposes in environmental monitoring, remote
sensing, vegetation research and also for land cover classification. A hyperspectral image consists of many
layers in which each layer represents a specific wavelength. The layers stack on top of one another making
a cube-like image for entire spectrum. This work aims to classify the hyperspectral images and to produce
a thematic map accurately. Spatial information of hyperspectral images is collected by applying
morphological profile and local binary pattern. Support vector machine is an efficient classification
algorithm for classifying the hyperspectral images. Genetic algorithm is used to obtain the best feature
subjected for classification. Selected features are classified for obtaining the classes and to produce a
thematic map. Experiment is carried out with AVIRIS Indian Pines and ROSIS Pavia University. Proposed
method produces accuracy as 93% for Indian Pines and 92% for Pavia University.
Analysis of Multi-focus Image Fusion Method Based on Laplacian PyramidRajyalakshmi Reddy
The document discusses a multi-focus image fusion method based on Laplacian pyramid decomposition. It begins with an introduction to image fusion and multi-scale transforms. It then describes the proposed Laplacian pyramid based fusion method, which decomposes images into multiple resolution levels and fuses the levels using different operators. Experimental results show the proposed method provides better visual quality and quantitative metrics than average and wavelet based fusion methods.
Blind seperation image sources via adaptive dictionary learningMohan Raj
This document summarizes a research paper about blind source separation of image sources using adaptive dictionary learning. The paper proposes a new approach that adaptively learns local dictionaries for each source during the separation process, when the sparse domains of the sources are unknown. This improves source separation quality even in noisy situations. The paper defines a cost function for this approach and proposes extending an existing denoising method to minimize it. Due to practical limitations, a feasible hierarchical method is proposed instead, where local dictionaries are learned for each source along with separation. Experimental results confirm the strength of this proposed approach.
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSIONijistjournal
A different image fusion algorithm based on self organizing feature map is proposed in this paper, aiming to produce quality images. Image Fusion is to integrate complementary and redundant information from multiple images of the same scene to create a single composite image that contains all the important features of the original images. The resulting fused image will thus be more suitable for human and machine perception or for further image processing tasks. The existing fusion techniques based on either direct operation on pixels or segments fail to produce fused images of the required quality and are mostly application based. The existing segmentation algorithms become complicated and time consuming when multiple images are to be fused. A new method of segmenting and fusion of gray scale images adopting Self organizing Feature Maps(SOM) is proposed in this paper. The Self Organizing Feature Maps is adopted to produce multiple slices of the source and reference images based on various combination of gray scale and can dynamically fused depending on the application. The proposed technique is adopted and analyzed for fusion of multiple images. The technique is robust in the sense that there will be no loss in information due to the property of Self Organizing Feature Maps; noise removal in the source images done during processing stage and fusion of multiple images is dynamically done to get the desired results. Experimental results demonstrate that, for the quality multifocus image fusion, the proposed method performs better than some popular image fusion methods in both subjective and objective qualities.
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSIONijistjournal
A different image fusion algorithm based on self organizing feature map is proposed in this paper, aiming to produce quality images. Image Fusion is to integrate complementary and redundant information from multiple images of the same scene to create a single composite image that contains all the important features of the original images. The resulting fused image will thus be more suitable for human and machine perception or for further image processing tasks. The existing fusion techniques based on either direct operation on pixels or segments fail to produce fused images of the required quality and are mostly application based. The existing segmentation algorithms become complicated and time consuming when multiple images are to be fused. A new method of segmenting and fusion of gray scale images adopting Self organizing Feature Maps(SOM) is proposed in this paper. The Self Organizing Feature Maps is adopted to produce multiple slices of the source and reference images based on various combination of gray scale and can dynamically fused depending on the application. The proposed technique is adopted and analyzed for fusion of multiple images. The technique is robust in the sense that there will be no loss in information due to the property of Self Organizing Feature Maps; noise removal in the source images done during processing stage and fusion of multiple images is dynamically done to get the desired results. Experimental results demonstrate that, for the quality multifocus image fusion, the proposed method performs better than some popular image fusion methods in both subjective and objective qualities.
Image fusion is a technique of
intertwining at least two pictures of same scene to
shape single melded picture which shows indispensable
data in the melded picture. Picture combination
system is utilized for expelling clamor from the
pictures. Commotion is an undesirable material which
crumbles the nature of a picture influencing the
lucidity of a picture. Clamor can be of different kinds,
for example, Gaussian commotion, motivation clamor,
uniform commotion and so forth. Pictures degenerate
some of the time amid securing or transmission or
because of blame memory areas in the equipment.
Picture combination should be possible at three
dimensions, for example, pixel level combination,
highlight level combination and choice dimension
combination. There are essentially two kinds of picture
combination methods which are spatial area
combination systems and transient space combination
procedures. (PCA) combination, Normal strategy, high
pass sifting are spatial area techniques and strategies
which incorporate change, for example, Discrete
Cosine Transform, Discrete wavelet change are
transient space combination strategies. There are
different techniques for picture combination which
have numerous favorable circumstances and
detriments. Numerous procedures experience the ill
effects of the issue of shading curios that comes in the
intertwined picture shaped. Also, the Cyclopean One
of the most astonishing properties of human stereo
vision is the combination of the left and right
perspectives of a scene into a solitary cyclopean one.
Under typical survey conditions, the world shows up as
observed from a virtual eye set halfway between the
left and right eye positions. The apparent picture of
the world is never recorded specifically by any tangible
exhibit, however developed by our neural equipment.
The term cyclopean alludes to a type of visual
upgrades that is characterized by binocular
dissimilarity alone. He suspected that stereo-psis may
find concealed articles, this may be helpful to discover
disguised items. The critical part of this examination
when utilizing arbitrary dab stereo-grams was that
uniqueness is adequate for stereo-psis, and where had
just demonstrated that binocular difference was vital
for stereo-psis.
A MORPHOLOGICAL MULTIPHASE ACTIVE CONTOUR FOR VASCULAR SEGMENTATIONijbbjournal
This paper presents a morphological active contour ideal for vascular segmentation in biomedical images.
The unenhanced images of vessels and background are successfully segmented using a two-step
morphological active contour based upon Chan and Vese’s Active Contour without Edges. Using dilation
and erosion as an approximation of curve evolution, the contour provides an efficient, simple, and robust
alternative to solving partial differential equations used by traditional level-set Active Contour models. The
proposed method is demonstrated with segmented data set images and compared to results garnered from
multiphase Active Contour without Edges, morphological watershed, and Fuzzy C-means segmentations.
Review on Optimal image fusion techniques and Hybrid techniqueIRJET Journal
This document reviews various image fusion techniques and proposes a hybrid technique. It discusses pixel-level, feature-level, and decision-level image fusion. Spatial domain methods like average fusion and temporal domain methods like discrete wavelet transform are described. The limitations of existing techniques like ringing artifacts and shift-variance are covered. A hybrid technique using set partitioning in hierarchical trees (SPIHT) and self-organizing migrating algorithm (SOMA) is proposed to improve fusion quality and efficiency over existing methods. This technique is presented as easier to implement and suitable for real-time applications.
An ensemble classification algorithm for hyperspectral imagessipij
Hyperspectral image analysis has been used for many purposes in environmental monitoring, remote
sensing, vegetation research and also for land cover classification. A hyperspectral image consists of many
layers in which each layer represents a specific wavelength. The layers stack on top of one another making
a cube-like image for entire spectrum. This work aims to classify the hyperspectral images and to produce
a thematic map accurately. Spatial information of hyperspectral images is collected by applying
morphological profile and local binary pattern. Support vector machine is an efficient classification
algorithm for classifying the hyperspectral images. Genetic algorithm is used to obtain the best feature
subjected for classification. Selected features are classified for obtaining the classes and to produce a
thematic map. Experiment is carried out with AVIRIS Indian Pines and ROSIS Pavia University. Proposed
method produces accuracy as 93% for Indian Pines and 92% for Pavia University.
Analysis of Multi-focus Image Fusion Method Based on Laplacian PyramidRajyalakshmi Reddy
The document discusses a multi-focus image fusion method based on Laplacian pyramid decomposition. It begins with an introduction to image fusion and multi-scale transforms. It then describes the proposed Laplacian pyramid based fusion method, which decomposes images into multiple resolution levels and fuses the levels using different operators. Experimental results show the proposed method provides better visual quality and quantitative metrics than average and wavelet based fusion methods.
Blind seperation image sources via adaptive dictionary learningMohan Raj
This document summarizes a research paper about blind source separation of image sources using adaptive dictionary learning. The paper proposes a new approach that adaptively learns local dictionaries for each source during the separation process, when the sparse domains of the sources are unknown. This improves source separation quality even in noisy situations. The paper defines a cost function for this approach and proposes extending an existing denoising method to minimize it. Due to practical limitations, a feasible hierarchical method is proposed instead, where local dictionaries are learned for each source along with separation. Experimental results confirm the strength of this proposed approach.
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSIONijistjournal
A different image fusion algorithm based on self organizing feature map is proposed in this paper, aiming to produce quality images. Image Fusion is to integrate complementary and redundant information from multiple images of the same scene to create a single composite image that contains all the important features of the original images. The resulting fused image will thus be more suitable for human and machine perception or for further image processing tasks. The existing fusion techniques based on either direct operation on pixels or segments fail to produce fused images of the required quality and are mostly application based. The existing segmentation algorithms become complicated and time consuming when multiple images are to be fused. A new method of segmenting and fusion of gray scale images adopting Self organizing Feature Maps(SOM) is proposed in this paper. The Self Organizing Feature Maps is adopted to produce multiple slices of the source and reference images based on various combination of gray scale and can dynamically fused depending on the application. The proposed technique is adopted and analyzed for fusion of multiple images. The technique is robust in the sense that there will be no loss in information due to the property of Self Organizing Feature Maps; noise removal in the source images done during processing stage and fusion of multiple images is dynamically done to get the desired results. Experimental results demonstrate that, for the quality multifocus image fusion, the proposed method performs better than some popular image fusion methods in both subjective and objective qualities.
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSIONijistjournal
A different image fusion algorithm based on self organizing feature map is proposed in this paper, aiming to produce quality images. Image Fusion is to integrate complementary and redundant information from multiple images of the same scene to create a single composite image that contains all the important features of the original images. The resulting fused image will thus be more suitable for human and machine perception or for further image processing tasks. The existing fusion techniques based on either direct operation on pixels or segments fail to produce fused images of the required quality and are mostly application based. The existing segmentation algorithms become complicated and time consuming when multiple images are to be fused. A new method of segmenting and fusion of gray scale images adopting Self organizing Feature Maps(SOM) is proposed in this paper. The Self Organizing Feature Maps is adopted to produce multiple slices of the source and reference images based on various combination of gray scale and can dynamically fused depending on the application. The proposed technique is adopted and analyzed for fusion of multiple images. The technique is robust in the sense that there will be no loss in information due to the property of Self Organizing Feature Maps; noise removal in the source images done during processing stage and fusion of multiple images is dynamically done to get the desired results. Experimental results demonstrate that, for the quality multifocus image fusion, the proposed method performs better than some popular image fusion methods in both subjective and objective qualities.
Image fusion is a technique of
intertwining at least two pictures of same scene to
shape single melded picture which shows indispensable
data in the melded picture. Picture combination
system is utilized for expelling clamor from the
pictures. Commotion is an undesirable material which
crumbles the nature of a picture influencing the
lucidity of a picture. Clamor can be of different kinds,
for example, Gaussian commotion, motivation clamor,
uniform commotion and so forth. Pictures degenerate
some of the time amid securing or transmission or
because of blame memory areas in the equipment.
Picture combination should be possible at three
dimensions, for example, pixel level combination,
highlight level combination and choice dimension
combination. There are essentially two kinds of picture
combination methods which are spatial area
combination systems and transient space combination
procedures. (PCA) combination, Normal strategy, high
pass sifting are spatial area techniques and strategies
which incorporate change, for example, Discrete
Cosine Transform, Discrete wavelet change are
transient space combination strategies. There are
different techniques for picture combination which
have numerous favorable circumstances and
detriments. Numerous procedures experience the ill
effects of the issue of shading curios that comes in the
intertwined picture shaped. Also, the Cyclopean One
of the most astonishing properties of human stereo
vision is the combination of the left and right
perspectives of a scene into a solitary cyclopean one.
Under typical survey conditions, the world shows up as
observed from a virtual eye set halfway between the
left and right eye positions. The apparent picture of
the world is never recorded specifically by any tangible
exhibit, however developed by our neural equipment.
The term cyclopean alludes to a type of visual
upgrades that is characterized by binocular
dissimilarity alone. He suspected that stereo-psis may
find concealed articles, this may be helpful to discover
disguised items. The critical part of this examination
when utilizing arbitrary dab stereo-grams was that
uniqueness is adequate for stereo-psis, and where had
just demonstrated that binocular difference was vital
for stereo-psis.
A MORPHOLOGICAL MULTIPHASE ACTIVE CONTOUR FOR VASCULAR SEGMENTATIONijbbjournal
This paper presents a morphological active contour ideal for vascular segmentation in biomedical images.
The unenhanced images of vessels and background are successfully segmented using a two-step
morphological active contour based upon Chan and Vese’s Active Contour without Edges. Using dilation
and erosion as an approximation of curve evolution, the contour provides an efficient, simple, and robust
alternative to solving partial differential equations used by traditional level-set Active Contour models. The
proposed method is demonstrated with segmented data set images and compared to results garnered from
multiphase Active Contour without Edges, morphological watershed, and Fuzzy C-means segmentations.
Review on Optimal image fusion techniques and Hybrid techniqueIRJET Journal
This document reviews various image fusion techniques and proposes a hybrid technique. It discusses pixel-level, feature-level, and decision-level image fusion. Spatial domain methods like average fusion and temporal domain methods like discrete wavelet transform are described. The limitations of existing techniques like ringing artifacts and shift-variance are covered. A hybrid technique using set partitioning in hierarchical trees (SPIHT) and self-organizing migrating algorithm (SOMA) is proposed to improve fusion quality and efficiency over existing methods. This technique is presented as easier to implement and suitable for real-time applications.
Optimized Neural Network for Classification of Multispectral ImagesIDES Editor
This document summarizes an article that proposes using a multiobjective particle swarm optimization (MOPSO) approach to optimize the structure of an artificial neural network for classifying multispectral satellite images. Specifically, the MOPSO is used to simultaneously select the most discriminative spectral bands from the available options and determine the optimal number of nodes in the hidden layer of the neural network. The MOPSO approach is compared to traditional classifiers like maximum likelihood classification and Euclidean classifiers. The results show that the MOPSO-optimized neural network approach provides superior performance for remote sensing image classification problems.
1) The document discusses various medical image fusion techniques including pixel level, feature level, and decision level fusion.
2) It proposes a novel pixel level fusion method called Iterative Block Level Principal Component Averaging fusion that divides images into blocks and calculates principal components for each block.
3) Experimental results on fusing noise free and noise filtered MR images show that the proposed method performs well in terms of average mutual information and structural similarity compared to other algorithms.
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...ijma
In the process of content-based multimedia retrieval, multimedia information is processed in order to
obtain descriptive features. Descriptive representation of features, results in a huge feature count, which
results in processing overhead. To reduce this descriptive feature overhead, various approaches have been
used to dimensional reduction, among which PCA and LDA are the most used methods. However, these
methods do not reflect the significance of feature content in terms of inter-relation among all dataset
features. To achieve a dimension reduction based on histogram transformation, features with low
significance can be eliminated. In this paper, we propose a feature dimensional reduction approaches to
achieve the dimension reduction approach based on a multi-linear kernel (MLK) modeling. A benchmark
dataset for the experimental work is taken and the proposed work is observed to be improved in analysis in
comparison to the conventional system.
SINGLE IMAGE SUPER RESOLUTION: A COMPARATIVE STUDYcsandit
The majority of applications requiring high resolution images to derive and analyze data
accurately and easily. Image super resolution is playing an effective role in those applications.
Image super resolution is the process of producing high resolution image from low resolution
image. In this paper, we study various image super resolution techniques with respect to the
quality of results and processing time. This comparative study introduces a comparison between
four algorithms of single image super-resolution. For fair comparison, the compared algorithms
are tested on the same dataset and same platform to show the major advantages of one over the
others.
This document describes a project that aims to detect backgrounds in images with poor lighting and enhance contrast using morphological operations. The proposed method involves two approaches: 1) dividing the image into blocks and analyzing each block to determine background parameters, and 2) using morphological erosion and dilation with structuring elements to compute minimum and maximum intensity values within windows of the image for background detection and contrast enhancement. The results and conclusions of implementing these methods are then presented.
IRJET - Review of Various Multi-Focus Image Fusion MethodsIRJET Journal
This document provides an overview of multi-focus image fusion methods. It discusses various multi-focus image fusion techniques in both the spatial and frequency domains. It reviews several papers on multi-focus image fusion using different methods like region mosaicking on laplacian pyramid (RMLP), discrete wavelet transform (DWT), principal component analysis (PCA), discrete cosine transform (DCT), and implementation on field programmable gate arrays (FPGAs). The document compares the advantages and issues of the techniques discussed in the reviewed papers. It provides context on applications of image fusion in areas like remote sensing, medical imaging, and more.
The document proposes a new feature descriptor called Local Bit-Plane Wavelet Pattern (LBWP) to improve content-based retrieval of biomedical images like CT and MRI scans. LBWP encodes relationships between pixel intensities in different bit planes and applies a wavelet function, capturing more fine-grained image details than prior methods. Evaluation on a dataset from The Cancer Imaging Archive showed LBWP outperformed existing approaches like Local Wavelet Pattern with higher average retrieval precision, rate, and F-score.
DEEP LEARNING BASED TARGET TRACKING AND CLASSIFICATION DIRECTLY IN COMPRESSIV...sipij
Past research has found that compressive measurements save data storage and bandwidth usage. However, it is also observed that compressive measurements are difficult to be used directly for target tracking and classification without pixel reconstruction. This is because the Gaussian random matrix destroys the target location information in the original video frames. This paper summarizes our research effort on target tracking and classification directly in the compressive measurement domain. We focus on one type of compressive measurement using pixel subsampling. That is, the compressive measurements are obtained by randomly subsample the original pixels in video frames. Even in such special setting, conventional trackers still do not work well. We propose a deep learning approach that integrates YOLO (You Only Look Once) and ResNet (residual network) for target tracking and classification in low quality videos. YOLO is for multiple target detection and ResNet is for target classification. Extensive experiments using optical and mid-wave infrared (MWIR) videos in the SENSIAC database demonstrated the efficacy of the proposed approach.
Image Fusion and Image Quality Assessment of Fused ImagesCSCJournals
Accurate diagnosis of tumor extent is important in radiotherapy. This paper presents the use of image fusion of PET and MRI image. Multi-sensor image fusion is the process of combining information from two or more images into a single image. The resulting image contains more information as compared to individual images. PET delivers high-resolution molecular imaging with a resolution down to 2.5 mm full width at half maximum (FWHM), which allows us to observe the brain\'s molecular changes using the specific reporter genes and probes. On the other hand, the 7.0 T-MRI, with sub-millimeter resolution images of the cortical areas down to 250 m, allows us to visualize the fine details of the brainstem areas as well as the many cortical and sub-cortical areas. The PET-MRI fusion imaging system provides complete information on neurological diseases as well as cognitive neurosciences. The paper presents PCA based image fusion and also focuses on image fusion algorithm based on wavelet transform to improve resolution of the images in which two images to be fused are firstly decomposed into sub-images with different frequency and then the information fusion is performed and finally these sub-images are reconstructed into result image with plentiful information. . We also propose image fusion in Radon space. This paper presents assessment of image fusion by measuring the quantity of enhanced information in fused images. We use entropy, mean, standard deviation and Fusion Mutual Information, cross correlation , Mutual Information Root Mean Square Error, Universal Image Quality Index and Relative shift in mean to compare fused image quality. Comparative evaluation of fused images is a critical step to evaluate the relative performance of different image fusion algorithms. In this paper, we also propose image quality metric based on the human vision system (HVS).
IRJET- Fusion based Brain Tumor DetectionIRJET Journal
1. The document discusses a method for detecting brain tumors using medical image fusion and support vector machines (SVM).
2. It involves fusing two MRI images using SVM to create a single fused image with more information than the original images. Texture and wavelet features are then extracted from the fused image.
3. The SVM classifier classifies the brain tumors as benign or malignant based on the trained and tested features extracted from the fused image.
FUZZY SEGMENTATION OF MRI CEREBRAL TISSUE USING LEVEL SET ALGORITHMAM Publications
The current study investigated a median filter with the fuzzy level set method to propose fuzzy segmentation of magnetic resonance imaging (MRI) cerebral tissue images. An MRI image was used as an input image. A median filter and fuzzy c-means (FCM) clustering were utilized to remove image noise and create image clusters, respectively. The image clusters showed initial and final cluster centers. The level set method was then used for segmentation after separating and extracting white matter from gray matter. Fuzzy c-means was sensitive to the choice of the initial cluster center. Improper center selection caused the method to produce suboptimal solutions. The proposed algorithm was successfully utilized to segment MRI cerebral tissue images. The algorithm efficiently performed segmentation of test MRI cerebral tissue images compared with algorithms proposed in previous studies.
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALIJCSEIT Journal
The document proposes an approach combining automatic relevance feedback and particle swarm optimization for image retrieval. It constructs a visual feature database from image features like color moments and Gabor filters. For a query image, it retrieves similar images and generates automatic relevance feedback by labeling images as relevant or irrelevant. It then uses particle swarm optimization to re-weight features and retrieve more relevant images over multiple iterations, splitting the swarm in later iterations. An experiment on Corel images over 5 classes showed the approach could effectively retrieve relevant images through this meta-heuristic process without human interaction.
MIP AND UNSUPERVISED CLUSTERING FOR THE DETECTION OF BRAIN TUMOUR CELLSAM Publications
Image processing is widely used in biomedical applications. Image processing can be used to analyze
different MRI brain images in order to get the abnormality in the image .The objective is to extract meaningful
information from the imaged signals. Image segmentation is a process of partitioning an image in to different parts.
The division in to parts is often based on the characteristics of the pixels in the image. In our paper the segmentation
of the tumour tissues is carried out using k-means and fuzzy c-means clustering.Tumour can be found and faster
detection is achieved with only few seconds for execution. The input image of the brain is taken from the available
database and the presence of tumourin input image can be detected.
This document summarizes a research article that proposes using a Bayesian classifier to aid in level set segmentation for early detection of diabetic retinopathy. Level set segmentation is used to segment retinal images and detect small blood clots. A Bayesian classifier is applied to help propagate the level set contour and classify pixels as normal blood vessels or abnormal blood clots. The method was tested on retinal images and showed it could detect small clots of 0.02mm, indicating it may help detect early proliferation stages. Results demonstrated it outperformed other methods in detecting minute clots for early stage proliferation detection.
A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...CSCJournals
Fuzzy c-means (FCM) algorithm has proved its effectiveness for image segmentation. However, still it lacks in getting robustness to noise and outliers, especially in the absence of prior knowledge of the noise. To overcome this problem, a generalized a novel multiple-kernel fuzzy cmeans (FCM) (NMKFCM) methodology with spatial information is introduced as a framework for image-segmentation problem. The algorithm utilizes the spatial neighborhood membership values in the standard kernels are used in the kernel FCM (KFCM) algorithm and modifies the membership weighting of each cluster. The proposed NMKFCM algorithm provides a new flexibility to utilize different pixel information in image-segmentation problem. The proposed algorithm is applied to brain MRI which degraded by Gaussian noise and Salt-Pepper noise. The proposed algorithm performs more robust to noise than other existing image segmentation algorithms from FCM family.
An Analysis and Comparison of Quality Index Using Clustering Techniques for S...CSCJournals
This document presents a proposed methodology for microarray image segmentation using clustering techniques. The methodology involves three main steps: preprocessing, gridding, and segmentation. Segmentation is performed using an enhanced fuzzy c-means clustering algorithm (EFCMC) that uses neighborhood pixel information and gray levels. EFCMC can accurately detect absent spots and is tolerant to noise. The methodology is tested on real microarray images and its segmentation quality is assessed using a quality index. Results show EFCMC improves the quality index compared to k-means clustering and fuzzy c-means clustering.
The document describes an image fusion approach that uses adaptive fuzzy logic modeling for global processing followed by Markov random field modeling for local processing.
It begins by introducing image fusion and its applications. It then discusses existing fusion approaches and their limitations. The proposed approach first uses an adaptive fuzzy logic model to minimize redundant information globally. It then applies Markov random field modeling locally for fusion. Experimental results showed the proposed approach improved the universal image quality index by 30-35% compared to fusion with Markov random field modeling alone.
A GENERAL STUDY ON HISTOGRAM EQUALIZATION FOR IMAGE ENHANCEMENTpharmaindexing
The document discusses several methods for image enhancement using histogram equalization. It begins with an introduction to histogram equalization and its use in increasing image quality and local contrast. It then reviews three existing histogram equalization methods - Bi-Histogram Equalization with Neighborhood Metrics, Class-Based Parametric Approximation to Histogram Equalization, and Texture Enhanced Histogram Equalization Using TV-L1 Image Decomposition. Each aimed to improve on traditional histogram equalization by addressing issues like maintaining brightness, preserving local information, and avoiding intensity saturation artifacts. The document concludes that variational approaches like TV-L1 decomposition have potential to outperform conventional histogram equalization methods for contrast enhancement.
The document describes a method for image fusion and optimization using stationary wavelet transform and particle swarm optimization. It summarizes that image fusion combines information from multiple images to extract relevant information. The proposed method uses stationary wavelet transform for image decomposition and particle swarm optimization to optimize the fused results. It applies stationary wavelet transform to source images to decompose them into wavelet coefficients. Particle swarm optimization is then used to optimize the transformed images. The inverse stationary wavelet transform is applied to the optimized coefficients to generate the fused image. The method is tested on various images and performance is evaluated using metrics like peak signal-to-noise ratio, entropy, mean square error and standard deviation.
IRJET-Multimodal Image Classification through Band and K-Means ClusteringIRJET Journal
This document proposes a bilayer graph-based learning framework for multimodal image classification using limited labeled pixels. It constructs a simple graph in the first layer where each vertex is a pixel and edge weights encode pixel similarity. Unsupervised learning estimates grouping relations among pixels. These relations form a hypergraph in the second layer, on which semisupervised learning classifies pixels to address challenges of complex relationships and limited labels in multimodal images. The framework effectively exploits the underlying data structure.
P.Pangideswara Rao has over 5 years of experience in civil engineering projects. He has a diploma in civil engineering and has worked on projects such as airport runways and power plant construction. His roles have included quality assurance engineering and ensuring construction quality standards are met. He is proficient in software such as AutoCAD and seeks to contribute his skills and commitment to the construction industry.
Optimized Neural Network for Classification of Multispectral ImagesIDES Editor
This document summarizes an article that proposes using a multiobjective particle swarm optimization (MOPSO) approach to optimize the structure of an artificial neural network for classifying multispectral satellite images. Specifically, the MOPSO is used to simultaneously select the most discriminative spectral bands from the available options and determine the optimal number of nodes in the hidden layer of the neural network. The MOPSO approach is compared to traditional classifiers like maximum likelihood classification and Euclidean classifiers. The results show that the MOPSO-optimized neural network approach provides superior performance for remote sensing image classification problems.
1) The document discusses various medical image fusion techniques including pixel level, feature level, and decision level fusion.
2) It proposes a novel pixel level fusion method called Iterative Block Level Principal Component Averaging fusion that divides images into blocks and calculates principal components for each block.
3) Experimental results on fusing noise free and noise filtered MR images show that the proposed method performs well in terms of average mutual information and structural similarity compared to other algorithms.
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...ijma
In the process of content-based multimedia retrieval, multimedia information is processed in order to
obtain descriptive features. Descriptive representation of features, results in a huge feature count, which
results in processing overhead. To reduce this descriptive feature overhead, various approaches have been
used to dimensional reduction, among which PCA and LDA are the most used methods. However, these
methods do not reflect the significance of feature content in terms of inter-relation among all dataset
features. To achieve a dimension reduction based on histogram transformation, features with low
significance can be eliminated. In this paper, we propose a feature dimensional reduction approaches to
achieve the dimension reduction approach based on a multi-linear kernel (MLK) modeling. A benchmark
dataset for the experimental work is taken and the proposed work is observed to be improved in analysis in
comparison to the conventional system.
SINGLE IMAGE SUPER RESOLUTION: A COMPARATIVE STUDYcsandit
The majority of applications requiring high resolution images to derive and analyze data
accurately and easily. Image super resolution is playing an effective role in those applications.
Image super resolution is the process of producing high resolution image from low resolution
image. In this paper, we study various image super resolution techniques with respect to the
quality of results and processing time. This comparative study introduces a comparison between
four algorithms of single image super-resolution. For fair comparison, the compared algorithms
are tested on the same dataset and same platform to show the major advantages of one over the
others.
This document describes a project that aims to detect backgrounds in images with poor lighting and enhance contrast using morphological operations. The proposed method involves two approaches: 1) dividing the image into blocks and analyzing each block to determine background parameters, and 2) using morphological erosion and dilation with structuring elements to compute minimum and maximum intensity values within windows of the image for background detection and contrast enhancement. The results and conclusions of implementing these methods are then presented.
IRJET - Review of Various Multi-Focus Image Fusion MethodsIRJET Journal
This document provides an overview of multi-focus image fusion methods. It discusses various multi-focus image fusion techniques in both the spatial and frequency domains. It reviews several papers on multi-focus image fusion using different methods like region mosaicking on laplacian pyramid (RMLP), discrete wavelet transform (DWT), principal component analysis (PCA), discrete cosine transform (DCT), and implementation on field programmable gate arrays (FPGAs). The document compares the advantages and issues of the techniques discussed in the reviewed papers. It provides context on applications of image fusion in areas like remote sensing, medical imaging, and more.
The document proposes a new feature descriptor called Local Bit-Plane Wavelet Pattern (LBWP) to improve content-based retrieval of biomedical images like CT and MRI scans. LBWP encodes relationships between pixel intensities in different bit planes and applies a wavelet function, capturing more fine-grained image details than prior methods. Evaluation on a dataset from The Cancer Imaging Archive showed LBWP outperformed existing approaches like Local Wavelet Pattern with higher average retrieval precision, rate, and F-score.
DEEP LEARNING BASED TARGET TRACKING AND CLASSIFICATION DIRECTLY IN COMPRESSIV...sipij
Past research has found that compressive measurements save data storage and bandwidth usage. However, it is also observed that compressive measurements are difficult to be used directly for target tracking and classification without pixel reconstruction. This is because the Gaussian random matrix destroys the target location information in the original video frames. This paper summarizes our research effort on target tracking and classification directly in the compressive measurement domain. We focus on one type of compressive measurement using pixel subsampling. That is, the compressive measurements are obtained by randomly subsample the original pixels in video frames. Even in such special setting, conventional trackers still do not work well. We propose a deep learning approach that integrates YOLO (You Only Look Once) and ResNet (residual network) for target tracking and classification in low quality videos. YOLO is for multiple target detection and ResNet is for target classification. Extensive experiments using optical and mid-wave infrared (MWIR) videos in the SENSIAC database demonstrated the efficacy of the proposed approach.
Image Fusion and Image Quality Assessment of Fused ImagesCSCJournals
Accurate diagnosis of tumor extent is important in radiotherapy. This paper presents the use of image fusion of PET and MRI image. Multi-sensor image fusion is the process of combining information from two or more images into a single image. The resulting image contains more information as compared to individual images. PET delivers high-resolution molecular imaging with a resolution down to 2.5 mm full width at half maximum (FWHM), which allows us to observe the brain\'s molecular changes using the specific reporter genes and probes. On the other hand, the 7.0 T-MRI, with sub-millimeter resolution images of the cortical areas down to 250 m, allows us to visualize the fine details of the brainstem areas as well as the many cortical and sub-cortical areas. The PET-MRI fusion imaging system provides complete information on neurological diseases as well as cognitive neurosciences. The paper presents PCA based image fusion and also focuses on image fusion algorithm based on wavelet transform to improve resolution of the images in which two images to be fused are firstly decomposed into sub-images with different frequency and then the information fusion is performed and finally these sub-images are reconstructed into result image with plentiful information. . We also propose image fusion in Radon space. This paper presents assessment of image fusion by measuring the quantity of enhanced information in fused images. We use entropy, mean, standard deviation and Fusion Mutual Information, cross correlation , Mutual Information Root Mean Square Error, Universal Image Quality Index and Relative shift in mean to compare fused image quality. Comparative evaluation of fused images is a critical step to evaluate the relative performance of different image fusion algorithms. In this paper, we also propose image quality metric based on the human vision system (HVS).
IRJET- Fusion based Brain Tumor DetectionIRJET Journal
1. The document discusses a method for detecting brain tumors using medical image fusion and support vector machines (SVM).
2. It involves fusing two MRI images using SVM to create a single fused image with more information than the original images. Texture and wavelet features are then extracted from the fused image.
3. The SVM classifier classifies the brain tumors as benign or malignant based on the trained and tested features extracted from the fused image.
FUZZY SEGMENTATION OF MRI CEREBRAL TISSUE USING LEVEL SET ALGORITHMAM Publications
The current study investigated a median filter with the fuzzy level set method to propose fuzzy segmentation of magnetic resonance imaging (MRI) cerebral tissue images. An MRI image was used as an input image. A median filter and fuzzy c-means (FCM) clustering were utilized to remove image noise and create image clusters, respectively. The image clusters showed initial and final cluster centers. The level set method was then used for segmentation after separating and extracting white matter from gray matter. Fuzzy c-means was sensitive to the choice of the initial cluster center. Improper center selection caused the method to produce suboptimal solutions. The proposed algorithm was successfully utilized to segment MRI cerebral tissue images. The algorithm efficiently performed segmentation of test MRI cerebral tissue images compared with algorithms proposed in previous studies.
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALIJCSEIT Journal
The document proposes an approach combining automatic relevance feedback and particle swarm optimization for image retrieval. It constructs a visual feature database from image features like color moments and Gabor filters. For a query image, it retrieves similar images and generates automatic relevance feedback by labeling images as relevant or irrelevant. It then uses particle swarm optimization to re-weight features and retrieve more relevant images over multiple iterations, splitting the swarm in later iterations. An experiment on Corel images over 5 classes showed the approach could effectively retrieve relevant images through this meta-heuristic process without human interaction.
MIP AND UNSUPERVISED CLUSTERING FOR THE DETECTION OF BRAIN TUMOUR CELLSAM Publications
Image processing is widely used in biomedical applications. Image processing can be used to analyze
different MRI brain images in order to get the abnormality in the image .The objective is to extract meaningful
information from the imaged signals. Image segmentation is a process of partitioning an image in to different parts.
The division in to parts is often based on the characteristics of the pixels in the image. In our paper the segmentation
of the tumour tissues is carried out using k-means and fuzzy c-means clustering.Tumour can be found and faster
detection is achieved with only few seconds for execution. The input image of the brain is taken from the available
database and the presence of tumourin input image can be detected.
This document summarizes a research article that proposes using a Bayesian classifier to aid in level set segmentation for early detection of diabetic retinopathy. Level set segmentation is used to segment retinal images and detect small blood clots. A Bayesian classifier is applied to help propagate the level set contour and classify pixels as normal blood vessels or abnormal blood clots. The method was tested on retinal images and showed it could detect small clots of 0.02mm, indicating it may help detect early proliferation stages. Results demonstrated it outperformed other methods in detecting minute clots for early stage proliferation detection.
A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...CSCJournals
Fuzzy c-means (FCM) algorithm has proved its effectiveness for image segmentation. However, still it lacks in getting robustness to noise and outliers, especially in the absence of prior knowledge of the noise. To overcome this problem, a generalized a novel multiple-kernel fuzzy cmeans (FCM) (NMKFCM) methodology with spatial information is introduced as a framework for image-segmentation problem. The algorithm utilizes the spatial neighborhood membership values in the standard kernels are used in the kernel FCM (KFCM) algorithm and modifies the membership weighting of each cluster. The proposed NMKFCM algorithm provides a new flexibility to utilize different pixel information in image-segmentation problem. The proposed algorithm is applied to brain MRI which degraded by Gaussian noise and Salt-Pepper noise. The proposed algorithm performs more robust to noise than other existing image segmentation algorithms from FCM family.
An Analysis and Comparison of Quality Index Using Clustering Techniques for S...CSCJournals
This document presents a proposed methodology for microarray image segmentation using clustering techniques. The methodology involves three main steps: preprocessing, gridding, and segmentation. Segmentation is performed using an enhanced fuzzy c-means clustering algorithm (EFCMC) that uses neighborhood pixel information and gray levels. EFCMC can accurately detect absent spots and is tolerant to noise. The methodology is tested on real microarray images and its segmentation quality is assessed using a quality index. Results show EFCMC improves the quality index compared to k-means clustering and fuzzy c-means clustering.
The document describes an image fusion approach that uses adaptive fuzzy logic modeling for global processing followed by Markov random field modeling for local processing.
It begins by introducing image fusion and its applications. It then discusses existing fusion approaches and their limitations. The proposed approach first uses an adaptive fuzzy logic model to minimize redundant information globally. It then applies Markov random field modeling locally for fusion. Experimental results showed the proposed approach improved the universal image quality index by 30-35% compared to fusion with Markov random field modeling alone.
A GENERAL STUDY ON HISTOGRAM EQUALIZATION FOR IMAGE ENHANCEMENTpharmaindexing
The document discusses several methods for image enhancement using histogram equalization. It begins with an introduction to histogram equalization and its use in increasing image quality and local contrast. It then reviews three existing histogram equalization methods - Bi-Histogram Equalization with Neighborhood Metrics, Class-Based Parametric Approximation to Histogram Equalization, and Texture Enhanced Histogram Equalization Using TV-L1 Image Decomposition. Each aimed to improve on traditional histogram equalization by addressing issues like maintaining brightness, preserving local information, and avoiding intensity saturation artifacts. The document concludes that variational approaches like TV-L1 decomposition have potential to outperform conventional histogram equalization methods for contrast enhancement.
The document describes a method for image fusion and optimization using stationary wavelet transform and particle swarm optimization. It summarizes that image fusion combines information from multiple images to extract relevant information. The proposed method uses stationary wavelet transform for image decomposition and particle swarm optimization to optimize the fused results. It applies stationary wavelet transform to source images to decompose them into wavelet coefficients. Particle swarm optimization is then used to optimize the transformed images. The inverse stationary wavelet transform is applied to the optimized coefficients to generate the fused image. The method is tested on various images and performance is evaluated using metrics like peak signal-to-noise ratio, entropy, mean square error and standard deviation.
IRJET-Multimodal Image Classification through Band and K-Means ClusteringIRJET Journal
This document proposes a bilayer graph-based learning framework for multimodal image classification using limited labeled pixels. It constructs a simple graph in the first layer where each vertex is a pixel and edge weights encode pixel similarity. Unsupervised learning estimates grouping relations among pixels. These relations form a hypergraph in the second layer, on which semisupervised learning classifies pixels to address challenges of complex relationships and limited labels in multimodal images. The framework effectively exploits the underlying data structure.
P.Pangideswara Rao has over 5 years of experience in civil engineering projects. He has a diploma in civil engineering and has worked on projects such as airport runways and power plant construction. His roles have included quality assurance engineering and ensuring construction quality standards are met. He is proficient in software such as AutoCAD and seeks to contribute his skills and commitment to the construction industry.
The document describes the seven main chakras in the human energy system, plus the crown chakra. For each chakra it lists the associated planet, location in the body, color, symbol, sound, emotion, alignment statement, essential oils, active hours, and mudra or hand position. The chakras range from the Saturn chakra at the base of the spine to the Crown chakra above the head. Each chakra is represented by a different color, symbol, sound and emotion.
El documento describe varios conceptos y técnicas de la arquitectura bioclimática. Explica la trayectoria solar y cómo afecta la radiación recibida por las fachadas en diferentes estaciones. También describe los mecanismos de transmisión del calor, la capacidad calorífica e inercia térmica, y factores que influyen en el confort térmico como la temperatura del aire, radiación y humedad.
Este documento describe un método para determinar la cantidad de atenolol, un medicamento para la hipertensión, en productos farmacéuticos mediante espectrometría infrarroja con transformada de Fourier (FTIR). El método involucra la extracción del atenolol de las tabletas en cloroformo, la construcción de una curva de calibración con patrones y la comparación de las muestras con la curva para cuantificar el contenido de atenolol. El método fue validado y aplicado con éxito para analizar muestras reales de diferentes
La netiqueta proporciona normas de comportamiento para hacer de Internet y las tecnologías un lugar más agradable donde se fomenta el respeto mutuo. Aunque no existe una policía que vigile su cumplimiento, las reglas de netiqueta han sido adoptadas por los usuarios para promover la seguridad y humanidad en las comunicaciones online. Algunas de estas reglas incluyen tratar a los demás con respeto, evitar el spam o compartir contenidos peligrosos, y denunciar el acoso o situaciones de riesgo en lugar de ser cómplice.
Este documento describe varios fármacos de acción central y antagonistas del calcio utilizados para tratar la hipertensión. La metildopa y la moxonidina son antihipertensivos de acción central, mientras que la amlodipina, felodipina, isradipina, nicardipina y nifedipina son antagonistas de los canales de calcio. Cada uno tiene indicaciones, dosis, efectos adversos y consideraciones de seguridad específicas.
Visual Saliency Model Using Sift and Comparison of Learning Approachescsandit
This document discusses a study that aims to develop a visual saliency model to predict where humans look in images. It uses the SIFT feature in addition to low, mid, and high-level image features to train machine learning models on an eye-tracking dataset. Support vector machines (SVM) achieved the best performance, accurately predicting fixations 88% of the time. Including the SIFT feature further improved SVM performance to 91% accuracy. The study evaluates different machine learning methods and determines SVM to be best suited for this binary classification task using high-dimensional image data.
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...AnuragVijayAgrawal
This document summarizes a research paper presentation on co-saliency detection approaches. The paper discusses related works on saliency detection in single images and videos. It describes the methodology used, which extracts image features and examines bottom-up cues to create co-saliency maps. Results on a benchmark image pair dataset show state-of-the-art methods achieve high accuracy. In conclusion, co-saliency detection is an emerging field that aims to identify shared salient regions across multiple images, though challenges remain to be addressed.
A Survey On Tracking Moving Objects Using Various AlgorithmsIJMTST Journal
Sparse representation has been applied to the object tracking problem. Mining the self- similarities between particles via multitask learning can improve tracking performance. How-ever, some particles may be different from others when they are sampled from a large region. Imposing all particles share the same structure may degrade the results. To overcome this problem, we propose a tracking algorithm based on robust multitask sparse representation (RMTT) in this letter. When we learn the particle representations, we decompose the sparse coefficient matrix into two parts in our algorithm. Joint sparse regularization is imposed on one coefficient matrix while element-wise sparse regularization is imposed on another matrix. The former regularization exploits self-similarities of particles while the later one considers the differences between them.
Object Classification of Satellite Images Using Cluster Repulsion Based Kerne...IOSR Journals
Abstract: We investigated the Classification of satellite images and multispectral remote sensing data .we
focused on uncertainty analysis in the produced land-cover maps .we proposed an efficient technique for
classifying the multispectral satellite images using Support Vector Machine (SVM) into road area, building area
and green area. We carried out classification in three modules namely (a) Preprocessing using Gaussian
filtering and conversion from conversion of RGB to Lab color space image (b) object segmentation using
proposed Cluster repulsion based kernel Fuzzy C- Means (FCM) and (c) classification using one-to-many SVM
classifier. The goal of this research is to provide the efficiency in classification of satellite images using the
object-based image analysis. The proposed work is evaluated using the satellite images and the accuracy of the
proposed work is compared to FCM based classification. The results showed that the proposed technique has
achieved better results reaching an accuracy of 79%, 84%, 81% and 97.9% for road, tree, building and vehicle
classification respectively.
Keywords:-Satellite image, FCM Clustering, Classification, SVM classifier.
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONijaia
Most of the currently known methods treat person re-identification task as classification problem and used commonly neural networks. However, these methods used only high-level convolutional feature or to express the feature representation of pedestrians. Moreover, the current data sets for person reidentification is relatively small. Under the limitation of the number of training set, deep convolutional networks are difficult to train adequately. Therefore, it is very worthwhile to introduce auxiliary data sets to help training. In order to solve this problem, this paper propose a novel method of deep transfer learning, and combines the comparison model with the classification model and multi-level fusion of the convolution features on the basis of transfer learning. In a multi-layers convolutional network, the characteristics of each layer of network are the dimensionality reduction of the previous layer of results, but the information of multi-level features is not only inclusive, but also has certain complementarity. We can using the information gap of different layers of convolutional neural networks to extract a better feature expression. Finally, the algorithm proposed in this paper is fully tested on four data sets (VIPeR, CUHK01, GRID and PRID450S). The obtained re-identification results prove the effectiveness of the algorithm.
When deep learners change their mind learning dynamics for active learningDevansh16
Abstract:
Active learning aims to select samples to be annotated that yield the largest performance improvement for the learning algorithm. Many methods approach this problem by measuring the informativeness of samples and do this based on the certainty of the network predictions for samples. However, it is well-known that neural networks are overly confident about their prediction and are therefore an untrustworthy source to assess sample informativeness. In this paper, we propose a new informativeness-based active learning method. Our measure is derived from the learning dynamics of a neural network. More precisely we track the label assignment of the unlabeled data pool during the training of the algorithm. We capture the learning dynamics with a metric called label-dispersion, which is low when the network consistently assigns the same label to the sample during the training of the network and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results.
Improving the Accuracy of Object Based Supervised Image Classification using ...CSCJournals
A lot of research has been undertaken and is being carried out for developing an accurate classifier for extraction of objects with varying success rates. Most of the commonly used advanced classifiers are based on neural network or support vector machines, which uses radial basis functions, for defining the boundaries of the classes. The drawback of such classifiers is that the boundaries of the classes as taken according to radial basis function which are spherical while the same is not true for majority of the real data. The boundaries of the classes vary in shape, thus leading to poor accuracy. This paper deals with use of new basis functions, called cloud basis functions (CBFs) neural network which uses a different feature weighting, derived to emphasize features relevant to class discrimination, for improving classification accuracy. Multi layer feed forward and radial basis functions (RBFs) neural network are also implemented for accuracy comparison sake. It is found that the CBFs NN has demonstrated superior performance compared to other activation functions and it gives approximately 3% more accuracy.
International Journal of Computational Engineering Research(IJCER) ijceronline
This document presents a hybrid methodology for classifying segmented images using both unsupervised and supervised classification techniques. The proposed methodology involves first segmenting the image into spectrally homogeneous regions using region growing segmentation. Then, a clustering algorithm is applied to the segmented regions for initial classification. Selected regions are used as training data for a supervised classification algorithm to further categorize the image. The hybrid approach combines the benefits of unsupervised clustering and supervised classification. The methodology is evaluated on natural and aerial images to compare its performance to existing seeded region growing and texture extraction segmentation methods.
Human Re-identification with Global and Local Siamese Convolution Neural NetworkTELKOMNIKA JOURNAL
This document proposes a global and local structure of Siamese Convolution Neural Network (SCNN) to perform human re-identification in single-shot approaches. The network extracts features from global and local parts of input images. A decision fusion technique then combines the global and local features. Experimental results on the VIPeR dataset show the proposed method achieves a normalized Area Under Curve score of 95.75% without occlusion, outperforming using local or global features alone. With occlusion, the score is 77.5%, still better than alternatives. The method performs well for re-identification including in occlusion cases by leveraging both global and local information.
A Survey on Different Relevance Feedback Techniques in Content Based Image Re...IRJET Journal
This document summarizes several relevance feedback techniques used in content-based image retrieval to bridge the semantic gap between low-level visual features and high-level semantic concepts. It reviews subspace learning algorithms like feature adaptation and relevance feedback, probabilistic feature weighting with positive and negative examples, asymmetric bagging and random subspaces for support vector machines, navigation pattern-based relevance feedback, biased discriminative Euclidean embedding, and feature line embedding biased discriminant analysis. The goal of these techniques is to retrieve more semantically relevant images through an iterative feedback process between the user and retrieval system.
IRJET-A Review on Implementation of High Dimension Colour Transform in Domain...IRJET Journal
This document reviews algorithms for detecting salient regions in images using high dimensional color transforms. It summarizes several existing methods that use color contrast, frequency analysis, and superpixel segmentation. A key method discussed creates a saliency map by finding the optimal linear combination of color coefficients in a high dimensional color space. This allows more accurate detection of salient objects versus methods using only RGB color. The performance of this high dimensional color transform method is improved by also utilizing relative location and color contrast between superpixels as learned features.
RANSAC BASED MOTION COMPENSATED RESTORATION FOR COLONOSCOPY IMAGESsipij
Colonoscopy is a procedure that has been used widely to detect the abnormality in a colon. Colonoscopy images suffer from a lot of problems that make it hard for the doctor to investigate/ understand a colon patient. Unfortunately, with the current technology, three is no way for doctors to know if the whole colon surface has been investigated or not. We have developed a method that utilizes RANSAC-based image registration to align sequences of any length in the colonoscopy video and restores each frame of the video using information from these aligned images. We proposed two methods. First method used the deep neural net for the classification of informative and non-informative image. The classification result was used as a preprocessing for alignment method. Also, we proposed a visualization structure for the classification results. The second method used the alignment to decide/classify the bad and good alignment by using two factors. The first factor is the accumulated error and the second factor contain three checking steps that check the pair error alignment beside the geometry transform status. The second method was able to align long sequences.
Review of Image Segmentation Techniques based on Region Merging ApproachEditor IJMTER
Image segmentation is an important task in computer vision and object recognition. Since
fully automatic image segmentation is usually very hard for natural images, interactive schemes with a
few simple user inputs are good solutions. In image segmentation the image is dividing into various
segments for processing images. The complexity of image content is a bigger challenge for carrying out
automatic image segmentation. On regions based scheme, the images are merged based on the similarity
criteria depending upon comparing the mean values of both the regions to be merged. So, the similar
regions are then merged and the dissimilar regions are merged together.
The document discusses using machine learning algorithms and supervised learning methods to develop an automated system for detecting nanoparticles and estimating their size and spatial distribution from scanning electron microscope images. The goal is to enable industrial-scale manufacturing of nanomaterials by applying quality control tools. Specifically, the research uses support vector machines and scale-invariant feature transform to extract features from images and classify pixels as nanorods or background in order to predict locations and dimensions of nanorods.
Image Retrieval using Graph based Visual SaliencyIRJET Journal
This document discusses image retrieval using graph-based visual saliency. It begins with an abstract that describes saliency detection methods and graph-based visual saliency (GBVS), which forms activation maps from image features and normalizes them to highlight salient parts. The purpose is to evaluate GBVS using statistical metrics like precision and recall, and to use genetic algorithms to improve its performance. It then provides background on saliency, different saliency approaches, what graph-based visual saliency is, its advantages and applications. Finally, it reviews several related works on visual saliency models.
This document discusses using wavelet domain saliency maps for secret communication in RGB images. It proposes a method to compute saliency maps using both approximation and detail coefficients from discrete wavelet transforms of the color channels. Higher numbers of secret bits would be embedded in less salient regions according to the saliency map. The saliency map approach is compared to other methods and could make steganography more secure by embedding data in less noticeable image regions.
Object based Classification of Satellite Images by Combining the HDP, IBP and...IRJET Journal
This document presents a method for unsupervised object-based classification of very high resolution panchromatic and multispectral satellite images. It uses hierarchical Dirichlet process (HDP) and Indian buffet process (IBP) to determine color frequencies for different areas after image segmentation using k-means clustering. Support vector machine (SVM) classification is then applied based on the color frequencies to classify the image objects into land cover classes. The hierarchical structure of the model transmits spatial information between image layers to provide cues for classification. The method aims to solve problems with traditional probabilistic topic models and achieve effective unsupervised classification of high resolution satellite images.
Robust Clustering of Eye Movement Recordings for QuantiGiuseppe Fineschi
Characterizing the location and extent of a viewer’s interest, in terms of eye movement recordings, informs a range of investigations in image and scene viewing. We present an automatic data-driven method for accomplishing this, which clusters visual point-of-regard (POR) measurements into gazes and regions-ofinterest using the mean shift procedure. Clusters produced using this method form a structured representation of viewer interest, and at the same time are replicable and not heavily influenced by noise or outliers. Thus, they are useful in answering fine-grained questions about where and how a viewer examined an image.
This document provides an overview of salient object detection techniques, including both traditional and deep learning-based methods. It discusses early models of saliency detection based on cognitive theories of human visual attention. Global contrast and diffusion-based methods for salient object detection are described. The use of fully convolutional neural networks for deep learning-based salient object detection is also covered. Both qualitative and quantitative comparisons of detection techniques are presented. The document concludes by noting improvements in recent models from including edge and context information, but that detection remains challenging across a variety of difficult image scenarios.
Similar to adaptive metric learning for saliency detection base paper (20)
This study Examines the Effectiveness of Talent Procurement through the Imple...DharmaBanothu
In the world with high technology and fast
forward mindset recruiters are walking/showing interest
towards E-Recruitment. Present most of the HRs of
many companies are choosing E-Recruitment as the best
choice for recruitment. E-Recruitment is being done
through many online platforms like Linkedin, Naukri,
Instagram , Facebook etc. Now with high technology E-
Recruitment has gone through next level by using
Artificial Intelligence too.
Key Words : Talent Management, Talent Acquisition , E-
Recruitment , Artificial Intelligence Introduction
Effectiveness of Talent Acquisition through E-
Recruitment in this topic we will discuss about 4important
and interlinked topics which are
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...Dr.Costas Sachpazis
Consolidation Settlement Calculation Program-The Python Code
By Professor Dr. Costas Sachpazis, Civil Engineer & Geologist
This program calculates the consolidation settlement for a foundation based on soil layer properties and foundation data. It allows users to input multiple soil layers and foundation characteristics to determine the total settlement.
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...IJCNCJournal
Paper Title
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation with Hybrid Beam Forming Power Transfer in WSN-IoT Applications
Authors
Reginald Jude Sixtus J and Tamilarasi Muthu, Puducherry Technological University, India
Abstract
Non-Orthogonal Multiple Access (NOMA) helps to overcome various difficulties in future technology wireless communications. NOMA, when utilized with millimeter wave multiple-input multiple-output (MIMO) systems, channel estimation becomes extremely difficult. For reaping the benefits of the NOMA and mm-Wave combination, effective channel estimation is required. In this paper, we propose an enhanced particle swarm optimization based long short-term memory estimator network (PSOLSTMEstNet), which is a neural network model that can be employed to forecast the bandwidth required in the mm-Wave MIMO network. The prime advantage of the LSTM is that it has the capability of dynamically adapting to the functioning pattern of fluctuating channel state. The LSTM stage with adaptive coding and modulation enhances the BER.PSO algorithm is employed to optimize input weights of LSTM network. The modified algorithm splits the power by channel condition of every single user. Participants will be first sorted into distinct groups depending upon respective channel conditions, using a hybrid beamforming approach. The network characteristics are fine-estimated using PSO-LSTMEstNet after a rough approximation of channels parameters derived from the received data.
Keywords
Signal to Noise Ratio (SNR), Bit Error Rate (BER), mm-Wave, MIMO, NOMA, deep learning, optimization.
Volume URL: https://airccse.org/journal/ijc2022.html
Abstract URL:https://aircconline.com/abstract/ijcnc/v14n5/14522cnc05.html
Pdf URL: https://aircconline.com/ijcnc/V14N5/14522cnc05.pdf
#scopuspublication #scopusindexed #callforpapers #researchpapers #cfp #researchers #phdstudent #researchScholar #journalpaper #submission #journalsubmission #WBAN #requirements #tailoredtreatment #MACstrategy #enhancedefficiency #protrcal #computing #analysis #wirelessbodyareanetworks #wirelessnetworks
#adhocnetwork #VANETs #OLSRrouting #routing #MPR #nderesidualenergy #korea #cognitiveradionetworks #radionetworks #rendezvoussequence
Here's where you can reach us : ijcnc@airccse.org or ijcnc@aircconline.com
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Transcat
Join us for this solutions-based webinar on the tools and techniques for commissioning and maintaining PV Systems. In this session, we'll review the process of building and maintaining a solar array, starting with installation and commissioning, then reviewing operations and maintenance of the system. This course will review insulation resistance testing, I-V curve testing, earth-bond continuity, ground resistance testing, performance tests, visual inspections, ground and arc fault testing procedures, and power quality analysis.
Fluke Solar Application Specialist Will White is presenting on this engaging topic:
Will has worked in the renewable energy industry since 2005, first as an installer for a small east coast solar integrator before adding sales, design, and project management to his skillset. In 2022, Will joined Fluke as a solar application specialist, where he supports their renewable energy testing equipment like IV-curve tracers, electrical meters, and thermal imaging cameras. Experienced in wind power, solar thermal, energy storage, and all scales of PV, Will has primarily focused on residential and small commercial systems. He is passionate about implementing high-quality, code-compliant installation techniques.
Supermarket Management System Project Report.pdfKamal Acharya
Supermarket management is a stand-alone J2EE using Eclipse Juno program.
This project contains all the necessary required information about maintaining
the supermarket billing system.
The core idea of this project to minimize the paper work and centralize the
data. Here all the communication is taken in secure manner. That is, in this
application the information will be stored in client itself. For further security the
data base is stored in the back-end oracle and so no intruders can access it.
Accident detection system project report.pdfKamal Acharya
The Rapid growth of technology and infrastructure has made our lives easier. The
advent of technology has also increased the traffic hazards and the road accidents take place
frequently which causes huge loss of life and property because of the poor emergency facilities.
Many lives could have been saved if emergency service could get accident information and
reach in time. Our project will provide an optimum solution to this draw back. A piezo electric
sensor can be used as a crash or rollover detector of the vehicle during and after a crash. With
signals from a piezo electric sensor, a severe accident can be recognized. According to this
project when a vehicle meets with an accident immediately piezo electric sensor will detect the
signal or if a car rolls over. Then with the help of GSM module and GPS module, the location
will be sent to the emergency contact. Then after conforming the location necessary action will
be taken. If the person meets with a small accident or if there is no serious threat to anyone’s
life, then the alert message can be terminated by the driver by a switch provided in order to
avoid wasting the valuable time of the medical rescue team.
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...PriyankaKilaniya
Energy efficiency has been important since the latter part of the last century. The main object of this survey is to determine the energy efficiency knowledge among consumers. Two separate districts in Bangladesh are selected to conduct the survey on households and showrooms about the energy and seller also. The survey uses the data to find some regression equations from which it is easy to predict energy efficiency knowledge. The data is analyzed and calculated based on five important criteria. The initial target was to find some factors that help predict a person's energy efficiency knowledge. From the survey, it is found that the energy efficiency awareness among the people of our country is very low. Relationships between household energy use behaviors are estimated using a unique dataset of about 40 households and 20 showrooms in Bangladesh's Chapainawabganj and Bagerhat districts. Knowledge of energy consumption and energy efficiency technology options is found to be associated with household use of energy conservation practices. Household characteristics also influence household energy use behavior. Younger household cohorts are more likely to adopt energy-efficient technologies and energy conservation practices and place primary importance on energy saving for environmental reasons. Education also influences attitudes toward energy conservation in Bangladesh. Low-education households indicate they primarily save electricity for the environment while high-education households indicate they are motivated by environmental concerns.
Build the Next Generation of Apps with the Einstein 1 Platform.
Rejoignez Philippe Ozil pour une session de workshops qui vous guidera à travers les détails de la plateforme Einstein 1, l'importance des données pour la création d'applications d'intelligence artificielle et les différents outils et technologies que Salesforce propose pour vous apporter tous les bénéfices de l'IA.
2. 3322 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 11, NOVEMBER 2015
Fig. 2. The comparison between low-level features and our SFV feature. (a) input image. (b) saliency map based on low-level features. (c) saliency map
based on SFV. (d) ground truth.
Fig. 3. Pipeline of the adaptive metric learning algorithm. IT [5], GB [19], LR [14], RC [6] are other four saliency methods.
adopted for this purpose in different applications since it takes
into account the covariance information when estimating the
data distributions and improves the performance of learning
methods significantly. To our knowledge, we are the first to
successfully formulate the saliency detection problem into a
metric learning framework and our method works well on
different databases. We also propose a Superpixel-wise Fisher
Vector coding approach which maps the low-level features,
such as RGB and LAB, to high dimensional sparse vector.
Compared with using low-level features directly, the SFV is
more discriminative in challenging environments as shown
in Figure 2. Thus we use SFV features to describe each
superpixel.
In this paper, we adopt an effective feature coding method
and propose a novel metric learning based saliency detection
model, which incorporates both supervised and
semi-supervised information. Our algorithm considers both
the global distribution of the whole training dataset (GML)
and the typical structure of a specific image (SML), and
we successfully fuse them together to extract the clustering
characteristics for estimating the final saliency map. Figure 3
shows the pipeline of our method. First, as an extension of the
traditional Fisher Vector coding [17], Superpixel-wise Fisher
Vector coding is proposed to describe superpixels by learning
the parameters of a Gaussian mixture model (Section III-A).
Second, we train a Generic metric from the training
set (Section III-B1) and apply it to a single image to find
the saliency seeds with the assistance of the superpixel-wise
objectness map generated by [18] (Section III-C).
Third, a Specific metric based on kernel classification is
learnt from the chosen seeds for each image (Section III-B2).
Finally, by integrating the Generic metric and Specific
metric together (Section III-D), we obtain the clustering
information for each superpixel and use it to generate the
final saliency map (Section III-E). The GML and SML as
shown in Figure 3 are two intermediate images which are
not really generated when computing saliency maps. But they
serve as comparisons to demonstrate the efficiency of the
fused results in Section IV-A. The main contributions of our
work include:
• Two metric learning approaches are first applied to
saliency detection as the optimal distance measure of
two superpixels. GML is learnt from the global training
set while SML is learnt from the specific image training
samples. They are complementary to each other and
achieve promising results after the affinity aggregation.
• A superpixel-wise fisher vector coding method is first
put forward which contains image contextual information
when representing superpixels and makes supervised
learning methods more suitable for single image
processing.
• An accurate seeds selection method is first presented
based on the Mahalanobis distance metric. The selected
seeds serve as training samples of the Specific metric
learning and reference nodes when evaluating saliency
values.
3. LI et al.: ADAPTIVE METRIC LEARNING FOR SALIENCY DETECTION 3323
Experimental results on various image sets show that our
method is comparable with most of the state-of-the-arts and
the proposed metric learning approaches can be extended to
other fields as well.
II. RELATED WORK
Significant improvement and prosperity in saliency
detection have been witnessed in recent years. Numerous
unsupervised approaches have been proposed under different
theoretical models. Cheng et al. [6] propose a global region
contrast algorithm which simultaneously considers the spatial
coherence across the regions and the global contrast over
the entire image. However, low-level color contrast becomes
invalid when dealing with challenging scenes. Li et al. [20]
compute the dense and sparse reconstruction errors based
on background templates which are extracted from image
boundaries. They propose several integration strategies, such
as multi-scale reconstruction error and Bayesian integration,
which improve the performance of saliency detection
significantly. In [21], boundary connectivity, a robust
background measure, is first applied to saliency detection.
It characterizes the spatial layout of image regions and
provides a specific geometrical explanation to its definition.
Perazzi et al. [22] formulate the saliency estimation and
complete contrast using high-dimensional Gaussian filters.
They modify SLIC [23] and demonstrate the effectiveness of
their superpixel segmentation approach in detecting salient
objects.
Furthermore, lacking the knowledge of sizes and locations
of objects, boundary prior and objectness are often adopted
to highlight the salient regions or depress the backgrounds.
Jiang et al. [18] construct saliency by integrating
three visual cues, including uniqueness, focusness and
objectness (UFO), where uniqueness represents color contrast;
focusness indicates the degree of focus, often appearing as the
reverse of blurriness; objectness proposed by Alexe et al. [24]
is the likelihood of a given image window containing an
object. In [25], Wei et al. define the saliency value of
each patch as the shortest distance to the image boundary,
observing that image boundaries are more likely to be the
background. However, this assumption is less convincing,
especially when the scene is challenging.
Compared with unsupervised approaches, supervised
methods are apparently rare. In [26] and [27], Jiang et al.
also propose a multi-scale learning approach, which maps the
regional feature vector to a saliency score and fuse these
scores across multiple levels to generate the final saliency
map. They introduce a novel feature vector, which integrates
the regional contrast, regional property and regional
backgroundness descriptors together, to represent each region
and learn a discriminative random forest regressor to predict
regional scores. Shen and Wu [14] treat an image as the
combination of sparse noises and the low-rank matrix. They
extract low-level features to form high-level priors and then
incorporate the priors to a low-rank matrix recovery model
for constructing the saliency map. However, the saliency
assignment near the object is unsatisfying due to the ambiguity
of prior maps. Liu et al. [28] formulate the saliency detection
as a partial differential equation problem and solve it under
an adaptive PDE learning framework. They learn the optimal
saliency seeds via discrete submodularity and use seeds as
boundary condition to solve the Linear Elliptic System.
Inspired by these works, we construct a metric fusion
framework which contains two complementary metric learning
approaches to generate robust and accurate saliency maps even
in complex scenes. Our method encodes low-level features into
a high-dimensional feature space and incorporates multi-scale
and objectness information when measuring saliency values.
Therefore, our method can uniformly highlight objects with
explicit object boundaries.
III. PROPOSED ALGORITHM
In this section, we present an effective and robust adaptive
metric learning method for visual saliency detection. The
proposed algorithm proceeds through five steps to generate
the final saliency map. Firstly, we extract low-level features
to encode the superpixels generated by the simple
linear iterative clustering (SLIC) [23] algorithm
with a Superpixel-wise Fisher Vector representation.
Secondly, two Mahalanobis distance metric learning
approaches, Generic metric learning and Specific metric
learning are introduced to learn the optimal distance measure
of superpixels. Thirdly, we propose a novel seeds selection
strategy based on the Mahalanobis distance to generate
saliency seeds, which can be used to train Specific metric
as training samples and evaluate the saliency values as
referenced nodes. Fourthly, a metric fusion framework is
presented to fuse the Generic and Specific metrics together.
Finally, we obtain graceful and smooth saliency maps by
combining the spectral clustering and multi-scale information.
A. Superpixel-Wise Fisher Vector Coding (SFV)
Appropriate feature coding approaches can effectively
extract main information and remove the redundancies, thus
greatly improving the performance of saliency detection.
Fisher Vector can be regarded as an extension of the
well-known bag-of-words representation, since it captures
the first-order and second-order differences between local
features and the centers of a Mixture of Gaussian Distributions.
Recently, Chen et al. [29] extend Fisher Vector to the point
level image representation for object detection. For a different
purpose, we propose to further extend the FV coding to
superpixel level and experimentally verify the superiority of
our Superpixel-wise Fisher Vector coding method.
Given a superpixel i= {pt, t = 1, . . ., T }, where pt is a
-dimensional image pixel, and T is the number of pixels
within i, we train a Gaussian mixture model (GMM)
λ(pt) = K
k=1 υkψk(pt) from all the pixels of an
image using the Maximum Likelihood (ML) criterion. The
parameters of the K-component GMM are defined as
λ = {υk, μk, k, k = 1, . . ., K}, where υk, μk and k are
the mixture weight, mean vector and covariance matrix of
Gaussian k respectively. Similar to the FV coding method, the
SFV representation can be written as a = 2 K-dimensional
concatenated form:
ϕi = {ζμ1, ζσ1 , . . . , ζμK , ζσK} (1)
4. 3324 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 11, NOVEMBER 2015
where ζμk and ζσk are defined as: ζμk = 1
T
√
υk
T
t=1
ηt (k) pt −μk
σk
,
ζσk = 1
T
√
υk
T
t=1
ηt (k) 1√
2
{(pt−μk )2
σ2
k
− 1}, and σk is the square root
of the diagonal values of k, ηt (k) is the soft assignment of pt
to Gaussian k.
The SFV representation ϕi is hereby used to describe
superpixel i in this paper. It has several advantages:
• As an extension of Fisher Vector coding,
SFV successfully realizes superpixel level coding
representation, making Fisher Vector more suitable for
single image processing. Instead of averaging low-level
features of contained pixels, SFV statistically analyzes
the internal feature distribution of each superpixel,
providing a more accurate and reliable representation
for it. Experiments show that our SFV generates more
smooth and uniform saliency maps and improves about
2 percent compared with low-level features in the
precision-recall curve on the MSRA-1000 database as
shown in Figure 7.
• SFV can be regarded as an adaptive Fisher Vector coding,
since the parameters of the GMM model are trained
on a specific image online. This means even the same
superpixels in different images have different coding
representations. Therefore, our SFV better considers
image contextual information.
• Due to the small number of superpixels in an image and
their disjoint nature, SFV is much faster than existing
state-of-the-art FV variants. Furthermore, besides saliency
detection, SFV can also be applied to other vision tasks,
such as image segmentation and content-aware image
resizing, etc.
B. Adaptive Metric Learning
Learning a discriminative metric can better distinguish the
samples in different classes, as well as shortening the distance
within the same class. Numerous models and methods
have been proposed in the last decade, especially for the
Mahalanobis distance metric learning, such as information
theoretic metric learning (ITML) [30], large margin nearest
neighbor (LMNN) [31], [32], and logistic discriminative based
metric learning (LDML) [33].
However, most existing metric learning approaches learn a
fixed metric for all samples without considering the deeper
structure of the data, thereby breaking down in the presence of
irrelevant or unreliable features. In this paper, we propose an
adaptive metric learning approach, which considers both the
global distribution of the whole training set (GML) and the
specific structure of a single image (SML) to better separate
objects from the background. Our approach can also be viewed
as an integration of a supervised distance metric learning
model (GML) and a semi-supervised distance metric learning
model (SML). Since GML and SML are complimentary to
each other, we get promising results after fusing
them together under an affinity aggregation framework
(Section III-D).
1) Generic Metric Learning (GML): Metric learning has
been widely applied to vision tasks, but never been used for
saliency detection because of its long training time, which is
infeasible for single image processing. In this part, we solve
this problem by pre-training a Generic metric Mg from the
first 500 images of MSRA-1000 database using gradient
descent, and we verify, both experimentally and empirically,
that Mg is generally suitable for all images.
First, we construct a training set {ϕi, i = 1, 2, . . . , M}
consisted of superpixels extracted from all training images,
where ϕi is the SFV representation of superpixel i. To find
the most discriminative Mg, we minimize
M∗
g = arg min
Mg
1
2
α Mg
2
+
n {ij|δn
i =1,δn
j =0}
D(i, j) (2)
D(i, j) = exp{−(ϕi − ϕj)T
Mg(ϕi − ϕj )/σ2
1 } (3)
where δn
i is an indicator of the ith superpixel in the nth image
belonging to the foreground or background, D(i, j) is the
exponential Mahalanobis distance between i and j under the
distance metric Mg. We set σ1 = 0.1 to control the strength
of distances.
Considering that the background is various and chaotic, and
different object regions are distinctive as well, we just impose
restriction on pairwise distances between positive samples
and negative ones, which is more reliable and reasonable
for the fact that salient objects are always distinct from the
background. This minimization aims at maximizing feature
distances between foreground and background samples,
thereby significantly improving the performance of saliency
detection. Eqn 2 can be easily solved by gradient descent.
The Generic metric includes the information of all superpixels
in the whole training images, thus it is appropriate for most
images.
2) Specific Metric Learning (SML): Recently,
Wang et al. [34] propose a novel doublet-SVM metric
learning approach based on Kernel Classification Framework,
thus formulating the metric learning into a SVM problem and
achieving desirable results with less training time. However,
experiments show that directly applying doublet-SVM
to saliency detection cannot ensure good detection
accuracy. Therefore, we modify this approach by adding
a constraint ω(τ1,τ2), which significantly improves the
performance of the final saliency map.
Let {ϕi, i = 1, 2, . . . , m} be the training dataset, where ϕi is
the SFV representation of a labeled superpixel extracted from
a specific image. The detailed process of extracting labeled
superpixels from an image will be discussed in Section III-C.
We first divide these samples into foreground seeds and
background seeds and label them as 1 and 0 respectively.
Given a training sample ϕi with label hi , we find its q1 nearest
neighbors with the same label and q2 nearest neighbors with
different labels, and then (q1 + q2) doublets are constructed
for it. Each doublet consists of the training sample ϕi and
one of its nearest neighbors. By combining the doublets of
all samples together, a doublet set χ = {x1, x2, . . . , xZ } is
established, where xτ = (ϕτ,1, ϕτ,2), τ = 1, 2, . . . Z is one
of the doublets, and ϕτ,1 and ϕτ,2 are the SFV of superpixel
τ1 and τ2 in doublet xτ , We assign xτ a label as follows:
lτ = −1 if hτ,1 = hτ,2, and lτ = 1 if hτ,1 = hτ,2.
5. LI et al.: ADAPTIVE METRIC LEARNING FOR SALIENCY DETECTION 3325
As an extension of degree-2 polynomial kernel, we define
the doublet level degree-2 polynomial kernel as:
Kp(xτ , xι)
= tr
ω(τ1,τ2)(ϕτ,1 − ϕτ,2)(ϕτ,1 − ϕτ,2)T
ω(ι1,ι2)(ϕι,1 − ϕι,2)(ϕι,1 − ϕι,2)T
= ω(τ1,τ2)ω(ι1,ι2){(ϕτ,1 − ϕτ,2)T
(ϕι,1 − ϕι,2)}2
(4)
where ω(τ1,τ2) = θ(τ1,τ2) ∗ O(τ1,τ2) is a weight parameter.
θ(τ1,τ2) = 1−exp(−dist(τ1,τ2)/σ2) (5)
O(τ1,τ2) = 1 − exp{−(Oτ1 − Oτ2)2
/σ2} (6)
where dist(τ1,τ2) is the space distance between superpixel
τ1 and τ2, and θ(τ1,τ2) is the corresponding exponential space
distance. Oτ1 is the objectness score defined as Eqn 11 of
superpixel τ1, and O(τ1,τ2) is the superpixel-wise objectness
distance between τ1 and τ2. We set σ2 = 0.1. The weight
parameter ω(τ1,τ2) provides crucial spatial and prior informa-
tion regarding the interesting objects, thus it is more robust in
evaluating the similarity between a pair of superpixels than the
feature distance alone. In order to determinate the similarity of
two samples in a doublet, we further define a kernel decision
function as follows:
E(x) = sgn{
τ
ατ lτ Kp(xτ , x) + β} (7)
where ατ is the weight of doublet xτ , β is a bias parameter.
We have
τ
ατ lτ Kp(xτ , x) + β
= ω(x1,x2)(ϕx,1 − ϕx,2)T
Ms(ϕx,1 − ϕx,2) + β (8)
Ms =
τ
ατ lτ ω(τ1,τ2)(ϕτ,1 − ϕτ,2)(ϕτ,1 − ϕτ,2)T
(9)
For the facility of computation, we set ω(x1,x2)=1. The
proposed Specific metric Ms can be easily solved by existing
SVM solvers. The Specific metric is trained only on the
test image, and it is much faster than existing metric
learning approaches. According to [34], the doublet-SVM
is 2000 times, on average, faster than the ITML [30].
Therefore, it is feasible to train a Specific metric for each
image to better distinguish its objects from the background.
In this part, we propose two metric learning approaches:
GML and SML. The first one considers more about the global
distribution of the whole training set, while the second one
aims at exploring the deeper structure of a specific image.
GML can be pretrained offline and is generally suitable for
all images, while SML is much faster, since it can be solved
by existing SVM solvers. We need to mention that the image
specific is not always better than the Generic metric, as it has
fewer training samples and less reliable labels. Instead, these
two metrics are supposed to be complementary to each other
and can be fused together to improve the performance of the
final detection results.
C. Iterative Seeds Selection by Mahalanobis Distance (ISMD)
As a preliminary criterion of saliency detection, saliency
seeds directly influence the performance of seeds-based
solutions. Recently, Liu et al. [28] propose an optimal
seeds selection strategy via submodularity. By adding a stop
criterion, the submodularity problem can be solved and then
the optimal seed set is obtained accordingly. In [35], Lu et al.
learn optimal seeds by combining bottom-up saliency maps
and mid-level vision cues. Inspired by their works, we propose
a compact but efficient iterative seeds selection scheme based
on the Mahalanobis distance assessment (ISMD).
Alexe et al. [24] present a novel objectness method to
measure the likelihood of a given image window containing
an object. Jiang et al. [18] extend the original objectness to
Pixel-level Objectness O(p) and Region-level Objectness Oi
by defining:
O(p) =
W
w=1
P(w) (10)
Oi =
1
T
p∈i
O(p) (11)
where W is the number of sampling windows that contain
pixel p, and P(w) is the probability score of the wth window,
T is the number of pixels within region i. We redefine the
region-level objectness as superpixle-wise objectness in this
paper.
Motivated by the fact that highlights of the superpixle-wise
objectness map are more likely to be the foreground seeds,
a set of initial foreground seeds is constructed from the lightest
two percent regions of the objectness map. Considering
that the background is massive and scattered, we pick out
several lowest objectness values from each boundary of the
superpixel-wise objectness map as initial background seeds.
The intuition is that if superpixel i is a foreground seed, the
ratio of distances from foreground seeds and background seeds
should be small. We formulate the ratio as follows:
i =
f s
drat(i, f s)
bs
drat(i, bs)
(12)
where
drat(i, f s) = φ(i, f s)(ϕi − ϕ f s)Mg(ϕi − ϕ f s)T
(13)
is the Mahalanobis distance between superpixel i and one
of foreground seeds f s under the Generic metric Mg, and
φ(i, f s) = d(i, f s) ∗ O(i, f s) is a weight parameter, where
d(i, f s) = exp(−dist2
(i, f s)/σ2) (14)
is another kind of exponential space distance between
superpixel i and f s. Only when i ≤ 0 or i ≥ 1,
i can be added to the foreground seeds set or background
seeds set, where 0 and 1 are two thresholds. With the
new added seeds each time, we iterate this process N1 times.
Since most of the area in an image belongs to the back-
ground, in order to generate more background seeds, the
iteration continues N2 times more, but only selects seeds
6. 3326 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 11, NOVEMBER 2015
Fig. 4. Iterative seeds selection by Mahalanobis distance. Initial saliency
seeds are first selected from the lightest and the darkest parts of the superpixel-
wise objectness map. By computing the Mahalanobis distance between any
superpixel and the chosen seeds, we iteratively increase the foreground and
background seeds.
with
bs
drat(i, bs) ≤ 2, where 2 is a threshold. Then we
obtain the final seeds set as illustrated in Figure 4.
As elaborated in Section III-B2, the Specific metric Ms
can be learnt from the labeled seeds via doublet-SVM.
One may concern that Ms will rely too much on Mg, since the
labeled seeds are generated under Mg. Fortunately, by learning
a generally suitable metric, we can enforce a very high seeds
accuracy (98.82% on MSRA-1000 database) which means the
seeds-based Specific metric is reliable enough to measure the
distance.
D. Metric Fusion for Extracting Spectral
Clustering Characteristics
Aggregating several affinity matrices appropriately may
enhance the relevant and useful information, and at the same
time, alleviate the irrelevant and unreliable one. Spectral
clustering is an important unsupervised clustering algorithm
for transferring the feature representation into a more
discriminative indicator space, and we call this property as
“spectral clustering characteristics”. Spectral clustering has
been applied to many fields for its effective and outstanding
performance.
In this section, we merge the metric fusion into a spectral
clustering features extraction process [36] and learn the
optimal aggregation weight for each affinity matrix. The fusion
strategy significantly improves the results of saliency detection
as shown in Figure 5. Based on the two metrics learnt above,
two affinity matrices g and s are constructed with the
corresponding i jth element
π
g
i, j = exp{−φ(i, j)(ϕi − ϕj )Mg(ϕi − ϕj )T
/σ3}
πs
i, j = exp{−φ(i, j)(ϕi − ϕj )Ms(ϕi − ϕj )T
/σ3} (15)
where σ3 = 0.1. The affinity aggregation strategy aims
at finding the optimal clustering characteristic vector of
all the superpixels in an image and the weight parameter
ϑ = [ϑg, ϑs]T associated with g and s, so the fusion
Fig. 5. Evaluation of metrics. (a) input images. (b) Generic metric.
(c) Specific metric. (d) fused results. (e) ground truth.
problem can be conducted as:
min
ϑg,ϑs
1,..., r
{
i, j
ϑ2
g π
g
i, j i − j
2
+
i, j
ϑ2
s πs
i, j i − j
2
}
= min
ϑg,ϑs
1,..., r
{ϑ2
g
T
(Hg − g) + ϑ2
s
T
(Hs − s) }
= min
ϑg,ϑs
(βgϑ2
g + βsϑ2
s ) (16)
where i is the clustering characteristic indicator of
superpixel i, and r is the number of superpixels in an image,
Hg = diag{h11, . . . , hrr } is the diagonal matrix of g with its
diagonal element hii =
j
π
g
i, j , βg = T (Hg − g) . To solve
this problem, we first employ two constraints: the normalized
weight constraint ϑg + ϑs = 1 and the normalized spectral
clustering constraint T H = 1. By fixing ϑ, the clustering
characteristic vector can be easily obtained using standard
spectral clustering. If is given, Eqn 16 can be formulated as:
min
ϑg,ϑs
(βgϑ2
g + βsϑ2
s ) = min
μg,μs
(ρgμ2
g + ρsμ2
s ) (17)
subject to
μ2
g + μ2
s = 1,
μg
√
αg
+
μs
√
αs
= 1 (18)
where αg = T Hg , ρg=
βg
αg
and μg =
√
αgϑg. This can be
easily solved by existing 1D line-search methods.
To summarize, metric fusion tries to find the optimal
clustering characteristic vector and the optimal weight
parameter ϑ via a two-step iterative strategy. Since affinity
matrices incorporate φ(i, j) in Eqn 15, the convergence
can be very fast, about three iterations in each image.
We use the indicator representation to compute saliency maps
(Section III-E).
E. Context-Based Multi-Scale Saliency Detection
In this section, we propose a context-based multi-scale
saliency detection algorithm to compute the saliency map for
each image. Lacking the knowledge of sizes of objects, we first
generate superpixels in S different scales. Then the K-means
algorithm is applied in each scale to segment an image into
7. LI et al.: ADAPTIVE METRIC LEARNING FOR SALIENCY DETECTION 3327
Fig. 6. The distribution of saliency values of ground truth foregrounds and backgrounds. (a) Generic metric on MSRA-1000. (b) Specific metric on
MSRA-1000. (c) AML on MSRA-1000. (d) AML on MSRA-5000.
N clusters via their SFV features. According to the intuition
that a superpixel is salient if its cluster neighbors are close
to the foreground seeds and far from the background seeds,
we define the distance between superpixel i and saliency seeds
in scale s as:
D(s)
i, f s =
f n(s)
q=1
{γ i − q + (1 − γ)
N
(s)
c
j=1
Wi, j j − q }
D
(s)
i,bs =
bn(s)
q=1
{γ i − q + (1 − γ)
N
(s)
c
j=1
Wi, j j − q }
(19)
where
Wi, j = Q1 exp{−dist(i, j)/σ2} ∗ Q2 exp{−(Oi − Oj )2
/σ2}
(20)
is the weighted distance between superpixel i and its cluster
neighbor j, and i is the clustering characteristic indicator
of superpixel i, f n and bn are the number of foreground
and background seeds chosen by our ISMD seeds selection
approach. Q1, Q2 and γ are weight parameters, Nc is the
number of cluster neighbors of superpixels i. The saliency
value of superpixel i can be formulated as:
sal(i) =
S
s=1
νs ∗ exp(Oi)
1 + {(1 − exp(−D(s)
i, f s/σ4)}/D(s)
i,bs
=
S
s=1
νs ∗ exp(Oi) ∗ D(s)
i,bs
D
(s)
i,bs + 1 − exp(−D
(s)
i, f s/σ4)
(21)
where νs is the weight of scale s, and σ4 = 0.1.
The considerations of all the other superpixels belonging to
the same cluster and multiple scales smooth the saliency map
effectively, and make our approach more robust in dealing with
complicated scenes.
IV. EXPERIMENTS
We evaluate the proposed method on four benchmark
datasets. The first one is MSRA-1000 [13], a subset
of MSRA-5000, which has been widely used in previous
works with its accurate human-labelled masks. The second
one is MRAS-5000 dataset [15] which includes 5000 more
comprehensive images. The third one is THUS-10000 [37]
consists of 10000 images, each of which has an unambiguous
salient object with pixel-wise ground truth labeling. The last
Fig. 7. (a) Precision-recall curve for Generic metric, Specific metric, and
fused results without neighbor smoothness (MSRA-1000 and Berkeley-300).
Precision-recall curve based on SFV and low-level features. Precision-recall
curve for other two fusion methods. (b) Images of fused results based on SFV
and low-level features.
one is Berkeley-300 [38] which contains more challenging
scenes with multiple objects of different sizes and locations.
Since we have already used the first 500 images
of MSRA-1000 for training, we evaluate our algorithm
and compare it with other methods on the rest 500 images of
MSRA-1000, 4500 images of MSRA-5000, where excludes
500 training images (MSRA-5000 contains all the images of
MSRA-1000), 9501 images of THUS-10000 (THUS-10000
contains 499 training images), and Berkeley-300.
A. Evaluation of Metrics
We perform several comparative experiments as shown
in Figure 5, Figure 6 and Figure 7(a) to demonstrate the
efficiency of Generic metric (GML), Specific metric (SML),
and their combination (AML based on SFV). In order to
eliminate the influence of neighbor smoothness, Eqn 19, when
comparing metrics, we just compute the distance between each
superpixel and seeds, instead of the sum of weighted distances
of its cluster neighbors:
D(s)
i, f s =
f n(s)
q=1
i − q , D(s)
i,bs =
bn(s)
q=1
i − q (22)
The precision-recall curves of the Generic metric and Specific
metric are almost the same, but their combination outperforms
both of them. We also try to add or multiply saliency maps
generated by these two metrics directly, but the PR curves are
much lower than our fusion approach in Figure 7(a). This
is consistent with our motivation: Mg is trained from the
8. 3328 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 11, NOVEMBER 2015
Fig. 8. Results of different methods. (a), (b) Precision-recall curves on MSRA-1000. (c) Average precisions, recalls, F-measures and AUC on MSRA-1000.
(d), (e) Precision-recall curves on MSRA-5000. (f) Average precisions, recalls, F-measures and AUC on MSRA-5000.
Fig. 9. Results of different methods. (a), (b) Precision-recall curves on THUS-10000. (c) Average precisions, recalls, F-measures and AUC on THUS-10000.
(d), (e) Precision-recall curves on Berkeley-300. (f) Average precisions, recalls, F-measures and AUC on Berkeley-300.
whole training dataset, containing the global distribution of the
data, and Ms aims at a single image, considering the specific
structure of samples.
Figure 5 demonstrates that the fused results significantly
remove the light saliency values in the background regions
produced by GML and SML. Since most parts in computing
saliency maps under different metrics are the same,
e.g., objectness prior map, seeds selection, etc., it is reasonable
that Figure 5 (b) and (c) are similar, but there are still
differences between them. To further prove this, we conduct an
extra experiment as shown in Figure 11. The second line is the
results generated by fusing the GML with itself, the third line
is the results generated by fusing the SML with itself, and the
fourth line the obtained by fusing the GML and SML. We call
them as GG, SS, and AML respectively. Limited by the image
resolution, some differences between the GML and SML may
not be find in Figure 5, but the integration with the metric
itself can apparently enlarge their distinctiveness. Furthermore,
if one metric is incorrect, another one can make up it.
The SS performs better than the GG in Figure 11 (a)-(e),
while the GG is better in (f)-(g), and the AML tends to
take the best results of them, which demonstrates that the
GML and SML are indeed complimentary to each other and
improve the performance of saliency detection after fusion.
Figure 11 (k)-(m) show that if both the GML and SML get
bad results, the results after fusion are still bad.
In addition, we plot the distribution of saliency values in
Figure 6. Ground truth masks provide a specific label, 1 or 0,
for each pixel and we regard a superpixel as foreground when
more than 80% pixels of it are labelled by 1. Otherwise, the
superpixel will be background. We put all the foreground
superpixels from the whole dataset together and get the
distribution of their saliency values computed by different
saliency methods as the red line. The blue line is the
distribution of saliency values of background superpixels.
Figure 6(a), (b), (c) are the saliency distribution produced
by GML, SML and AML on MSRA-1000 respectively.
Figure 6(d) is AML on MSRA-5000. This shows that AML is
better than GML and SML, since its background saliency
values are closer to 0.
Furthermore, our Generic metric is robust to different
databases. We use the metric trained from MSRA-1000 to
all the databases, including MSRA-1000, MSRA-5000,
THUS-10000, and Berkeley-300. As shown in
Figure 8 and Figure 9, the results are still promising even
on different databases, which demonstrates the effectiveness
and adaptiveness of our Generic metric. Overall, the fused
results based on two outstanding and complementary metrics
achieve higher precision and recall values and generate more
accurate saliency maps.
B. Evaluation of Superpixel-Wise Fisher Vector
We have mentioned that our Superpixel-wise Fisher Vector
coding approach can improve the performance of saliency
detection by capturing the average first-order and second-order
differences between local features and the centers of a Mixture
of Gaussian Distributions. In experiments, we extract the
low-level features: RGB and LAB to learn a 12D SFV
representation for each superpixel ( = 6, K = 1,
= 2 K = 12). Figure 7(a) shows the efficiency of our
SFV coding approach by comparing the precision-recall curves
of low-level features and the SFV on MSRA-1000 database.
Figure 7(b) are corresponding images.
C. Evaluation of Saliency Maps
We compare the proposed saliency detection model with
several state-of-the-art methods: IT [5], GB [19], FT [13],
9. LI et al.: ADAPTIVE METRIC LEARNING FOR SALIENCY DETECTION 3329
Fig. 10. The comparison of previous methods, our algorithm and ground truth. (a) Test image. (b) IT [5]. (c) GB [19]. (d) GC [39].
(e) CB [44]. (f) UFO [18]. (g) Proposed. (h) Ground truth.
GC [39], UFO [18], SVO [40], HS [41], PD [42], AMC [43],
RCJ [37], DSR [20], DRFI [26], CB [44], RC [6], LR [14]
and XL [45]. We use source codes provided by the
authors or implement them based on the available codes or
softwares.
We conduct several quantitative comparisons of some
typical saliency detection methods. Figure 8(a), (b), (d) and (e)
show that the proposed AML is comparable with most of the
state-of-the-arts on MSRA-1000 and MSRA-5000 databases.
Figure 8(c) and (f) are the comparisons of average precision,
recall, F-measure and AUC. We use AUC as an evaluation
criteria, since it represents the area under the PR curve
and can effectively reflect the global properties of different
algorithms. Instead of using the bounding boxes to evaluate
the saliency detection performances on MSRA-5000 database,
we adopt the accurate human-labeled masks provided by [26]
to ensure more reliable comparative results. We also perform
experiments on THUS-10000 and Berkeley-300 databases
as shown in Figure 9. Precision-recall curves show that
AML reaches 97.4%, 94.0%, 96.5%, 81.5% precision rate on
MSRA-1000, MSRA-5000, THUS-10000, and Berkeley-300
respectively. All of them demonstrate the efficiency of our
method.
Figure 10 shows some sample results of five previous
approaches and our AML algorithm. The IT and GB methods
are capable in finding the salient regions in most cases, but
they tend to highlight the boundaries and miss lots of object
information because of the blurriness of saliency maps. The
GC method cannot contain all the salient pixels and often
mislabels small background patches as salient regions. The
CB and UFO models can highlight the objects uniformly, but
they become invalid in dealing with challenging scenes. Our
method can catch both the small and large salient objects even
in complex environments. In addition, we can highlight the
objects uniformly with accurate boundaries and do not need
to care about the number and locations of the salient objects.
We also test the average computational cost on different
datasets: 18.15s on MSRA-1000, 18.42s on MSRA-5000,
17.90s on THUS-10000 and 18.78s on Berkeley-300. The pro-
posed algorithm is implemented in MATLAB on a PC machine
with Intel i7-3370 CPU (3.4 GHz) and 32 GB memory.
D. Evaluation of Selected Seeds
We train an effective Specific metric based on the
assumption that the selected seeds are correct. In experiments,
10. 3330 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 11, NOVEMBER 2015
Fig. 11. Example results of different metrics. The first line is the input images, the second line is the results generated by fusing the GML with itself, the
third line is the results generated by fusing the SML with itself, the fourth line is obtained by fusing the GML and SML, and the last line is the ground truth
images.
we cannot ensure that the chosen seeds are completely
accurate, but we can enforce a very high seeds accuracy. The
accuracy of selected seeds is defined as follows:
sa =
f sc + bsc
f st + bst
=
f sc + bsc
( f sc + f sic) + (bsc + bsic)
(23)
where
f sc =
n i
(gtn
i &seedn
i )
bsc =
n i
(gtn
i &seedn
i ) (24)
i represents the ith superpixel extracted from the nth image
of a typical database. gtn
i and seedn
i are the ground truth and
label assigned by our seeds selection mechanism of i. The
accuracy rates of four databases are: 0.9882 on MSRA-1000,
0.9769 on MSRA-5000, 0.9822 on THUS-10000 and 0.8874
on Berkeley-300. We experimentally verify that the seeds are
accurate enough to generate a reliable Specific metric for
each image.
V. CONCLUSION
In this paper, we explicitly propose two Mahalanobis
distance metric learning models and a superpixel-wise fisher
vector representation for visual saliency detection. To our
knowledge, we are the first to apply metric learning to
saliency detection and conduct a metric fusion mechanism
to improve the detection accuracy. Different from previous
methods, we adopt a new feature coding strategy and make
the supervised metric learning more suitable for single image
processing. In addition, we propose an accurate seeds selection
method based on the Mahalanobis distance measure to train
the Specific metric and construct the final saliency map.
We estimate the saliency value of each superpixel from a
multi-scale view and include the contextual information when
computing it. Experimental results with sixteen state-of-the-art
algorithms on four benchmark image databases demonstrate
the efficiency of our metric learning approach and the saliency
detection model. In the future, we plan to explore more robust
object detection approaches to further improve the accuracy
of saliency detection.
REFERENCES
[1] C. Siagian and L. Itti, “Rapid biologically-inspired scene classification
using features shared with visual attention,” IEEE Trans. Pattern Anal.
Mach. Intell., vol. 29, no. 2, pp. 300–312, Feb. 2007.
[2] H. Liu, X. Xie, X. Tang, Z.-W. Li, and W.-Y. Ma, “Effective browsing
of Web image search results,” in Proc. 6th ACM SIGMM Int. Workshop
Multimedia Inf. Retr., 2004, pp. 84–90.
[3] C. Christopoulos, A. Skodras, and T. Ebrahimi, “The JPEG2000 still
image coding system: An overview,” IEEE Trans. Consum. Electron.,
vol. 46, no. 4, pp. 1103–1127, Nov. 2000.
[4] Y. Niu, F. Liu, X. Li, and M. Gleicher, “Warp propagation for video
resizing,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2010,
pp. 537–544.
[5] L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual
attention for rapid scene analysis,” IEEE Trans. Pattern Anal. Mach.
Intell., vol. 20, no. 11, pp. 1254–1259, Nov. 1998.
[6] M.-M. Cheng, G.-X. Zhang, N. J. Mitra, X. Huang, and S.-M. Hu,
“Global contrast based salient region detection,” in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit., Jun. 2011, pp. 409–416.
[7] Y. Xie, H. Lu, and M.-H. Yang, “Bayesian saliency via low and mid
level cues,” IEEE Trans. Image Process., vol. 22, no. 5, pp. 1689–1698,
May 2013.
[8] C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang, “Saliency detection
via graph-based manifold ranking,” in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit., Jun. 2013, pp. 3166–3173.
[9] J. Sun, H. Lu, and X. Liu, “Saliency region detection based on Markov
absorption probabilities,” IEEE Trans. Image Process., vol. 24, no. 5,
pp. 1639–1649, May 2015.
[10] Y.-F. Ma and H.-J. Zhang, “Contrast-based image attention analysis by
using fuzzy growing,” in Proc. 11th ACM Int. Conf. Multimedia, 2003,
pp. 374–381.
[11] J. Sun, H. Lu, and S. Li, “Saliency detection based on integration
of boundary and soft-segmentation,” in Proc. IEEE Int. Conf. Image
Process., Sep./Oct. 2012, pp. 1085–1088.
[12] X. Hou and L. Zhang, “Saliency detection: A spectral residual approach,”
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2007,
pp. 1–8.
[13] R. Achanta, S. Hemami, F. Estrada, and S. Süsstrunk, “Frequency-tuned
salient region detection,” in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit. (CVPR), Jun. 2009, pp. 1597–1604.
[14] X. Shen and Y. Wu, “A unified approach to salient object detection via
low rank matrix recovery,” in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit., Jun. 2012, pp. 853–860.
[15] T. Liu et al., “Learning to detect a salient object,” IEEE Trans. Pattern
Anal. Mach. Intell., vol. 33, no. 2, pp. 353–367, Feb. 2011.
[16] J. Yang and M.-H. Yang, “Top-down visual saliency via joint CRF and
dictionary learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
Jun. 2012, pp. 2296–2303.
[17] J. Sánchez, F. Perronnin, T. Mensink, and J. Verbeek, “Image
classification with the Fisher vector: Theory and practice,” Int. J.
Comput. Vis., vol. 105, no. 3, pp. 222–245, 2013.
11. LI et al.: ADAPTIVE METRIC LEARNING FOR SALIENCY DETECTION 3331
[18] P. Jiang, H. Ling, J. Yu, and J. Peng, “Salient region detection by UFO:
Uniqueness, focusness and objectness,” in Proc. IEEE Int. Conf. Comput.
Vis., Dec. 2013, pp. 1976–1983.
[19] J. Harel, C. Koch, and P. Perona, “Graph-based visual saliency,” in Proc.
Adv. Neural Inf. Process. Syst., 2006, pp. 545–552.
[20] X. Li, H. Lu, L. Zhang, X. Ruan, and M.-H. Yang, “Saliency detection
via dense and sparse reconstruction,” in Proc. IEEE Int. Conf. Comput.
Vis., Dec. 2013, pp. 2976–2983.
[21] W. Zhu, S. Liang, Y. Wei, and J. Sun, “Saliency optimization from
robust background detection,” in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit., Jun. 2014, pp. 2814–2821.
[22] F. Perazzi, P. Krahenbuhl, Y. Pritch, and A. Hornung, “Saliency filters:
Contrast based filtering for salient region detection,” in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit., Jun. 2012, pp. 733–740.
[23] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk,
“SLIC superpixels compared to state-of-the-art superpixel methods,”
IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 11, pp. 2274–2282,
Nov. 2012.
[24] B. Alexe, T. Deselaers, and V. Ferrari, “Measuring the objectness of
image windows,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34,
no. 11, pp. 2189–2202, Nov. 2012.
[25] Y. Wei, F. Wen, W. Zhu, and J. Sun, “Geodesic saliency using
background priors,” in Proc. 12th Eur. Conf. Comput. Vis. (ECCV), 2012,
pp. 29–42.
[26] H. Jiang, Z. Yuan, M.-M. Cheng, Y. Gong, N. Zheng, and J. Wang.
(2014). “Salient object detection: A discriminative regional feature inte-
gration approach.” [Online]. Available: http://arxiv.org/abs/1410.5926
[27] H. Jiang, J. Wang, Z. Yuan, Y. Wu, N. Zheng, and S. Li, “Salient
object detection: A discriminative regional feature integration approach,”
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2013,
pp. 2083–2090.
[28] R. Liu, J. Cao, Z. Lin, and S. Shan, “Adaptive partial differential
equation learning for visual saliency detection,” in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit., Jun. 2014, pp. 3866–3873.
[29] Q. Chen et al., “Efficient maximum appearance search for large-scale
object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
Jun. 2013, pp. 3190–3197.
[30] J. V. Davis, B. Kulis, P. Jain, S. Sra, and I. S. Dhillon,
“Information-theoretic metric learning,” in Proc. 24th Int. Conf. Mach.
Learn., 2007, pp. 209–216.
[31] K. Q. Weinberger, J. Blitzer, and L. K. Saul, “Distance metric learning
for large margin nearest neighbor classification,” in Proc. Adv. Neural
Inf. Process. Syst., 2005, pp. 1473–1480.
[32] K. Q. Weinberger and L. K. Saul, “Fast solvers and efficient
implementations for distance metric learning,” in Proc. 25th Int. Conf.
Mach. Learn., 2008, pp. 1160–1167.
[33] M. Guillaumin, J. Verbeek, and C. Schmid, “Is that you? Metric
learning approaches for face identification,” in Proc. IEEE 12th Int.
Conf. Comput. Vis., Sep./Oct. 2009, pp. 498–505.
[34] F. Wang, W. Zuo, L. Zhang, D. Meng, and D. Zhang. (2013). “A kernel
classification framework for metric learning.” [Online]. Available:
http://arxiv.org/abs/1309.5823
[35] S. Lu, V. Mahadevan, and N. Vasconcelos, “Learning optimal seeds for
diffusion-based salient object detection,” in Proc. IEEE Conf. Comput.
Vis. Pattern Recognit., Jun. 2014, pp. 2790–2797.
[36] H.-C. Huang, Y.-Y. Chuang, and C.-S. Chen, “Affinity aggregation for
spectral clustering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
Jun. 2012, pp. 773–780.
[37] M.-M. Cheng, N. J. Mitra, X. Huang, P. H. S. Torr, and S.-M. Hu,
“Global contrast based salient region detection,” IEEE Trans. Pattern
Anal. Mach. Intell., vol. 37, no. 3, pp. 569–582, Mar. 2014.
[38] V. Movahedi and J. H. Elder, “Design and perceptual validation
of performance measures for salient object segmentation,” in Proc.
IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Workshops,
Jun. 2010, pp. 49–56.
[39] M.-M. Cheng, J. Warrell, W.-Y. Lin, S. Zheng, V. Vineet, and N. Crook,
“Efficient salient region detection with soft image abstraction,” in Proc.
IEEE Int. Conf. Comput. Vis., Dec. 2013, pp. 1529–1536.
[40] K.-Y. Chang, T.-L. Liu, H.-T. Chen, and S.-H. Lai, “Fusing generic
objectness and visual saliency for salient object detection,” in Proc. IEEE
Int. Conf. Comput. Vis., Nov. 2011, pp. 914–921.
[41] Q. Yan, L. Xu, J. Shi, and J. Jia, “Hierarchical saliency detection,”
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2013,
pp. 1155–1162.
[42] R. Margolin, A. Tal, and L. Zelnik-Manor, “What makes a patch
distinct?” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
Jun. 2013, pp. 1139–1146.
[43] B. Jiang, L. Zhang, H. Lu, C. Yang, and M.-H. Yang, “Saliency detection
via absorbing Markov chain,” in Proc. IEEE Int. Conf. Comput. Vis.,
Dec. 2013, pp. 1665–1672.
[44] H. Jiang, J. Wang, Z. Yuan, T. Liu, N. Zheng, and S. Li, “Automatic
salient object segmentation based on context and shape prior,” in Proc.
BMVC, 2011, pp. 110.1–110.12
[45] Y. Xie and H. Lu, “Visual saliency detection based on Bayesian model,”
in Proc. 18th IEEE Int. Conf. Image Process., Sep. 2011, pp. 645–648.
Shuang Li is currently pursuing the
B.E. degree with the School of Information
and Communication Engineering, Dalian University
of Technology (DUT), China. From 2012 to 2015,
she was a Research Assistant with the Computer
Vision Group, DUT. Her research interests focus
on saliency detection and object recognition.
Huchuan Lu (SM’12) received the M.Sc. degree
in signal and information processing and the
Ph.D. degree in system engineering from the Dalian
University of Technology (DUT), Dalian, China,
in 1998 and 2008, respectively. He joined as a
Faculty Member in 1998, and is currently a Full
Professor with the School of Information and
Communication Engineering, DUT. His current
research interests include the areas of computer
vision and pattern recognition with a focus on visual
tracking, saliency detection, and segmentation.
He is also a member of the Association for Computing Machinery and
an Associate Editor of the IEEE TRANSACTIONS ON SYSTEMS, MAN AND
CYBERNETICS—PART B.
Zhe Lin (M’10) received the B.Eng. degree in
automatic control from the University of Science
and Technology of China, in 2002, the M.S. degree
in electrical engineering from the Korea Advanced
Institute of Science and Technology, in 2004, and the
Ph.D. degree in electrical and computer engineering
from the University of Maryland, College Park,
in 2009. He has been a Research Intern with
Microsoft Live Labs Research. He is currently a
Senior Research Scientist with Adobe Research,
San Jose, CA. His research interests include deep
learning, object detection and recognition, image classification and tagging,
content-based image and video retrieval, human motion tracking, and activity
analysis.
Xiaohui Shen (M’11) received the B.S. and
M.S. degrees from the Department of Automation,
Tsinghua University, China, and the Ph.D. degree
from the Department of Electrical Engineering
and Computer Sciences, Northwestern University,
in 2013. He is currently a Research Scientist with
Adobe Research, San Jose, CA. He is generally
interested in the research problems in the area of
computer vision, in particular, image retrieval, object
detection, and image understanding.
Brian Price received the Ph.D. degree in computer
science from Brigham Young University under the
advisement of Dr. B. Morse. He has contributed
new features to many Adobe products, such as
Photoshop, Photoshop Elements, and After-Effects,
mostly involving interactive image segmentation and
matting. He is currently a Senior Research Scientist
with Adobe Research, specializing in computer
vision. His research interests include semantic seg-
mentation, interactive object selection and matting,
stereo and RGBD, and broad interest in computer
vision and its intersections with machine learning and computer graphics.