Final Year IEEE Project 2013-2014 - Digital Image Processing Project Title and Abstract


Published on

Final Year IEEE Project 2013-2014 - Digital Image Processing Project Title and Abstract

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Final Year IEEE Project 2013-2014 - Digital Image Processing Project Title and Abstract

  1. 1. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, 13 Years of Experience Automated Services 24/7 Help Desk Support Experience & Expertise Developers Advanced Technologies & Tools Legitimate Member of all Journals Having 1,50,000 Successive records in all Languages More than 12 Branches in Tamilnadu, Kerala & Karnataka. Ticketing & Appointment Systems. Individual Care for every Student. Around 250 Developers & 20 Researchers
  2. 2. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, 227-230 Church Road, Anna Nagar, Madurai – 625020. 0452-4390702, 4392702, + 91-9944793398., S.P.Towers, No.81 Valluvar Kottam High Road, Nungambakkam, Chennai - 600034. 044-42072702, +91-9600354638, 15, III Floor, SI Towers, Melapudur main Road, Trichy – 620001. 0431-4002234, + 91-9790464324. 577/4, DB Road, RS Puram, Opp to KFC, Coimbatore – 641002 0422- 4377758, +91-9677751577.
  3. 3. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, Plot No: 4, C Colony, P&T Extension, Perumal puram, Tirunelveli- 627007. 0462-2532104, +919677733255, 1st Floor, A.R.IT Park, Rasi Color Scan Building, Ramanathapuram - 623501. 04567-223225, 74, 2nd floor, K.V.K Complex,Upstairs Krishna Sweets, Mettur Road, Opp. Bus stand, Erode-638 011. 0424-4030055, +91- 9677748477 No: 88, First Floor, S.V.Patel Salai, Pondicherry – 605 001. 0413– 4200640 +91-9677704822 TNHB A-Block,, Opp: Hotel Ganesh Near Busstand. Salem – 636007, 0427-4042220, +91-9894444716.
  4. 4. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, ETPL DIP-001 Local Edge Preserving Multiscale Decomposition For High Dynamic Range Image Tone Mapping A novel filter is proposed for edge-preserving decomposition of an image. It is different from previous filters in its locally adaptive property. The filtered image contains local means everywhere and preserves local salient edges. Comparisons are made between our filtered result and the results of three other methods. A detailed analysis is also made on the behavior of the filter. A multiscale decomposition with this filter is proposed for manipulating a high dynamic range image, which has three detail layers and one base layer. The multiscale decomposition with the filter addresses three assumptions: 1) the base layer preserves local means everywhere; 2) every scale's salient edges are relatively large gradients in a local window; and 3) all of the nonzero gradient information belongs to the detail layer. An effective function is also proposed for compressing the detail layers. The reproduced image gives a good visualization. Experimental results on real images demonstrate that our algorithm is especially effective at preserving or enhancing local details. ETPL DIP-002 Multistucture Large Deformation Diffeomorphic Brain Registration( Biomedical Engineering) Whole brain MRI registration has many useful applications in group analysis and morphometry, yet accurate registration across different neuropathological groups remains challenging. Structure-specific information, or anatomical guidance, can be used to initialize and constrain registration to improve accuracy and robustness. We describe here a multistructure diffeomorphic registration approach that uses concurrent subcortical and cortical shape matching to guide the overall registration. Validation experiments carried out on openly available datasets demonstrate comparable or improved alignment of subcortical and cortical brain structures over leading brain registration algorithms. We also demonstrate that a group-wise average atlas built with multistructure registration accounts for greater intersubject variability and provides more sensitive tensor-based morphometry measurements. ETPL DIP-003 Iterative Closest Normal Point for 3D Face Recognition( Pattern Analysis and Machine Intelligence) The common approach for 3D face recognition is to register a probe face to each of the gallery faces and then calculate the sum of the distances between their points. This approach is computationally expensive and sensitive to facial expression variation. In this paper, we introduce the iterative closest normal point method for finding the corresponding points between a generic reference face and every input face. The proposed correspondence finding method samples a set of points for each face, denoted as the closest normal points. These points are effectively aligned across all faces, enabling effective application of discriminant analysis methods for 3D face recognition. As a result, the expression variation problem is addressed by minimizing the within-class variability of the face samples while maximizing the between- class variability. As an important conclusion, we show that the surface normal vectors of the face at the sampled points contain more discriminatory information than the coordinates of the points. We have performed comprehensive experiments on the Face Recognition Grand Challenge database, which is presently the largest available 3D face database. We have achieved verification rates of 99.6 and 99.2 percent at a false acceptance rate of 0.1 percent for the all versus all and ROC III experiments, respectively, which, to the best of our knowledge, have seven and four times less error rates, respectively, compared to the best existing methods on this database. ETPL DIP-004 Face Recognition & verification using photometric stergo(Information Forensics and Security) This paper presents a new database suitable for both 2-D and 3-D face recognition based on photometric stereo (PS): the Photoface database. The database was collected using a custom-made four-source PS device designed to enable data capture with minimal interaction necessary from the subjects. The device, which automatically detects the presence of a subject using ultrasound, was placed at the entrance to a
  5. 5. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, busy workplace and captured 1839 sessions of face images with natural pose and expression. This meant that the acquired data is more realistic for everyday use than existing databases and is, therefore, an invaluable test bed for state-of-the-art recognition algorithms. The paper also presents experiments of various face recognition and verification algorithms using the albedo, surface normals, and recovered depth maps. Finally, we have conducted experiments in order to demonstrate how different methods in the pipeline of PS (i.e., normal field computation and depth map reconstruction) affect recognition and verification performance. These experiments help to 1) demonstrate the usefulness of PS, and our device in particular, for minimal-interaction face recognition, and 2) highlight the optimal reconstruction and recognition algorithms for use with natural-expression PS data. The database can be downloaded from ETPL DIP-005 Objective Quality Assessment of Tone-Mapped Images Tone-mapping operators (TMOs) that convert high dynamic range (HDR) to low dynamic range (LDR) images provide practically useful tools for the visualization of HDR images on standard LDR displays. Different TMOs create different tone-mapped images, and a natural question is which one has the best quality. Without an appropriate quality measure, different TMOs cannot be compared, and further improvement is directionless. Subjective rating may be a reliable evaluation method, but it is expensive and time consuming, and more importantly, is difficult to be embedded into optimization frameworks. Here we propose an objective quality assessment algorithm for tone-mapped images by combining: 1) a multiscale signal fidelity measure on the basis of a modified structural similarity index and 2) a naturalness measure on the basis of intensity statistics of natural images. Validations using independent subject-rated image databases show good correlations between subjective ranking score and the proposed tone-mapped image quality index (TMQI). Furthermore, we demonstrate the extended applications of TMQI using two examples - parameter tuning for TMOs and adaptive fusion of multiple tone-mapped images. ETPL DIP-006 Segmentation and Tracing of Single Neurons from 3D Confocal Microscope Images( Biomedical and Health Informatics) In order to understand the brain, we need to first understand the morphology of neurons. In the neurobiology community, there have been recent pushes to analyze both neuron connectivity and the influence of structure on function. Currently, a technical roadblock that stands in the way of these studies is the inability to automatically trace neuronal structure from microscopy. On the image processing side, proposed tracing algorithms face difficulties in low contrast, indistinct boundaries, clutter, and complex branching structure. To tackle these difficulties, we develop Tree2Tree, a robust automatic neuron segmentation and morphology generation algorithm. Tree2Tree uses a local medial tree generation strategy in combination with a global tree linking to build a maximum likelihood global tree. Recasting the neuron tracing problem in a graph-theoretic context enables Tree2Tree to estimate bifurcations naturally, which is currently a challenge for current neuron tracing algorithms. Tests on cluttered confocal microscopy images of Drosophila neurons give results that correspond to ground truth within a margin of $ pm hbox{2.75}$% normalized mean absolute error. ETPL DIP-007 Silhoutte Analysis-Based action recognition via Exploiting Human Poses( Circuits and Systems for Video Technology) In this paper, we propose a novel scheme for human action recognition that combines the advantages of both local and global representations. We explore human silhouettes for human action representation by taking into account the correlation between sequential poses in an action. A modified bag-of-words model, named bag of correlated poses, is introduced to encode temporally local features of actions. To
  6. 6. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, utilize the property of visual word ambiguity, we adopt the soft assignment strategy to reduce the dimensionality of our model and circumvent the penalty of computational complexity and quantization error. To compensate for the loss of structural information, we propose an extended motion template, i.e., extensions of the motion history image, to capture the holistic structural features. The proposed scheme takes advantages of local and global features and, therefore, provides a discriminative representation for human actions. Experimental results prove the viability of the complimentary properties of two descriptors and the proposed approach outperforms the state-of-the-art methods on the IXMAS action recognition dataset. ETPL DIP-008 Pose-Invariant Face Recognition Using Markov Random Fields One of the key challenges for current face recognition techniques is how to handle pose variations between the probe and gallery face images. In this paper, we present a method for reconstructing the virtual frontal view from a given nonfrontal face image using Markov random fields (MRFs) and an efficient variant of the belief propagation algorithm. In the proposed approach, the input face image is divided into a grid of overlapping patches, and a globally optimal set of local warps is estimated to synthesize the patches at the frontal view. A set of possible warps for each patch is obtained by aligning it with images from a training database of frontal faces. The alignments are performed efficiently in the Fourier domain using an extension of the Lucas-Kanade algorithm that can handle illumination variations. The problem of finding the optimal warps is then formulated as a discrete labeling problem using an MRF. The reconstructed frontal face image can then be used with any face recognition technique. The two main advantages of our method are that it does not require manually selected facial landmarks or head pose estimation. In order to improve the performance of our pose normalization method in face recognition, we also present an algorithm for classifying whether a given face image is at a frontal or nonfrontal pose. Experimental results on different datasets are presented to demonstrate the effectiveness of the proposed approach ETPL DIP-009 Color Video Denoising Based on Combined Interframe and Intercolor Prediction( Circuits and Systems for Video Technology) An advanced color video denoising scheme which we call CIFIC based on combined interframe and intercolor prediction is proposed in this paper. CIFIC performs the denoising filtering in the RGB color space, and exploits both the interframe and intercolor correlation in color video signal directly by forming multiple predictors for each color component using all three color components in the current frame as well as the motion-compensated neighboring reference frames. The temporal correspondence is established through the joint-RGB motion estimation (ME) which acquires a single motion trajectory for the red, green, and blue components. Then the current noisy observation as well as the interframe and intercolor predictors are combined by a linear minimum mean squared error (LMMSE) filter to obtain the denoised estimate for every color component. The ill condition in the weight determination of the LMMSE filter is detected and remedied by gradually removing the “least contributing” predictor. Furthermore, our previous work on the LMMSE filter applied in the adaptive luminance-chrominance space (LAYUV for short) is revisited. By reformulating LAYUV and comparing it with CIFIC, we deduce that LAYUV is a restricted version of CIFIC, and thus CIFIC can theoretically achieve lower denoising error. Experimental results verify the improvement brought by the joint-RGB ME and the integration of the intercolor prediction, as well as the superiority of CIFIC over LAYUV. Meanwhile, when compared with other state-of-the-art algorithms, CIFIC provides competitive performance both in terms of the color peak signal-to-noise ratio and in perceptual quality.
  7. 7. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, ETPL DIP-010 Wang-Landau Monte Carlo-Based Tracking Methods for Abrupt Motions( Pattern Analysis and Machine Intelligence) We propose a novel tracking algorithm based on the Wang-Landau Monte Carlo (WLMC) sampling method for dealing with abrupt motions efficiently. Abrupt motions cause conventional tracking methods to fail because they violate the motion smoothness constraint. To address this problem, we introduce the Wang-Landau sampling method and integrate it into a Markov Chain Monte Carlo (MCMC)-based tracking framework. By employing the novel density-of-states term estimated by the Wang-Landau sampling method into the acceptance ratio of MCMC, our WLMC-based tracking method alleviates the motion smoothness constraint and robustly tracks the abrupt motions. Meanwhile, the marginal likelihood term of the acceptance ratio preserves the accuracy in tracking smooth motions. The method is then extended to obtain good performance in terms of scalability, even on a high-dimensional state space. Hence, it covers drastic changes in not only position but also scale of a target. To achieve this, we modify our method by combining it with the N-fold way algorithm and present the N-Fold Wang-Landau (NFWL)-based tracking method. The N-fold way algorithm helps estimate the density-of-states with a smaller number of samples. Experimental results demonstrate that our approach efficiently samples the states of the target, even in a whole state space, without loss of time, and tracks the target accurately and robustly when position and scale are changing severely ETPL DIP-011 Multi-View ML Object Tracking With Online Learning on Riemannian Manifolds by Combining Geometric Constraints This paper addresses issues in object tracking with occlusion scenarios, where multiple uncalibrated cameras with overlapping fields of view are exploited. We propose a novel method where tracking is first done independently in each individual view and then tracking results are mapped from different views to improve the tracking jointly. The proposed tracker uses the assumptions that objects are visible in at least one view and move uprightly on a common planar ground that may induce a homography relation between views. A method for online learning of object appearances on Riemannian manifolds is also introduced. The main novelties of the paper include: 1) define a similarity measure, based on geodesics between a candidate object and a set of mapped references from multiple views on a Riemannian manifold; 2) propose multi-view maximum likelihood estimation of object bounding box parameters, based on Gaussian-distributed geodesics on the manifold; 3) introduce online learning of object appearances on the manifold, taking into account of possible occlusions; 4) utilize projective transformations for objects between views, where parameters are estimated from warped vertical axis by combining planar homography, epipolar geometry, and vertical vanishing point; 5) embed single-view trackers in a three-layer multi-view tracking scheme. Experiments have been conducted on videos from multiple uncalibrated cameras, where objects contain long-term partial/full occlusions, or frequent intersections. Comparisons have been made with three existing methods, where the performance is evaluated both qualitatively and quantitatively. Results have shown the effectiveness of the proposed method in terms of robustness against tracking drift caused by occlusions. ETPL DIP-012 Multi-Atlas Segmentation with Joint Label Fusion ( Pattern Analysis and Machine Intelligence) Multi-atlas segmentation is an effective approach for automatically labeling objects of interest in biomedical images. In this approach, multiple expert-segmented example images, called atlases, are registered to a target image, and deformed atlas segmentations are combined using label fusion. Among the proposed label fusion strategies, weighted voting with spatially varying weight distributions derived from atlas-target intensity similarity have been particularly successful. However, one limitation of these strategies is that the weights are computed independently for each atlas, without taking into account the fact that different atlases may produce similar label errors. To address this limitation, we propose a new solution for the label fusion problem in which weighted voting is formulated in terms of minimizing the
  8. 8. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, total expectation of labeling error and in which pairwise dependency between atlases is explicitly modeled as the joint probability of two atlases making a segmentation error at a voxel. This probability is approximated using intensity similarity between a pair of atlases and the target image in the neighborhood of each voxel. We validate our method in two medical image segmentation problems: hippocampus segmentation and hippocampus subfield segmentation in magnetic resonance (MR) images. For both problems, we show consistent and significant improvement over label fusion strategies that assign atlas weights independently. ETPL DIP-013 Spatially Coherent Fuzzy Clustering for Accurate and Noise-Robust Image Segmentation In this letter, we present a new FCM-based method for spatially coherent and noise-robust image segmentation. Our contribution is twofold: 1) the spatial information of local image features is integrated into both the similarity measure and the membership function to compensate for the effect of noise; and 2) an anisotropic neighborhood, based on phase congruency features, is introduced to allow more accurate segmentation without image smoothing. The segmentation results, for both synthetic and real images, demonstrate that our method efficiently preserves the homogeneity of the regions and is more robust to noise than related FCM-based methods. ETPL DIP-014 Adaptive Markov Random Fields for Joint Unmixing and Segmentation of Hyperspectral Images Abstract: Linear spectral unmixing is a challenging problem in hyperspectral imaging that consists of decomposing an observed pixel into a linear combination of pure spectra (or endmembers) with their corresponding proportions (or abundances). Endmember extraction algorithms can be employed for recovering the spectral signatures while abundances are estimated using an inversion step. Recent works have shown that exploiting spatial dependencies between image pixels can improve spectral unmixing. Markov random fields (MRF) are classically used to model these spatial correlations and partition the image into multiple classes with homogeneous abundances. This paper proposes to define the MRF sites using similarity regions. These regions are built using a self-complementary area filter that stems from the morphological theory. This kind of filter divides the original image into flat zones where the underlying pixels have the same spectral values. Once the MRF has been clearly established, a hierarchical Bayesian algorithm is proposed to estimate the abundances, the class labels, the noise variance, and the corresponding hyperparameters. A hybrid Gibbs sampler is constructed to generate samples according to the corresponding posterior distribution of the unknown parameters and hyperparameters. Simulations conducted on synthetic and real AVIRIS data demonstrate the good performance of the algorithm. ETPL DIP-015 Depth Estimation of Face Images Using the Nonlinear Least-Squares Model Abstract: In this paper, we propose an efficient algorithm to reconstruct the 3D structure of a human face from one or more of its 2D images with different poses. In our algorithm, the nonlinear least-squares model is first employed to estimate the depth values of facial feature points and the pose of the 2D face image concerned by means of the similarity transform. Furthermore, different optimization schemes are presented with regard to the accuracy levels and the training time required. Our algorithm also embeds the symmetrical property of the human face into the optimization procedure, in order to alleviate the sensitivities arising from changes in pose. In addition, the regularization term, based on linear correlation, is added in the objective function to improve the estimation accuracy of the 3D structure. Further, a model-integration method is proposed to improve the depth-estimation accuracy when multiple nonfrontal-view face images are available. Experimental results on the 2D and 3D databases demonstrate the feasibility and efficiency of the proposed methods.
  9. 9. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, ETPL DIP-016 Local Energy Pattern for Texture Classification Using Self-Adaptive Quantization Thresholds Abstract: Local energy pattern, a statistical histogram-based representation, is proposed for texture classification. First, we use normalized local-oriented energies to generate local feature vectors, which describe the local structures distinctively and are less sensitive to imaging conditions. Then, each local feature vector is quantized by self-adaptive quantization thresholds determined in the learning stage using histogram specification, and the quantized local feature vector is transformed to a number by N-nary coding, which helps to preserve more structure information during vector quantization. Finally, the frequency histogram is used as the representation feature. The performance is benchmarked by material categorization on KTH-TIPS and KTH-TIPS2-a databases. Our method is compared with typical statistical approaches, such as basic image features, local binary pattern (LBP), local ternary pattern, completed LBP, Weber local descriptor, and VZ algorithms (VZ-MR8 and VZ-Joint). The results show that our method is superior to other methods on the KTH-TIPS2-a database, and achieving competitive performance on the KTH-TIPS database. Furthermore, we extend the representation from static image to dynamic texture, and achieve favorable recognition results on the University of California at Los Angeles (UCLA) dynamic texture database. ETPL DIP-017 Perceptual Quality Metric With Internal Generative Mechanism Abstract: Objective image quality assessment (IQA) aims to evaluate image quality consistently with human perception. Most of the existing perceptual IQA metrics cannot accurately represent the degradations from different types of distortion, e.g., existing structural similarity metrics perform well on content-dependent distortions while not as well as peak signal-to-noise ratio (PSNR) on content- independent distortions. In this paper, we integrate the merits of the existing IQA metrics with the guide of the recently revealed internal generative mechanism (IGM). The IGM indicates that the human visual system actively predicts sensory information and tries to avoid residual uncertainty for image perception and understanding. Inspired by the IGM theory, we adopt an autoregressive prediction algorithm to decompose an input scene into two portions, the predicted portion with the predicted visual content and the disorderly portion with the residual content. Distortions on the predicted portion degrade the primary visual information, and structural similarity procedures are employed to measure its degradation; distortions on the disorderly portion mainly change the uncertain information and the PNSR is employed for it. Finally, according to the noise energy deployment on the two portions, we combine the two evaluation results to acquire the overall quality score. Experimental results on six publicly available databases demonstrate that the proposed metric is comparable with the state-of-the-art quality metrics. ETPL DIP-018 Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study Abstract: Visual attention is a process that enables biological and machine vision systems to select the most relevant regions from a scene. Relevance is determined by two components: 1) top-down factors driven by task and 2) bottom-up factors that highlight image regions that are different from their surroundings. The latter are often referred to as “visual saliency.” Modeling bottom-up visual saliency has been the subject of numerous research efforts during the past 20 years, with many successful applications in computer vision and robotics. Available models have been tested with different datasets (e.g., synthetic psychological search arrays, natural images or videos) using different evaluation scores (e.g., search slopes, comparison to human eye tracking) and parameter settings. This has made direct comparison of models difficult. Here, we perform an exhaustive comparison of 35 state-of-the-art saliency models over 54 challenging synthetic patterns, three natural image datasets, and two video datasets, using three evaluation scores. We find that although model rankings vary, some models consistently perform better. Analysis of datasets reveals that existing datasets are highly center-biased,
  10. 10. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, which influences some of the evaluation scores. Computational complexity analysis shows that some models are very fast, yet yield competitive eye movement prediction accuracy. Different models often have common easy/difficult stimuli. Furthermore, several concerns in visual saliency modeling, eye movement datasets, and evaluation scores are discussed and insights for future work are provided. Our study allows one to assess the state-of-the-art, helps to organizing this rapidly growing field, and sets a unified comparison framework for gauging future efforts, similar to the PASCAL VOC challenge in the object recognition and detection domains. ETPL DIP-019 Local Edge-Preserving Multiscale Decomposition for High Dynamic Range Image Tone Mapping Abstract: A novel filter is proposed for edge-preserving decomposition of an image. It is different from previous filters in its locally adaptive property. The filtered image contains local means everywhere and preserves local salient edges. Comparisons are made between our filtered result and the results of three other methods. A detailed analysis is also made on the behavior of the filter. A multiscale decomposition with this filter is proposed for manipulating a high dynamic range image, which has three detail layers and one base layer. The multiscale decomposition with the filter addresses three assumptions: 1) the base layer preserves local means everywhere; 2) every scale's salient edges are relatively large gradients in a local window; and 3) all of the nonzero gradient information belongs to the detail layer. An effective function is also proposed for compressing the detail layers. The reproduced image gives a good visualization. Experimental results on real images demonstrate that our algorithm is especially effective at preserving or enhancing local details. ETPL DIP-020 LLSURE: Local Linear SURE-Based Edge-Preserving Image Filtering Abstract: In this paper, we propose a novel approach for performing high-quality edge-preserving image filtering. Based on a local linear model and using the principle of Stein's unbiased risk estimate as an estimator for the mean squared error from the noisy image only, we derive a simple explicit image filter which can filter out noise while preserving edges and fine-scale details. Moreover, this filter has a fast and exact linear-time algorithm whose computational complexity is independent of the filtering kernel size; thus, it can be applied to real time image processing tasks. The experimental results demonstrate the effectiveness of the new filter for various computer vision applications, including noise reduction, detail smoothing and enhancement, high dynamic range compression, and flash/no-flash denoising. ETPL DIP-021 Optimal Inversion of the Generalized Anscombe Transformation for Poisson-Gaussian Noise Abstract: Many digital imaging devices operate by successive photon-to-electron, electron-to-voltage, and voltage-to-digit conversions. These processes are subject to various signal-dependent errors, which are typically modeled as Poisson-Gaussian noise. The removal of such noise can be effected indirectly by applying a variance-stabilizing transformation (VST) to the noisy data, denoising the stabilized data with a Gaussian denoising algorithm, and finally applying an inverse VST to the denoised data. The generalized Anscombe transformation (GAT) is often used for variance stabilization, but its unbiased inverse transformation has not been rigorously studied in the past. We introduce the exact unbiased inverse of the GAT and show that it plays an integral part in ensuring accurate denoising results. We demonstrate that this exact inverse leads to state-of-the-art results without any notable increase in the computational complexity compared to the other inverses. We also show that this inverse is optimal in the sense that it can be interpreted as a maximum likelihood inverse. Moreover, we thoroughly analyze the behavior of the proposed inverse, which also enables us to derive a closed-form approximation for it. This paper generalizes our work on the exact unbiased inverse of the Anscombe transformation, which we have presented earlier for the removal of pure Poisson noise.
  11. 11. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, ETPL DIP-022 Blind Separation of Time/Position Varying Mixtures Abstract: We address the challenging open problem of blindly separating time/position varying mixtures, and attempt to separate the sources from such mixtures without having prior information about the sources or the mixing system. Unlike studies concerning instantaneous or convolutive mixtures, we assume that the mixing system (medium) is varying in time/position. Attempts to solve this problem have mostly utilized, so far, online algorithms based on tracking the mixing system by methods previously developed for the instantaneous or convolutive mixtures. In contrast with these attempts, we develop a unified approach in the form of staged sparse component analysis (SSCA). Accordingly, we assume that the sources are either sparse or can be “sparsified.” In the first stage, we estimate the filters of the mixing system, based on the scatter plot of the sparse mixtures' data, using a proper clustering and curve/surface fitting. In the second stage, the mixing system is inverted, yielding the estimated sources. We use the SSCA approach for solving three types of mixtures: time/position varying instantaneous mixtures, single- path mixtures, and multipath mixtures. Real-life scenarios and simulated mixtures are used to demonstrate the performance of our approach. ETPL DIP-023 Nonlocal Transform-Domain Filter for Volumetric Data Denoising and Reconstruction Abstract: We present an extension of the BM3D filter to volumetric data. The proposed algorithm, BM4D, implements the grouping and collaborative filtering paradigm, where mutually similar d - dimensional patches are stacked together in a (d+1) -dimensional array and jointly filtered in transform domain. While in BM3D the basic data patches are blocks of pixels, in BM4D we utilize cubes of voxels, which are stacked into a 4-D “group.” The 4-D transform applied on the group simultaneously exploits the local correlation present among voxels in each cube and the nonlocal correlation between the corresponding voxels of different cubes. Thus, the spectrum of the group is highly sparse, leading to very effective separation of signal and noise through coefficient shrinkage. After inverse transformation, we obtain estimates of each grouped cube, which are then adaptively aggregated at their original locations. We evaluate the algorithm on denoising of volumetric data corrupted by Gaussian and Rician noise, as well as on reconstruction of volumetric phantom data with non-zero phase from noisy and incomplete Fourier-domain (k-space) measurements. Experimental results demonstrate the state-of-the-art denoising performance of BM4D, and its effectiveness when exploited as a regularizer in volumetric data reconstruction. ETPL DIP-024 Huber Fractal Image Coding Based on a Fitting Plane Abstract: Recently, there has been significant interest in robust fractal image coding for the purpose of robustness against outliers. However, the known robust fractal coding methods (HFIC and LAD-FIC, etc.) are not optimal, since, besides the high computational cost, they use the corrupted domain block as the independent variable in the robust regression model, which may adversely affect the robust estimator to calculate the fractal parameters (depending on the noise level). This paper presents a Huber fitting plane-based fractal image coding (HFPFIC) method. This method builds Huber fitting planes (HFPs) for the domain and range blocks, respectively, ensuring the use of an uncorrupted independent variable in the robust model. On this basis, a new matching error function is introduced to robustly evaluate the best scaling factor. Meanwhile, a median absolute deviation (MAD) about the median decomposition criterion is proposed to achieve fast adaptive quadtree partitioning for the image corrupted by salt & pepper noise. In order to reduce computational cost, the no-search method is applied to speedup the encoding process. Experimental results show that the proposed HFPFIC can yield superior performance over conventional robust fractal image coding methods in encoding speed and the quality of the restored image. Furthermore, the no-search method can significantly reduce encoding time and achieve less than 2.0 s for
  12. 12. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, the HFPFIC with acceptable image quality degradation. In addition, we show that, combined with the MAD decomposition scheme, the HFP technique used as a robust method can further reduce the encoding time while maintaining image quality. ETPL DIP-025 Demosaicking of Noisy Bayer-Sampled Color Images With Least-Squares Luma- Chroma Demultiplexing and Noise Level Estimation Abstract: This paper adapts the least-squares luma-chroma demultiplexing (LSLCD) demosaicking method to noisy Bayer color filter array (CFA) images. A model is presented for the noise in white- balanced gamma-corrected CFA images. A method to estimate the noise level in each of the red, green, and blue color channels is then developed. Based on the estimated noise parameters, one of a finite set of configurations adapted to a particular level of noise is selected to demosaic the noisy data. The noise- adaptive demosaicking scheme is called LSLCD with noise estimation (LSLCD-NE). Experimental results demonstrate state-of-the-art performance over a wide range of noise levels, with low computational complexity. Many results with several algorithms, noise levels, and images are presented on our companion web site along with software to allow reproduction of our results. ETPL DIP-026 Multiscale Gradients-Based Color Filter Array Interpolation Abstract: Single sensor digital cameras use color filter arrays to capture a subset of the color data at each pixel coordinate. Demosaicing or color filter array (CFA) interpolation is the process of estimating the missing color samples to reconstruct a full color image. In this paper, we propose a demosaicing method that uses multiscale color gradients to adaptively combine color difference estimates from different directions. The proposed solution does not require any thresholds since it does not make any hard decisions, and it is noniterative. Although most suitable for the Bayer CFA pattern, the method can be extended to other mosaic patterns. To demonstrate this, we describe its application to the Lukac CFA pattern. Experimental results show that it outperforms other available demosaicing methods by a clear margin in terms of CPSNR and S-CIELAB measures for both mosaic patterns. ETPL DIP-027 Optimal local dimming for LC image formation with controllable backlighting Abstract: Light emitting diode (LED)-backlit liquid crystal displays (LCDs) hold the promise of improving image quality while reducing the energy consumption with signal-dependent local dimming. However, most existing local dimming algorithms are mostly motivated by simple implementation, and they often lack concern for visual quality. To fully realize the potential of LED-backlit LCDs and reduce the artifacts that often occur in current systems, we propose a novel local dimming technique that can achieve the theoretical highest fidelity of intensity reproduction in either l1 or l2 metrics. Both the exact and fast approximate versions of the optimal local dimming algorithm are proposed. Simulation results demonstrate superior performances of the proposed algorithm in terms of visual quality and power consumption. ETPL DIP-028 Multiscale Bi-Gaussian Filter for Adjacent Curvilinear Structures Detection With Application to Vasculature Images Abstract: The intensity or gray-level derivatives have been widely used in image segmentation and enhancement. Conventional derivative filters often suffer from an undesired merging of adjacent objects because of their intrinsic usage of an inappropriately broad Gaussian kernel; as a result, neighboring structures cannot be properly resolved. To avoid this problem, we propose to replace the low-level Gaussian kernel with a bi-Gaussian function, which allows independent selection of scales in the foreground and background. By selecting a narrow neighborhood for the background with regard to the foreground, the proposed method will reduce interference from adjacent objects simultaneously preserving the ability of intraregion smoothing. Our idea is inspired by a comparative analysis of existing line filters, in which several traditional methods, including the vesselness, gradient flux, and medialness
  13. 13. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, models, are integrated into a uniform framework. The comparison subsequently aids in understanding the principles of different filtering kernels, which is also a contribution of this paper. Based on some axiomatic scale-space assumptions, the full representation of our bi-Gaussian kernel is deduced. The popular γ-normalization scheme for multiscale integration is extended to the bi-Gaussian operators. Finally, combined with a parameter-free shape estimation scheme, a derivative filter is developed for the typical applications of curvilinear structure detection and vasculature image enhancement. It is verified in experiments using synthetic and real data that the proposed method outperforms several conventional filters in separating closely located objects and being robust to noise. ETPL DIP-029 Visually Lossless Encoding for JPEG2000 Abstract: Due to exponential growth in image sizes, visually lossless coding is increasingly being considered as an alternative to numerically lossless coding, which has limited compression ratios. This paper presents a method of encoding color images in a visually lossless manner using JPEG2000. In order to hide coding artifacts caused by quantization, visibility thresholds (VTs) are measured and used for quantization of subband signals in JPEG2000. The VTs are experimentally determined from statistically modeled quantization distortion, which is based on the distribution of wavelet coefficients and the dead- zone quantizer of JPEG2000. The resulting VTs are adjusted for locally changing backgrounds through a visual masking model, and then used to determine the minimum number of coding passes to be included in the final codestream for visually lossless quality under the desired viewing conditions. Codestreams produced by this scheme are fully JPEG2000 Part-I compliant. ETPL DIP-030 Rate-Distortion Analysis of Dead-Zone Plus Uniform Threshold Scalar Quantization and Its Application—Part I: Fundamental Theory, Abstract: This paper provides a systematic rate-distortion (R-D) analysis of the dead-zone plus uniform threshold scalar quantization (DZ+UTSQ) with nearly uniform reconstruction quantization (NURQ) for generalized Gaussian distribution (GGD), which consists of two aspects: R-D performance analysis and R-D modeling. In R-D performance analysis, we first derive the preliminary constraint of optimum entropy-constrained DZ+UTSQ/NURQ for GGD, under which the property of the GGD distortion-rate (D-R) function is elucidated. Then for the GGD source of actual transform coefficients, the refined constraint and precise conditions of optimum DZ+UTSQ/NURQ are rigorously deduced in the real coding bit rate range, and efficient DZ+UTSQ/NURQ design criteria are proposed to reasonably simplify the utilization of effective quantizers in practice. In R-D modeling, inspired by R-D performance analysis, the D-R function is first developed, followed by the novel rate-quantization (R-Q) and distortion- quantization (D-Q) models derived using analytical and heuristic methods. The D-R, R-Q, and D-Q models form the source model describing the relationship between the rate, distortion, and quantization steps. One application of the proposed source model is the effective two-pass VBR coding algorithm design on an encoder of H.264/AVC reference software, which achieves constant video quality and desirable rate control accuracy. ETPL DIP-031 Rate-Distortion Analysis of Dead-Zone Plus Uniform Threshold Scalar Quantization and Its Application—Part II: Two-Pass VBR Coding for H.264/AVC Abstract: In the first part of this paper, we derive a source model describing the relationship between the rate, distortion, and quantization steps of the dead-zone plus uniform threshold scalar quantizers with nearly uniform reconstruction quantizers for generalized Gaussian distribution. This source model consists of rate-quantization, distortion-quantization (D-Q), and distortion-rate (D-R) models. In this part, we first rigorously confirm the accuracy of the proposed source model by comparing the calculated results with the coding data of JM 16.0. Efficient parameter estimation strategies are then developed to better employ this source model in our two-pass rate control method for H.264 variable bit rate coding. Based on our D-Q and D-R models, the proposed method is of high stability, low complexity and is easy
  14. 14. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, to implement. Extensive experiments demonstrate that the proposed method achieves: 1) average peak signal-to-noise ratio variance of only 0.0658 dB, compared to 1.8758 dB of JM 16.0's method, with an average rate control error of 1.95% and 2) significant improvement in smoothing the video quality compared with the latest two-pass rate control method. ETPL DIP-032 Nonrigid Image Registration With Crystal Dislocation Energy Abstract: The goal of nonrigid image registration is to find a suitable transformation such that the transformed moving image becomes similar to the reference image. The image registration problem can also be treated as an optimization problem, which tries to minimize an objective energy function that measures the differences between two involved images. In this paper, we consider image matching as the process of aligning object boundaries in two different images. The registration energy function can be defined based on the total energy associated with the object boundaries. The optimal transformation is obtained by finding the equilibrium state when the total energy is minimized, which indicates the object boundaries find their correspondences and stop deforming. We make an analogy between the above processes with the dislocation system in physics. The object boundaries are viewed as dislocations (line defects) in crystal. Then the well-developed dislocation energy is used to derive the energy assigned to object boundaries in images. The newly derived registration energy function takes the global gradient information of the entire image into consideration, and produces an orientation-dependent and long-range interaction between two images to drive the registration process. This property of interaction endows the new registration framework with both fast convergence rate and high registration accuracy. Moreover, the new energy function can be adapted to realize symmetric diffeomorphic transformation so as to ensure one-to-one matching between subjects. In this paper, the superiority of the new method is theoretically proven, experimentally tested and compared with the state-of-the-art SyN method. Experimental results with 3-D magnetic resonance brain images demonstrate that the proposed method outperforms the compared methods in terms of both registration accuracy and computation time. ETPL DIP-033 Double Shrinking Sparse Dimension Reduction Abstract: Learning tasks such as classification and clustering usually perform better and cost less (time and space) on compressed representations than on the original data. Previous works mainly compress data via dimension reduction. In this paper, we propose “double shrinking” to compress image data on both dimensionality and cardinality via building either sparse low-dimensional representations or a sparse projection matrix for dimension reduction. We formulate a double shrinking model (DSM) as an l1 regularized variance maximization with constraint ||x||2=1, and develop a double shrinking algorithm (DSA) to optimize DSM. DSA is a path-following algorithm that can build the whole solution path of locally optimal solutions of different sparse levels. Each solution on the path is a “warm start” for searching the next sparser one. In each iteration of DSA, the direction, the step size, and the Lagrangian multiplier are deduced from the Karush-Kuhn-Tucker conditions. The magnitudes of trivial variables are shrunk and the importances of critical variables are simultaneously augmented along the selected direction with the determined step length. Double shrinking can be applied to manifold learning and feature selections for better interpretation of features, and can be combined with classification and clustering to boost their performance. The experimental results suggest that double shrinking produces efficient and effective data compression. ETPL DIP-034 Reinitialization-Free Level Set Evolution via Reaction Diffusion Abstract: This paper presents a novel reaction-diffusion (RD) method for implicit active contours that is completely free of the costly reinitialization procedure in level set evolution (LSE). A diffusion term is introduced into LSE, resulting in an RD-LSE equation, from which a piecewise constant solution can be
  15. 15. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, derived. In order to obtain a stable numerical solution from the RD-based LSE, we propose a two-step splitting method to iteratively solve the RD-LSE equation, where we first iterate the LSE equation, then solve the diffusion equation. The second step regularizes the level set function obtained in the first step to ensure stability, and thus the complex and costly reinitialization procedure is completely eliminated from LSE. By successfully applying diffusion to LSE, the RD-LSE model is stable by means of the simple finite difference method, which is very easy to implement. The proposed RD method can be generalized to solve the LSE for both variational level set method and partial differential equation-based level set method. The RD-LSE method shows very good performance on boundary antileakage. The extensive and promising experimental results on synthetic and real images validate the effectiveness of the proposed RD-LSE approach. ETPL DIP-035 Track Creation and Deletion Framework for Long-Term Online Multiface Tracking Abstract: To improve visual tracking, a large number of papers study more powerful features, or better cue fusion mechanisms, such as adaptation or contextual models. A complementary approach consists of improving the track management, that is, deciding when to add a target or stop its tracking, for example, in case of failure. This is an essential component for effective multiobject tracking applications, and is often not trivial. Deciding whether or not to stop a track is a compromise between avoiding erroneous early stopping while tracking is fine, and erroneous continuation of tracking when there is an actual failure. This decision process, very rarely addressed in the literature, is difficult due to object detector deficiencies or observation models that are insufficient to describe the full variability of tracked objects and deliver reliable likelihood (tracking) information. This paper addresses the track management issue and presents a real-time online multiface tracking algorithm that effectively deals with the above difficulties. The tracking itself is formulated in a multiobject state-space Bayesian filtering framework solved with Markov Chain Monte Carlo. Within this framework, an explicit probabilistic filtering step decides when to add or remove a target from the tracker, where decisions rely on multiple cues such as face detections, likelihood measures, long-term observations, and track state characteristics. The method has been applied to three challenging data sets of more than 9 h in total, and demonstrate a significant performance increase compared to more traditional approaches (Markov Chain Monte Carlo, reversible- jump Markov Chain Monte Carlo) only relying on head detection and likelihood for track management. ETPL DIP-036 Wavelet Domain Multifractal Analysis for Static and Dynamic Texture Classification Abstract: In this paper, we propose a new texture descriptor for both static and dynamic textures. The new descriptor is built on the wavelet-based spatial-frequency analysis of two complementary wavelet pyramids: standard multiscale and wavelet leader. These wavelet pyramids essentially capture the local texture responses in multiple high-pass channels in a multiscale and multiorientation fashion, in which there exists a strong power-law relationship for natural images. Such a power-law relationship is characterized by the so-called multifractal analysis. In addition, two more techniques, scale normalization and multiorientation image averaging, are introduced to further improve the robustness of the proposed descriptor. Combining these techniques, the proposed descriptor enjoys both high discriminative power and robustness against many environmental changes. We apply the descriptor for classifying both static and dynamic textures. Our method has demonstrated excellent performance in comparison with the state- of-the-art approaches in several public benchmark datasets. ETPL DIP-037 Video Object Tracking in the Compressed Domain Using Spatio-Temporal Markov Random Fields Abstract: Despite the recent progress in both pixel-domain and compressed-domain video object tracking, the need for a tracking framework with both reasonable accuracy and reasonable complexity still exists. This paper presents a method for tracking moving objects in H.264/AVC-compressed video sequences
  16. 16. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, using a spatio-temporal Markov random field (ST-MRF) model. An ST-MRF model naturally integrates the spatial and temporal aspects of the object's motion. Built upon such a model, the proposed method works in the compressed domain and uses only the motion vectors (MVs) and block coding modes from the compressed bitstream to perform tracking. First, the MVs are preprocessed through intracoded block motion approximation and global motion compensation. At each frame, the decision of whether a particular block belongs to the object being tracked is made with the help of the ST-MRF model, which is updated from frame to frame in order to follow the changes in the object's motion. The proposed method is tested on a number of standard sequences, and the results demonstrate its advantages over some of the recent state-of-the-art methods. ETPL DIP-038 Online Object Tracking With Sparse Prototypes Abstract: Online object tracking is a challenging problem as it entails learning an effective model to account for appearance change caused by intrinsic and extrinsic factors. In this paper, we propose a novel online object tracking algorithm with sparse prototypes, which exploits both classic principal component analysis (PCA) algorithms with recent sparse representation schemes for learning effective appearance models. We introduce l1regularization into the PCA reconstruction, and develop a novel algorithm to represent an object by sparse prototypes that account explicitly for data and noise. For tracking, objects are represented by the sparse prototypes learned online with update. In order to reduce tracking drift, we present a method that takes occlusion and motion blur into account rather than simply includes image observations for model update. Both qualitative and quantitative evaluations on challenging image sequences demonstrate that the proposed tracking algorithm performs favorably against several state-of- the-art methods. ETPL DIP-039 Automatic Dynamic Texture Segmentation Using Local Descriptors and Optical Flow bstract: A dynamic texture (DT) is an extension of the texture to the temporal domain. How to segment a DT is a challenging problem. In this paper, we address the problem of segmenting a DT into disjoint regions. A DT might be different from its spatial mode (i.e., appearance) and/or temporal mode (i.e., motion field). To this end, we develop a framework based on the appearance and motion modes. For the appearance mode, we use a new local spatial texture descriptor to describe the spatial mode of the DT; for the motion mode, we use the optical flow and the local temporal texture descriptor to represent the temporal variations of the DT. In addition, for the optical flow, we use the histogram of oriented optical flow (HOOF) to organize them. To compute the distance between two HOOFs, we develop a simple effective and efficient distance measure based on Weber's law. Furthermore, we also address the problem of threshold selection by proposing a method for determining thresholds for the segmentation method by an offline supervised statistical learning. The experimental results show that our method provides very good segmentation results compared to the state-of-the-art methods in segmenting regions that differ in their dynamics. ETPL DIP-040 Efficient Image Classification via Multiple Rank Regression bstract: The problem of image classification has aroused considerable research interest in the field of image processing. Traditional methods often convert an image to a vector and then use a vector-based classifier. In this paper, a novel multiple rank regression model (MRR) for matrix data classification is proposed. Unlike traditional vector-based methods, we employ multiple-rank left projecting vectors and right projecting vectors to regress each matrix data set to its label for each category. The convergence behavior, initialization, computational complexity, and parameter determination are also analyzed. Compared with vector-based regression methods, MRR achieves higher accuracy and has lower computational complexity. Compared with traditional supervised tensor-based methods, MRR performs
  17. 17. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, better for matrix data classification. Promising experimental results on face, object, and hand-written digit image classification tasks are provided to show the effectiveness of our method. ETPL DIP-041 Regularized Discriminative Spectral Regression Method for Heterogeneous Face Matching Abstract: Face recognition is confronted with situations in which face images are captured in various modalities, such as the visual modality, the near infrared modality, and the sketch modality. This is known as heterogeneous face recognition. To solve this problem, we propose a new method called discriminative spectral regression (DSR). The DSR maps heterogeneous face images into a common discriminative subspace in which robust classification can be achieved. In the proposed method, the subspace learning problem is transformed into a least squares problem. Different mappings should map heterogeneous images from the same class close to each other, while images from different classes should be separated as far as possible. To realize this, we introduce two novel regularization terms, which reflect the category relationships among data, into the least squares approach. Experiments conducted on two heterogeneous face databases validate the superiority of the proposed method over the previous methods. ETPL DIP-042 Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search Abstract: Due to the popularity of social media websites, extensive research efforts have been dedicated to tag-based social image search. Both visual information and tags have been investigated in the research field. However, most existing methods use tags and visual characteristics either separately or sequentially in order to estimate the relevance of images. In this paper, we propose an approach that simultaneously utilizes both visual and textual information to estimate the relevance of user tagged images. The relevance estimation is determined with a hypergraph learning approach. In this method, a social image hypergraph is constructed, where vertices represent images and hyperedges represent visual or textual terms. Learning is achieved with use of a set of pseudo-positive images, where the weights of hyperedges are updated throughout the learning process. In this way, the impact of different tags and visual words can be automatically modulated. Comparative results of the experiments conducted on a dataset including 370+images are presented, which demonstrate the effectiveness of the proposed approach. ETPL DIP-043 Action Search by Example Using Randomized Visual Vocabularies Abstract: Because actions can be small video objects, it is a challenging problem to search for similar actions in crowded and dynamic scenes when a single query example is provided. We propose a fast action search method that can efficiently locate similar actions spatiotemporally. Both the query action and the video datasets are characterized by spatio-temporal interest points. Instead of using a unified visual vocabulary to index all interest points in the database, we propose randomized visual vocabularies to enable fast and robust interest point matching. To accelerate action localization, we have developed a coarse-to-fine video subvolume search scheme, which is several orders of magnitude faster than the existing spatio-temporal branch and bound search. Our experiments on cross-dataset action search show promising results when compared with the state of the arts. Additional experiments on a 5-h versatile video dataset validate the efficiency of our method, where an action search can be finished in just 37.6 s on a regular desktop machine. ETPL DIP-044 Robust Albedo Estimation From a Facial Image With Cast Shadow Under General Unknown Lighting Abstract: Albedo estimation from a facial image is crucial for various computer vision tasks, such as 3-D morphable-model fitting, shape recovery, and illumination-invariant face recognition, but the currently available methods do not give good estimation results. Most methods ignore the influence of cast shadows and require a statistical model to obtain facial albedo. This paper describes a method for albedo estimation that makes combined use of image intensity and facial depth information for an image with
  18. 18. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, cast shadows and general unknown light. In order to estimate the albedo map of a face, we formulate the albedo estimation problem as a linear programming problem that minimizes intensity error under the assumption that the surface of the face has constant albedo. Since the solution thus obtained has significant errors in certain parts of the facial image, the albedo estimate needs to be compensated. We minimize the mean square error of albedo under the assumption that the surface normals, which are calculated from the facial depth information, are corrupted with noise. The proposed method is simple and the experimental results show that this method gives better estimates than other methods. ETPL DIP-045 Separable Markov Random Field Model and Its Applications in Low Level Vision Abstract: This brief proposes a continuously-valued Markov random field (MRF) model with separable filter bank, denoted as MRFSepa, which significantly reduces the computational complexity in the MRF modeling. In this framework, we design a novel gradient-based discriminative learning method to learn the potential functions and separable filter banks. We learn MRFSepa models with 2-D and 3-D separable filter banks for the applications of gray-scale/color image denoising and color image demosaicing. By implementing MRFSepa model on graphics processing unit, we achieve real-time image denoising and fast image demosaicing with high-quality results. ETPL DIP-046 Two-Direction Nonlocal Model for Image Denoising Abstract: Similarities inherent in natural images have been widely exploited for image denoising and other applications. In fact, if a cluster of similar image patches is rearranged into a matrix, similarities exist both between columns and rows. Using the similarities, we present a two-directional nonlocal (TDNL) variational model for image denoising. The solution of our model consists of three components: one component is a scaled version of the original observed image and the other two components are obtained by utilizing the similarities. Specifically, by using the similarity between columns, we get a nonlocal-means-like estimation of the patch with consideration to all similar patches, while the weights are not the pairwise similarities but a set of clusterwise coefficients. Moreover, by using the similarity between rows, we also get nonlocal-autoregression-like estimations for the center pixels of the similar patches. The TDNL model leads to an alternative minimization algorithm. Experiments indicate that the model can perform on par with or better than the state-of-the-art denoising methods. ETPL DIP-047 Optimizing the Error Diffusion Filter for Blue Noise Halftoning With Multiscale Error Diffusion Abstract: A good halftoning output should bear a blue noise characteristic contributed by isotropically- distributed isolated dots. Multiscale error diffusion (MED) algorithms try to achieve this by exploiting radially symmetric and noncausal error diffusion filters to guarantee spatial homogeneity. In this brief, an optimized diffusion filter is suggested to make the diffusion close to isotropic. When it is used with MED, the resulting output has a nearly ideal blue noise characteristic. ETPL DIP-049 Sparse Representation With Kernels Abstract: Recent research has shown the initial success of sparse coding (Sc) in solving many computer vision tasks. Motivated by the fact that kernel trick can capture the nonlinear similarity of features, which helps in finding a sparse representation of nonlinear features, we propose kernel sparse representation (KSR). Essentially, KSR is a sparse coding technique in a high dimensional feature space mapped by an implicit mapping function. We apply KSR to feature coding in image classification, face recognition, and kernel matrix approximation. More specifically, by incorporating KSR into spatial pyramid matching (SPM), we develop KSRSPM, which achieves a good performance for image classification. Moreover,
  19. 19. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, KSR-based feature coding can be shown as a generalization of efficient match kernel and an extension of Sc-based SPM. We further show that our proposed KSR using a histogram intersection kernel (HIK) can be considered a soft assignment extension of HIK-based feature quantization in the feature coding process. Besides feature coding, comparing with sparse coding, KSR can learn more discriminative sparse codes and achieve higher accuracy for face recognition. Moreover, KSR can also be applied to kernel matrix approximation in large scale learning tasks, and it demonstrates its robustness to kernel matrix approximation, especially when a small fraction of the data is used. Extensive experimental results demonstrate promising results of KSR in image classification, face recognition, and kernel matrix approximation. All these applications prove the effectiveness of KSR in computer vision and machine learning tasks. ETPL DIP-050 Image-Difference Prediction: From Grayscale to Color Abstract: Existing image-difference measures show excellent accuracy in predicting distortions, such as lossy compression, noise, and blur. Their performance on certain other distortions could be improved; one example of this is gamut mapping. This is partly because they either do not interpret chromatic information correctly or they ignore it entirely. We present an image-difference framework that comprises image normalization, feature extraction, and feature combination. Based on this framework, we create image-difference measures by selecting specific implementations for each of the steps. Particular emphasis is placed on using color information to improve the assessment of gamut-mapped images. Our best image-difference measure shows significantly higher prediction accuracy on a gamut- mapping dataset than all other evaluated measures. ETPL DIP-051 When Does Computational Imaging Improve Performance? Abstract: A number of computational imaging techniques are introduced to improve image quality by increasing light throughput. These techniques use optical coding to measure a stronger signal level. However, the performance of these techniques is limited by the decoding step, which amplifies noise. Although it is well understood that optical coding can increase performance at low light levels, little is known about the quantitative performance advantage of computational imaging in general settings. In this paper, we derive the performance bounds for various computational imaging techniques. We then discuss the implications of these bounds for several real-world scenarios (e.g., illumination conditions, scene properties, and sensor noise characteristics). Our results show that computational imaging techniques do not provide a significant performance advantage when imaging with illumination that is brighter than typical daylight. These results can be readily used by practitioners to design the most suitable imaging systems given the application at hand. ETPL DIP-052 Anisotropic Interpolation of Sparse Generalized Image Samples Abstract: Practical image-acquisition systems are often modeled as a continuous-domain prefilter followed by an ideal sampler, where generalized samples are obtained after convolution with the impulse response of the device. In this paper, our goal is to interpolate images from a given subset of such samples. We express our solution in the continuous domain, considering consistent resampling as a data- fidelity constraint. To make the problem well posed and ensure edge-preserving solutions, we develop an efficient anisotropic regularization approach that is based on an improved version of the edge-enhancing anisotropic diffusion equation. Following variational principles, our reconstruction algorithm minimizes successive quadratic cost functionals. To ensure fast convergence, we solve the corresponding sequence of linear problems by using multigrid iterations that are specifically tailored to their sparse structure. We conduct illustrative experiments and discuss the potential of our approach both in terms of algorithmic
  20. 20. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, design and reconstruction quality. In particular, we present results that use as little as 2% of the image samples. ETPL DIP-053 Clustered-Dot Halftoning With Direct Binary Search Abstract: In this paper, we present a new algorithm for aperiodic clustered-dot halftoning based on direct binary search (DBS). The DBS optimization framework has been modified for designing clustered-dot texture, by using filters with different sizes in the initialization and update steps of the algorithm. Following an intuitive explanation of how the clustered-dot texture results from this modified framework, we derive a closed-form cost metric which, when minimized, equivalently generates stochastic clustered- dot texture. An analysis of the cost metric and its influence on the texture quality is presented, which is followed by a modification to the cost metric to reduce computational cost and to make it more suitable for screen design. ETPL DIP-054 Task-Specific Image Partitioning Abstract: Image partitioning is an important preprocessing step for many of the state-of-the-art algorithms used for performing high-level computer vision tasks. Typically, partitioning is conducted without regard to the task in hand. We propose a task-specific image partitioning framework to produce a region-based image representation that will lead to a higher task performance than that reached using any task- oblivious partitioning framework and existing supervised partitioning framework, albeit few in number. The proposed method partitions the image by means of correlation clustering, maximizing a linear discriminant function defined over a superpixel graph. The parameters of the discriminant function that define task-specific similarity/dissimilarity among superpixels are estimated based on structured support vector machine (S-SVM) using task-specific training data. The S-SVM learning leads to a better generalization ability while the construction of the superpixel graph used to define the discriminant function allows a rich set of features to be incorporated to improve discriminability and robustness. We evaluate the learned task-aware partitioning algorithms on three benchmark datasets. Results show that task-aware partitioning leads to better labeling performance than the partitioning computed by the state- of-the-art general-purpose and supervised partitioning algorithms. We believe that the task-specific image partitioning paradigm is widely applicable to improving performance in high-level image understanding tasks ETPL DIP-055 Generalized Inverse-Approach Model for Spectral-Signal Recovery Abstract: We have studied the transformation system of a spectral signal to the response of the system as a linear mapping from higher to lower dimensional space in order to look more closely at inverse- approach models. The problem of spectral-signal recovery from the response of a transformation system is generally stated on the basis of the generalized inverse-approach theorem, which provides a modular model for generating a spectral signal from a given response value. The controlling criteria, including the robustness of the inverse model to perturbations of the response caused by noise, and the condition number for matrix inversion, are proposed, together with the mean square error, so as to create an efficient model for spectral-signal recovery. The spectral-reflectance recovery and color correction of natural surface color are numerically investigated to appraise different illuminant-observer transformation matrices based on the proposed controlling criteria both in the absence and the presence of noise. ETPL DIP-056 Spatio-Temporal Auxiliary Particle Filtering With -Norm-Based Appearance Model Learning for Robust Visual Tracking Abstract: In this paper, we propose an efficient and accurate visual tracker equipped with a new particle filtering algorithm and robust subspace learning-based appearance model. The proposed visual tracker avoids drifting problems caused by abrupt motion changes and severe appearance variations that are well-
  21. 21. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, known difficulties in visual tracking. The proposed algorithm is based on a type of auxiliary particle filtering that uses a spatio-temporal sliding window. Compared to conventional particle filtering algorithms, spatio-temporal auxiliary particle filtering is computationally efficient and successfully implemented in visual tracking. In addition, a real-time robust principal component pursuit (RRPCP) equipped with l1-norm optimization has been utilized to obtain a new appearance model learning block for reliable visual tracking especially for occlusions in object appearance. The overall tracking framework based on the dual ideas is robust against occlusions and out-of-plane motions because of the proposed spatio-temporal filtering and recursive form of RRPCP. The designed tracker has been evaluated using challenging video sequences, and the results confirm the advantage of using this tracker. ETPL DIP-057 Manifold Regularized Multitask Learning for Semi-Supervised Multilabel Image Classification Abstract: It is a significant challenge to classify images with multiple labels by using only a small number of labeled samples. One option is to learn a binary classifier for each label and use manifold regularization to improve the classification performance by exploring the underlying geometric structure of the data distribution. However, such an approach does not perform well in practice when images from multiple concepts are represented by high-dimensional visual features. Thus, manifold regularization is insufficient to control the model complexity. In this paper, we propose a manifold regularized multitask learning (MRMTL) algorithm. MRMTL learns a discriminative subspace shared by multiple classification tasks by exploiting the common structure of these tasks. It effectively controls the model complexity because different tasks limit one another's search volume, and the manifold regularization ensures that the functions in the shared hypothesis space are smooth along the data manifold. We conduct extensive experiments, on the PASCAL VOC'07 dataset with 20 classes and the MIR dataset with 38 classes, by comparing MRMTL with popular image classification algorithms. The results suggest that MRMTL is effective for image classification. ETPL DIP-058 Linear Distance Coding for Image Classification Abstract: The feature coding-pooling framework is shown to perform well in image classification tasks, because it can generate discriminative and robust image representations. The unavoidable information loss incurred by feature quantization in the coding process and the undesired dependence of pooling on the image spatial layout, however, may severely limit the classification. In this paper, we propose a linear distance coding (LDC) method to capture the discriminative information lost in traditional coding methods while simultaneously alleviating the dependence of pooling on the image spatial layout. The core of the LDC lies in transforming local features of an image into more discriminative distance vectors, where the robust image-to-class distance is employed. These distance vectors are further encoded into sparse codes to capture the salient features of the image. The LDC is theoretically and experimentally shown to be complementary to the traditional coding methods, and thus their combination can achieve higher classification accuracy. We demonstrate the effectiveness of LDC on six data sets, two of each of three types (specific object, scene, and general object), i.e., Flower 102 and PFID 61, Scene 15 and Indoor 67, Caltech 101 and Caltech 256. The results show that our method generally outperforms the traditional coding methods, and achieves or is comparable to the state-of-the-art performance on these data sets. ETPL DIP-059 What Are We Tracking: A Unified Approach of Tracking and Recognition Abstract: Tracking is essentially a matching problem. While traditional tracking methods mostly focus on low-level image correspondences between frames, we argue that high-level semantic correspondences are indispensable to make tracking more reliable. Based on that, a unified approach of low-level object tracking and high-level recognition is proposed for single object tracking, in which the target category is
  22. 22. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, actively recognized during tracking. High-level offline models corresponding to the recognized category are then adaptively selected and combined with low-level online tracking models so as to achieve better tracking performance. Extensive experimental results show that our approach outperforms state-of-the-art online models in many challenging tracking scenarios such as drastic view change, scale change, background clutter, and morphable objects. ETPL DIP-060 Unsupervised Amplitude and Texture Classification of SAR Images With Multinomial Latent Model Abstract: In this paper, we combine amplitude and texture statistics of the synthetic aperture radar images for the purpose of model-based classification. In a finite mixture model, we bring together the Nakagami densities to model the class amplitudes and a 2-D auto-regressive texture model with t-distributed regression error to model the textures of the classes. A non-stationary multinomial logistic latent class label model is used as a mixture density to obtain spatially smooth class segments. The classification expectation-maximization algorithm is performed to estimate the class parameters and to classify the pixels. We resort to integrated classification likelihood criterion to determine the number of classes in the model. We present our results on the classification of the land covers obtained in both supervised and unsupervised cases processing TerraSAR-X, as well as COSMO-SkyMed data. ETPL DIP-061 Fuzzy C-Means Clustering With Local Information and Kernel Metric for Image Segmentation Abstract: In this paper, we present an improved fuzzy C-means (FCM) algorithm for image segmentation by introducing a tradeoff weighted fuzzy factor and a kernel metric. The tradeoff weighted fuzzy factor depends on the space distance of all neighboring pixels and their gray-level difference simultaneously. By using this factor, the new algorithm can accurately estimate the damping extent of neighboring pixels. In order to further enhance its robustness to noise and outliers, we introduce a kernel distance measure to its objective function. The new algorithm adaptively determines the kernel parameter by using a fast bandwidth selection rule based on the distance variance of all data points in the collection. Furthermore, the tradeoff weighted fuzzy factor and the kernel distance measure are both parameter free. Experimental results on synthetic and real images show that the new algorithm is effective and efficient, and is relatively independent of this type of noise. ETPL DIP-062 Rate-Distortion Optimized Rate Control for Depth Map-Based 3-D Video Coding Abstract: In this paper, a novel rate control scheme with optimized bits allocation for the 3-D video coding is proposed. First, we investigate the R-D characteristics of the texture and depth map of the coded view, as well as the quality dependency between the virtual view and the coded view. Second, an optimal bit allocation scheme is developed to allocate target bits for both the texture and depth maps of different views. Meanwhile, a simplified model parameter estimation scheme is adopted to speed up the coding process. Finally, the experimental results on various 3-D video sequences demonstrate that the proposed algorithm achieves excellent R-D efficiency and bit rate accuracy compared to benchmark algorithms. ETPL DIP-063 Performance Evaluation Methodology for Historical Document Image Binarization Abstract: Document image binarization is of great importance in the document image analysis and recognition pipeline since it affects further stages of the recognition process. The evaluation of a binarization method aids in studying its algorithmic behavior, as well as verifying its effectiveness, by providing qualitative and quantitative indication of its performance. This paper addresses a pixel-based binarization evaluation methodology for historical handwritten/machine-printed document images. In the proposed evaluation scheme, the recall and precision evaluation measures are properly modified using a weighting scheme that diminishes any potential evaluation bias. Additional performance metrics of the proposed evaluation scheme consist of the percentage rates of broken and missed text, false alarms,
  23. 23. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, background noise, character enlargement, and merging. Several experiments conducted in comparison with other pixel-based evaluation measures demonstrate the validity of the proposed evaluation scheme. ETPL DIP-064 Video Quality Pooling Adaptive to Perceptual Distortion Severity Abstract: It is generally recognized that severe video distortions that are transient in space and/or time have a large effect on overall perceived video quality. In order to understand this phenomena, we study the distribution of spatio-temporally local quality scores obtained from several video quality assessment (VQA) algorithms on videos suffering from compression and lossy transmission over communication channels. We propose a content adaptive spatial and temporal pooling strategy based on the observed distribution. Our method adaptively emphasizes “worst” scores along both the spatial and temporal dimensions of a video sequence and also considers the perceptual effect of large-area cohesive motion flow such as egomotion. We demonstrate the efficacy of the method by testing it using three different VQA algorithms on the LIVE Video Quality database and the EPFL-PoliMI video quality database. ETPL DIP-065 Modified Gradient Search for Level Set Based Image Segmentation Abstract: Level set methods are a popular way to solve the image segmentation problem. The solution contour is found by solving an optimization problem where a cost functional is minimized. Gradient descent methods are often used to solve this optimization problem since they are very easy to implement and applicable to general nonconvex functionals. They are, however, sensitive to local minima and often display slow convergence. Traditionally, cost functionals have been modified to avoid these problems. In this paper, we instead propose using two modified gradient descent methods, one using a momentum term and one based on resilient propagation. These methods are commonly used in the machine learning community. In a series of 2-D/3-D-experiments using real and synthetic data with ground truth, the modifications are shown to reduce the sensitivity for local optima and to increase the convergence rate. The parameter sensitivity is also investigated. The proposed methods are very simple modifications of the basic method, and are directly compatible with any type of level set implementation. Downloadable reference code with examples is available online. ETPL DIP-066 Maximum Margin Correlation Filter: A New Approach for Localization and Classification Abstract: Support vector machine (SVM) classifiers are popular in many computer vision tasks. In most of them, the SVM classifier assumes that the object to be classified is centered in the query image, which might not always be valid, e.g., when locating and classifying a particular class of vehicles in a large scene. In this paper, we introduce a new classifier called Maximum Margin Correlation Filter (MMCF), which, while exhibiting the good generalization capabilities of SVM classifiers, is also capable of localizing objects of interest, thereby avoiding the need for image centering as is usually required in SVM classifiers. In other words, MMCF can simultaneously localize and classify objects of interest. We test the efficacy of the proposed classifier on three different tasks: vehicle recognition, eye localization, and face classification. We demonstrate that MMCF outperforms SVM classifiers as well as well known correlation filters. ETPL DIP-067 Adaptive Fingerprint Image Enhancement With Emphasis on Preprocessing of Data Abstract: This article proposes several improvements to an adaptive fingerprint enhancement method that is based on contextual filtering. The term adaptive implies that parameters of the method are automatically adjusted based on the input fingerprint image. Five processing blocks comprise the adaptive fingerprint enhancement method, where four of these blocks are updated in our proposed
  24. 24. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, system. Hence, the proposed overall system is novel. The four updated processing blocks are: 1) preprocessing; 2) global analysis; 3) local analysis; and 4) matched filtering. In the preprocessing and local analysis blocks, a nonlinear dynamic range adjustment method is used. In the global analysis and matched filtering blocks, different forms of order statistical filters are applied. These processing blocks yield an improved and new adaptive fingerprint image processing method. The performance of the updated processing blocks is presented in the evaluation part of this paper. The algorithm is evaluated toward the NIST developed NBIS software for fingerprint recognition on FVC databases. ETPL DIP-068 Objective Quality Assessment of Tone-Mapped Images Abstract: Tone-mapping operators (TMOs) that convert high dynamic range (HDR) to low dynamic range (LDR) images provide practically useful tools for the visualization of HDR images on standard LDR displays. Different TMOs create different tone-mapped images, and a natural question is which one has the best quality. Without an appropriate quality measure, different TMOs cannot be compared, and further improvement is directionless. Subjective rating may be a reliable evaluation method, but it is expensive and time consuming, and more importantly, is difficult to be embedded into optimization frameworks. Here we propose an objective quality assessment algorithm for tone-mapped images by combining: 1) a multiscale signal fidelity measure on the basis of a modified structural similarity index and 2) a naturalness measure on the basis of intensity statistics of natural images. Validations using independent subject-rated image databases show good correlations between subjective ranking score and the proposed tone-mapped image quality index (TMQI). Furthermore, we demonstrate the extended applications of TMQI using two examples - parameter tuning for TMOs and adaptive fusion of multiple tone-mapped images. ETPL DIP-069 Catching a Rat by Its Edglets Abstract: Computer vision is a noninvasive method for monitoring laboratory animals. In this article, we propose a robust tracking method that is capable of extracting a rodent from a frame under uncontrolled normal laboratory conditions. The method consists of two steps. First, a sliding window combines three features to coarsely track the animal. Then, it uses the edglets of the rodent to adjust the tracked region to the animal's boundary. The method achieves an average tracking error that is smaller than a representative state-of-the-art method. ETPL DIP-070 Juxtaposed Color Halftoning Relying on Discrete Lines Abstract: Most halftoning techniques allow screen dots to overlap. They rely on the assumption that the inks are transparent, i.e., the inks do not scatter a significant portion of the light back to the air. However, many special effect inks, such as metallic inks, iridescent inks, or pigmented inks, are not transparent. In order to create halftone images, halftone dots formed by such inks should be juxtaposed, i.e., printed side by side. We propose an efficient juxtaposed color halftoning technique for placing any desired number of colorant layers side by side without overlapping. The method uses a monochrome library of screen elements made of discrete lines with rational thicknesses. Discrete line juxtaposed color halftoning is performed efficiently by multiple accesses to the screen element library. ETPL DIP-071 Image Noise Level Estimation by Principal Component Analysis Abstract: The problem of blind noise level estimation arises in many image processing applications, such as denoising, compression, and segmentation. In this paper, we propose a new noise level estimation method on the basis of principal component analysis of image blocks. We show that the noise variance can be estimated as the smallest eigenvalue of the image block covariance matrix. Compared with 13 existing methods, the proposed approach shows a good compromise between speed and accuracy. It is at
  25. 25. Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad | Pondicherry | Trivandrum | Salem | Erode | Tirunelveli, least 15 times faster than methods with similar accuracy, and it is at least two times more accurate than other methods. Our method does not assume the existence of homogeneous areas in the input image and, hence, can successfully process images containing only textures. ETPL DIP-072 Nonlocal Image Restoration With Bilateral Variance Estimation: A Low-Rank Approach Abstract: Simultaneous sparse coding (SSC) or nonlocal image representation has shown great potential in various low-level vision tasks, leading to several state-of-the-art image restoration techniques, including BM3D and LSSC. However, it still lacks a physically plausible explanation about why SSC is a better model than conventional sparse coding for the class of natural images. Meanwhile, the problem of sparsity optimization, especially when tangled with dictionary learning, is computationally difficult to solve. In this paper, we take a low-rank approach toward SSC and provide a conceptually simple interpretation from a bilateral variance estimation perspective, namely that singular-value decomposition of similar packed patches can be viewed as pooling both local and nonlocal information for estimating signal variances. Such perspective inspires us to develop a new class of image restoration algorithms called spatially adaptive iterative singular-value thresholding (SAIST). For noise data, SAIST generalizes the celebrated BayesShrink from local to nonlocal models; for incomplete data, SAIST extends previous deterministic annealing-based solution to sparsity optimization through incorporating the idea of dictionary learning. In addition to conceptual simplicity and computational efficiency, SAIST has achieved highly competent (often better) objective performance compared to several state-of-the-art methods in image denoising and completion experiments. Our subjective quality results compare favorably with those obtained by existing techniques, especially at high noise levels and with a large amount of missing data. ETPL DIP-073 Variational Approach for the Fusion of Exposure Bracketed Pairs Abstract: When taking pictures of a dark scene with artificial lighting, ambient light is not sufficient for most cameras to obtain both accurate color and detail information. The exposure bracketing feature usually available in many camera models enables the user to obtain a series of pictures taken in rapid succession with different exposure times; the implicit idea is that the user picks the best image from this set. But in many cases, none of these images is good enough; in general, good brightness and color information are retained from longer-exposure settings, whereas sharp details are obtained from shorter ones. In this paper, we propose a variational method for automatically combining an exposure-bracketed pair of images within a single picture that reflects the desired properties of each one. We introduce an energy functional consisting of two terms, one measuring the difference in edge information with the short-exposure image and the other measuring the local color difference with a warped version of the long-exposure image. This method is able to handle camera and subject motion as well as noise, and the results compare favorably with the state of the art. ETPL DIP-074 Image Denoising With Dominant Sets by a Coalitional Game Approach Abstract: Dominant sets are a new graph partition method for pairwise data clustering proposed by Pavan and Pelillo. We address the problem of dominant sets with a coalitional game model, in which each data point is treated as a player and similar data points are encouraged to group together for cooperation. We propose betrayal and hermit rules to describe the cooperative behaviors among the players. After applying the betrayal and hermit rules, an optimal and stable graph partition emerges, and all the players in the partition will not change their groups. For computational feasibility, we design an approximate algorithm for finding a dominant set of mutually similar players and then apply the algorithm to an application such as image denoising. In image denoising, every pixel is treated as a player who seeks similar partners according to its patch appearance in its local neighborhood. By averaging the noisy effects with the