This document summarizes a research paper that proposes a custom spatiotemporal fusion method for video saliency detection. The method involves taking video frames and computing colour and motion saliency. It then performs temporal fusion and pixel saliency fusion. Colour information then guides a spatiotemporal diffusion process using a permutation matrix. The results show the proposed method achieves overall best performance compared to other state-of-the-art saliency detection methods on a publicly available dataset, based on five global saliency evaluation metrics.
This document summarizes an intelligent object inpainting approach for video repairing using MATLAB. It proposes a new object-based method for video inpainting that can maintain spatial consistency and temporal motion continuity simultaneously. The method involves three steps: 1) constructing a virtual contour of occluded objects, 2) selecting and mapping key postures from available frames, and 3) generating synthetic postures if a matching sequence cannot be found. It aims to improve on previous patch-based methods by handling videos captured by stationary or mobile cameras and compensating for insufficient available postures. The results demonstrate better performance than 3D-based methods at reducing computational complexity while maintaining accuracy.
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...IRJET Journal
This document proposes a novel blind super resolution method to improve the spatial resolution of real-life video sequences. The key aspects of the proposed method are:
1) It estimates blur without knowing the point spread function or noise statistics using a non-uniform interpolation super resolution method and multi-scale processing.
2) It uses a cost function with fidelity and regularization terms of a Huber-Markov random field to preserve edges and fine details in the reconstructed high resolution frames.
3) It performs masking to suppress artifacts from inaccurate motions, adaptively weighting the fidelity term at each iteration for faster convergence.
The method is tested on real-life videos with complex motions, objects, and brightness changes, showing
Survey paper on image compression techniquesIRJET Journal
This document summarizes and compares several popular image compression techniques: wavelet compression, JPEG/DCT compression, vector quantization (VQ), fractal compression, and genetic algorithm compression. It finds that all techniques perform satisfactorily at 0.5 bits per pixel, but for very low bit rates like 0.25 bpp, wavelet compression techniques like EZW perform best in terms of compression ratio and quality. Specifically, EZW and JPEG are more practical than others at low bit rates. The document also notes advantages and disadvantages of each technique and concludes hybrid approaches may achieve even higher compression ratios while maintaining image quality.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document presents a novel approach for jointly optimizing spatial prediction and transform coding in video compression. It aims to improve performance and reduce complexity compared to existing techniques. The proposed method uses singular value decomposition (SVD) to compress images. SVD decomposes an image matrix into three matrices, allowing the image to be approximated using only a few singular values. This achieves compression by removing redundant information. The document outlines the SVD approach for image compression and measures compression performance using compression ratio and mean squared error between the original and compressed images. It then discusses trends in image and video coding, including combining natural and synthetic content. Finally, it provides a block diagram of the proposed system and compares its compression performance to existing discrete cosine transform-
In the current scenario compression of video files is in high demand. Color video compression has become a significant technology to lessen the memory space and to decrease transmission time. Video compression using fractal technique is based on self similarity concept by comparing the range block and domain block. However, its computational complexity is very high. In this paper we presented hybrid video compression technique to compress Audio/Video Interleaved file and overcome the problem of Computational complexity. We implemented Discrete Wavelet Transform and hybrid fractal HV partition technique using Particle Swarm Optimization (called mapping of PSO) for compression of videos. The analysis demonstrate that hybrid technique gives a very good speed up to compress video and achieve Peak Signal to Noise Ratio.
An improved image compression algorithm based on daubechies wavelets with ar...Alexander Decker
This document summarizes an academic article that proposes a new image compression algorithm using Daubechies wavelets and arithmetic coding. It first discusses existing image compression techniques and their limitations. It then describes the proposed algorithm, which applies Daubechies wavelet transform followed by 2D Walsh wavelet transform on image blocks and arithmetic coding. Results show the proposed method achieves higher compression ratios and PSNR values than existing algorithms like EZW and SPIHT. Future work aims to improve results by exploring different wavelets and compression techniques.
This document describes an image fusion method using pyramidal decomposition. It proposes extracting fine details from input images using guided filtering and fusing the base layers of images across multiple exposures or focal points using a multiresolution pyramid approach. A weight map is generated considering exposure, contrast, and saturation to guide the fusion of base layers. The fused base layer is then combined with extracted fine details to produce a detail-enhanced fused image. The goal is to preserve details in both very dark and extremely bright regions of the input images. It is argued that this method can effectively fuse images from different exposures or focal points without introducing artifacts.
This document summarizes an intelligent object inpainting approach for video repairing using MATLAB. It proposes a new object-based method for video inpainting that can maintain spatial consistency and temporal motion continuity simultaneously. The method involves three steps: 1) constructing a virtual contour of occluded objects, 2) selecting and mapping key postures from available frames, and 3) generating synthetic postures if a matching sequence cannot be found. It aims to improve on previous patch-based methods by handling videos captured by stationary or mobile cameras and compensating for insufficient available postures. The results demonstrate better performance than 3D-based methods at reducing computational complexity while maintaining accuracy.
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...IRJET Journal
This document proposes a novel blind super resolution method to improve the spatial resolution of real-life video sequences. The key aspects of the proposed method are:
1) It estimates blur without knowing the point spread function or noise statistics using a non-uniform interpolation super resolution method and multi-scale processing.
2) It uses a cost function with fidelity and regularization terms of a Huber-Markov random field to preserve edges and fine details in the reconstructed high resolution frames.
3) It performs masking to suppress artifacts from inaccurate motions, adaptively weighting the fidelity term at each iteration for faster convergence.
The method is tested on real-life videos with complex motions, objects, and brightness changes, showing
Survey paper on image compression techniquesIRJET Journal
This document summarizes and compares several popular image compression techniques: wavelet compression, JPEG/DCT compression, vector quantization (VQ), fractal compression, and genetic algorithm compression. It finds that all techniques perform satisfactorily at 0.5 bits per pixel, but for very low bit rates like 0.25 bpp, wavelet compression techniques like EZW perform best in terms of compression ratio and quality. Specifically, EZW and JPEG are more practical than others at low bit rates. The document also notes advantages and disadvantages of each technique and concludes hybrid approaches may achieve even higher compression ratios while maintaining image quality.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document presents a novel approach for jointly optimizing spatial prediction and transform coding in video compression. It aims to improve performance and reduce complexity compared to existing techniques. The proposed method uses singular value decomposition (SVD) to compress images. SVD decomposes an image matrix into three matrices, allowing the image to be approximated using only a few singular values. This achieves compression by removing redundant information. The document outlines the SVD approach for image compression and measures compression performance using compression ratio and mean squared error between the original and compressed images. It then discusses trends in image and video coding, including combining natural and synthetic content. Finally, it provides a block diagram of the proposed system and compares its compression performance to existing discrete cosine transform-
In the current scenario compression of video files is in high demand. Color video compression has become a significant technology to lessen the memory space and to decrease transmission time. Video compression using fractal technique is based on self similarity concept by comparing the range block and domain block. However, its computational complexity is very high. In this paper we presented hybrid video compression technique to compress Audio/Video Interleaved file and overcome the problem of Computational complexity. We implemented Discrete Wavelet Transform and hybrid fractal HV partition technique using Particle Swarm Optimization (called mapping of PSO) for compression of videos. The analysis demonstrate that hybrid technique gives a very good speed up to compress video and achieve Peak Signal to Noise Ratio.
An improved image compression algorithm based on daubechies wavelets with ar...Alexander Decker
This document summarizes an academic article that proposes a new image compression algorithm using Daubechies wavelets and arithmetic coding. It first discusses existing image compression techniques and their limitations. It then describes the proposed algorithm, which applies Daubechies wavelet transform followed by 2D Walsh wavelet transform on image blocks and arithmetic coding. Results show the proposed method achieves higher compression ratios and PSNR values than existing algorithms like EZW and SPIHT. Future work aims to improve results by exploring different wavelets and compression techniques.
This document describes an image fusion method using pyramidal decomposition. It proposes extracting fine details from input images using guided filtering and fusing the base layers of images across multiple exposures or focal points using a multiresolution pyramid approach. A weight map is generated considering exposure, contrast, and saturation to guide the fusion of base layers. The fused base layer is then combined with extracted fine details to produce a detail-enhanced fused image. The goal is to preserve details in both very dark and extremely bright regions of the input images. It is argued that this method can effectively fuse images from different exposures or focal points without introducing artifacts.
A systematic image compression in the combination of linear vector quantisati...eSAT Publishing House
1) The document presents a method for image compression that combines linear vector quantization and discrete wavelet transform.
2) Linear vector quantization is used to generate codebooks and encode image blocks, achieving better PSNR and MSE than self-organizing maps.
3) The encoded blocks are then subjected to discrete wavelet transform. Low-low subbands are stored for reconstruction while other subbands are discarded.
4) Experimental results show the proposed method achieves higher PSNR and lower MSE than existing techniques, preserving both texture and edge information.
A New Approach for video denoising and enhancement using optical flow EstimationIRJET Journal
This document proposes a new approach for video denoising and enhancement using optical flow estimation. It discusses using motion compensation via optical flow estimation along with principal component analysis (PCA) to provide fine video details. However, PCA has limitations in fully eliminating noise. The proposed method aims to replace PCA with wavelet transformation, which provides multi-resolution analysis and sparsity advantages for better denoising results in terms of PSNR and RMSE compared to PCA. It involves estimating optical flow between frames for motion compensation before applying wavelet transformation for noise removal and video reconstruction.
Efficient 3D stereo vision stabilization for multi-camera viewpointsjournalBEEI
In this paper, an algorithm is developed in 3D Stereo vision to improve image stabilization process for multi-camera viewpoints. Finding accurate unique matching key-points using Harris Laplace corner detection method for different photometric changes and geometric transformation in images. Then improved the connectivity of correct matching pairs by minimizing
the global error using spanning tree algorithm. Tree algorithm helps to stabilize randomly positioned camera viewpoints in linear order. The unique matching key-points will be calculated only once with our method.
Then calculated planar transformation will be applied for real time video rendering. The proposed algorithm can process more than 200 camera viewpoints within two seconds.
Improving the iterative back projection estimation through Lorentzian sharp i...IJECEIAES
This document summarizes a study that proposed an enhancement technique for the iterative back projection (IBP) super resolution estimation method. The study aimed to improve the IBP method by using a Lorentzian error function with a sharp infinite symmetrical filter (SISEF) to provide edge enhancement. The IBP method suffers from jaggy and ringing artifacts due to the iterative reconstruction process and lack of edge guidance during back projection. The proposed method combines IBP with the Lorentzian SISEF filter to produce a higher resolution output image with finer edge details while increasing robustness to noise and reducing ringing artifacts. The SISEF filter provides precise edge information to guide the back projection process, and the Lorentzian error norm suppresses
Video saliency-recognition by applying custom spatio temporal fusion techniqueIAESIJAI
Video saliency detection is a major growing field with quite few contributions to it. The general method available today is to conduct frame wise saliency detection and this leads to several complications, including an incoherent pixel-based saliency map, making it not so useful. This paper provides a novel solution to saliency detection and mapping with its custom spatio-temporal fusion method that uses frame wise overall motion colour saliency along with pixel-based consistent spatio-temporal diffusion for its temporal uniformity. In the proposed method section, it has been discussed how the video is fragmented into groups of frames and each frame undergoes diffusion and integration in a temporary fashion for the colour saliency mapping to be computed. Then the inter group frame are used to format the pixel-based saliency fusion, after which the features, that is, fusion of pixel saliency and colour information, guide the diffusion of the spatio temporal saliency. With this, the result has been tested with 5 publicly available global saliency evaluation metrics and it comes to conclusion that the proposed algorithm performs better than several state-of-the-art saliency detection methods with increase in accuracy with a good value margin. All the results display the robustness, reliability, versatility and accuracy.
Robust foreground modelling to segment and detect multiple moving objects in ...IJECEIAES
This document summarizes a research paper that proposes a robust foreground modeling method to segment and detect multiple moving objects in videos. The proposed method uses a running average technique to model the background and subtract it from video frames to detect foreground objects. Morphological operations like dilation and erosion are applied to reduce noise and merge connected regions. Convex hull processing is also used to define object boundaries more clearly. The method was tested on standard video datasets and achieved better performance than other techniques in segmenting objects under various challenging conditions like illumination changes and occlusion. Experimental results demonstrated high precision, recall and specificity based on comparisons with ground truth data.
This paper describes a novel system for vectorizing 2D raster cartoon. The output videos are the resolution independent, smaller in file size. As a first step, input video is segment to scene thereafter all processes are done for each scene separately. Every scene contains foreground and background objects so in each and every scene foreground background classification is performed. Background details can occlude by foreground objects but when foreground objects move its previous position such occluded details exposed in one of the next frame so using that frame can fill the occluded area and can generate static background. Classified foreground objects are identified and the motion of the foreground objects tracked for this simple user assistance is required from those motion details of foreground object’s animation generated. Static background and foreground objects segmented using K-means clustering and each and every cluster’s vectorized using potrace. Using vectored background and foreground object animation path vector video regenerated.
This document discusses a proposed approach for multi-focus image fusion using a discrete cosine wavelet sharpness criterion. Multi-focus image fusion combines information from multiple images of the same scene to produce an "all-in-focus" image. The proposed approach uses a discrete cosine transform to calculate sharpness values for sub-blocks of the input images and selects the sharpest sub-blocks to include in the fused image. Experimental results on images of a clock, bottle, and book show the discrete cosine wavelet criterion produces fused images with higher quality than a bilateral gradient-based sharpness criterion, as measured by mutual information metrics.
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
This document presents a new method for image compression called Haar Wavelet Based Joint Compression Method Using Adaptive Fractal Image Compression (DWT+AFIC). It combines discrete wavelet transform with an existing adaptive fractal image compression technique to improve compression ratio and reconstructed image quality compared to previous fractal image compression methods. The document introduces fractal image compression and its limitations, describes the proposed DWT+AFIC method and 5 other compression techniques, provides simulation results on test images showing DWT+AFIC achieves higher peak signal to noise ratios and compression ratios than other methods, and concludes DWT+AFIC decreases encoding time while increasing compression ratio and maintaining reconstructed image quality.
Development of depth map from stereo images using sum of absolute differences...nooriasukmaningtyas
This article proposes a framework for the depth map reconstruction using stereo images. Fundamentally, this map provides an important information which commonly used in essential applications such as autonomous vehicle navigation, drone’s navigation and 3D surface reconstruction. To develop an accurate depth map, the framework must be robust against the challenging regions of low texture, plain color and repetitive pattern on the input stereo image. The development of this map requires several stages which starts with matching cost calculation, cost aggregation, optimization and refinement stage. Hence, this work develops a framework with sum of absolute difference (SAD) and the combination of two edge preserving filters to increase the robustness against the challenging regions. The SAD convolves using block matching technique to increase the efficiency of matching process on the low texture and plain color regions. Moreover, two edge preserving filters will increase the accuracy on the repetitive pattern region. The results show that the proposed method is accurate and capable to work with the challenging regions. The results are provided by the Middlebury standard dataset. The framework is also efficiently and can be applied on the 3D surface reconstruction. Moreover, this work is greatly competitive with previously available methods.
Survey on Various Image Denoising TechniquesIRJET Journal
This document summarizes several techniques for image denoising. It begins by defining image noise and explaining how noise degrades image quality. It then reviews 7 different published techniques for image denoising, summarizing the key aspects of each technique. These include methods using local spectral component decomposition, SVD-based denoising, patch-based near-optimal denoising, LPG-PCA denoising, trivariate shrinkage filtering, SURE-LET denoising, and 3D transform-domain collaborative filtering. The document concludes that LSCD provides better denoising results according to PSNR analysis and provides an overview of the state-of-the-art in image denoising techniques.
IRJET - Underwater Image Enhancement using PCNN and NSCT FusionIRJET Journal
This document discusses techniques for enhancing underwater images that have been degraded due to scattering and absorption in the water medium. It proposes a new method for color image fusion using Non-Subsampled Contourlet Transform (NSCT) and Pulse Coupled Neural Network (PCNN). NSCT is used to decompose the image into sub-bands, while PCNN is used to fuse the high frequency sub-band coefficients. The proposed method is shown to outperform other fusion methods in objective quality assessment metrics. Various other underwater image enhancement techniques are also discussed, including wavelength compensation, multi-band fusion, image mode filtering, and approaches using neural networks like convolutional neural networks.
Propose shot boundary detection methods by using visual hybrid featuresIJECEIAES
Shot boundary detection is the fundamental technique that plays an important role in a variety of video processing tasks such as summarization, retrieval, object tracking, and so on. This technique involves segmenting a video sequence into shots, each of which is a sequence of interrelated temporal frames. This paper introduces two methods, where the first is for detecting the cut shot boundary via employing visual hybrid features, while the second method is to compare between them. This enhances the effectiveness of the performance of detecting the shot by selecting the strongest features. The first method was performed by utilizing hybrid features, which included statistics histogram of hue-saturation-value color space and grey level co-occurrence matrix. The second method was performed by utilizing hybrid features that include discrete wavelet transform and grey level co-occurrence matrix. The frame size decreased. This process had the advantage of reducing the computation time. Also used local adaptive thresholds, which enhanced the method’s performance. The tested videos were obtained from the BBC archive, which included BBC Learning English and BBC News. Experimental results have indicated that the second method has achieved (97.618%) accuracy performance, which was higher than the first and other methods using evaluation metrics.
Design and implementation of video tracking system based on camera field of viewsipij
The basic idea of this paper is to design and implement of video tracking system based on Camera Field of
View (CFOV), Otsu’s method was used to detect targets such as vehicles and people. Whereas most
algorithms were spent a lot of time to execute the process, an algorithm was developed to achieve it in a
little time. The histogram projection was used in both directional to detect target from search region,
which is robust to various light conditions in Charge Couple Device (CCD) camera images and saves
computation time.
Our algorithm based on background subtraction, and normalize cross correlation operation from a series
of sequential sub images can estimate the motion vector. Camera field of view (CFOV) was determined and
calibrated to find the relation between real distance and image distance. The system was tested by
measuring the real position of object in the laboratory and compares it with the result of computed one. So
these results are promising to develop the system in future.
This paper presents a simple technique to perform inverse halftoning using the deep learning framework. The proposed method inherits the usability and superiority of deep residual learning to reconstruct the halftone image into the continuous-tone representation. It involves a series of convolution operations and activation function in forms of residual block elements. We investigate the usage of pre-activation function and standard activation function in each residual block. The experimental section validates the proposed method ability to effectively reconstruct the halftone image. This section also exhibits the proposed method superiority in the inverse halftoning task compared to that of the handcrafted feature schemes and former deep learning approaches. The proposed method achieves 30.37 dB and 0.9481 on the average peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) scores, respectively. It gives the improvements around 1.67 dB and 0.0481 for those values compared to the most competing scheme.
INFORMATION SATURATION IN MULTISPECTRAL PIXEL LEVEL IMAGE FUSIONIJCI JOURNAL
The availability of imaging sensors operating in multiple spectral bands has led to the requirement of
image fusion algorithms that would combine the image from these sensors in an efficient way to give an
image that is more informative as well as perceptible to human eye. Multispectral image fusion is the
process of combining images from different spectral bands that are optically acquired. In this paper, we
used a pixel-level image fusion based on principal component analysis that combines satellite images of the
same scene from seven different spectral bands. The purpose of using principal component analysis
technique is that it is best method for Grayscale image fusion and gives better results. The main aim of
PCA technique is to reduce a large set of variables into a small set which still contains most of the
information that was present in the large set. The paper compares different parameters namely, entropy,
standard deviation, correlation coefficient etc. for different number of images fused from two to seven.
Finally, the paper shows that the information content in an image gets saturated after fusing four images.
IRJET-Underwater Image Enhancement by Wavelet Decomposition using FPGAIRJET Journal
This document describes a method for enhancing underwater images using wavelet decomposition and fusion on an FPGA (field programmable gate array). Underwater images often have low contrast and visibility due to light scattering in water. The proposed method performs color correction and contrast enhancement on an input underwater image. It then decomposes the color-corrected and contrast-enhanced images into low and high frequency components using wavelet transforms. Image fusion is performed on the wavelet coefficients to combine the detailed information from both images. The fused image is reconstructed via inverse wavelet transform. Experimental results show the proposed fusion-based approach improves underwater image visibility. Implementing the algorithm on an FPGA provides benefits over general processors for computationally intensive image processing.
Internet data almost double every year. The need of multimedia communication
is less storage space and fast transmission. So, the large volume of video data has become
the reason for video compression. The aim of this paper is to achieve temporal compression
for three-dimensional (3D) videos using motion estimation-compensation and wavelets.
Instead of performing a two-dimensional (2D) motion search, as is common in conventional
video codec’s, the use of a 3D motion search has been proposed, that is able to better exploit
the temporal correlations of 3D content. This leads to more accurate motion prediction and
a smaller residual. The discrete wavelet transform (DWT) compression scheme has been
added for better compression ratio. The DWT has a high-energy compaction property thus
greatly impacted the field of compression. The quality parameters peak signal to noise ratio
(PSNR) and mean square error (MSE) have been calculated. The simulation results shows
that the proposed work improves the PSNR from existing work.
Enhancement of Medical Images using Histogram Based Hybrid TechniqueINFOGAIN PUBLICATION
Digital Image Processing is very important area of research. A number of techniques are available for image enhancement of gray scale images as well as color images. They work very efficiently for enhancement of the gray scale as well as color images. Important techniques namely Histogram Equalization, BBHE, RSWHE, RSWHE (recursion=2, gamma=No), AGCWD (Recursion=0, gamma=0) have been used quite frequently for image enhancement. But there are some shortcomings of the present techniques. The major shortcoming is that while enhancement, the brightness of the image deteriorates quite a lot. So there was need for some technique for image enhancement so that while enhancement was done, the brightness of the images does not go down. To remove this shortcoming, a new hybrid technique namely RESWHE+AGCWD (recursion=2, gamma=0 or 1) was proposed. The results of the proposed technique were compared with the existing techniques. In the present methodology, the brightness did not decrease during image enhancement. So the results and the technique was validated and accepted. The parameters via PSNR, MSE, AMBE etc. are taken for performance evaluation and validation of the proposed technique against the existing techniques which results in better outperform.
Because of the rapid growth in technology breakthroughs, including
multimedia and cell phones, Telugu character recognition (TCR) has recently
become a popular study area. It is still necessary to construct automated and
intelligent online TCR models, even if many studies have focused on offline
TCR models. The Telugu character dataset construction and validation using
an Inception and ResNet-based model are presented. The collection of 645
letters in the dataset includes 18 Achus, 38 Hallus, 35 Othulu, 34×16
Guninthamulu, and 10 Ankelu. The proposed technique aims to efficiently
recognize and identify distinctive Telugu characters online. This model's main
pre-processing steps to achieve its goals include normalization, smoothing,
and interpolation. Improved recognition performance can be attained by using
stochastic gradient descent (SGD) to optimize the model's hyperparameters.
Scientific workload execution on a distributed computing platform such as a
cloud environment is time-consuming and expensive. The scientific workload
has task dependencies with different service level agreement (SLA)
prerequisites at different levels. Existing workload scheduling (WS) designs
are not efficient in assuring SLA at the task level. Alongside, induces higher
costs as the majority of scheduling mechanisms reduce either time or energy.
In reducing, cost both energy and makespan must be optimized together for
allocating resources. No prior work has considered optimizing energy and
processing time together in meeting task level SLA requirements. This paper
presents task level energy and performance assurance-workload scheduling
(TLEPA-WS) algorithm for the distributed computing environment. The
TLEPA-WS guarantees energy minimization with the performance
requirement of the parallel application under a distributed computational
environment. Experiment results show a significant reduction in using energy
and makespan; thereby reducing the cost of workload execution in comparison
with various standard workload execution models.
More Related Content
Similar to Video saliency-detection using custom spatiotemporal fusion method
A systematic image compression in the combination of linear vector quantisati...eSAT Publishing House
1) The document presents a method for image compression that combines linear vector quantization and discrete wavelet transform.
2) Linear vector quantization is used to generate codebooks and encode image blocks, achieving better PSNR and MSE than self-organizing maps.
3) The encoded blocks are then subjected to discrete wavelet transform. Low-low subbands are stored for reconstruction while other subbands are discarded.
4) Experimental results show the proposed method achieves higher PSNR and lower MSE than existing techniques, preserving both texture and edge information.
A New Approach for video denoising and enhancement using optical flow EstimationIRJET Journal
This document proposes a new approach for video denoising and enhancement using optical flow estimation. It discusses using motion compensation via optical flow estimation along with principal component analysis (PCA) to provide fine video details. However, PCA has limitations in fully eliminating noise. The proposed method aims to replace PCA with wavelet transformation, which provides multi-resolution analysis and sparsity advantages for better denoising results in terms of PSNR and RMSE compared to PCA. It involves estimating optical flow between frames for motion compensation before applying wavelet transformation for noise removal and video reconstruction.
Efficient 3D stereo vision stabilization for multi-camera viewpointsjournalBEEI
In this paper, an algorithm is developed in 3D Stereo vision to improve image stabilization process for multi-camera viewpoints. Finding accurate unique matching key-points using Harris Laplace corner detection method for different photometric changes and geometric transformation in images. Then improved the connectivity of correct matching pairs by minimizing
the global error using spanning tree algorithm. Tree algorithm helps to stabilize randomly positioned camera viewpoints in linear order. The unique matching key-points will be calculated only once with our method.
Then calculated planar transformation will be applied for real time video rendering. The proposed algorithm can process more than 200 camera viewpoints within two seconds.
Improving the iterative back projection estimation through Lorentzian sharp i...IJECEIAES
This document summarizes a study that proposed an enhancement technique for the iterative back projection (IBP) super resolution estimation method. The study aimed to improve the IBP method by using a Lorentzian error function with a sharp infinite symmetrical filter (SISEF) to provide edge enhancement. The IBP method suffers from jaggy and ringing artifacts due to the iterative reconstruction process and lack of edge guidance during back projection. The proposed method combines IBP with the Lorentzian SISEF filter to produce a higher resolution output image with finer edge details while increasing robustness to noise and reducing ringing artifacts. The SISEF filter provides precise edge information to guide the back projection process, and the Lorentzian error norm suppresses
Video saliency-recognition by applying custom spatio temporal fusion techniqueIAESIJAI
Video saliency detection is a major growing field with quite few contributions to it. The general method available today is to conduct frame wise saliency detection and this leads to several complications, including an incoherent pixel-based saliency map, making it not so useful. This paper provides a novel solution to saliency detection and mapping with its custom spatio-temporal fusion method that uses frame wise overall motion colour saliency along with pixel-based consistent spatio-temporal diffusion for its temporal uniformity. In the proposed method section, it has been discussed how the video is fragmented into groups of frames and each frame undergoes diffusion and integration in a temporary fashion for the colour saliency mapping to be computed. Then the inter group frame are used to format the pixel-based saliency fusion, after which the features, that is, fusion of pixel saliency and colour information, guide the diffusion of the spatio temporal saliency. With this, the result has been tested with 5 publicly available global saliency evaluation metrics and it comes to conclusion that the proposed algorithm performs better than several state-of-the-art saliency detection methods with increase in accuracy with a good value margin. All the results display the robustness, reliability, versatility and accuracy.
Robust foreground modelling to segment and detect multiple moving objects in ...IJECEIAES
This document summarizes a research paper that proposes a robust foreground modeling method to segment and detect multiple moving objects in videos. The proposed method uses a running average technique to model the background and subtract it from video frames to detect foreground objects. Morphological operations like dilation and erosion are applied to reduce noise and merge connected regions. Convex hull processing is also used to define object boundaries more clearly. The method was tested on standard video datasets and achieved better performance than other techniques in segmenting objects under various challenging conditions like illumination changes and occlusion. Experimental results demonstrated high precision, recall and specificity based on comparisons with ground truth data.
This paper describes a novel system for vectorizing 2D raster cartoon. The output videos are the resolution independent, smaller in file size. As a first step, input video is segment to scene thereafter all processes are done for each scene separately. Every scene contains foreground and background objects so in each and every scene foreground background classification is performed. Background details can occlude by foreground objects but when foreground objects move its previous position such occluded details exposed in one of the next frame so using that frame can fill the occluded area and can generate static background. Classified foreground objects are identified and the motion of the foreground objects tracked for this simple user assistance is required from those motion details of foreground object’s animation generated. Static background and foreground objects segmented using K-means clustering and each and every cluster’s vectorized using potrace. Using vectored background and foreground object animation path vector video regenerated.
This document discusses a proposed approach for multi-focus image fusion using a discrete cosine wavelet sharpness criterion. Multi-focus image fusion combines information from multiple images of the same scene to produce an "all-in-focus" image. The proposed approach uses a discrete cosine transform to calculate sharpness values for sub-blocks of the input images and selects the sharpest sub-blocks to include in the fused image. Experimental results on images of a clock, bottle, and book show the discrete cosine wavelet criterion produces fused images with higher quality than a bilateral gradient-based sharpness criterion, as measured by mutual information metrics.
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
This document presents a new method for image compression called Haar Wavelet Based Joint Compression Method Using Adaptive Fractal Image Compression (DWT+AFIC). It combines discrete wavelet transform with an existing adaptive fractal image compression technique to improve compression ratio and reconstructed image quality compared to previous fractal image compression methods. The document introduces fractal image compression and its limitations, describes the proposed DWT+AFIC method and 5 other compression techniques, provides simulation results on test images showing DWT+AFIC achieves higher peak signal to noise ratios and compression ratios than other methods, and concludes DWT+AFIC decreases encoding time while increasing compression ratio and maintaining reconstructed image quality.
Development of depth map from stereo images using sum of absolute differences...nooriasukmaningtyas
This article proposes a framework for the depth map reconstruction using stereo images. Fundamentally, this map provides an important information which commonly used in essential applications such as autonomous vehicle navigation, drone’s navigation and 3D surface reconstruction. To develop an accurate depth map, the framework must be robust against the challenging regions of low texture, plain color and repetitive pattern on the input stereo image. The development of this map requires several stages which starts with matching cost calculation, cost aggregation, optimization and refinement stage. Hence, this work develops a framework with sum of absolute difference (SAD) and the combination of two edge preserving filters to increase the robustness against the challenging regions. The SAD convolves using block matching technique to increase the efficiency of matching process on the low texture and plain color regions. Moreover, two edge preserving filters will increase the accuracy on the repetitive pattern region. The results show that the proposed method is accurate and capable to work with the challenging regions. The results are provided by the Middlebury standard dataset. The framework is also efficiently and can be applied on the 3D surface reconstruction. Moreover, this work is greatly competitive with previously available methods.
Survey on Various Image Denoising TechniquesIRJET Journal
This document summarizes several techniques for image denoising. It begins by defining image noise and explaining how noise degrades image quality. It then reviews 7 different published techniques for image denoising, summarizing the key aspects of each technique. These include methods using local spectral component decomposition, SVD-based denoising, patch-based near-optimal denoising, LPG-PCA denoising, trivariate shrinkage filtering, SURE-LET denoising, and 3D transform-domain collaborative filtering. The document concludes that LSCD provides better denoising results according to PSNR analysis and provides an overview of the state-of-the-art in image denoising techniques.
IRJET - Underwater Image Enhancement using PCNN and NSCT FusionIRJET Journal
This document discusses techniques for enhancing underwater images that have been degraded due to scattering and absorption in the water medium. It proposes a new method for color image fusion using Non-Subsampled Contourlet Transform (NSCT) and Pulse Coupled Neural Network (PCNN). NSCT is used to decompose the image into sub-bands, while PCNN is used to fuse the high frequency sub-band coefficients. The proposed method is shown to outperform other fusion methods in objective quality assessment metrics. Various other underwater image enhancement techniques are also discussed, including wavelength compensation, multi-band fusion, image mode filtering, and approaches using neural networks like convolutional neural networks.
Propose shot boundary detection methods by using visual hybrid featuresIJECEIAES
Shot boundary detection is the fundamental technique that plays an important role in a variety of video processing tasks such as summarization, retrieval, object tracking, and so on. This technique involves segmenting a video sequence into shots, each of which is a sequence of interrelated temporal frames. This paper introduces two methods, where the first is for detecting the cut shot boundary via employing visual hybrid features, while the second method is to compare between them. This enhances the effectiveness of the performance of detecting the shot by selecting the strongest features. The first method was performed by utilizing hybrid features, which included statistics histogram of hue-saturation-value color space and grey level co-occurrence matrix. The second method was performed by utilizing hybrid features that include discrete wavelet transform and grey level co-occurrence matrix. The frame size decreased. This process had the advantage of reducing the computation time. Also used local adaptive thresholds, which enhanced the method’s performance. The tested videos were obtained from the BBC archive, which included BBC Learning English and BBC News. Experimental results have indicated that the second method has achieved (97.618%) accuracy performance, which was higher than the first and other methods using evaluation metrics.
Design and implementation of video tracking system based on camera field of viewsipij
The basic idea of this paper is to design and implement of video tracking system based on Camera Field of
View (CFOV), Otsu’s method was used to detect targets such as vehicles and people. Whereas most
algorithms were spent a lot of time to execute the process, an algorithm was developed to achieve it in a
little time. The histogram projection was used in both directional to detect target from search region,
which is robust to various light conditions in Charge Couple Device (CCD) camera images and saves
computation time.
Our algorithm based on background subtraction, and normalize cross correlation operation from a series
of sequential sub images can estimate the motion vector. Camera field of view (CFOV) was determined and
calibrated to find the relation between real distance and image distance. The system was tested by
measuring the real position of object in the laboratory and compares it with the result of computed one. So
these results are promising to develop the system in future.
This paper presents a simple technique to perform inverse halftoning using the deep learning framework. The proposed method inherits the usability and superiority of deep residual learning to reconstruct the halftone image into the continuous-tone representation. It involves a series of convolution operations and activation function in forms of residual block elements. We investigate the usage of pre-activation function and standard activation function in each residual block. The experimental section validates the proposed method ability to effectively reconstruct the halftone image. This section also exhibits the proposed method superiority in the inverse halftoning task compared to that of the handcrafted feature schemes and former deep learning approaches. The proposed method achieves 30.37 dB and 0.9481 on the average peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) scores, respectively. It gives the improvements around 1.67 dB and 0.0481 for those values compared to the most competing scheme.
INFORMATION SATURATION IN MULTISPECTRAL PIXEL LEVEL IMAGE FUSIONIJCI JOURNAL
The availability of imaging sensors operating in multiple spectral bands has led to the requirement of
image fusion algorithms that would combine the image from these sensors in an efficient way to give an
image that is more informative as well as perceptible to human eye. Multispectral image fusion is the
process of combining images from different spectral bands that are optically acquired. In this paper, we
used a pixel-level image fusion based on principal component analysis that combines satellite images of the
same scene from seven different spectral bands. The purpose of using principal component analysis
technique is that it is best method for Grayscale image fusion and gives better results. The main aim of
PCA technique is to reduce a large set of variables into a small set which still contains most of the
information that was present in the large set. The paper compares different parameters namely, entropy,
standard deviation, correlation coefficient etc. for different number of images fused from two to seven.
Finally, the paper shows that the information content in an image gets saturated after fusing four images.
IRJET-Underwater Image Enhancement by Wavelet Decomposition using FPGAIRJET Journal
This document describes a method for enhancing underwater images using wavelet decomposition and fusion on an FPGA (field programmable gate array). Underwater images often have low contrast and visibility due to light scattering in water. The proposed method performs color correction and contrast enhancement on an input underwater image. It then decomposes the color-corrected and contrast-enhanced images into low and high frequency components using wavelet transforms. Image fusion is performed on the wavelet coefficients to combine the detailed information from both images. The fused image is reconstructed via inverse wavelet transform. Experimental results show the proposed fusion-based approach improves underwater image visibility. Implementing the algorithm on an FPGA provides benefits over general processors for computationally intensive image processing.
Internet data almost double every year. The need of multimedia communication
is less storage space and fast transmission. So, the large volume of video data has become
the reason for video compression. The aim of this paper is to achieve temporal compression
for three-dimensional (3D) videos using motion estimation-compensation and wavelets.
Instead of performing a two-dimensional (2D) motion search, as is common in conventional
video codec’s, the use of a 3D motion search has been proposed, that is able to better exploit
the temporal correlations of 3D content. This leads to more accurate motion prediction and
a smaller residual. The discrete wavelet transform (DWT) compression scheme has been
added for better compression ratio. The DWT has a high-energy compaction property thus
greatly impacted the field of compression. The quality parameters peak signal to noise ratio
(PSNR) and mean square error (MSE) have been calculated. The simulation results shows
that the proposed work improves the PSNR from existing work.
Enhancement of Medical Images using Histogram Based Hybrid TechniqueINFOGAIN PUBLICATION
Digital Image Processing is very important area of research. A number of techniques are available for image enhancement of gray scale images as well as color images. They work very efficiently for enhancement of the gray scale as well as color images. Important techniques namely Histogram Equalization, BBHE, RSWHE, RSWHE (recursion=2, gamma=No), AGCWD (Recursion=0, gamma=0) have been used quite frequently for image enhancement. But there are some shortcomings of the present techniques. The major shortcoming is that while enhancement, the brightness of the image deteriorates quite a lot. So there was need for some technique for image enhancement so that while enhancement was done, the brightness of the images does not go down. To remove this shortcoming, a new hybrid technique namely RESWHE+AGCWD (recursion=2, gamma=0 or 1) was proposed. The results of the proposed technique were compared with the existing techniques. In the present methodology, the brightness did not decrease during image enhancement. So the results and the technique was validated and accepted. The parameters via PSNR, MSE, AMBE etc. are taken for performance evaluation and validation of the proposed technique against the existing techniques which results in better outperform.
Similar to Video saliency-detection using custom spatiotemporal fusion method (20)
Because of the rapid growth in technology breakthroughs, including
multimedia and cell phones, Telugu character recognition (TCR) has recently
become a popular study area. It is still necessary to construct automated and
intelligent online TCR models, even if many studies have focused on offline
TCR models. The Telugu character dataset construction and validation using
an Inception and ResNet-based model are presented. The collection of 645
letters in the dataset includes 18 Achus, 38 Hallus, 35 Othulu, 34×16
Guninthamulu, and 10 Ankelu. The proposed technique aims to efficiently
recognize and identify distinctive Telugu characters online. This model's main
pre-processing steps to achieve its goals include normalization, smoothing,
and interpolation. Improved recognition performance can be attained by using
stochastic gradient descent (SGD) to optimize the model's hyperparameters.
Scientific workload execution on a distributed computing platform such as a
cloud environment is time-consuming and expensive. The scientific workload
has task dependencies with different service level agreement (SLA)
prerequisites at different levels. Existing workload scheduling (WS) designs
are not efficient in assuring SLA at the task level. Alongside, induces higher
costs as the majority of scheduling mechanisms reduce either time or energy.
In reducing, cost both energy and makespan must be optimized together for
allocating resources. No prior work has considered optimizing energy and
processing time together in meeting task level SLA requirements. This paper
presents task level energy and performance assurance-workload scheduling
(TLEPA-WS) algorithm for the distributed computing environment. The
TLEPA-WS guarantees energy minimization with the performance
requirement of the parallel application under a distributed computational
environment. Experiment results show a significant reduction in using energy
and makespan; thereby reducing the cost of workload execution in comparison
with various standard workload execution models.
Investigating human subjects is the goal of predicting human emotions in the
real world scenario. A significant number of psychological effects require
(feelings) to be produced, directly releasing human emotions. The
development of effect theory leads one to believe that one must be aware of
one's sentiments and emotions to forecast one's behavior. The proposed line
of inquiry focuses on developing a reliable model incorporating
neurophysiological data into actual feelings. Any change in emotional affect
will directly elicit a response in the body's physiological systems. This
approach is named after the notion of Gaussian mixture models (GMM). The
statistical reaction following data processing, quantitative findings on emotion
labels, and coincidental responses with training samples all directly impact the
outcomes that are accomplished. In terms of statistical parameters such as
population mean and standard deviation, the suggested method is evaluated
compared to a technique considered to be state-of-the-art. The proposed
system determines an individual's emotional state after a minimum of 6
iterative learning using the Gaussian expectation-maximization (GEM)
statistical model, in which the iterations tend to continue to zero error. Perhaps
each of these improves predictions while simultaneously increasing the
amount of value extracted.
Early diagnosis of cancers is a major requirement for patients and a
complicated job for the oncologist. If it is diagnosed early, it could have made
the patient more likely to live. For a few decades, fuzzy logic emerged as an
emphatic technique in the identification of diseases like different types of
cancers. The recognition of cancer diseases mostly operated with inexactness,
inaccuracy, and vagueness. This paper aims to design the fuzzy expert system
(FES) and its implementation for the detection of prostate cancer. Specifically,
prostate-specific antigen (PSA), prostate volume (PV), age, and percentage
free PSA (%FPSA) are used to determine prostate cancer risk (PCR), while
PCR serves as an output parameter. Mamdani fuzzy inference method is used
to calculate a range of PCR. The system provides a scale of risk of prostate
cancer and clears the path for the oncologist to determine whether their
patients need a biopsy. This system is fast as it requires minimum calculation
and hence comparatively less time which reduces mortality and morbidity and
is more reliable than other economic systems and can be frequently used by
doctors.
The biomedical profession has gained importance due to the rapid and accurate diagnosis of clinical patients using computer-aided diagnosis (CAD) tools.
The diagnosis and treatment of Alzheimer’s disease (AD) using complementary multimodalities can improve the quality of life and mental state of patients.
In this study, we integrated a lightweight custom convolutional neural network
(CNN) model and nature-inspired optimization techniques to enhance the performance, robustness, and stability of progress detection in AD. A multi-modal
fusion database approach was implemented, including positron emission tomography (PET) and magnetic resonance imaging (MRI) datasets, to create a fused
database. We compared the performance of custom and pre-trained deep learning models with and without optimization and found that employing natureinspired algorithms like the particle swarm optimization algorithm (PSO) algorithm significantly improved system performance. The proposed methodology,
which includes a fused multimodality database and optimization strategy, improved performance metrics such as training, validation, test accuracy, precision, and recall. Furthermore, PSO was found to improve the performance of
pre-trained models by 3-5% and custom models by up to 22%. Combining different medical imaging modalities improved the overall model performance by
2-5%. In conclusion, a customized lightweight CNN model and nature-inspired
optimization techniques can significantly enhance progress detection, leading to
better biomedical research and patient care.
Class imbalance is a pervasive issue in the field of disease classification from
medical images. It is necessary to balance out the class distribution while training a model. However, in the case of rare medical diseases, images from affected
patients are much harder to come by compared to images from non-affected
patients, resulting in unwanted class imbalance. Various processes of tackling
class imbalance issues have been explored so far, each having its fair share of
drawbacks. In this research, we propose an outlier detection based image classification technique which can handle even the most extreme case of class imbalance. We have utilized a dataset of malaria parasitized and uninfected cells. An
autoencoder model titled AnoMalNet is trained with only the uninfected cell images at the beginning and then used to classify both the affected and non-affected
cell images by thresholding a loss value. We have achieved an accuracy, precision, recall, and F1 score of 98.49%, 97.07%, 100%, and 98.52% respectively,
performing better than large deep learning models and other published works.
As our proposed approach can provide competitive results without needing the
disease-positive samples during training, it should prove to be useful in binary
disease classification on imbalanced datasets.
Recently, plant identification has become an active trend due to encouraging
results achieved in plant species detection and plant classification fields
among numerous available plants using deep learning methods. Therefore,
plant classification analysis is performed in this work to address the problem
of accurate plant species detection in the presence of multiple leaves together,
flowers, and noise. Thus, a convolutional neural network based deep feature
learning and classification (CNN-DFLC) model is designed to analyze
patterns of plant leaves and perform classification using generated finegrained feature weights. The proposed CNN-DFLC model precisely estimates
which the given image belongs to which plant species. Several layers and
blocks are utilized to design the proposed CNN-DFLC model. Fine-grained
feature weights are obtained using convolutional and pooling layers. The
obtained feature maps in training are utilized to predict labels and model
performance is tested on the Vietnam plant image (VPN-200) dataset. This
dataset consists of a total number of 20,000 images and testing results are
achieved in terms of classification accuracy, precision, recall, and other
performance metrics. The mean classification accuracy obtained using the
proposed CNN-DFLC model is 96.42% considering all 200 classes from the
VPN-200 dataset.
Big data as a service (BDaaS) platform is widely used by various
organizations for handling and processing the high volume of data generated
from different internet of things (IoT) devices. Data generated from these IoT
devices are kept in the form of big data with the help of cloud computing
technology. Researchers are putting efforts into providing a more secure and
protected access environment for the data available on the cloud. In order to
create a safe, distributed, and decentralised environment in the cloud,
blockchain technology has emerged as a useful tool. In this research paper, we
have proposed a system that uses blockchain technology as a tool to regulate
data access that is provided by BDaaS platforms. We are securing the access
policy of data by using a modified form of ciphertext policy-attribute based
encryption (CP-ABE) technique with the help of blockchain technology. For
secure data access in BDaaS, algorithms have been created using a mix of CPABE with blockchain technology. Proposed smart contract algorithms are
implemented using Eclipse 7.0 IDE and the cloud environment has been
simulated on CloudSim tool. Results of key generation time, encryption time,
and decryption time has been calculated and compared with access control
mechanism without blockchain technology.
Internet of things (IoT) has become one of the eminent phenomena in human
life along with its collaboration with wireless sensor networks (WSNs), due
to enormous growth in the domain; there has been a demand to address the
various issues regarding it such as energy consumption, redundancy, and
overhead. Data aggregation (DA) is considered as the basic mechanism to
minimize the energy efficiency and communication overhead; however,
security plays an important role where node security is essential due to the
volatile nature of WSN. Thus, we design and develop proximate node aware
secure data aggregation (PNA-SDA). In the PNA-SDA mechanism, additional
data is used to secure the original data, and further information is shared with
the proximate node; moreover, further security is achieved by updating the
state each time. Moreover, the node that does not have updated information is
considered as the compromised node and discarded. PNA-SDA is evaluated
considering the different parameters like average energy consumption, and
average deceased node; also, comparative analysis is carried out with the
existing model in terms of throughput and correct packet identification.
Drones provide an alternative progression in protection submissions since
they are capable of conducting autonomous seismic investigations. Recent
advancement in unmanned aerial vehicle (UAV) communication is an internet
of a drone combined with 5G networks. Because of the quick utilization of
rapidly progressed registering frameworks besides 5G officialdoms, the
information from the user is consistently refreshed and pooled. Thus, safety
or confidentiality is vital among clients, and a proficient substantiation
methodology utilizing a vigorous sanctuary key. Conventional procedures
ensure a few restrictions however taking care of the assault arrangements in
information transmission over the internet of drones (IOD) environmental
frameworks. A unique hyperelliptical curve (HEC) cryptographically based
validation system is proposed to provide protected data facilities among
drones. The proposed method has been compared with the existing methods
in terms of packet loss rate, computational cost, and delay and thereby
provides better insight into efficient and secure communication. Finally, the
simulation results show that our strategy is efficient in both computation and
communication.
Monitoring behavior, numerous actions, or any such information is considered
as surveillance and is done for information gathering, influencing, managing,
or directing purposes. Citizens employ surveillance to safeguard their
communities. Governments do this for the purposes of intelligence collection,
including espionage, crime prevention, the defense of a method, a person, a
group, or an item; or the investigation of criminal activity. Using an internet
of things (IoT) rover, the area will be secured with better secrecy and
efficiency instead of humans, will provide an additional safety step. In this
paper, there is a discussion about an IoT rover for remote surveillance based
around a Raspberry Pi microprocessor which will be able to monitor a
closed/open space. This rover will allow safer survey operations and would
help to reduce the risks involved with it.
In a world where climate change looms large the spotlight often shines on
greenhouse gases, but the shadow of man-made aerosols should not be
underestimated. These tiny particles play a pivotal role in disrupting Earth's
radiative equilibrium, yet many mysteries surround their influence on various
physical aspects of our planet. The root of these mysteries lies in the limited
data we have on aerosol sources, formation processes, conversion dynamics,
and collection methods. Aerosols, composed of particulate matter (PM),
sulfates, and nitrates, hold significant sway across the hemisphere. Accurate
measurement demands the refinement of in-situ, satellite, and ground-based
techniques. As aerosols interact intricately with the environment, their full
impact remains an enigma. Enter a groundbreaking study in Morocco that
dared to compare an internet of thing (IoT) system with satellite-based
atmospheric models, with a focus on fine particles below 10 and 2.5
micrometers in diameter. The initial results, particularly in regions abundant
with extraction pits, shed light on the IoT system's potential to decode
aerosols' role in the grand narrative of climate change. These findings inspire
hope as we confront the formidable global challenge of climate change.
The use of technology has a significant impact to reduce the consequences of
accidents. Sensors, small components that detect interactions experienced by
various components, play a crucial role in this regard. This study focuses on
how the MPU6050 sensor module can be used to detect the movement of
people who are falling, defined as the inability of the lower body, including
the hips and feet, to support the body effectively. An airbag system is
proposed to reduce the impact of a fall. The data processing method in this
study involves the use of a threshold value to identify falling motion. The
results of the study have identified a threshold value for falling motion,
including an acceleration relative (AR) value of less than or equal to 0.38 g,
an angle slope of more than or equal to 40 degrees, and an angular velocity
of more than or equal to 30 °/s. The airbag system is designed to inflate
faster than the time of impact, with a gas flow rate of 0.04876 m3
/s and an
inflating time of 0.05 s. The overall system has a specificity value of 100%,
a sensitivity of 85%, and an accuracy of 94%.
The fundamental principle of the paper is that the soil moisture sensor obtains
the moisture content level of the soil sample. The water pump is automatically
activated if the moisture content is insufficient, which causes water to flow
into the soil. The water pump is immediately turned off when the moisture
content is high enough. Smart home, smart city, smart transportation, and
smart farming are just a few of the new intelligent ideas that internet of things
(IoT) includes. The goal of this method is to increase productivity and
decrease manual labour among farmers. In this paper, we present a system for
monitoring and regulating water flow that employs a soil moisture sensor to
keep track of soil moisture content as well as the land’s water level to keep
track of and regulate the amount of water supplied to the plant. The device
also includes an automated led lighting system.
In order to provide sensing services to low-powered IoT devices, wireless sensor networks (WSNs) organize specialized transducers into networks. Energy usage is one of the most important design concerns in WSN because it is very hard to replace or recharge the batteries in sensor nodes. For an energy-constrained network, the clustering technique is crucial in preserving battery life. By strategically selecting a cluster head (CH), a network's load can be balanced, resulting in decreased energy usage and extended system life. Although clustering has been predominantly used in the literature, the concept of chain-based clustering has not yet been explored. As a result, in this paper, we employ a chain-based clustering architecture for data dissemination in the network. Furthermore, for CH selection, we employ the coati optimisation algorithm, which was recently proposed and has demonstrated significant improvement over other optimization algorithms. In this method, the parameters considered for selecting the CH are energy, node density, distance, and the network’s average energy. The simulation results show tremendous improvement over the competitive cluster-based routing algorithms in the context of network lifetime, stability period (first node dead), transmission rate, and the network's power reserves.
The construction industry is an industry that is always surrounded by
uncertainties and risks. The industry is always associated with a threatindustry which has a complex, tedious layout and techniques characterized by
unpredictable circumstances. It comprises a variety of human talents and the
coordination of different areas and activities associated with it. In this
competitive era of the construction industry, delays and cost overruns of the
project are often common in every project and the causes of that are also
common. One of the problems which we are trying to cater to is the improper
handling of materials at the construction site. In this paper, we propose
developing a system that is capable of tracking construction material on site
that would benefit the contractor and client for better control over inventory
on-site and to minimize loss of material that occurs due to theft and misplacing
of materials.
Today, health monitoring relies heavily on technological advancements. This
study proposes a low-power wide-area network (LPWAN) based, multinodal
health monitoring system to monitor vital physiological data. The suggested
system consists of two nodes, an indoor node, and an outdoor node, and the
nodes communicate via long range (LoRa) transceivers. Outdoor nodes use an
MPU6050 module, heart rate, oxygen pulse, temperature, and skin resistance
sensors and transmit sensed values to the indoor node. We transferred the data
received by the master node to the cloud using the Adafruit cloud service. The
system can operate with a coverage of 4.5 km, where the optimal distance
between outdoor sensor nodes and the indoor master node is 4 km. To further
predict fall detection, various machine learning classification techniques have
been applied. Upon comparing various classifier techniques, the decision tree
method achieved an accuracy of 0.99864 with a training and testing ratio of
70:30. By developing accurate prediction models, we can identify high-risk
individuals and implement preventative measures to reduce the likelihood of
a fall occurring. Remote monitoring of the health and physical status of elderly
people has proven to be the most beneficial application of this technology.
The effectiveness of adaptive filters are mainly dependent on the design
techniques and the algorithm of adaptation. The most common adaptation
technique used is least mean square (LMS) due its computational simplicity.
The application depends on the adaptive filter configuration used and are well
known for system identification and real time applications. In this work, a
modified delayed μ-law proportionate normalized least mean square
(DMPNLMS) algorithm has been proposed. It is the improvised version of the
µ-law proportionate normalized least mean square (MPNLMS) algorithm.
The algorithm is realized using Ladner-Fischer type of parallel prefix
logarithmic adder to reduce the silicon area. The simulation and
implementation of very large-scale integration (VLSI) architecture are done
using MATLAB, Vivado suite and complementary metal–oxide–
semiconductor (CMOS) 90 nm technology node using Cadence RTL and
Genus Compiler respectively. The DMPNLMS method exhibits a reduction
in mean square error, a higher rate of convergence, and more stability. The
synthesis results demonstrate that it is area and delay effective, making it
practical for applications where a faster operating speed is required.
The increasing demand for faster, robust, and efficient device development of enabling technology to mass production of industrial research in circuit design deals with challenges like size, efficiency, power, and scalability. This paper, presents a design and analysis of low power high speed full adder using negative capacitance field effecting transistors. A comprehensive study is performed with adiabatic logic and reversable logic. The performance of full adder is studied with metal oxide field effect transistor (MOSFET) and negative capacitance field effecting (NCFET). The NCFET based full adder offers a low power and high speed compared with conventional MOSFET. The complete design and analysis are performed using cadence virtuoso. The adiabatic logic offering low delay of 0.023 ns and reversable logic is offering low power of 7.19 mw.
The global agriculture system faces significant challenges in meeting the
growing demand for food production, particularly given projections that the
world's population will reach 70% by 2050. Hydroponic farming is an
increasingly popular technique in this field, offering a promising solution to
these challenges. This paper will present the improvement of the current
traditional hydroponic method by providing a system that can be used to
monitor and control the important element in order to help the plant grow up
smoothly. This proposed system is quite efficient and user-friendly that can
be used by anyone. This is a combination of a traditional hydroponic system,
an automatic control system and a smartphone. The primary objective is to
develop a smart system capable of monitoring and controlling potential
hydrogen (pH) levels, a key factor that affects hydroponic plant growth.
Ultimately, this paper offers an alternative approach to address the challenges
of the existing agricultural system and promote the production of clean,
disease-free, and healthy food for a better future.
More from International Journal of Reconfigurable and Embedded Systems (20)
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Video saliency-detection using custom spatiotemporal fusion method
1. International Journal of Reconfigurable and Embedded Systems (IJRES)
Vol. 12, No. 2, July 2023, pp. 269~275
ISSN: 2089-4864, DOI: 10.11591/ijres.v12.i2.pp269-275 269
Journal homepage: http://ijres.iaescore.com
Video saliency-detection using custom spatiotemporal fusion
method
Vinay C. Warad, Ruksar Fatima
Department of Computer Science and Engineering, Khaja Bandanawaz College of Engineering, Kalaburagi, India
Article Info ABSTRACT
Article history:
Received Jul 20, 2022
Revised Oct 15, 2022
Accepted Dec 10, 2022
There have been several researches done in the field of image saliency but
not as much as in video saliency. In order to increase precision and accuracy
during compression, reduce coding complexity and time consumption along
with memory allocation problems with our proposed solution. It is a
modified high-definition video compression (HEVC) pixel based consistent
spatiotemporal diffusion with temporal uniformity. It involves taking apart
the video into groups of frames, computing colour saliency, integrate
temporal fusion, pixel saliency fusion is conducted and then colour
information guides the diffusion process for the spatiotemporal mapping
with the help of permutation matrix. The proposed solution is tested on a
publicly available extensive dataset with five global saliency valuation
metrics and is compared with several other state-of-the-art saliency detection
methods. The results display and overall best performance amongst all other
candidates.
Keywords:
Computing colour saliency
High-definition video
compression pixel
Image saliency
Spatiotemporal diffusion
Video saliency
This is an open access article under the CC BY-SA license.
Corresponding Author:
Vinay C. Warad
Department of Computer Science and Engineering, Khaja Bandanawaz College of Engineering
Kalaburagi, Karnataka 585104, India
Email: vinay_c111@rediffmail.com
1. INTRODUCTION
The world has tried to imitate the functioning of the human eye and the brain. The marvel of the
brain to distinguish among the important and non-important features of the view the eyes are seeing and take
in only whatever is necessary. Various researchers have imitated this process and in today’s word, we have
this in the form of conference videos, broadcasting and streaming. There have been several researches in the
field of image saliency but not in video saliency. Few researches that have made a significant impact in this
field. Itti’s model is one of the most [1] researched and most prominent models for image saliency. Fourier
transformation is used with the help of phase spectrum and [2], [3] helps image saliency using frequency
tuning. They have used the principles of inhibition of return and winner take all that is inspired from the
visual nervous system [4], [5].
It is difficult for video saliency detection, as images are not still, making memory allocation and
computational complexity increased. It has a video saliency detection methodology [6] that involves
determining the position of an object with reference to another. They use computation of space-time-saliency
map as well as computation of motion saliency map [7]-[10]. Fused static and dynamic saliency mapping
[11] to obtain a space- time saliency detection model. Here dynamic texture model is employed [12] to obtain
motion patterns for both stationary and dynamic scenes.
They have used fusion model but it results in low-level saliency [13]-[15]. They have used global
temporal clues to forge a robust low-level saliency map [16], [17]. The disadvantage of these methodologies
is that the accumulation of error is quite high and this has led to several wrong detections.
2. ISSN: 2089-4864
Int J Reconfigurable & Embedded Syst, Vol. 12, No. 2, July 2023: 269-275
270
The proposed solution is a modified spatiotemporal fusion saliency detection method. It involves a
spatiotemporal background to obtain high saliency values around the foreground objects. Then after ignoring
the hollow effects, a series of adjustments are made to the general saliency strategies to increase efficiency of
both motion and colour saliencies. The usage of cross frame super pixels and one to one spatial temporal
fusion helps in overall increase in accuracy and precision during compression.
2. RELATED WORK
In this section, the works of some of the research papers that have helped in the completion of the
proposed algorithm have been mentioned. This survey talks about the various video saliency methodologies
along with their advantages and disadvantages [18]. Borji [19], it has also the same outline of the paper but it
also includes the various aspect, which make it difficult for the algorithms to imitate the human eye-brain
coordination and how to overcome them.
This paper has a notable contribution to this field of research [20]. It has a database named dynamic
human fixation 1K (DHF1K) that helps in pointing out fixations that are needed during dynamic scene free
viewing, then there is the attentive convolutional neural network-long short-term memory network (ACLNet)
which has augmentations to the original convolutional neural network and long short-term memory (CNN-
LSTM) model to enable fast end-to-end saliency learning. In this paper [21], [22] they have made some
corrections in the smooth pursuits (SP) logic. It involves manual annotations of the SPs with fixation along
the arithmetic points and SP salient locations by training slicing convolutional neural networks.
High-definition video compression (HEVC) system has become the new standard video
compression algorithms used today. With making changes to the HEVC algorithms with the help of a spatial
saliency algorithm that uses the concept of a motion vector [23], It has led to better compression and
efficiency. They haves introduced a salient object segmentation that uses the combination of conditional
random field (CRF) and saliency measure. It has used statistical framework and local colour contrasting,
motion and illumination features [24]. Fang et al. [25] is also using spatiotemporal fusion with uncertainty in
statistics to measure visual saliency. They have used geodesic robustness methodology to get the saliency
map [26], [27]. Has been a great help to our solution formation with its super-pixel usage and adaptive colour
quantization [28]-[30]. Its measurement of difference between spatial distance and histograms has helped to
obtain the super-pixel saliency map. They gave us an overall idea of the various evaluation metrics to be used
in this paper [31], [32]. The first section has the introduction and section 2 succeeds it with the related work
[33]. Section 3 and 4 displays the proposed algorithm, its methodologies and modifications along with its
final experimentation and comparison. Section 5 concludes the paper.
3. PROPOSED SYSTEM
3.1. Modeling based saliency adjustment
The robustness is obtained by combining long-term inter batch information with colour contrast
computation. Background and foreground appearance models are represented by 𝐵𝑀 ∈ ℝ3×𝑏𝑛
and 𝐹𝑀 ∈
ℝ3×𝑓𝑛
with 𝑏𝑛 𝑎𝑛𝑑 𝑓𝑛 being their sizes respectively. The 𝑖 − 𝑡ℎ super pixel’s RGB history in all regions is
taken care of with the following equations 𝑖𝑛𝑡𝑟𝑎𝐶𝑖
= exp(𝜆 − |𝜑(𝑀𝐶𝑖) − 𝜑(𝐶𝑀𝑖)|) ; 𝜆 = 0.5 and 𝑖𝑛𝑡𝑒𝑟𝐶𝑖
=
𝜑(
min||(𝑅𝑖,𝐺𝑖,𝐵𝑖),𝐵𝑀||2
⋅
1
𝑏𝑛
∑||(𝑅𝑖,𝐺𝑖,𝐵𝑖),𝐵𝑀||2
min||(𝑅𝑖,𝐺𝑖,𝐵𝑖),𝐹𝑀||
2
⋅
1
𝑓𝑛
∑||(𝑅𝑖,𝐺𝑖,𝐵𝑖),𝐹𝑀||
2
). Here, 𝜆 is the upper bound discrepancy degree and helps inversing
the penalty between the motion and color saliencies.
3.2. Contrast-based saliency mapping
The video sequence is now divided into several short groups of frames 𝐺𝑖 = {𝐹1, 𝐹2, 𝐹3, … . , 𝐹𝑛}.
Each frame 𝐹𝑘, where (𝑘denotes the frame number) undergoes modification using simple linear iterative
clustering with boundary-aware smoothing method which removes the unnecessary details. The colour and
motion gradient mapping to help form the spatiotemporal gradient map with help of pixel-based computation
is given by 𝑆𝑀𝑇 = ||𝑢𝑥, 𝑢𝑦||2
⨀||∇(𝐹)||2
. That is, horizontal and vertical gradient of optical flow and ∇(𝐹)
colour gradient map. We then calculate the 𝑖 − 𝑡ℎ super pixel’s motion contrast using (1).
𝑀𝐶𝑖 = ∑
||𝑈𝑖,𝑈𝑗||
2
||𝑎𝑖,𝑎||2
,
𝑎𝑗∈𝜓𝑖
𝜓𝑖 = {𝜏 + 1 ≥ ||𝑎𝑖, 𝑎𝑗||
2
≥ 𝜏} (1)
3. Int J Reconfigurable & Embedded Syst ISSN: 2089-4864
Video saliency-detection using custom spatiotemporal fusion method (Vinay C. Warad)
271
Where 𝑙2 norm has been used and 𝑈 and 𝑎𝑖 denote the optical flow gradient in two directions and 𝑖 −
𝑡ℎ super-pixel position centre respectively. 𝜓𝑖is used to denote computational contrast range and is calculated
using shortest Euclidean distance between spatiotemporal map and 𝑖 − 𝑡ℎ superpixel.
𝜏 =
𝑟
||Λ(𝑆𝑀𝑇)||0
∑ ||Λ(𝑆𝑀𝑇𝜏
)||
0
𝜏∈||𝜏,𝑖||≤𝑟 ; 𝑙 = 0.5 min{𝑤𝑖𝑑𝑡ℎ, ℎ𝑒𝑖𝑔ℎ𝑡} , Λ → 𝑑𝑜𝑤𝑛 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 (2)
Colour saliency is also computed the same way as optical flow gradient, except we use the red, blue
and green notations for the 𝑖 − 𝑡ℎ super pixel. So, the equation is 𝐶𝑀 = ∑
||(𝑅𝑖,𝐺𝑖,𝐵𝑖,),(𝑅𝑗,𝐺𝑗,𝐵𝑗)||
2
||𝑎𝑖,𝑎𝑗||
2
𝑎𝑗∈𝜓𝑖
. The
following equation smoothens both 𝑀𝐶 and 𝐶𝑀 as temporal and saliency value refining is done by spatial
information integration.
𝐶𝑀𝑘,𝑖 ←
∑ ∑ exp (−||𝑐𝑘,𝑖
𝑎𝜏,𝑗∈𝜇𝜙
,𝑐 𝜏,𝑗||1 𝜇)⋅𝐶𝑀𝜏,𝑗
⁄
𝑘+1
𝜏=𝑘−1
∑ ∑ exp (−||𝑐𝑘,𝑖
𝑎𝜏,𝑗∈𝜇𝜙
,𝑐 𝜏,𝑗||1 𝜇)
⁄
𝑘+1
𝜏=𝑘−1
(3)
Here, 𝑐𝑘,𝑖 is the average of the 𝑖 − 𝑡ℎ super-pixel RGB colour value in 𝑘 − 𝑡ℎ frame while 𝜎 controls
smoothing strength. The ||𝑎𝑘,𝑖, 𝑎 𝜏,𝑗||
2
≤ 𝜃 needs to be satisfied and this is done using 𝜇.
𝜃 =
1
𝑚×𝑛
∑ ∑ ||
1
𝑚
𝑚
𝑖=1
𝑛
𝑘=1 ∑ 𝐹(𝑆𝑀𝑇𝑘,𝑖
𝑚
𝑖=1 ), 𝐹(𝑆𝑀𝑇𝑘,𝑖
)||1; 𝑚, 𝑛 = 𝑓𝑟𝑎𝑚𝑒 𝑛𝑢𝑚𝑏𝑒𝑟𝑠 (4)
𝐹(𝑆𝑀𝑇𝑖
) = {
𝑎𝑖, 𝑆𝑀𝑇𝑖
≤ 𝜖 ×
1
𝑚
∑ 𝑆𝑀𝑇𝑖
𝑚
𝑖=1
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
; 𝜖 = 𝑓𝑖𝑙𝑡𝑒𝑟 𝑠𝑡𝑟𝑒𝑛𝑔ℎ𝑡 𝑐𝑜𝑛𝑡𝑟𝑜𝑙 (5)
At each batch frame level, the 𝑞 − 𝑡ℎ frame’s smoothing rate is dynamically updated with (1 −
𝛾)𝜃𝑠−1 + 𝛾𝜃𝑠 → 𝜃𝑠; 𝛾 = (𝑙𝑒𝑎𝑟𝑛𝑖𝑛𝑔 𝑤𝑒𝑖𝑔ℎ𝑡 ,0.2). Now the colour and motion saliency is integrated to get
the pixel-based saliency map𝐿𝐿𝑆 = 𝐶𝑀 ⊙ 𝑀𝐶. Since this fused saliency maps increases accuracy
considerably but the rate decreases, so this will be dealt with in the next section.
3.3. Accuracy boosting
Matrix 𝑀 is to be considered as the input. It will be decomposed using sparse 𝑆 and low level 𝐷 with
min
𝐷,𝑆
𝛼||𝑆||1
+ ||𝐷||∗
𝑠𝑢𝑏𝑗 = 𝑀 = 𝑆 + 𝐷 where the nuclear form of 𝐷 is used. With the help of robust
principal component analysis (RPCA) [30] and is showcased using 𝑆 ← 𝑠𝑖𝑔𝑛(𝑀 − 𝐷 − 𝑆)[|𝑀 − 𝐷 − 𝑆| −
𝛼𝛽]+ and 𝐷 ← 𝑉[Σ − 𝛽𝐼]+𝑈, (𝑉, Σ, 𝑈) ← 𝑠𝑣𝑑(𝑍). Where 𝑠𝑣𝑑(𝑍) denotes singular value decomposition of
Lagrange multiplier and 𝛼 𝑎𝑛𝑑 𝛽 represent lesser-rank and sparse threshold parameters respectively. For
reduction of incorrect detections caused by the misplacement of optical flow of super pixels in the
foreground’s region, the given region’s rough foreground is located and feature subspace of a frame 𝑘 is
spanned as 𝑔𝐼𝑘 = {𝐿𝐿𝑆𝑘,1
, 𝐿𝐿𝑆𝑘,2
, … . . 𝐿𝐿𝑆𝑘,𝑚
} and thus for the entire frame group we get 𝑔𝐵𝜏 =
{𝑔𝐼1, 𝑔𝐼2, … . , 𝑔𝐼𝑛}. This way the rough foreground is calculated as 𝑅𝐹𝑖
= [∑ 𝐿𝐿𝑆𝑘,𝑖
−
𝑛
𝑘=1
𝜔
𝑛×𝑚
∑ ∑ 𝐿𝐿𝑆𝑘,𝑖
]+
𝑚
𝑖=1
𝑛
𝑘=1 .
Here 𝜔 is reliability cotrol factor and we also get two subspaces by 𝐿𝐿𝑆 and RGB colour and it is
given by 𝑆𝐵 = {𝑐𝑣1, 𝑐𝑣2, … . , 𝑐𝑣𝑛} ∈ ℝ3𝑣×𝑛
where 𝑐𝑣𝑖 = {𝑣𝑒𝑐(𝑅𝑖,1, 𝐺𝑖,1, 𝐵𝑖,1, … . , 𝑅𝑖,𝑚, 𝐺𝑖,𝑚, 𝐵𝑖,𝑚)}𝐾
and
𝑆𝐹 = 𝑣𝑒𝑐(𝐿𝐿𝑆1
), … . 𝑣𝑒𝑐(𝐿𝐿𝑆𝑛
) ∈ ℝ𝑣×𝑛
. This helps in making a one-to-one correspondence and then pixel-
based saliency mapping infusion that is dissipated on the entire group of frames. 𝑆𝐵over𝑆𝐹 causes disruptive
foreground salient movements and hence with the help from [31]-[33] this issue was resolved with an
alternate solution.
min
𝑀𝑐𝑥,𝑆𝑐𝑥,𝜗,𝐴⊙𝜗
||𝑀𝑐||
∗
+ ||𝐷𝑥||
∗
+ ||𝐴 + 𝜗||2
+ 𝛼1||𝑆𝑐||
1
+ 𝛼2||𝑆𝑥||; || ∙ ||∗
𝑛𝑢𝑐𝑙𝑒𝑎𝑟 𝑛𝑜𝑟𝑚, 𝐴 𝑖𝑠 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 𝑚𝑎𝑡𝑟𝑖𝑥𝑠. 𝑡 𝑀𝑐 = 𝐷𝑐 + 𝑆𝑐, 𝑀𝑠 = 𝐷𝑠 + 𝑆𝑥, 𝑀𝑐 = 𝑆𝐵 ⊙ 𝜗,
𝑀𝑥 = 𝑆𝐹 ⊙ 𝜗, 𝜗 = {𝐸1, 𝐸2, … . , 𝐸𝑛}, 𝐸𝑖 ∈ {0,1}𝑚×𝑚
, 𝐸𝑖1𝐾
= 1. (6)
𝐷𝑐, 𝐷𝑥variables represent colour and saliency mapping, 𝜗 is the permutation matrix while 𝑆𝑥, 𝑆𝑐
represents colour feature sparse component space and saliency feature space. This entire equation set helps in
correcting super-pixel correspondences.
4. ISSN: 2089-4864
Int J Reconfigurable & Embedded Syst, Vol. 12, No. 2, July 2023: 269-275
272
3.4. Mathematical model
As shown in (6) generates a distributed version of convex problems 𝐷(𝑀𝑐𝑥, 𝑆𝑐𝑥, 𝜗, 𝐴 ⊙ 𝜗) =
𝛼1||𝑆𝑐||
1
+ 𝛼2||𝐸𝑥||
2
+ 𝛽1||𝑀𝑐||
∗
+ 𝛽2||𝑀𝑥||
∗
+ ||𝐴 ⊙ 𝜗||2
+ 𝑡𝑟𝑎𝑐𝑒(𝑍1
𝐾(𝑀𝑐 − 𝐷𝑐 − 𝑆𝑐)) +
𝑡𝑟𝑎𝑐𝑒(𝑍2
𝐾(𝑀𝑥 − 𝐷𝑥 − 𝑆𝑥)) +
𝜋
2
(||𝑀𝑐 − 𝐷𝑐 − 𝑆𝑐||
2
+ ||(𝑀𝑥 − 𝐷𝑥 − 𝑆𝑥)||
2
). Where 𝑍𝑖 represents Lagrangian
multiplier. 𝜋denotes steps of iterations and the optimized solution using partial derivative 𝑆𝑐,𝑥
𝑘+1
=
1
2
||𝑆𝑐,𝑥
𝑘
−
(𝑀𝑐,𝑥
𝑘
− 𝑆𝑐,𝑥
𝑘
+ 𝑍1,2
𝑘
𝜋𝑘||2
2
⁄ + min
𝑆𝑐,𝑥
𝑘
𝛼1,2 ||𝑆𝑐,𝑥
𝑘
||
1
/𝜋𝑘 and 𝐷𝑐,𝑥
𝑘+1
=
1
2
||𝐷𝑐,𝑥
𝑘
− (𝑀𝑐,𝑥
𝑘
− 𝐷𝑐,𝑥
𝑘
+ 𝑍1,2
𝑘
𝜋𝑘||2
2
⁄ +
min
𝐷𝑐,𝑥
𝑘
𝛽1,2 ||𝐷𝑐,𝑥
𝑘
||
∗
/𝜋𝑘.
𝐷𝑖 is updated to become 𝐷𝑐,𝑥
𝑘+1
← 𝑈𝐾
+ 𝑉 [Σ −
𝛽1,2
𝜋𝑘
], where (𝑉, Σ, 𝑈) ← 𝑠𝑣𝑑 (𝑀𝑐,𝑥
𝑘
− 𝑆𝑐,𝑥
𝑘
+
𝑍1,2
𝑘
𝜋𝑘
).
Similarly, for 𝑆𝑖,𝑆𝑐,𝑥
𝑘+1
← 𝑠𝑖𝑔𝑛 (
|𝐽|
𝜋𝑘
) [𝐽 −
𝛼1,2
𝜋𝑘
]
+
as 𝐽 = 𝑀𝑐,𝑥
𝑘
− 𝐷𝑐,𝑥
𝑘
+ 𝑍𝑐,𝑥
𝑘
/𝜋𝑘.
Value of 𝐸 is determined are used to compute the norm cost 𝐿 ∈ ℝ𝑚×𝑚
is calculated as 𝑙𝑖,𝑗
𝑘
=
||𝑂𝑘,𝑖 − 𝐻(𝑉1, 𝑗)||
2
, 𝑉1 = 𝐻(𝑆𝐵, 𝑘) ⊙ 𝐸𝑘 and 𝑙𝑖,𝑗
𝑘
= ||𝑂𝑘,𝑖 − 𝐻(𝑉2, 𝑗)||
2
, 𝑉2 = 𝐻(𝑆𝐵, 𝑘) ⊙ 𝐸𝑘. Then we use
and objective matrix 𝑂 to calculate the 𝑘 − 𝑡ℎ of 𝑅𝐹 and the equation is 𝑂𝑘,𝑖 = 𝑆𝑐,𝑥(𝑘, 𝑖) + 𝐷𝑐,𝑥(𝑘, 𝑖) −
𝑍1,2(𝑘, 𝑖)/𝜋𝑘 . There is a need to change 𝐿𝜏 as it is hard to approximate the value of 𝑚𝑖𝑛||𝐴 + 𝜗||2
. 𝐿𝜏 =
{𝑟1,1
𝜏
+ 𝑑1,1
𝜏
, 𝑟1,2
𝜏
+ 𝑑1,2
𝜏
, … . , 𝑟𝑚,𝑚
𝜏
+ 𝑑𝑚,𝑚
𝜏
} ∈ ℝ𝑚×𝑚
𝑓𝑜𝑟 𝑘 = [𝑘 − 1, 𝑘 + 1] is hanged to 𝐿𝑘 as shown in (7).
𝐻(𝐿𝑘, 𝑗) ← ∑ ∑ 𝐻(𝐿𝜏, 𝑣). exp (−||𝑐𝜏,𝑣, 𝑐𝑘,𝑗|| 1 𝜇)
⁄
𝑝𝑡,𝑣∈𝜉
𝑘+1
𝜏=𝑘−1 (7)
The global optimization is solved using the equations 𝑆𝐹𝑘+1
← 𝑆𝐹𝑘
⊙ 𝜗, 𝑆𝐵𝑘+1
𝑆𝐵𝑘
⊙ 𝜗 and
𝑍1,2
𝑘+1
← 𝜋𝑘(𝑀𝑐,𝑥
𝑘
− 𝐷𝑐,𝑥
𝑘
− 𝑆𝑐,𝑥
𝑘
) + 𝑍1,2
𝑘
where 𝜋𝑘+1 ← 𝜋𝑘 × 1.05. The alignment of the super pixels is now
given by 𝑔𝑆𝑖 =
1
𝑛−1
∑ 𝐻(𝑆𝐹 ⊙ 𝜗, 𝜏)
𝑛
𝜏=1,𝑖≠𝜏 . To reduce the incorrect detections and alignments we introduce
𝑆𝐹 and use (8)-(10).
𝑆𝐹
̃ ← 𝑆𝐹 ⊙ 𝜗 (8)
𝑆𝐹 ← 𝑆𝐹
̃ ∙ (1𝑚×𝑛
− 𝑋(𝑆𝑐)) + 𝜌 ∙ 𝑆𝐹
̅̅
̅̅ ∙ 𝑋(𝑆𝑐) (9)
𝜌𝑖,𝑗 = {
0.5,
1
𝑛
∑ 𝑆𝐹𝑖,𝑗 <
̃
𝑛
𝑗=1 𝑆𝐹𝑖,𝑗
̃
2, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
(10)
The equation for mapping for the 𝑖 − 𝑡ℎ video frame is given by 𝑔𝑆𝑖 =
𝐻(𝜌,𝑖)−(𝐻(𝜌,𝑖).𝑋(𝑆𝑐)
𝐻(𝜌,𝑖)(𝑛−1)
∑ 𝐻(𝑆𝐹 ⊙ 𝜗, 𝜏)
𝑛
𝜏=1,𝑖≠𝜏 . There is a need to diffuse inner temporal batch 𝑥𝑟 of the current
group’s frames based of degree of colour similarity. The final output is given by 𝑔𝑆𝑖,𝑗 =
𝑥𝑟∙𝑦𝑟+∑ 𝑦𝑖∙𝑔𝑆𝑖,𝑗
𝑛
𝑖=1
𝑦𝑟+∑ 𝑦𝑖
𝑛
𝑖=1
; 𝑦𝑟 = exp (− ||𝑐𝑟,𝑗, 𝑐𝑖,𝑗||
2
/𝜇). Where 𝑥𝑙showcases the colour distance-based weights.
4. RESULTS, EXPERIMENTS AND DATABASE
The proposed solution has been compared with [34] as a base reference as well as by [35]’s
operational block description length (OBDL) algorithm, [36]’s dynamic adaptive whitening saliency (AWS-
D) algorithm, the object-to-motion convolutional neural network two layer long short-term memory
(OMCNN-2CLSTM) algorithm in [36], attentive convolutional (ACL) algorithm [37], saliency-aware video
compression (SAVC) algorithm from [38] and [39]. The database used is the same as the one in the base
paper. It is a high-definition eye-tracking database with its open source available at GitHub
https://github.com/spzhubuaa/Video-based-Eye-Tracking-Dataset [40]. 10 video sequences with 3 different
resolutions, 1920 × 1080, 1280 × 720, and 832 × 480, were taken for experimentation. For evaluating the
performance of all the saliency methods, we employed five global evaluation metrics, namely area under the
ROC curve (AUC), Similarity (SIM), correlation coefficient (CC), normalized scanpath saliency (NSS) and
Kullback-Leibler (KL).
5. Int J Reconfigurable & Embedded Syst ISSN: 2089-4864
Video saliency-detection using custom spatiotemporal fusion method (Vinay C. Warad)
273
The XU algorithm is quite similar to HEVC; hence its saliency detection is better than most
algorithms but is faces problems when there are complex images as input. Other than that, our proposed
solution has performed remarkably well and has the best compression efficiency and precision among all the
algorithms in comparison. Table 1 shows results for saliency algorithms that are used. Figure 1 shows the
saliency evaluation and comparison graph.
Table 1. The following results for saliency algorithms used: fixation maps, XU [40], base paper [34] and
proposed algorithm
Parameter BasketBall FourPeople RaceHorses
Fixation Maps
XU [40]
Base Paper [34]
Proposed
algorithm
Figure 1. Saliency evaluation and comparison graph
5. CONCLUSION
This paper has proposed a solution called modified spatiotemporal fusion video saliency detection
method. It involves a modified fusion calculation along with several changes to the basic HEVC code to
include colour contrast computations, boost both motions, and colour values. There is also spatiotemporal of
pixel-based coherency boost to increase temporal scope saliency. The proposed work is tested on the
database as same as that of the base paper and is compared with other state-of-the-art methods with the help
of five global evaluation metrics AUC, SIM, CC, NSS and KL. It has been concluded that the proposed
algorithm of this paper has the best performance out of all the mentioned methods with better compression
efficiency and precision.
REFERENCES
[1] L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254–1259, 1998, doi: 10.1109/34.730558.
[2] C. Guo, Q. Ma, and L. Zhang, “Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform,” in 26th
IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Jun. 2008, pp. 1–8, doi: 10.1109/CVPR.2008.4587715.
[3] R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk, “Frequency-tuned salient region detection,” in 2009 IEEE Conference on
Computer Vision and Pattern Recognition, Jun. 2010, pp. 1597–1604, doi: 10.1109/cvpr.2009.5206596.
[4] M. Cerf, E. P. Frady, and C. Koch, “Faces and text attract gaze independent of the task: experimental data and computer model,”
Journal of Vision, vol. 9, no. 12, pp. 10–10, Nov. 2009, doi: 10.1167/9.12.10.
[5] M. Cerf, J. Harel, W. Einhäuser, and C. Koch, “Predicting human gaze using low-level saliency combined with face detection,”
Advances in Neural Information Processing Systems 20 (NIPS 2007), 2008.
6. ISSN: 2089-4864
Int J Reconfigurable & Embedded Syst, Vol. 12, No. 2, July 2023: 269-275
274
[6] L. J. Li and L. Fei-Fei, “What, where and who? Classifying events by scene and object recognition,” in Proceedings of the IEEE
International Conference on Computer Vision, 2007, pp. 1–8, doi: 10.1109/ICCV.2007.4408872.
[7] B. Scassellati, “Theory of mind for a humanoid robot,” Autonomous Robots, vol. 12, no. 1, pp. 13–24, 2002, doi:
10.1023/A:1013298507114.
[8] S. Marat, T. H. Phuoc, L. Granjon, N. Guyader, D. Pellerin, and A. Guérin-Dugué, “Spatio-temporal saliency model to predict eye
movements in video free viewing,” 2008 16th European Signal Processing Conference, Lausanne, 2008, pp. 1-5.
[9] Y. F. Ma and H. J. Zhang, “A model of motion attention for video skimming,” in IEEE International Conference on Image
Processing, 2002, vol. 1, pp. I-129-I–132, doi: 10.1109/icip.2002.1037976.
[10] S. Li and M. C. Lee, “Fast visual tracking using motion saliency in video,” in ICASSP, IEEE International Conference on
Acoustics, Speech and Signal Processing - Proceedings, 2007, vol. 1, pp. I-1073-I–1076, doi: 10.1109/ICASSP.2007.366097.
[11] R. J. Peters and L. Itti, “Beyond bottom-up: incorporating task-dependent influences into a computational model of spatial
attention,” in 2007 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2007, pp. 1–8, doi:
10.1109/CVPR.2007.383337.
[12] A. C. Schütz, D. I. Braun, and K. R. Gegenfurtner, “Object recognition during foveating eye movements,” Vision Research, vol.
49, no. 18, pp. 2241–2253, 2009, doi: 10.1016/j.visres.2009.05.022.
[13] F. Zhou, S. B. Kang, and M. F. Cohen, “Time-mapping using space-time saliency,” in 2014 IEEE Conference on Computer
Vision and Pattern Recognition, Jun. 2014, pp. 3358–3365, doi: 10.1109/CVPR.2014.429.
[14] Z. Liu, X. Zhang, S. Luo, and O. Le Meur, “Superpixel-based spatiotemporal saliency detection,” IEEE Transactions on Circuits
and Systems for Video Technology, vol. 24, no. 9, pp. 1522–1540, Sep. 2014, doi: 10.1109/TCSVT.2014.2308642.
[15] Y. Li, S. Li, C. Chen, A. Hao and H. Qin, “Accurate and robust video saliency detection via self-paced diffusion,” in IEEE
Transactions on Multimedia, vol. 22, no. 5, pp. 1153-1167, May 2020, doi: 10.1109/TMM.2019.2940851.
[16] Y. Fang, G. Ding, J. Li and Z. Fang, “Deep3DSaliency: deep stereoscopic video saliency detection model by 3D convolutional
networks,” in IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2305-2318, May 2019, doi: 10.1109/TIP.2018.2885229.
[17] C. Chen, Y. Li, S. Li, H. Qin and A. Hao, “A novel bottom-up saliency detection method for video with dynamic background,” in
IEEE Signal Processing Letters, vol. 25, no. 2, pp. 154-158, Feb. 2018, doi: 10.1109/LSP.2017.2775212.
[18] T. M. Hoang and J. Zhou, “Recent trending on learning based video compression: A survey,” Cognitive Robotics, vol. 1, pp. 145–
158, 2021, doi: 10.1016/j.cogr.2021.08.003.
[19] A. Borji, “Saliency prediction in the deep learning era: successes and limitations,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 43, no. 2, pp. 679–700, Feb. 2021, doi: 10.1109/TPAMI.2019.2935715.
[20] W. Wang, J. Shen, J. Xie, M.-M. Cheng, H. Ling, and A. Borji, “Revisiting video saliency prediction in the deep learning era,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 1, pp. 220–237, Jan. 2021, doi:
10.1109/TPAMI.2019.2924417.
[21] M. Startsev and M. Dorr, “Supersaliency: a novel pipeline for predicting smooth pursuit-based attention improves generalisability
of video saliency,” IEEE Access, vol. 8, pp. 1276–1289, 2020, doi: 10.1109/ACCESS.2019.2961835.
[22] H. Li, F. Qi, and G. Shi, “A novel spatio-temporal 3D convolutional encoder-decoder network for dynamic saliency prediction,”
IEEE Access, vol. 9, pp. 36328–36341, 2021, doi: 10.1109/ACCESS.2021.3063372.
[23] F. Guo, W. Wang, Z. Shen, J. Shen, L. Shao, and D. Tao, “Motion-aware rapid video saliency detection,” in IEEE Transactions
on Circuits and Systems for Video Technology, vol. 30, no. 12, pp. 4887-4898, Dec. 2020, doi: 10.1109/TCSVT.2019.2906226..
[24] E. Rahtu, J. Kannala, M. Salo, and J. Heikkilä, “Segmenting salient objects from images and videos,” in Lecture Notes in
Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6315
LNCS, no. PART 5, 2010, pp. 366–379, doi: 10.1007/978-3-642-15555-0_27.
[25] Y. Fang, Z. Wang, and W. Lin, “Video saliency incorporating spatiotemporal cues and uncertainty weighting,” in Proceedings -
IEEE International Conference on Multimedia and Expo, Jul. 2013, pp. 1–6, doi: 10.1109/ICME.2013.6607572.
[26] W. Wang, J. Shen, and F. Porikli, “Saliency-aware geodesic video object segmentation,” in 2015 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), Jun. 2015, vol. 07-12-June, pp. 3395–3402, doi: 10.1109/CVPR.2015.7298961.
[27] W. Wang, J. Shen, and Ling Shao, “Consistent video saliency using local gradient flow optimization and global refinement,”
IEEE Transactions on Image Processing, vol. 24, no. 11, pp. 4185–4196, Nov. 2015, doi: 10.1109/TIP.2015.2460013.
[28] Z. Liu, L. Meur, and S. Luo, “Superpixel-based saliency detection,” in International Workshop on Image Analysis for Multimedia
Interactive Services, Jul. 2013, pp. 1–4, doi: 10.1109/WIAMIS.2013.6616119.
[29] Z. Bylinskii, T. Judd, A. Oliva, A. Torralba, and F. Durand, “what do different evaluation metrics tell us about saliency models?,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 3, pp. 740–757, Mar. 2019, doi:
10.1109/TPAMI.2018.2815601.
[30] J. Wright, Y. Peng, Y. Ma, A. Ganesh, and S. Rao, “Robust principal component analysis: exact recovery of corrupted low-rank
matrices by convex optimization,” in Advances in Neural Information Processing Systems 22 - Proceedings of the 2009
Conference, 2009, pp. 2080–2088.
[31] X. Zhou, C.Yang, and W.Yu, “Moving object detection by detecting contiguous outliers in the low-rank representation,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 3, pp. 597–610, 2013, doi: 10.1109/TPAMI.2012.132.
[32] Z. Zeng, T.-H. Chan, K. Jia, and D. Xu, “Finding correspondence from multiple images via sparse and low-rank decomposition,”
in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in
Bioinformatics), vol. 7576 LNCS, no. PART 5, 2012, pp. 325–339, doi: 10.1007/978-3-642-33715-4_24.
[33] P. Ji, H. Li, M. Salzmann, and Y. Dai, “Robust motion segmentation with unknown correspondences,” in Lecture Notes in
Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8694
LNCS, no. PART 6, 2014, pp. 204–219, doi: 10.1007/978-3-319-10599-4_14.
[34] S. Zhu, C. Liu, and Z. Xu, “High-definition video compression system based on perception guidance of salient information of a
convolutional neural network and HEVC compression domain,” IEEE Transactions on Circuits and Systems for Video
Technology, vol. 30, no. 7, pp. 1–1, 2020, doi: 10.1109/TCSVT.2019.2911396.
[35] S. H. Khatoonabadi, N. Vasconcelos, I. V. Bajic, and Y. Shan, “How many bits does it take for a stimulus to be salient?,” in 2015
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2015, vol. 07-12-June, pp. 5501–5510, doi:
10.1109/CVPR.2015.7299189.
[36] V. Leboran, A. Garcia-Diaz, X. R. Fdez-Vidal, and X. M. Pardo, “Dynamic whitening saliency,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 39, no. 5, pp. 893–907, May 2017, doi: 10.1109/TPAMI.2016.2567391.
[37] W. Wang, J. Shen, F. Guo, M.-M. Cheng, and A. Borji, “Revisiting video saliency: a large-scale benchmark and a new model,” in
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp. 4894–4903, doi:
10.1109/CVPR.2018.00514.
7. Int J Reconfigurable & Embedded Syst ISSN: 2089-4864
Video saliency-detection using custom spatiotemporal fusion method (Vinay C. Warad)
275
[38] H. Hadizadeh and I. V. Bajic, “Saliency-aware video compression,” IEEE Transactions on Image Processing, vol. 23, no. 1, pp.
19–33, Jan. 2014, doi: 10.1109/TIP.2013.2282897.
[39] M. Xu, L. Jiang, X. Sun, Z. Ye, and Z. Wang, “Learning to detect video saliency with HEVC features,” IEEE Transactions on
Image Processing, vol. 26, no. 1, pp. 369–385, Jan. 2017, doi: 10.1109/TIP.2016.2628583.
[40] F. Zhang, “VED100: A video-based eye-tracking dataset on visual saliency detection,” Jan 1, 2019. Distributed by Github.
https://github.com/spzhubuaa/VED100-A-Video-Based-Eye-Tracking-Dataset-on-Visual-Saliency-Detection
BIOGRAPHIES OF AUTHORS
Vinay C. Warad working as assistant professor in department of computer
science and engineering at Khawaja Bandanawaz College of Engineering. He has 8 years of
teaching experience. His area of interest is video saliency, image retrieval. He can be contacted
at email: vinaywarad999@gmail.com.
Dr. Ruksar Fatima is a professor &head of the department for computer science
and engineering, Vice principal and examination in charge at Khaja Bandanawaz College of
Engineering (KBNCE) Kalaburagi, Karnataka. She is the Advisory Board Member for IJESRT
(International Journal of Engineering and Research Technology). She is Member of The
International Association of Engineers (IAENG). She can be contacted at email:
ruksarf@gmail.com.