Video Stitching using Improved RANSAC and SIFTIRJET Journal
1. The document discusses techniques for stitching multiple video frames into a panoramic video using Scale-Invariant Feature Transform (SIFT) and an improved RANSAC algorithm.
2. Key points and feature descriptors are extracted from frames using SIFT to find correspondences between frames. The improved RANSAC algorithm is used to estimate homography matrices between frames and filter outlier matches.
3. Frames are blended together to compensate for exposure differences and misalignments before being mapped to a reference plane to create the panoramic video mosaic. The algorithm aims to produce a high quality panoramic video in real-time.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Google Research Siggraph Whitepaper | Total Relighting: Learning to Relight P...Alejandro Franceschi
Google Research Siggraph Whitepaper | Total Relighting: Learning to Relight Portraits for Background Replacement
Abstract:
Given a portrait and an arbitrary high dynamic range lighting environment, our framework uses machine learning to composite the subject into a new scene, while accurately modeling their appearance in the target illumination condition. We estimate a high quality alpha matte, foreground element, albedo map, and surface normals, and we propose a novel, per-pixel lighting representation within a deep learning framework.
This paper proposes a new algorithm for single-image super-resolution that exploits image compressibility in the wavelet domain using compressed sensing theory. The algorithm incorporates the downsampling low-pass filter into the measurement matrix to decrease coherence between the wavelet basis and sampling basis, allowing use of wavelets. It then uses a greedy algorithm to solve for sparse wavelet coefficients representing the high-resolution image. Results show improved performance over existing super-resolution approaches without requiring training data.
1. The document presents an approach to enhance the realism of synthetic images rendered by game engines. A convolutional network is trained to modify rendered images using intermediate representations from the rendering process.
2. The network is trained with an adversarial objective to provide strong supervision at multiple perceptual levels. A new strategy is proposed for sampling image patches during training to address differences in scene layout distributions between datasets.
3. The approach significantly enhances photorealism over recent image-to-image translation methods and baselines, as shown in controlled experiments. It can add realistic details like gloss, vegetation, and road textures while keeping enhancements consistent with the input image content.
This document summarizes a research paper that presents a real-time 3D reconstruction method using stereo vision from a driving car. The method extends LSD-SLAM with stereo capabilities to simultaneously track camera pose and reconstruct semi-dense depth maps. It is evaluated on the KITTI dataset and compared to laser scans and traditional stereo methods. Results show the direct SLAM technique generates visually pleasing and globally consistent semi-dense reconstructions in real-time on a single CPU.
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupLihang Li
This is the slides about DTAM for my group meeting report, hope it does help to anyone who will want to implement DTAM and need to understand it deeply.
This document presents a new technique for enhancing the contrast of low-contrast satellite images using discrete wavelet transform (DWT) and singular value decomposition (SVD). It begins with an abstract and introduction describing the technique. The technique uses DWT to decompose an input satellite image into frequency subbands, and SVD to estimate the singular value matrix of the low-low subband. The singular values are modified to enhance contrast before reconstructing the final image. The proposed DWT-SVD technique is compared to general histogram equalization (GHE) and singular value equalization (SVE), with results suggesting it outperforms these methods both visually and quantitatively. The document also discusses using fast Fourier transform and bi-log
Video Stitching using Improved RANSAC and SIFTIRJET Journal
1. The document discusses techniques for stitching multiple video frames into a panoramic video using Scale-Invariant Feature Transform (SIFT) and an improved RANSAC algorithm.
2. Key points and feature descriptors are extracted from frames using SIFT to find correspondences between frames. The improved RANSAC algorithm is used to estimate homography matrices between frames and filter outlier matches.
3. Frames are blended together to compensate for exposure differences and misalignments before being mapped to a reference plane to create the panoramic video mosaic. The algorithm aims to produce a high quality panoramic video in real-time.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Google Research Siggraph Whitepaper | Total Relighting: Learning to Relight P...Alejandro Franceschi
Google Research Siggraph Whitepaper | Total Relighting: Learning to Relight Portraits for Background Replacement
Abstract:
Given a portrait and an arbitrary high dynamic range lighting environment, our framework uses machine learning to composite the subject into a new scene, while accurately modeling their appearance in the target illumination condition. We estimate a high quality alpha matte, foreground element, albedo map, and surface normals, and we propose a novel, per-pixel lighting representation within a deep learning framework.
This paper proposes a new algorithm for single-image super-resolution that exploits image compressibility in the wavelet domain using compressed sensing theory. The algorithm incorporates the downsampling low-pass filter into the measurement matrix to decrease coherence between the wavelet basis and sampling basis, allowing use of wavelets. It then uses a greedy algorithm to solve for sparse wavelet coefficients representing the high-resolution image. Results show improved performance over existing super-resolution approaches without requiring training data.
1. The document presents an approach to enhance the realism of synthetic images rendered by game engines. A convolutional network is trained to modify rendered images using intermediate representations from the rendering process.
2. The network is trained with an adversarial objective to provide strong supervision at multiple perceptual levels. A new strategy is proposed for sampling image patches during training to address differences in scene layout distributions between datasets.
3. The approach significantly enhances photorealism over recent image-to-image translation methods and baselines, as shown in controlled experiments. It can add realistic details like gloss, vegetation, and road textures while keeping enhancements consistent with the input image content.
This document summarizes a research paper that presents a real-time 3D reconstruction method using stereo vision from a driving car. The method extends LSD-SLAM with stereo capabilities to simultaneously track camera pose and reconstruct semi-dense depth maps. It is evaluated on the KITTI dataset and compared to laser scans and traditional stereo methods. Results show the direct SLAM technique generates visually pleasing and globally consistent semi-dense reconstructions in real-time on a single CPU.
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupLihang Li
This is the slides about DTAM for my group meeting report, hope it does help to anyone who will want to implement DTAM and need to understand it deeply.
This document presents a new technique for enhancing the contrast of low-contrast satellite images using discrete wavelet transform (DWT) and singular value decomposition (SVD). It begins with an abstract and introduction describing the technique. The technique uses DWT to decompose an input satellite image into frequency subbands, and SVD to estimate the singular value matrix of the low-low subband. The singular values are modified to enhance contrast before reconstructing the final image. The proposed DWT-SVD technique is compared to general histogram equalization (GHE) and singular value equalization (SVE), with results suggesting it outperforms these methods both visually and quantitatively. The document also discusses using fast Fourier transform and bi-log
This paper presents a new approach for the enhancement of Synthetic Radar Imagery using Discrete Wavelet Transform and its variants. Some of the approaches like nonlocal filtering (NLF) techniques, and multiscale iterative reconstruction (e.g., the BM3D method) do not solve the RE/SR imaging inverse problems in descriptive settings imposing some structured regularization constraints and exploits the sparsity of the desired image representations for resolution enhancement (RE) and superresolution (SR) of coherent remote sensing (RS). Such approaches are not properly adapted to the SR recovery of the speckle-corrupted low resolution (LR) coherent radar imagery. These pitfalls are eradicated by using DWT approach wherein the despeckled/deblurred HR image is recovered from the LR speckle/blurry corrupted radar image by applying some of the descriptive-experiment-design-regularization (DEDR) based re-constructive steps. Next, the multistage RE is consequently performed in each scaled refined SR frame via the iterative reconstruction of the upscaled radar images, followed by the discrete-wavelet-transform-based sparsity promoting denoising with guaranteed consistency preservation in each resolution frame. The performance of the method proposed is compared in terms of the number of iterations taken by it with other techniques existing in the literature.
Abstract: Primarily due to the progresses in super resolution imagery, the methods of segment-based image analysis for generating and updating geographical information are becoming more and more important. This work presents a image segmentation based on colour features with K-means clustering. The entire work is divided into two stages. First enhancement of color separation of satellite image using de correlation stretching is carried out and then the regions are grouped into a set of five classes using K-means clustering algorithm. At first, the spatial data is concentrated focused around every pixel, and at that point two separating procedures are added to smother the impact of pseudoedges. What's more, the spatial data weight is built and grouped with k-means bunching, and the regularization quality in every district is controlled by the bunching focus esteem. The exploratory results, on both reenacted and genuine datasets, demonstrate that the proposed methodology can adequately lessen the pseudoedges of the aggregate variety regularization in the level.
This document presents a paper that proposes an image registration algorithm using log-polar transform and FFT-based correlation. The algorithm first estimates the angle, scale, and translation between two images by converting them to the log-polar domain, where rotation and scaling appear as translation. It then recovers the residual translation using gradient correlation in the spatial domain. The algorithm is tested on various images related by similarity transformations and is shown to accurately recover scales up to 5.85 times while being robust to noise. It provides a computationally efficient way to register images using properties of the Fourier transform and log-polar mappings.
Intel, Intelligent Systems Lab: Syable View Synthesis WhitepaperAlejandro Franceschi
Intel, Intelligent Systems Lab:
Stable View Synthesis Whitepaper
We present Stable View Synthesis (SVS). Given a set
of source images depicting a scene from freely distributed
viewpoints, SVS synthesizes new views of the scene. The
method operates on a geometric scaffold computed via
structure-from-motion and multi-view stereo. Each point
on this 3D scaffold is associated with view rays and corresponding feature vectors that encode the appearance of
this point in the input images.
The core of SVS is view dependent on-surface feature aggregation, in which directional feature vectors at each 3D point are processed to produce a new feature vector for a ray that maps this point into the new target view.
The target view is then rendered by a convolutional network from a tensor of features synthesized in this way for all pixels. The method is composed of differentiable modules and is trained end-to-end. It supports spatially-varying view-dependent importance weighting and feature transformation of source images at each point; spatial and temporal stability due to the smooth dependence of on-surface feature aggregation on the target view; and synthesis of view-dependent effects such as specular reflection.
Experimental results demonstrate that SVS outperforms state-of-the-art view synthesis methods both quantitatively and qualitatively on three diverse realworld datasets, achieving unprecedented levels of realism in free-viewpoint video of challenging large-scale scenes.
Reversible watermarking based on invariant image classification and dynamic h...JPINFOTECH JAYAPRAKASH
This document proposes a new reversible watermarking scheme that has two main contributions: 1) A dynamic histogram shifting modulation that adapts to local image content specifics when inserting data into textured areas. 2) A classification process that identifies parts of the image best suited for watermarking with different reversible modulations. This classification is based on a reference image derived from the original, allowing synchronized watermark embedding and extraction. Experiments show the proposed method can insert more data with lower distortion than existing schemes.
Optic Flow Estimation by Deep Learning outlines several key concepts in optical flow estimation including:
- Optical flow is the apparent motion of brightness patterns in images. Estimating optical flow involves making assumptions like brightness constancy and spatial coherence.
- Classical algorithms like Lucas-Kanade and Horn-Schunck use techniques like regularization, coarse-to-fine processing, and descriptor matching to address challenges like the aperture problem, large displacements, and occlusions.
- Recent deep learning approaches like FlowNet, DeepFlow, and EpicFlow use convolutional neural networks to directly learn optical flow, achieving state-of-the-art performance on benchmarks. These approaches combine descriptor matching, variational optimization,
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...cscpconf
Video Stabilization, which is important for better analysis and user experience, is typically done through Global Motion Estimation (GME) and Compensation. GME can be done in image domain using many techniques or in Transform domain using the well-known Phase Correlation methods which relate motion to phase shift in the spectrum. While image domain methods are generally slower (due to dense vector field computations), they can do global as well as local motion estimation. Transform domain methods cannot normally do local motion, but are faster and more accurate on homogeneous images, and are resilient to even rapid illumination changes and large motion. However both these approaches can become very time consuming if one needs more accuracy and smoothness because of the nature of the tradeoff. We show here that wavelet transforms can be used in a novel way to achieve a very smooth stabilization along with a significant speedup in this Fourier domain computation without sacrificing accuracy. We
do this by adaptively selecting and combining motion computed on a specific pair of sub-bands using the wavelet interpolation capability. Our approach yields a smooth, scalable, fast and
adaptive algorithm (based on time requirement and recent motion history) to yield significantly better accuracy than a single level wavelet decomposition based approach.
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...IRJET Journal
This document proposes a novel blind super resolution method to improve the spatial resolution of real-life video sequences. The key aspects of the proposed method are:
1) It estimates blur without knowing the point spread function or noise statistics using a non-uniform interpolation super resolution method and multi-scale processing.
2) It uses a cost function with fidelity and regularization terms of a Huber-Markov random field to preserve edges and fine details in the reconstructed high resolution frames.
3) It performs masking to suppress artifacts from inaccurate motions, adaptively weighting the fidelity term at each iteration for faster convergence.
The method is tested on real-life videos with complex motions, objects, and brightness changes, showing
The document discusses an approach to upscaling video using a back iteration algorithm. It begins with an abstract describing how the back iteration algorithm is similar to back-projection algorithms used in tomography. It then discusses how the back iteration algorithm is implemented iteratively on individual video frames to upscale the video. Key aspects of the algorithm include motion estimation, intensity calculation, and registering frames at sub-pixel accuracy. The document provides details on the mathematical model and implementation of the back iteration algorithm for video upscaling. It presents results of applying the algorithm and concludes with discussing opportunities for future improvements.
This document proposes improvements to algorithms for robot navigation using omnidirectional cameras. It summarizes previous approaches that used two omnidirectional cameras and the Sum of Absolute Difference (SAD) algorithm to localize points in 3D scenes. The previous SAD method performed poorly on repetitive textures. The document then proposes enhancements to SAD and the Kanade-Lucas-Tomasi (KLT) feature tracker to address these issues. It also describes using the improved algorithms to reconstruct 3D structures and estimate camera motion. Experiments using simulators show the new approaches outperform previous methods in problems related to repetitive textures and camera rotation.
Mixed Reality: Pose Aware Object Replacement for Alternate RealitiesAlejandro Franceschi
This document explains how the technology involved, can semantically replace moving objects, humans, and other such visual input, and transform it, using mixed reality, into whatever the viewer would prefer the real world looks like instead.
From videogames to movies, to education, healthcare, commerce, communications, and industrial solutions, this will radically change the way we interact with the world, with others, and ourselves.
PAN Sharpening of Remotely Sensed Images using Undecimated Multiresolution De...journal ijrtem
ABSTRACT : In many applications satellite images are used on the basis of resolution, where a high resolution is one of the major issues in the remotely sensed image. In this paper, we propose a new pan-sharpening technique to enhance the resolution of the satellite image by injecting the high frequency details from High-Resolution Panchromatic (HRP) image into Low Resolution Multi-Spectral (LRMS) image using Discrete Wavelet Transform (DWT) and Stationary Wavelet Transform (SWT). SWT algorithm is designed in such a way to overcome the lack of translation-invariance of DWT and is used to enhance the edges on the intermediate stage by preserving spatial information. Translation-invariance is attained by eliminating the down samplers and up samplers present in the DWT. Results show that the performance of the proposed fusion method is better than that of the state-of-art methods in terms of visual quality and other several frequently used metrics, such as the Correlation Coefficient, Peak Signal to Noise Ratio and Root Mean Square Error. Keywords: Image Fusion, Pan-sharpening, Discrete Wavelet Transform, Stationary Wavelet Transform, Quality Metrics
Survey on Single image Super Resolution TechniquesIOSR Journals
Super-resolution is the process of recovering a high-resolution image from multiple lowresolutionimages
of the same scene. The key objective of super-resolution (SR) imaging is to reconstruct a
higher-resolution image based on a set of images, acquired from the same scene and denoted as ‘lowresolution’
images, to overcome the limitation and/or ill-posed conditions of the image acquisition process for
facilitating better content visualization and scene recognition. In this paper, we provide a comprehensive review
of existing super-resolution techniques and highlight the future research challenges. This includes the
formulation of an observation model and coverage of the dominant algorithm – Iterative back projection.We
critique these methods and identify areas which promise performance improvements. In this paper, future
directions for super-resolution algorithms are discussed. Finally results of available methods are given.
This document describes research applying deep convolutional networks to intrinsic image decomposition. The network is trained on synthetic data to map RGB pixels to shading and reflectance estimates. It outperforms a popular method (Retinex) on a benchmark dataset, producing more accurate albedo maps and comparable lighting estimates. Future work could explore network architecture and training on a wider range of real-world data.
Parallel implementation of geodesic distance transform with application in su...Tuan Q. Pham
This paper presents a parallel implementation of geodesic distance transform (GDT) using OpenMP to speed up the algorithm on multi-core CPUs. The sequential chamfer distance propagation algorithm is parallelized by partitioning the image into bands that are processed concurrently by different threads. Experimental results show a speedup of 2.6 times on a quad-core machine without loss of accuracy. This parallel GDT forms part of a C implementation for geodesic superpixel segmentation of natural images.
This document outlines methods for passive stereo vision, from traditional to deep learning-based approaches. It discusses modeling from multiple views, stereo matching techniques like dense correspondence search and cost aggregation. Traditional methods include semi-global matching and energy minimization using graph cuts or belief propagation. Deep learning has also been applied to learn sparse depth representations and end-to-end stereo matching. The document provides an overview of techniques and challenges in passive stereo vision.
Reversible watermarking based on invariant image classification and dynamic h...IEEEFINALYEARPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
The document compares three image fusion techniques: wavelet transform, IHS (Intensity-Hue-Saturation), and PCA (Principal Component Analysis). For each technique, it describes the methodology, syntax used, and features. It then applies each technique to sample images to produce fused images. The RGB values of the fused images are recorded and compared in a table. The wavelet technique uses max area selection and consistency verification for feature selection. IHS transforms RGB to IHS values and replaces intensity with a panchromatic image. PCA replaces the first principal component with a high-resolution panchromatic image. The document concludes no single technique is best and the quality depends on the application.
Effective Pixel Interpolation for Image Super ResolutionIOSR Journals
In the near future, there is an eminent demand for High Resolution images. In order to fulfil this
demand, Super Resolution (SR) is an approach used to renovate High Resolution (HR) image from one or more
Low Resolution (LR) images. The aspiration of SR is to dig up the self-sufficient information from each LR
image in that set and combine the information into a single HR image. Conventional interpolation methods can
produce sharp edges; however, they are approximators and tend to weaken fine structure. In order to overcome
the drawback, a new approach of Effective Pixel Interpolation method is incorporated. It has been numerically
verified that the resulting algorithm reinstate sharp edges and enhance fine structures satisfactorily,
outperforming conventional methods. The suggested algorithm has also proved efficient enough to be applicable
for real-time processing for resolution enhancement of image. Statistical examples are shown to verify the claim.
Image fusion technology is also used to fuse two processed images obtained through the algorithm
Effective Pixel Interpolation for Image Super ResolutionIOSR Journals
Abstract: In the near future, there is an eminent demand for High Resolution images. In order to fulfil this demand, Super Resolution (SR) is an approach used to renovate High Resolution (HR) image from one or more Low Resolution (LR) images. The aspiration of SR is to dig up the self-sufficient information from each LR image in that set and combine the information into a single HR image. Conventional interpolation methods can produce sharp edges; however, they are approximators and tend to weaken fine structure. In order to overcome the drawback, a new approach of Effective Pixel Interpolation method is incorporated. It has been numerically verified that the resulting algorithm reinstate sharp edges and enhance fine structures satisfactorily, outperforming conventional methods. The suggested algorithm has also proved efficient enough to be applicable for real-time processing for resolution enhancement of image. Statistical examples are shown to verify the claim. Image fusion technology is also used to fuse two processed images obtained through the algorithm. Keywords: Super Resolution, Interpolation, EESM, Image Fusion
MULTIPLE REGION OF INTEREST TRACKING OF NON-RIGID OBJECTS USING DEMON'S ALGOR...cscpconf
1) The document proposes a multiple region of interest (ROI) tracking algorithm for non-rigid objects using Demon's algorithm and a pyramidal approach.
2) A pyramidal implementation of Demon's algorithm improves computational efficiency and accuracy of tracking by calculating displacement fields at multiple image resolutions.
3) The algorithm is applied to track non-rigid ROIs in laparoscopy videos, which could help surgeons perform minimal invasive surgery.
This paper presents a new approach for the enhancement of Synthetic Radar Imagery using Discrete Wavelet Transform and its variants. Some of the approaches like nonlocal filtering (NLF) techniques, and multiscale iterative reconstruction (e.g., the BM3D method) do not solve the RE/SR imaging inverse problems in descriptive settings imposing some structured regularization constraints and exploits the sparsity of the desired image representations for resolution enhancement (RE) and superresolution (SR) of coherent remote sensing (RS). Such approaches are not properly adapted to the SR recovery of the speckle-corrupted low resolution (LR) coherent radar imagery. These pitfalls are eradicated by using DWT approach wherein the despeckled/deblurred HR image is recovered from the LR speckle/blurry corrupted radar image by applying some of the descriptive-experiment-design-regularization (DEDR) based re-constructive steps. Next, the multistage RE is consequently performed in each scaled refined SR frame via the iterative reconstruction of the upscaled radar images, followed by the discrete-wavelet-transform-based sparsity promoting denoising with guaranteed consistency preservation in each resolution frame. The performance of the method proposed is compared in terms of the number of iterations taken by it with other techniques existing in the literature.
Abstract: Primarily due to the progresses in super resolution imagery, the methods of segment-based image analysis for generating and updating geographical information are becoming more and more important. This work presents a image segmentation based on colour features with K-means clustering. The entire work is divided into two stages. First enhancement of color separation of satellite image using de correlation stretching is carried out and then the regions are grouped into a set of five classes using K-means clustering algorithm. At first, the spatial data is concentrated focused around every pixel, and at that point two separating procedures are added to smother the impact of pseudoedges. What's more, the spatial data weight is built and grouped with k-means bunching, and the regularization quality in every district is controlled by the bunching focus esteem. The exploratory results, on both reenacted and genuine datasets, demonstrate that the proposed methodology can adequately lessen the pseudoedges of the aggregate variety regularization in the level.
This document presents a paper that proposes an image registration algorithm using log-polar transform and FFT-based correlation. The algorithm first estimates the angle, scale, and translation between two images by converting them to the log-polar domain, where rotation and scaling appear as translation. It then recovers the residual translation using gradient correlation in the spatial domain. The algorithm is tested on various images related by similarity transformations and is shown to accurately recover scales up to 5.85 times while being robust to noise. It provides a computationally efficient way to register images using properties of the Fourier transform and log-polar mappings.
Intel, Intelligent Systems Lab: Syable View Synthesis WhitepaperAlejandro Franceschi
Intel, Intelligent Systems Lab:
Stable View Synthesis Whitepaper
We present Stable View Synthesis (SVS). Given a set
of source images depicting a scene from freely distributed
viewpoints, SVS synthesizes new views of the scene. The
method operates on a geometric scaffold computed via
structure-from-motion and multi-view stereo. Each point
on this 3D scaffold is associated with view rays and corresponding feature vectors that encode the appearance of
this point in the input images.
The core of SVS is view dependent on-surface feature aggregation, in which directional feature vectors at each 3D point are processed to produce a new feature vector for a ray that maps this point into the new target view.
The target view is then rendered by a convolutional network from a tensor of features synthesized in this way for all pixels. The method is composed of differentiable modules and is trained end-to-end. It supports spatially-varying view-dependent importance weighting and feature transformation of source images at each point; spatial and temporal stability due to the smooth dependence of on-surface feature aggregation on the target view; and synthesis of view-dependent effects such as specular reflection.
Experimental results demonstrate that SVS outperforms state-of-the-art view synthesis methods both quantitatively and qualitatively on three diverse realworld datasets, achieving unprecedented levels of realism in free-viewpoint video of challenging large-scale scenes.
Reversible watermarking based on invariant image classification and dynamic h...JPINFOTECH JAYAPRAKASH
This document proposes a new reversible watermarking scheme that has two main contributions: 1) A dynamic histogram shifting modulation that adapts to local image content specifics when inserting data into textured areas. 2) A classification process that identifies parts of the image best suited for watermarking with different reversible modulations. This classification is based on a reference image derived from the original, allowing synchronized watermark embedding and extraction. Experiments show the proposed method can insert more data with lower distortion than existing schemes.
Optic Flow Estimation by Deep Learning outlines several key concepts in optical flow estimation including:
- Optical flow is the apparent motion of brightness patterns in images. Estimating optical flow involves making assumptions like brightness constancy and spatial coherence.
- Classical algorithms like Lucas-Kanade and Horn-Schunck use techniques like regularization, coarse-to-fine processing, and descriptor matching to address challenges like the aperture problem, large displacements, and occlusions.
- Recent deep learning approaches like FlowNet, DeepFlow, and EpicFlow use convolutional neural networks to directly learn optical flow, achieving state-of-the-art performance on benchmarks. These approaches combine descriptor matching, variational optimization,
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...cscpconf
Video Stabilization, which is important for better analysis and user experience, is typically done through Global Motion Estimation (GME) and Compensation. GME can be done in image domain using many techniques or in Transform domain using the well-known Phase Correlation methods which relate motion to phase shift in the spectrum. While image domain methods are generally slower (due to dense vector field computations), they can do global as well as local motion estimation. Transform domain methods cannot normally do local motion, but are faster and more accurate on homogeneous images, and are resilient to even rapid illumination changes and large motion. However both these approaches can become very time consuming if one needs more accuracy and smoothness because of the nature of the tradeoff. We show here that wavelet transforms can be used in a novel way to achieve a very smooth stabilization along with a significant speedup in this Fourier domain computation without sacrificing accuracy. We
do this by adaptively selecting and combining motion computed on a specific pair of sub-bands using the wavelet interpolation capability. Our approach yields a smooth, scalable, fast and
adaptive algorithm (based on time requirement and recent motion history) to yield significantly better accuracy than a single level wavelet decomposition based approach.
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...IRJET Journal
This document proposes a novel blind super resolution method to improve the spatial resolution of real-life video sequences. The key aspects of the proposed method are:
1) It estimates blur without knowing the point spread function or noise statistics using a non-uniform interpolation super resolution method and multi-scale processing.
2) It uses a cost function with fidelity and regularization terms of a Huber-Markov random field to preserve edges and fine details in the reconstructed high resolution frames.
3) It performs masking to suppress artifacts from inaccurate motions, adaptively weighting the fidelity term at each iteration for faster convergence.
The method is tested on real-life videos with complex motions, objects, and brightness changes, showing
The document discusses an approach to upscaling video using a back iteration algorithm. It begins with an abstract describing how the back iteration algorithm is similar to back-projection algorithms used in tomography. It then discusses how the back iteration algorithm is implemented iteratively on individual video frames to upscale the video. Key aspects of the algorithm include motion estimation, intensity calculation, and registering frames at sub-pixel accuracy. The document provides details on the mathematical model and implementation of the back iteration algorithm for video upscaling. It presents results of applying the algorithm and concludes with discussing opportunities for future improvements.
This document proposes improvements to algorithms for robot navigation using omnidirectional cameras. It summarizes previous approaches that used two omnidirectional cameras and the Sum of Absolute Difference (SAD) algorithm to localize points in 3D scenes. The previous SAD method performed poorly on repetitive textures. The document then proposes enhancements to SAD and the Kanade-Lucas-Tomasi (KLT) feature tracker to address these issues. It also describes using the improved algorithms to reconstruct 3D structures and estimate camera motion. Experiments using simulators show the new approaches outperform previous methods in problems related to repetitive textures and camera rotation.
Mixed Reality: Pose Aware Object Replacement for Alternate RealitiesAlejandro Franceschi
This document explains how the technology involved, can semantically replace moving objects, humans, and other such visual input, and transform it, using mixed reality, into whatever the viewer would prefer the real world looks like instead.
From videogames to movies, to education, healthcare, commerce, communications, and industrial solutions, this will radically change the way we interact with the world, with others, and ourselves.
PAN Sharpening of Remotely Sensed Images using Undecimated Multiresolution De...journal ijrtem
ABSTRACT : In many applications satellite images are used on the basis of resolution, where a high resolution is one of the major issues in the remotely sensed image. In this paper, we propose a new pan-sharpening technique to enhance the resolution of the satellite image by injecting the high frequency details from High-Resolution Panchromatic (HRP) image into Low Resolution Multi-Spectral (LRMS) image using Discrete Wavelet Transform (DWT) and Stationary Wavelet Transform (SWT). SWT algorithm is designed in such a way to overcome the lack of translation-invariance of DWT and is used to enhance the edges on the intermediate stage by preserving spatial information. Translation-invariance is attained by eliminating the down samplers and up samplers present in the DWT. Results show that the performance of the proposed fusion method is better than that of the state-of-art methods in terms of visual quality and other several frequently used metrics, such as the Correlation Coefficient, Peak Signal to Noise Ratio and Root Mean Square Error. Keywords: Image Fusion, Pan-sharpening, Discrete Wavelet Transform, Stationary Wavelet Transform, Quality Metrics
Survey on Single image Super Resolution TechniquesIOSR Journals
Super-resolution is the process of recovering a high-resolution image from multiple lowresolutionimages
of the same scene. The key objective of super-resolution (SR) imaging is to reconstruct a
higher-resolution image based on a set of images, acquired from the same scene and denoted as ‘lowresolution’
images, to overcome the limitation and/or ill-posed conditions of the image acquisition process for
facilitating better content visualization and scene recognition. In this paper, we provide a comprehensive review
of existing super-resolution techniques and highlight the future research challenges. This includes the
formulation of an observation model and coverage of the dominant algorithm – Iterative back projection.We
critique these methods and identify areas which promise performance improvements. In this paper, future
directions for super-resolution algorithms are discussed. Finally results of available methods are given.
This document describes research applying deep convolutional networks to intrinsic image decomposition. The network is trained on synthetic data to map RGB pixels to shading and reflectance estimates. It outperforms a popular method (Retinex) on a benchmark dataset, producing more accurate albedo maps and comparable lighting estimates. Future work could explore network architecture and training on a wider range of real-world data.
Parallel implementation of geodesic distance transform with application in su...Tuan Q. Pham
This paper presents a parallel implementation of geodesic distance transform (GDT) using OpenMP to speed up the algorithm on multi-core CPUs. The sequential chamfer distance propagation algorithm is parallelized by partitioning the image into bands that are processed concurrently by different threads. Experimental results show a speedup of 2.6 times on a quad-core machine without loss of accuracy. This parallel GDT forms part of a C implementation for geodesic superpixel segmentation of natural images.
This document outlines methods for passive stereo vision, from traditional to deep learning-based approaches. It discusses modeling from multiple views, stereo matching techniques like dense correspondence search and cost aggregation. Traditional methods include semi-global matching and energy minimization using graph cuts or belief propagation. Deep learning has also been applied to learn sparse depth representations and end-to-end stereo matching. The document provides an overview of techniques and challenges in passive stereo vision.
Reversible watermarking based on invariant image classification and dynamic h...IEEEFINALYEARPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
The document compares three image fusion techniques: wavelet transform, IHS (Intensity-Hue-Saturation), and PCA (Principal Component Analysis). For each technique, it describes the methodology, syntax used, and features. It then applies each technique to sample images to produce fused images. The RGB values of the fused images are recorded and compared in a table. The wavelet technique uses max area selection and consistency verification for feature selection. IHS transforms RGB to IHS values and replaces intensity with a panchromatic image. PCA replaces the first principal component with a high-resolution panchromatic image. The document concludes no single technique is best and the quality depends on the application.
Effective Pixel Interpolation for Image Super ResolutionIOSR Journals
In the near future, there is an eminent demand for High Resolution images. In order to fulfil this
demand, Super Resolution (SR) is an approach used to renovate High Resolution (HR) image from one or more
Low Resolution (LR) images. The aspiration of SR is to dig up the self-sufficient information from each LR
image in that set and combine the information into a single HR image. Conventional interpolation methods can
produce sharp edges; however, they are approximators and tend to weaken fine structure. In order to overcome
the drawback, a new approach of Effective Pixel Interpolation method is incorporated. It has been numerically
verified that the resulting algorithm reinstate sharp edges and enhance fine structures satisfactorily,
outperforming conventional methods. The suggested algorithm has also proved efficient enough to be applicable
for real-time processing for resolution enhancement of image. Statistical examples are shown to verify the claim.
Image fusion technology is also used to fuse two processed images obtained through the algorithm
Effective Pixel Interpolation for Image Super ResolutionIOSR Journals
Abstract: In the near future, there is an eminent demand for High Resolution images. In order to fulfil this demand, Super Resolution (SR) is an approach used to renovate High Resolution (HR) image from one or more Low Resolution (LR) images. The aspiration of SR is to dig up the self-sufficient information from each LR image in that set and combine the information into a single HR image. Conventional interpolation methods can produce sharp edges; however, they are approximators and tend to weaken fine structure. In order to overcome the drawback, a new approach of Effective Pixel Interpolation method is incorporated. It has been numerically verified that the resulting algorithm reinstate sharp edges and enhance fine structures satisfactorily, outperforming conventional methods. The suggested algorithm has also proved efficient enough to be applicable for real-time processing for resolution enhancement of image. Statistical examples are shown to verify the claim. Image fusion technology is also used to fuse two processed images obtained through the algorithm. Keywords: Super Resolution, Interpolation, EESM, Image Fusion
MULTIPLE REGION OF INTEREST TRACKING OF NON-RIGID OBJECTS USING DEMON'S ALGOR...cscpconf
1) The document proposes a multiple region of interest (ROI) tracking algorithm for non-rigid objects using Demon's algorithm and a pyramidal approach.
2) A pyramidal implementation of Demon's algorithm improves computational efficiency and accuracy of tracking by calculating displacement fields at multiple image resolutions.
3) The algorithm is applied to track non-rigid ROIs in laparoscopy videos, which could help surgeons perform minimal invasive surgery.
Multiple region of interest tracking of non rigid objects using demon's algor...csandit
In this paper we propose an algorithm for tracking multiple ROI (region of interest) undergoing
non-rigid transformations. Demon's algorithm based on the idea of Maxwell's demon, has been
applied here to estimate the displacement field for tracking of multiple ROI. This algorithm
works on pixel intensities of the sequence of images thus making it suitable for tracking
objects/regions undergoing non-rigid transformations. We have incorporated a pyramid-based
approach for demon's algorithm computations of displacement field, which leads to significant
reduction in the convergence speed and improvement in the accuracy. This algorithm is applied
for tracking non-rigid objects in laproscopy videos which would aid surgeons in Minimal
Invasive Surgery (MIS).
ER Publication,
IJETR, IJMCTR,
Journals,
International Journals,
High Impact Journals,
Monthly Journal,
Good quality Journals,
Research,
Research Papers,
Research Article,
Free Journals, Open access Journals,
erpublication.org,
Engineering Journal,
Science Journals,
Boosting ced using robust orientation estimationijma
In this paper, Coherence Enhancement Diffusion (CED) is boosted feeding external orientation using new
robust orientation estimation. In CED, proper scale selection is very important as the gradient vector at
that scale reflects the orientation of local ridge. For this purpose a new scheme is proposed in which pre
calculated orientation, by using local and integration scales. From the experiments it is found the proposed
scheme is working much better in noisy environment as compared to the traditional Coherence
Enhancement Diffusion
A PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKINGIRJET Journal
This document presents a project report on removing unnecessary objects from photos using masking techniques. It discusses using algorithms like Fast Marching and Navier-Stokes to fill in missing image data and maintain continuity across boundaries. The Fast Marching method begins at region boundaries and works inward, prioritizing completion of boundary pixels first. Navier-Stokes uses fluid dynamics equations to continue intensity value functions and ensure they remain continuous at boundaries. Color filtering can also be used to segment specific colored objects or regions. The project aims to implement these techniques to remove unwanted objects from images and fill the resulting gaps seamlessly.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
A Survey on Single Image Dehazing ApproachesIRJET Journal
This document provides a survey of single image dehazing approaches. It begins with an introduction to the problem of haze in images and how it degrades quality. It then summarizes several existing single image dehazing methods, including those based on the atmospheric scattering model, dark channel prior, color attenuation prior, and deep learning approaches. The survey covers the key assumptions and limitations of each approach. Overall, the document reviews the progress that has been made in developing techniques to remove haze from a single input image.
This document proposes methods for enhancing and extracting minutiae from fingerprint images using symmetry features. It summarizes previous work on fingerprint enhancement and introduces a new approach using an image pyramid and directional filtering based on the frequency-adapted structure tensor. For minutiae extraction, it adds parabolic symmetry to the local fingerprint model to simultaneously detect minutia position and direction. Experiments on the FVC2004 database show the methods lower the matching error compared to other techniques.
An efficient image segmentation approach through enhanced watershed algorithmAlexander Decker
This document proposes an efficient image segmentation approach combining an enhanced watershed algorithm and color histogram analysis. The watershed algorithm is applied to preprocessed images after merging the results with an enhanced edge detection. Over-segmentation issues are addressed through a post-processing step applying color histogram analysis to each segmented region, improving overall performance. The document provides background on image segmentation techniques, reviews related work applying watershed algorithms, and discusses challenges like over-segmentation that watershed approaches can face.
1) The document proposes using homogeneous motion discovery to generate additional reference frames for 4K video coding. Motion is estimated between reference frames and the current frame to generate affine motion models and associated masks.
2) Experimental results on 3 video sequences show average bit rate savings of 3.78% over HEVC by using the additional reference frames generated from the affine motion models.
3) The approach provides a simpler computation method for high resolution video coding compared to motion hint estimation, which requires super-pixel segmentation that becomes impractical for resolutions like 4K.
Fpga implementation of fusion technique for fingerprint applicationIAEME Publication
Image Fusion is a process of combining relevant information from a set of images, into a
single image, wherein the resultant fused image will be more informative and complete than any of
the input images. This paper discusses Laplacian Pyramid (LP) based image fusion techniques for
fingerprint application. The technique is implemented in MatLab and evaluation parameters Mean
Square Error (MSE), Peak Signal to Noise Ratio (PSNR) and Matching score are discussed. As well
the same implemented on Virtex-5 FPGA development board using Verilog HDL. LP based
technique provides better results for image fusion than other techniques.
Fpga implementation of fusion technique for fingerprint applicationIAEME Publication
This document discusses the implementation of Laplacian Pyramid (LP) based image fusion techniques for fingerprint applications. LP fusion provides better results than other techniques like PCA and DCT. The technique is implemented in MATLAB and on an FPGA development board using Verilog HDL. Performance is evaluated using mean square error, peak signal to noise ratio, and matching score. LP fusion captures image details at multiple scales and is well-suited for fusing fingerprint images.
This paper presents an approach for image restoration in the presence of blur and noise. The image is divided into independent regions modeled with a Gaussian prior. Wavelet-based methods are used for image denoising, while classical Wiener filtering is used for deblurring. The algorithm finds the maximum a posteriori estimate at the intersection of convex sets generated by Wiener filtering. It provides efficient image restoration without sacrificing the simplicity of filtering, and generates a better restored image compared to previous methods.
This paper presents an approach for image restoration in the presence of blur and noise. The image is divided into independent regions modeled with a Gaussian prior. Wavelet based methods are used for image denoising, while classical Wiener filtering is used for deblurring. The algorithm finds the maximum a posteriori estimate at the intersection of convex sets generated by Wiener filtering. It provides efficient image restoration without sacrificing the simplicity of filtering, and generates a better restored image.
X-Ray Image Enhancement using CLAHE MethodIRJET Journal
This document presents a method for enhancing X-ray images using Contrast Limited Adaptive Histogram Equalization (CLAHE). CLAHE improves local contrast and edge definition by applying histogram equalization separately to small regions of the image rather than the entire image. It prevents overamplification of noise that can occur with adaptive histogram equalization. The proposed method uses an image processing filter chain including noise reduction, high pass filtering, and CLAHE to enhance 2D X-ray images. Key parameters of the filter chain are optimized using an interior point algorithm. The goal is to provide customized tissue contrast for each treatment location to allow for accurate patient setup and analysis in radiation therapy. The CLAHE method is shown to effectively enhance contrast in X-
Efficient 3D stereo vision stabilization for multi-camera viewpointsjournalBEEI
In this paper, an algorithm is developed in 3D Stereo vision to improve image stabilization process for multi-camera viewpoints. Finding accurate unique matching key-points using Harris Laplace corner detection method for different photometric changes and geometric transformation in images. Then improved the connectivity of correct matching pairs by minimizing
the global error using spanning tree algorithm. Tree algorithm helps to stabilize randomly positioned camera viewpoints in linear order. The unique matching key-points will be calculated only once with our method.
Then calculated planar transformation will be applied for real time video rendering. The proposed algorithm can process more than 200 camera viewpoints within two seconds.
Boosting CED Using Robust Orientation Estimationijma
n this paper, Coherence Enhancement Diffusion (CED) is boosted feeding external orientation using new
robust orientation estimation. In CED, proper scale selection is very important as the gradient vector at
that scale reflects the orientation of local ridge. For this purpose a new scheme is proposed in which pre
calculated orientation, by using local and integration scales. From the experiments it is found the proposed
scheme is working much better in noisy environment as compared to the traditional Coherence
Enhancement Diffusion
Similar to Multi-hypothesis projection-based shift estimation for sweeping panorama reconstruction (20)
Oral presentation on Asymmetric recursive Gaussian filtering for space-varia...Tuan Q. Pham
1) The document describes an asymmetric recursive Gaussian filter for space-variant artificial bokeh.
2) It proposes using different directional sigma values (σx+, σx-, σy+, σy-) at each pixel to allow for discontinuous blur, minimizing intensity leakage across blur boundaries.
3) The approach constrains the rate of change of the directional sigma values to taper increases and decreases, producing good defocus blur for scenes with depth discontinuities like portraits.
Asymmetric recursive Gaussian filtering for space-variant artificial bokehTuan Q. Pham
This document describes an asymmetric recursive Gaussian filter for space-variant artificial bokeh. The filter approximates two-dimensional space-variant blur using separable one-dimensional Gaussian filtering along the x- and y- dimensions. Within each dimension, the Gaussian filter is approximated by parallel forward and backward infinite impulse response (IIR) filters. The filter reduces intensity leakage at blur discontinuities by modifying the blur sigma of the IIR filters differently for the forward and backward passes as they approach discontinuities, resulting in an asymmetric space-variant filter. This asymmetric recursive filter is able to produce visually pleasing background blur for scenes with contents at different depths without smearing artifacts.
Parallel implementation of geodesic distance transform with application in su...Tuan Q. Pham
This poster presents a parallel implementation of geodesic distance transform using OpenMP. This work forms part of a C implementation for geodesic superpixel segmentation of natural images. Presented at DICTA 2013 conference
Multi-hypothesis projection-based shift estimation for sweeping panorama reco...Tuan Q. Pham
This document presents a multi-hypothesis projection-based shift estimation technique for improving panorama reconstruction from camera sweeps. It summarizes that correlation-based shift estimation can produce incorrect results for large translations or small rotations. The proposed method tests multiple shift hypotheses by taking projections and finding the dominant correlation peak. It is fast, processing images at 20 fps while being robust to large motions, perspective changes, moving objects, and motion blur. The technique enables better panorama stitching in challenging real-world conditions.
Non-maximum suppression using fewer than two comparison per pixelsTuan Q. Pham
Tuan Pham presented a paper on improving non-maximum suppression algorithms to require fewer than two comparisons per pixel. He described existing algorithms like spiral scanning and block partitioning. His improvements included selective spiral scanning that tests fewer pixels and quarter-block partitioning that guarantees candidates are local maxima. Evaluation showed his algorithms outperformed existing methods, requiring up to 60% fewer comparisons. He also demonstrated an application in video denoising by detecting highlight points across frames for noise reduction.
Paper fingerprinting using alpha-masked image matchingTuan Q. Pham
The document summarizes research from Canon Information Systems Research Australia on identifying paper fingerprints (PFP) for authentication purposes. It discusses using alpha-masked image matching and inpainting to make PFP matching more robust to changes in documents, such as printing. Experiments show alpha-masked correlation and normalized correlation are most effective at matching PFPs, even when a significant portion of the image is changed or masked. The researchers conclude PFP matching could be further improved by scanning documents at multiple orientations to separate diffuse and specular reflections.
Robust Super-Resolution by minimizing a Gaussian-weighted L2 error normTuan Q. Pham
1. The document proposes a robust super-resolution algorithm that minimizes a Gaussian-weighted L2 error norm. This suppresses the influence of intensity outliers without requiring additional regularization.
2. The algorithm is based on maximum likelihood estimation but uses a Gaussian error norm instead of a quadratic norm. This makes the algorithm robust against outliers by reducing their influence to zero.
3. The effectiveness of the proposed algorithm is demonstrated on real infrared image sequences with severe aliasing and intensity outliers, where it outperforms other methods in handling outliers and noise.
Separable bilateral filtering for fast video preprocessingTuan Q. Pham
This document summarizes research on separable bilateral filtering for fast video preprocessing. Bilateral filtering reduces noise while preserving edges but has high computational complexity. The researchers propose a separable implementation that approximates the original filter with linear complexity. They apply separable bilateral filtering to video noise reduction and show it achieves better compressed video quality than full-kernel filtering with the same computation. The separable approach makes real-time bilateral filtering possible for applications like video preprocessing.
Normalized averaging using adaptive applicability functions with applications...Tuan Q. Pham
Normalized averaging is a technique for reconstructing images from sparsely sampled data using adaptive applicability functions. It involves taking a weighted average of signal values based on their associated certainty, where the weights are determined by a local structure analysis. Experimental results show the technique can effectively extend linear structures and texture information into missing regions to reconstruct images, and does so faster than traditional diffusion-based inpainting methods. Further research areas include improving the local structure analysis and neighborhood operator.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Multi-hypothesis projection-based shift estimation for sweeping panorama reconstruction
1. MULTI-HYPOTHESIS PROJECTION-BASED SHIFT ESTIMATION FOR SWEEPING
PANORAMA RECONSTRUCTION
Tuan Q. Pham† Philip Cox
Canon Information Systems Research Australia (CiSRA)
1 Thomas Holt drive, North Ryde, NSW 2113, Australia.
†
tuan.pham@cisra.canon.com.au
ABSTRACT one that produces the highest two-dimensional (2D) Normal-
Global alignment is an important step in many imaging appli- ized Cross-Correlation (NCC) score.
cations for hand-held cameras. We propose a fast algorithm Various enhancements are added to the basic alignment
that can handle large global translations in either x- or y- algorithm above to improve its performance. The input im-
direction from a pan-tilt camera. The algorithm estimates the ages are subsampled prior to analysis to improve speed and
translations in x- and y-direction separately using 1D corre- noise robustness. Shift estimation is performed over multiple
lation of the absolute gradient projections along the x- and y- scales to rule out incorrect shifts due to strong correlation of
axis. Synthetic experiments show that the proposed multiple texture at certain frequencies. When appropriate, the images
shift hypotheses approach is robust to translations up to 90% are automatically cropped to improve overlap before gradient
of the image width, whereas other projection-based alignment projection.
methods can handle up to 25% only. The proposed approach Given the alignment between consecutive frames from a
can also handle larger rotations than other methods. The ro- panning camera, a panoramic image can be constructed dur-
bustness of the alignment to non-purely translational image ing image capturing. Overlapping images are stitched along
motion and moving objects in the scene is demonstrated by a an irregular seam that avoids cutting through moving objects.
sweeping panorama application on live images from a Canon This seam also minimizes an intensity mismatch of the two
camera with minimal user interaction. images on either side of the seam. Image blending is fi-
nally used to eliminate any remaining intensity mismatch af-
Index Terms— shift estimation, image projection, sweep ter stitching.
panorama
1.1. Literature review
1. INTRODUCTION
Numerous solutions are available for translational image
Global alignment is an important task for many imaging ap- alignment. Amongst them, a correlation-based method is
plications such as image quality measurement, video stabi- popular for its robustness. However, 2D correlation is costly
lization, and moving object detection. For applications on for large images. 2D phase correlation, for example, requires
embedded devices, the alignment needs to be both accurate O(N 2 logN 2 ) computations for an N×N image using Fast
and fast. Robustness against difficult imaging conditions such Fourier Transform (FFT). The computational complexity can
as low light, camera motion blur or motion in the scene is be reduced to O(N logN ) if the correlation is performed on
also desirable. In this paper, we describe a low-cost global 1D image projections only [1]. This projection-based align-
shift estimation algorithm that addresses these needs. The ment algorithm is suitable for images with strong gradient
algorithm’s robustness against difficult imaging conditions structures along the projection axes. This assumption holds
and its real-time performance is demonstrated on a sweep- for most indoor and natural landscape scenes.
ing panorama application using live images from a hand-held Adams et al. reported a real-time projection-based align-
Canon camera. ment of a 320×240 viewfinder video stream at 30 frames per
In particular, our global alignment algorithm performs second on standard smartphone hardware [2]. Their algorithm
separable shift estimation using one-dimensional (1D) pro- uses projections of the image’s gradient energy along four di-
jections of the absolute gradient images along the sampling rections. The use of image gradient rather than intensity im-
axes. For each image dimension, multiple shift hypotheses proves alignment robustness against local lighting changes.
are maintained to avoid misdetection due to non-purely trans- Despite their speed advantage, previous projection-based
lational motion, independent moving objects, or distractions alignment algorithms have a number of limitations. First, the
from the non-overlapping areas. The final shift estimate is the images must have a substantial overlap (e.g., more than 90%
2. Fig. 1. Simultaneous shift estimation over multiple scales.
of the frame area according to [2]) for the alignment to work.
This is because image data from non-overlapping areas cor-
rupt the image projections, eventually breaking their correla-
tion. Second, any deviation from a pure translation is likely to
break the alignment. The viewfinder algorithm [2], for exam-
ple, claims to handle a maximum of 1◦ rotation only. Third,
previous gradient projection algorithms are not robust to low
lighting condition. The weak gradient energy of dark cur-
rent noise at every pixel often overpowers the stronger but
sparse gradient of the scene structures when integrated over Fig. 2. Multiple shift hypotheses from gradient projection
a whole image row or column. For a similar reason, gradi- correlation.
ent projection algorithms are also not robust against highly
textured scene such as carpet or foliage.
the base of the pyramid. Block summing is used to subsample
the images for efficiency. Because block summing produces
1.2. Structure of this paper
slightly more aliased images compared to Gaussian subsam-
In this paper, we present a fast global alignment algorithm pling, some subpixel alignment error is expected. However,
with application in sweeping panorama reconstruction. Sec- the alignment error can be corrected by subpixel peak inter-
tion 2 presents the new multiple hypotheses global alignment polation of the NCC score at the base pyramid level.
algorithm using gradient projections. Section 3 describes a
software prototype for sweeping panorama stitching using the 2.1. Multi-hypothesis gradient projection correlation
new alignment algorithm. Section 4 evaluates the alignment
and panorama stitching algorithms. Section 5 concludes the At each pyramid level, the translation between two input im-
paper. ages I1 , I2 is estimated by a multi-hypothesis projection-
based shift estimation algorithm described in Fig. 2. Im-
age gradients |∂I1 /∂x| and |∂I1 /∂y| are estimated using fi-
2. PROJECTION-BASED IMAGE ALIGNMENT nite difference. The magnitude of the x-gradient image is
then integrated along image columns to obtain the x-gradient
We propose a projection-based shift estimation algorithm that projection: px = |∂I1 /∂x|dy. The y-gradient projec-
1
is robust to large translation, small rotation and perspective tion is similarly obtained from the y-gradient image: py =
1
change, noise and texture. The global shift is computed over |∂I1 /∂y|dx. The corresponding gradient projections from
multiple scales as shown in Fig. 1. The input images are first the two images are correlated to find multiple possible trans-
subsampled to a manageable size to reduce noise and compu- lations in either dimension. Cross-correlation of zero-padded
tation. A dyadic image pyramid is then constructed for each zero-mean signals is used instead of a brute-force search for
image [3]. At each pyramid level, a shift estimate is obtained a correlation peak in [2] to handle a larger range of possible
independently using the new projection-based image align- motion. Multiple 2D shift hypotheses are derived from all
ment algorithm described in Section 2.1. The shift candidate combinations of the 1D shift hypotheses in both dimensions.
with the highest 2D NCC score is the final shift estimate. A 2D NCC score is obtained for each of these 2D shift hy-
Aligning two images at multiple subsampled resolutions potheses from the overlapping area of the input images dic-
and taking the best solution is more robust than alignment at tated by the shift. The shift hypothesis with the highest 2D
a single original resolution for a number of reasons. First, NCC score is then refined to a subpixel accuracy by an ellip-
noise is substantially reduced by subsampling while the gra- tical paraboloid fit over a 3×3 neighborhood around the 2D
dient information of the scene is largely preserved. Second, NCC peak.
subsampling reduces texture variation and its contribution to Fig. 3 shows a block diagram with all steps and possible
the gradient projections. execution paths of our multi-hypothesis projection-based shift
Too much subsampling, however, eliminates useful align- estimation algorithm. The efficiency of the new algorithm
ment details. To achieve an optimal gain in signal-to-noise ra- comes from two improvements over [2] in steps 1 and 2:
tio, we align the images over three successively halved pyra-
mid levels starting from an image size around 2562 pixels at 1. The input images are subsampled to a manageable size
3. Frame 1 panorama from
Frame 21 Frame 31
Frame 11 48 panning images (6 actually used) Frame 41
50
100
150
Fig. 3. Flow chart describing the proposed projection-based 200
shift estimation algorithm. 250
300
350
100 200 300 400 500 600 700 800 900 1000 1100
(e.g., 256×256 pixels) before alignment; Fig. 4. Sweeping panorama (1119 × 353) in the presence of
2. The 2D translation is estimated separately in x- and y- moving objects and perspective image motion (seams shown
dimension (rather than in four orientations as in [2]) us- in yellow).
ing projections of the images directional gradient mag-
nitude (rather than the gradient energy as in [2]) onto
robust compositing method is to segment the mosaic and use
the corresponding axis.
a single image per segment [6]. For sweeping panorama, the
The algorithm is robust to large translations thanks to a images undergo a translation mainly in one direction. Two
new multiple shift hypotheses algorithm in steps 3 to 6: consecutive images can therefore be joined together along a
seam that minimizes the intensity mismatch between adjacent
3. For each pair of 1D projections, k shift hypotheses are segments [7]. Laplacian pyramid fusion [3] can then be used
selected from the k strongest 1D NCC peaks (e.g., k=5) to smooth out any remaining seam artefacts.
using non-maximal suppression [4];
To demonstrate our alignment technology on realistic
4. Any shift candidate with a dominant 1D NCC score, scenes, we built a standalone application that stitches live im-
which is higher than 1.5-time the second highest score ages from a panning camera. The images are automatically
along the same dimension, is the final shift for that di- transferred from a Canon 40D camera to a PC. A screen-
mension; shot of our demo application is given in Fig. 4, where the
panorama was reconstructed from six panning images in real-
5. If only one dimension has a dominant NCC score, the time.
two images are cropped to an overlapping area along For efficiency, we do not use all captured images for
this dimension before returning to step 2; panorama stitching. The images whose fields of view are
6. If there is no shift hypothesis with a dominant 1D NCC covered by neighbouring frames can be skipped to reduce
score, k 2 2D shift hypotheses are constructed from the the seam computations. All incoming frames still need to be
1D shift hypotheses (see Fig. 2). The shift candidate aligned to determine their overlapping areas. The first frame
with the highest 2D NCC score is the final 2D shift. is always used in the panorama. A frame is skipped if it over-
laps more than 75% with the last used frame and if the next
Note that our algorithm terminates at step 4 if two images frame also overlaps more than 25% with the last used frame.
have substantial overlap. Step 5 is executed if there is a large The second condition ensures no coverage gap is created by
shift in only one dimension. Step 6 is the most expensive part removing a frame. These overlapping parameters can be in-
because it requires the computation of k 2 2D NCC scores. creased to encourage more frames to be used during stitching.
Fortunately, for a sweeping panorama application, the motion Fig. 4 illustrates an example with this default overlapping pa-
is mainly one-dimensional. As a result, most of the examples rameter where only four out of six captured frames are needed
in this paper branch to step 5, which requires significantly to construct a panoramic image.
fewer 2D NCC score computations to find the best translation. Our software prototype automatically determines the
sweep direction from the alignment information. There is
3. SEAMLESS PANORAMA STITCHING no need for the user to select the direction, as required in
some consumer cameras. Fig. 5b shows an example of a ver-
Using the alignment algorithm described in the previous sec- tical panorama constructed by our system from ten images in
tion, a panning image sequence can be combined to form a Fig. 5a. The output image is a good reproduction of the scene
panoramic image. If the alignment is accurate to a subpixel despite few horizontal or vertical structures in the scene, light-
level, frame averaging can be used for image composition [5]. ing change due to camera auto-gain, and texture of carpet on
However, subpixel alignment is difficult for images captured the floor. Another example on automatic sweep direction de-
by a moving camera with moving objects in the scene. A more tection can be seen in Fig. 9, where the camera was panned
4. Fig. 7. Estimated shifts for image pairs undergoing a syn-
(a) 10 input fra mes (512×340) (b) panorama (543×1330) thetic horizontal shift.
Fig. 5. Vertical sweeping panorama produced by our system.
Matlab. For each available image size, an average runtime
and its standard deviation are plotted as error bars in Fig. 6.
Runtime varies even for the same image size due to different
content overlap. A line is fit to the data points to predict the
runtime of each algorithm for an arbitrary image size. All al-
gorithms show a linear run-time performance with respect to
the number of input pixels. 2D correlation is the slowest algo-
rithm. Its floating-point FFT operation also triggers an out-of-
memory error for images larger than ten Mega Pixels (MP).
Our algorithm runs slightly faster than that of Adams et al.
because ours does not have the corner detection and match-
ing steps. The red line fit in Fig. 6 shows that it takes us less
Fig. 6. Shift estimation run time set out against image size for than 0.05 of a second in Matlab to align a 1 MP image pair
three algorithms. and roughly 0.1 second to align an 8 MP image pair. As the
image size gets larger, the major part of the run-time is spent
from right to left instead of the traditional left to right motion on image subsampling, which can be implemented more effi-
as in Fig. 4. ciently in hardware using CCD binning.
To measure the robustness of our projection-based align-
4. EVALUATION ment algorithm against large translation, we performed a syn-
thetic shift experiment. Two 512×340 images were cropped
We first present an evaluation of our projection-based shift es- from the panoramic image in Fig. 4 such that they are re-
timation, followed by results on seamless panorama stitching. lated by a purely horizontal translation, which ranges from
1 to 500 pixels. The estimated shifts [tx ty ] are plotted in
Fig. 7 for three algorithms: 2D correlation, viewfinder align-
4.1. Shift estimation
ment, and this paper s. Both 2D correlation and viewfinder
We compare our multi-hypothesis projection-based shift esti- alignment fail to estimate shifts larger than 128 pixels (i.e.
mation algorithm against an FFT-based 2D correlation and tx > 25% of image width). Our multi-hypothesis algorithm,
the viewfinder alignment algorithm [2]. All three algo- on the other hand, estimates both shift components correctly
rithms were implemented in Matlab version R2010b. For the for a synthetic translation up to 456 pixels (i.e. 90% of image
viewfinder alignment algorithm, the images were subsampled width). As suggested by the 2D correlation subplot on the top
to approximately 320×240 pixels to match the viewfinder res- row of Fig. 7, the strongest correlation peak does not always
olution in [2]. Harris corner detection followed by nearest correspond to the true shift. Large non-overlapping areas can
neighbour corner matching was used to correct for small ro- alter the correlation surface, leading to a sudden switch of the
tation and scale change as described in [2]. global peak to a different location. This sudden change in the
We applied the three shift estimators to panning image global correlation peak corresponds to the sudden jumps of
pairs of different sizes and recorded the execution time in the tx and ty curves in the 2D correlation subplot.
5. small image rotation are further described by the RMSEs in
Table 2. Within a ±1◦ rotation range, Adams et al. is the most
accurate method, closely followed by this paper. Both achieve
subpixel accuracy. For any larger rotation range, our algo-
rithm is the most accurate. We consistently produce less than
2-pixel alignment error for rotation up to 5◦ . Adams et al., on
the other hand, fail to align images with more than 3◦ rotation.
4.2. Panorama stitching
We demonstrate the accuracy of our multi-hypothesis
projection-based shift estimation on a sweeping panorama ap-
plication. Five images on the top row of Fig. 4 come from a
sequence of 48 images captured by a hand-held camera. Due
Fig. 8. Estimated shifts for image pairs undergoing a small to a panning motion of the camera, the input images undergo a
synthetic rotation. horizontal translation mainly. The translations are calculated
between consecutive image pairs using the alignment algo-
rithm presented in Section 2. Six frames (1,12,22,33,43,48)
The average accuracy of the estimated shifts in Fig. 7 is with sufficient content overlap are automatically selected for
tabulated in Table 1. We measured the Root Mean Squared panorama stitching. The selected frames are stitched together
Errors (RMSE) of the estimated shifts within two ground- along a set of irregular seams (shown as yellow lines in the
truth translation intervals. The first interval (1 ≤ tx ≤ 128) is panorama).
where all three algorithms achieve subpixel accuracy. Within Fig. 4 demonstrates our solution s robustness to moving
this interval, the viewfinder alignment algorithm is the most objects and non-purely translational motion. Because the in-
accurate and this paper s is the least accurate. The second in- tensity difference across the seams is minimized, the stitched
terval covers a larger range of shifts (1 ≤ tx ≤ 456) and this image appears seamless. The seams do not cut through mov-
is when all other algorithms fail. Within this larger motion ing objects such as the cars on the road. However, one of
range, our algorithm produces an average of 2-pixel align- these cars appears multiple times in the panorama as it moves
ment error for horizontal translation up to 90% of the image through the scene during image acquisition. Another visible
width. artefact is the bending of the balcony wall close to the camera.
We also tested the robustness of our shift estimation al- This geometric distortion is due to the approximation of a full
gorithm against small image rotation. Fig. 8 plots the esti- 3D projective transformation of the images by a simple 2D
mated shifts by the same three alignment algorithms on purely translation. Despite these artefacts, the produced panorama is
rotated image pairs. The images are generated from frame a plausible representation of the scene.
1 of the image sequence in Fig. 4 by a rotation, followed Our global alignment algorithm is also robust to motion
by central cropping to 276×448 pixels to remove the miss- blur. An example of a panning sequence with severe motion
ing image boundary. Under zero translation, the viewfinder blur is shown on the top row of Fig. 9. Because multiple 1D
alignment algorithm is robust up to 3◦ rotation. Outside shift hypotheses are kept, the correct 2D shifts are success-
this ±3◦ rotation range, however, the viewfinder alignment fully detected, leading to a good panorama reconstruction on
algorithm produces unreliably large shift estimation errors. the bottom row of Fig. 9. Note that the output panorama could
Note that the middle subplot has a 10-time larger vertical axis have been improved further using motion blur deconvolution.
limit compared to the other two subplots. Our algorithm per- However, deconvolution is out of the scope of this paper.
forms equally well to that of Adams et al. for small rotation More panoramas reconstructed by our system are given
(|θ| < 3◦ ). For larger rotation, the error of our alignment in Fig. 10. Our algorithm works well outdoors (Fig. 10a)
increases only gradually, reaching 10-pixel misalignment for because motion of distant scenes can be approximated by a
10◦ rotation.
The performances of the three alignment algorithms under
Table 2. RMSE of estimated shifts under small rotation
Table 1. RMSE of estimated shifts under large translation Correlation Adams et al. This paper
Correlation Adams et al. This paper −1◦ ≤ θ ≤ 1◦ 1.070 0.673 0.737
1 ≤ tx ≤ 128 0.118 0.083 0.420 −3◦ ≤ θ ≤ 3◦ 3.212 1.684 1.310
1 ≤ tx ≤ 456 278.444 279.549 2.281 −5◦ ≤ θ ≤ 5◦ 5.481 141.555 1.679
6. Frame 11 Frame 9 Frame 6 Frame 3
sweeping panorama from 12 panning images (9 actually used)
Frame 0
(a) motion trail of a moving car
200
400
600
500 1000 1500 2000 2500 3000 (c) over-exposed
Fig. 9. Seamless panorama reconstruction under motion blur (b) ripples due to unstable sweeping motion
(output size is 3456×704).
Fig. 11. Some panoramas produced by a consumer camera.
5. CONCLUSION
We have presented a new projection-based shift estimation
algorithm using multiple shift hypothesis testing. Our shift
(a) Outdoor panorama (8448×1428) from 14 images
estimation algorithm is fast and it can handle large image
translations in either x- or y-direction. The robustness of
the algorithm in real-life situations is demonstrated using a
sweeping panorama stitching application. Our alignment al-
(b) 360◦ panorama (4448×496) from a PTZ camera
gorithm is found to be robust against small perspective change
due to camera motion. It is also robust against motion blur
and moving objects in the scene. We have presented a demo
application for live panorama stitching from a Canon cam-
era. The panorama stitching solution comprises of a multi-
(c) 180◦ panorama (4000×704) of a busy shopping centre hypothesis projection-based image alignment step, an irregu-
Fig. 10. Sweeping panoramas constructed by our system. lar seam stitching step and an optional image blending step.
6. ACKNOWLEDGMENT
translation. Projective distortions only appear when there is The authors would like to thank Ankit Mohan from Canon
significant depth difference in the scene. The 360◦ indoor USA R&D and Edouard Francois from Canon Research
panorama in Fig. 10b, for example, shows bending of linear France for their help to improve this paper’s presentation.
structures due to this perspective effect. These distortions are
unavoidable for a wide-angle view because the panorama ef- 7. REFERENCES
fectively lies on a cylindrical surface, whereas each input im-
age lies on a different imaging plane. Finally, an 180◦ view [1] S. Alliney and C. Morandi, “Digital image registration using
of a busy shopping centre is presented in Fig. 10c. The recon- projections,” PAMI, 8(2):222–233, 1986.
structed panorama captures many people in motion, none of
[2] A. Adams, N. Gelfand, and K. Pulli, “Viewfinder alignment,”
them are cut by the hidden seams. Comput. Graph. Forum, 27(2):597–606, 2008.
For comparison purposes, we captured some panoramic [3] E. H. Adelson, C. H. Anderson, J. R. Bergen, P. J. Burt, and
images using a consumer camera available on the market. J. M. Ogden, “Pyramid method in image processing,” RCA
Different from our technology, which stitches as few frames Eng., 29(6):33–41, 1984.
as possible along some irregular seams, this camera joins as
many frames as it captures along straight vertical seams. This [4] T. Q. Pham, “Non-maximum suppression using fewer than two
comparisons per pixel,” in Proc. of ACIVS, 2010, pp. 438–451.
strip-based stitching algorithm is prone to motion artefacts
such as the motion trail of the car in Fig. 11a. The thin strip [5] H.-Y. Shum and R. Szeliski, Construction of panoramic mosaics
approach is also not robust to jittered camera motion. Fig. 11b with global and local alignment, IJCV, 36(2):101–130, 2000.
shows some jitter artefacts of a whiteboard and a nearby win-
[6] J. Davis, “Mosaics of scenes with moving objects,” in Proc. of
dow due to an uneven panning motion. The top drawer of the CVPR, 1998, pp. 354–360.
vertical panorama in Fig. 11c also looks distorted. Our solu-
tion does not suffer from jittered artefacts because the images [7] S. Avidan and A. Shamir, “Seam carving for content-aware
are aligned in both directions before fusion. image resizing,” in Proc. of SIGGRAPH, 2007.