This document evaluates GNSS code and phase solutions. It summarizes the key differences between code-only and code+phase differential GPS (DGPS) processing techniques. Code measurements are affected by biases while phase measurements also contain integer ambiguities. The document tests DGPS code and code+phase solutions using a dual-frequency GPS receiver to collect data at points within 10km of a reference station. Results show coordinate discrepancies between the two solutions are generally below 1m.
Canny Edge Detection Algorithm on FPGA IOSR Journals
This document summarizes the implementation of the Canny edge detection algorithm on an FPGA. It begins with an introduction to edge detection and digital image processing. It then describes the high-level implementation of the Canny algorithm using Simulink. The design and system-level block diagram of the implementation on an FPGA is shown, including loading an input image and displaying the output. Simulation and synthesis results are presented, showing the resource utilization on a Spartan 3E FPGA board. The implementation provides real-time edge detection to interface an FPGA with a monitor.
Recognition and tracking moving objects using moving camera in complex scenesIJCSEA Journal
1) The document proposes a method for tracking moving objects in videos captured using a moving camera in complex scenes. It involves video stabilization, key frame extraction, object detection/tracking using Gaussian mixture models and Kalman filters, and object recognition using bag of features.
2) Key frame extraction identifies important frames for processing by computing edge differences between frames and selecting frames above a threshold.
3) Moving objects are detected using background subtraction and Gaussian mixture models, and then tracked across frames using Kalman filters.
4) Object recognition is performed using bag of features, which represents objects as histograms of visual word frequencies to classify objects based on characteristic visual parts.
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)Matthew O'Toole
Recent advances in both computational photography and displays have given rise to a new generation of computational devices. Computational cameras and displays provide a visual experience that goes beyond the capabilities of traditional systems by adding computational power to optics, lights, and sensors. These devices are breaking new ground in the consumer market, including lightfield cameras that redefine our understanding of pictures (Lytro), displays for visualizing 3D/4D content without special eyewear (Nintendo 3DS), motion-sensing devices that use light coded in space or time to detect motion and position (Kinect, Leap Motion), and a movement toward ubiquitous computing with wearable cameras and displays (Google Glass).
This short (1.5 hour) course serves as an introduction to the key ideas and an overview of the latest work in computational cameras, displays, and light transport.
This document describes the implementation of a Sobel edge detection algorithm on an FPGA. It discusses first and second order derivative edge detection algorithms. It provides an overview of the FPGA implementation including the use of block RAM for image storage, a VGA interface for display, and resource utilization. The FPGA implementation achieved a processing speed of 400 frames per second for a 500x500 grayscale image. Future work proposed improving performance and developing the design into a complete embedded system on a Zynq SoC.
In tech multiple-wavelength_holographic_interferometry_with_tunable_laser_diodesMeyli Valin Fernández
Multiple-wavelength holographic interferometry uses tunable laser diodes to measure large step heights with high accuracy. Holograms are recorded at different wavelengths, generating phase differences with synthetic wavelengths from 0.4637 mm to 129.1 mm. The 129.1 mm wavelength allows measuring a 32 mm step height, while the 0.463 mm wavelength provides 0.01 mm measurement accuracy. Recursive calculations using phase differences from multiple wavelengths eliminate 2π ambiguities, enabling measurement of the 32 mm step with 0.01 mm accuracy. Precise knowledge of the recording wavelengths is required for correct phase unwrapping.
This document evaluates GNSS code and phase solutions. It summarizes the key differences between code-only and code+phase differential GPS (DGPS) processing techniques. Code measurements are affected by biases while phase measurements also contain integer ambiguities. The document tests DGPS code and code+phase solutions using a dual-frequency GPS receiver to collect data at points within 10km of a reference station. Results show coordinate discrepancies between the two solutions are generally below 1m.
Canny Edge Detection Algorithm on FPGA IOSR Journals
This document summarizes the implementation of the Canny edge detection algorithm on an FPGA. It begins with an introduction to edge detection and digital image processing. It then describes the high-level implementation of the Canny algorithm using Simulink. The design and system-level block diagram of the implementation on an FPGA is shown, including loading an input image and displaying the output. Simulation and synthesis results are presented, showing the resource utilization on a Spartan 3E FPGA board. The implementation provides real-time edge detection to interface an FPGA with a monitor.
Recognition and tracking moving objects using moving camera in complex scenesIJCSEA Journal
1) The document proposes a method for tracking moving objects in videos captured using a moving camera in complex scenes. It involves video stabilization, key frame extraction, object detection/tracking using Gaussian mixture models and Kalman filters, and object recognition using bag of features.
2) Key frame extraction identifies important frames for processing by computing edge differences between frames and selecting frames above a threshold.
3) Moving objects are detected using background subtraction and Gaussian mixture models, and then tracked across frames using Kalman filters.
4) Object recognition is performed using bag of features, which represents objects as histograms of visual word frequencies to classify objects based on characteristic visual parts.
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)Matthew O'Toole
Recent advances in both computational photography and displays have given rise to a new generation of computational devices. Computational cameras and displays provide a visual experience that goes beyond the capabilities of traditional systems by adding computational power to optics, lights, and sensors. These devices are breaking new ground in the consumer market, including lightfield cameras that redefine our understanding of pictures (Lytro), displays for visualizing 3D/4D content without special eyewear (Nintendo 3DS), motion-sensing devices that use light coded in space or time to detect motion and position (Kinect, Leap Motion), and a movement toward ubiquitous computing with wearable cameras and displays (Google Glass).
This short (1.5 hour) course serves as an introduction to the key ideas and an overview of the latest work in computational cameras, displays, and light transport.
This document describes the implementation of a Sobel edge detection algorithm on an FPGA. It discusses first and second order derivative edge detection algorithms. It provides an overview of the FPGA implementation including the use of block RAM for image storage, a VGA interface for display, and resource utilization. The FPGA implementation achieved a processing speed of 400 frames per second for a 500x500 grayscale image. Future work proposed improving performance and developing the design into a complete embedded system on a Zynq SoC.
In tech multiple-wavelength_holographic_interferometry_with_tunable_laser_diodesMeyli Valin Fernández
Multiple-wavelength holographic interferometry uses tunable laser diodes to measure large step heights with high accuracy. Holograms are recorded at different wavelengths, generating phase differences with synthetic wavelengths from 0.4637 mm to 129.1 mm. The 129.1 mm wavelength allows measuring a 32 mm step height, while the 0.463 mm wavelength provides 0.01 mm measurement accuracy. Recursive calculations using phase differences from multiple wavelengths eliminate 2π ambiguities, enabling measurement of the 32 mm step with 0.01 mm accuracy. Precise knowledge of the recording wavelengths is required for correct phase unwrapping.
Molecular dynamics (MD) is a very useful tool to understand various phenomena in atomistic detail. In MD, we can overcome the size- and time-scale problems by efficient parallelization. In this lecture, I’ll explain various parallelization methods of MD with some examples of GENESIS MD software optimization on Fugaku.
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as l1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)Jia-Bin Huang
This document presents a physical approach to detecting moving cast shadows in video. It introduces a physics-based shadow model that decomposes light sources into direct and ambient components. Color features are used to encode the difference between shadow and background pixels. A weak shadow detector is used to identify shadow candidates, and a Gaussian mixture model learns the shadow model over time. Spatial information is incorporated to improve learning. The approach detects shadows at light/shadow borders separately. Experimental results on various sequences demonstrate improved shadow detection and discrimination rates compared to other methods. Future work will derive physics-based features for a global shadow model and extend the physical model to more complex cases.
Currently, in both market and the academic communities have required applications based on image and video processing with several real-time constraints. On the other hand, detection of moving objects is a very important task in mobile robotics and surveillance applications. In order to achieve this, we are using a alternative means for real time motion detection systems. This paper proposes hardware architecture for motion detection based on the background subtraction algorithm, which is implemented on FPGAs (Field Programmable Gate Arrays). For achieving this, the following steps are executed: (a) a background image (in gray-level format) is stored in an external SRAM memory, (b) a low-pass filter is applied to both the stored and current images, (c) a subtraction operation between both images is obtained, and (d) a morphological filter is applied over the resulting image. Afterward, the gravity center of the object is calculated and sent to a PC (via RS-232 interface).
A Novel Approach for Tracking with Implicit Video Shot DetectionIOSR Journals
1) The document presents a novel approach that combines video shot detection and object tracking using a particle filter to create an efficient tracking algorithm with implicit shot detection.
2) It uses a robust pixel difference method for shot detection that is resistant to sudden illumination changes. It then applies a particle filter for tracking that uses color histograms and Bhattacharyya distance to track objects across frames.
3) The key innovation is that the tracking algorithm is only initiated after a shot change is detected, reducing computational costs by discarding unneeded frames and triggering tracking only when needed. This provides a more efficient solution for tracking large video datasets with minimal preprocessing.
Optical Computing for Fast Light Transport AnalysisMatthew O'Toole
Optical Computing for Fast Light Transport Analysis
Matthew O'Toole and Kiriakos N. Kutulakos. ACM SIGGRAPH Asia, 2010.
We present a general framework for analyzing the transport matrix of a real-world scene at full resolution, without capturing many photos. The key idea is to use projectors and cameras to directly acquire eigenvectors and the Krylov subspace of the unknown transport matrix. To do this, we implement Krylov subspace methods partially in optics, by treating the scene as a black box subroutine that enables optical computation of arbitrary matrix-vector products. We describe two methods—optical Arnoldi to acquire a low-rank approximation of the transport matrix for relighting; and optical GMRES to invert light transport. Our experiments suggest that good-quality relighting and transport inversion are possible from a few dozen low-dynamic range photos, even for scenes with complex shadows, caustics, and other challenging lighting effects.
This document summarizes a previous work on automatically extracting objects from images using embedded watermarks. Specifically:
1) Previous methods embedded a watermark bit (0 or 1) for each pixel to indicate foreground or background.
2) During watermark embedding, the pixel value was quantized depending on the watermark bit.
3) This allowed automatic extraction of the object by decoding the embedded watermarks.
However, the pixel-wise watermarking was fragile to post-processing like compression. To address this, the proposed method uses block-wise watermark embedding and decoding.
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...IRJET Journal
This document proposes a novel blind super resolution method to improve the spatial resolution of real-life video sequences. The key aspects of the proposed method are:
1) It estimates blur without knowing the point spread function or noise statistics using a non-uniform interpolation super resolution method and multi-scale processing.
2) It uses a cost function with fidelity and regularization terms of a Huber-Markov random field to preserve edges and fine details in the reconstructed high resolution frames.
3) It performs masking to suppress artifacts from inaccurate motions, adaptively weighting the fidelity term at each iteration for faster convergence.
The method is tested on real-life videos with complex motions, objects, and brightness changes, showing
Research Inventy : International Journal of Engineering and Scienceresearchinventy
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
1. The document discusses using NMR experiments and simulations to study the pore geometry of porous solids.
2. It presents an approach using Carr-Parcell-Meiboom-Gill (CPMG) echo trains measured by NMR for liquid diffusing in submicron pores to characterize the pore morphology.
3. Both numerical simulations of NMR signals from a model porous structure and experimental CPMG measurements on a porous glass are shown, finding reasonable agreement between the two.
This document describes a new method for analyzing infant spontaneous motor patterns using a Kinect sensor and tracking algorithm. The Kinect is used to record 3D video of infants' limbs in motion without any body markers. Custom software then tracks limb positions over time and calculates kinematic measures like velocity and movement units. Initial results show the method can accurately capture and quantify limb movements and correlations between limbs. The goal is to use this non-invasive tracking to study developmental changes in infants' movement patterns from 2-24 weeks of age.
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Jia-Bin Huang
The document summarizes a research paper about learning moving cast shadows for foreground detection. It presents a proposed algorithm that uses a confidence-rated Gaussian mixture learning approach and Bayesian framework with Markov random fields to model local and global shadow features. This exploits the complementary nature of local and global features to improve shadow detection. The algorithm is evaluated on outdoor and indoor video sequences, showing improved accuracy over previous methods especially in adaptability to different lighting conditions. Future work could incorporate additional features and more powerful models.
Primal-dual coding photography is a new photographic technique that uses coded illumination and exposure to selectively record user-defined subsets of light paths, generalizing conventional photography. It modulates the contribution of specific light paths using a "probing matrix" of primal codes for illumination and dual codes for exposure over multiple frames. This allows effects like enhancing direct light, capturing indirect light of different ranges, separating light transport effects, and making 3D regions invisible or color-coded in the photo. The technique provides guarantees of optimality and convergence for reconstructing images from the coded photo measurements.
This document summarizes a method for 4D reconstruction of crop fields using images collected over time from aerial vehicles. The method uses structure from motion and multi-view stereo to reconstruct 3D point clouds of the field for different time points. It then employs a spatio-temporal model and robust data association to align these reconstructions into a unified 4D model that captures how the crops change over time. The performance of the method is evaluated on a new dataset collected over a peanut field with ground truth measurements. Results show the 4D reconstructions qualitatively capture visual appearance changes and are quantitatively accurate for measuring crop geometric properties over time.
The document discusses background subtraction techniques for detecting moving objects in video frames. It introduces the mixture of Gaussians approach, which models each pixel as a combination of Gaussian distributions to determine if it belongs to the background or foreground. The key advantages of this approach are its robustness to repetitive motions and changes in lighting/weather. The document compares various techniques, then covers implementation details and challenges of applying mixture of Gaussians to an outdoor scene with moving vehicles and foliage.
This document summarizes a new algorithm called MewDC-NMF for unsupervised unmixing of hyperspectral images. MewDC-NMF stands for Minimum endmember-wise Distance Constrained Nonnegative Matrix Factorization. It simultaneously extracts endmembers and estimates abundance fractions without requiring pure pixels. This is accomplished by imposing a distance constraint between endmembers to make their spectra more compact during optimization. Experiments on synthetic and real AVIRIS data show MewDC-NMF outperforms other constrained NMF methods in extracting more accurate endmembers and estimating abundances.
In this paper, we propose a novel fast video search algorithm for large video database.
Histogram of Oriented Gradients (HOG) has been reported which can be reliably applied to
object detection, especially pedestrian detection. We use HOG based features as a feature
vector of a frame image in this study. Combined with active search, a temporal pruning
algorithm, fast and robust video search can be achieved. The proposed search algorithm has
been evaluated by 6 hours of video to search for given 200 video clips which each length is 15
seconds. Experimental results show the proposed algorithm can detect the similar video clip
more accurately and robust against Gaussian noise than conventional fast video search
algorithm.
All optical image processing using third harmonic generation for image correl...M. Faisal Halim
Term Paper: All optical image processing using third harmonic generation for image correlation
Optical Information Processing Course
Monday, 20th December, 2010
Background subtraction is a technique used to separate foreground objects from backgrounds in video frames. It works by comparing each frame to a background model and detecting differences which indicate moving foreground objects. Recursive techniques like mixtures of Gaussians model the background pixel values over time using multiple Gaussian distributions, allowing the background model to adapt to changing lighting conditions. Adaptive background/foreground detection uses a background model that evolves over time to distinguish foreground objects from the background in a robust way.
Real Time Detection of Moving Object Based on Fpgaiosrjce
IOSR Journal of Electronics and Communication Engineering(IOSR-JECE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of electronics and communication engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in electronics and communication engineering. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
A Moving Target Detection Algorithm Based on Dynamic BackgroundChittipolu Praveen
The document analyzes and compares two common algorithms for moving target detection: background subtraction and frame difference. Background subtraction detects targets by comparing each frame to a background model, while frame difference compares adjacent frames. Both have advantages but also limitations, such as background subtraction being sensitive to dynamic background changes. The document then proposes a new algorithm based on background subtraction. It generates the background for the next frame by combining the current frame and background, so stationary objects become part of the background over time rather than being detected as foreground. Experimental results show this dynamic background algorithm can detect targets more effectively and precisely.
Molecular dynamics (MD) is a very useful tool to understand various phenomena in atomistic detail. In MD, we can overcome the size- and time-scale problems by efficient parallelization. In this lecture, I’ll explain various parallelization methods of MD with some examples of GENESIS MD software optimization on Fugaku.
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as l1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)Jia-Bin Huang
This document presents a physical approach to detecting moving cast shadows in video. It introduces a physics-based shadow model that decomposes light sources into direct and ambient components. Color features are used to encode the difference between shadow and background pixels. A weak shadow detector is used to identify shadow candidates, and a Gaussian mixture model learns the shadow model over time. Spatial information is incorporated to improve learning. The approach detects shadows at light/shadow borders separately. Experimental results on various sequences demonstrate improved shadow detection and discrimination rates compared to other methods. Future work will derive physics-based features for a global shadow model and extend the physical model to more complex cases.
Currently, in both market and the academic communities have required applications based on image and video processing with several real-time constraints. On the other hand, detection of moving objects is a very important task in mobile robotics and surveillance applications. In order to achieve this, we are using a alternative means for real time motion detection systems. This paper proposes hardware architecture for motion detection based on the background subtraction algorithm, which is implemented on FPGAs (Field Programmable Gate Arrays). For achieving this, the following steps are executed: (a) a background image (in gray-level format) is stored in an external SRAM memory, (b) a low-pass filter is applied to both the stored and current images, (c) a subtraction operation between both images is obtained, and (d) a morphological filter is applied over the resulting image. Afterward, the gravity center of the object is calculated and sent to a PC (via RS-232 interface).
A Novel Approach for Tracking with Implicit Video Shot DetectionIOSR Journals
1) The document presents a novel approach that combines video shot detection and object tracking using a particle filter to create an efficient tracking algorithm with implicit shot detection.
2) It uses a robust pixel difference method for shot detection that is resistant to sudden illumination changes. It then applies a particle filter for tracking that uses color histograms and Bhattacharyya distance to track objects across frames.
3) The key innovation is that the tracking algorithm is only initiated after a shot change is detected, reducing computational costs by discarding unneeded frames and triggering tracking only when needed. This provides a more efficient solution for tracking large video datasets with minimal preprocessing.
Optical Computing for Fast Light Transport AnalysisMatthew O'Toole
Optical Computing for Fast Light Transport Analysis
Matthew O'Toole and Kiriakos N. Kutulakos. ACM SIGGRAPH Asia, 2010.
We present a general framework for analyzing the transport matrix of a real-world scene at full resolution, without capturing many photos. The key idea is to use projectors and cameras to directly acquire eigenvectors and the Krylov subspace of the unknown transport matrix. To do this, we implement Krylov subspace methods partially in optics, by treating the scene as a black box subroutine that enables optical computation of arbitrary matrix-vector products. We describe two methods—optical Arnoldi to acquire a low-rank approximation of the transport matrix for relighting; and optical GMRES to invert light transport. Our experiments suggest that good-quality relighting and transport inversion are possible from a few dozen low-dynamic range photos, even for scenes with complex shadows, caustics, and other challenging lighting effects.
This document summarizes a previous work on automatically extracting objects from images using embedded watermarks. Specifically:
1) Previous methods embedded a watermark bit (0 or 1) for each pixel to indicate foreground or background.
2) During watermark embedding, the pixel value was quantized depending on the watermark bit.
3) This allowed automatic extraction of the object by decoding the embedded watermarks.
However, the pixel-wise watermarking was fragile to post-processing like compression. To address this, the proposed method uses block-wise watermark embedding and decoding.
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...IRJET Journal
This document proposes a novel blind super resolution method to improve the spatial resolution of real-life video sequences. The key aspects of the proposed method are:
1) It estimates blur without knowing the point spread function or noise statistics using a non-uniform interpolation super resolution method and multi-scale processing.
2) It uses a cost function with fidelity and regularization terms of a Huber-Markov random field to preserve edges and fine details in the reconstructed high resolution frames.
3) It performs masking to suppress artifacts from inaccurate motions, adaptively weighting the fidelity term at each iteration for faster convergence.
The method is tested on real-life videos with complex motions, objects, and brightness changes, showing
Research Inventy : International Journal of Engineering and Scienceresearchinventy
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
1. The document discusses using NMR experiments and simulations to study the pore geometry of porous solids.
2. It presents an approach using Carr-Parcell-Meiboom-Gill (CPMG) echo trains measured by NMR for liquid diffusing in submicron pores to characterize the pore morphology.
3. Both numerical simulations of NMR signals from a model porous structure and experimental CPMG measurements on a porous glass are shown, finding reasonable agreement between the two.
This document describes a new method for analyzing infant spontaneous motor patterns using a Kinect sensor and tracking algorithm. The Kinect is used to record 3D video of infants' limbs in motion without any body markers. Custom software then tracks limb positions over time and calculates kinematic measures like velocity and movement units. Initial results show the method can accurately capture and quantify limb movements and correlations between limbs. The goal is to use this non-invasive tracking to study developmental changes in infants' movement patterns from 2-24 weeks of age.
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Jia-Bin Huang
The document summarizes a research paper about learning moving cast shadows for foreground detection. It presents a proposed algorithm that uses a confidence-rated Gaussian mixture learning approach and Bayesian framework with Markov random fields to model local and global shadow features. This exploits the complementary nature of local and global features to improve shadow detection. The algorithm is evaluated on outdoor and indoor video sequences, showing improved accuracy over previous methods especially in adaptability to different lighting conditions. Future work could incorporate additional features and more powerful models.
Primal-dual coding photography is a new photographic technique that uses coded illumination and exposure to selectively record user-defined subsets of light paths, generalizing conventional photography. It modulates the contribution of specific light paths using a "probing matrix" of primal codes for illumination and dual codes for exposure over multiple frames. This allows effects like enhancing direct light, capturing indirect light of different ranges, separating light transport effects, and making 3D regions invisible or color-coded in the photo. The technique provides guarantees of optimality and convergence for reconstructing images from the coded photo measurements.
This document summarizes a method for 4D reconstruction of crop fields using images collected over time from aerial vehicles. The method uses structure from motion and multi-view stereo to reconstruct 3D point clouds of the field for different time points. It then employs a spatio-temporal model and robust data association to align these reconstructions into a unified 4D model that captures how the crops change over time. The performance of the method is evaluated on a new dataset collected over a peanut field with ground truth measurements. Results show the 4D reconstructions qualitatively capture visual appearance changes and are quantitatively accurate for measuring crop geometric properties over time.
The document discusses background subtraction techniques for detecting moving objects in video frames. It introduces the mixture of Gaussians approach, which models each pixel as a combination of Gaussian distributions to determine if it belongs to the background or foreground. The key advantages of this approach are its robustness to repetitive motions and changes in lighting/weather. The document compares various techniques, then covers implementation details and challenges of applying mixture of Gaussians to an outdoor scene with moving vehicles and foliage.
This document summarizes a new algorithm called MewDC-NMF for unsupervised unmixing of hyperspectral images. MewDC-NMF stands for Minimum endmember-wise Distance Constrained Nonnegative Matrix Factorization. It simultaneously extracts endmembers and estimates abundance fractions without requiring pure pixels. This is accomplished by imposing a distance constraint between endmembers to make their spectra more compact during optimization. Experiments on synthetic and real AVIRIS data show MewDC-NMF outperforms other constrained NMF methods in extracting more accurate endmembers and estimating abundances.
In this paper, we propose a novel fast video search algorithm for large video database.
Histogram of Oriented Gradients (HOG) has been reported which can be reliably applied to
object detection, especially pedestrian detection. We use HOG based features as a feature
vector of a frame image in this study. Combined with active search, a temporal pruning
algorithm, fast and robust video search can be achieved. The proposed search algorithm has
been evaluated by 6 hours of video to search for given 200 video clips which each length is 15
seconds. Experimental results show the proposed algorithm can detect the similar video clip
more accurately and robust against Gaussian noise than conventional fast video search
algorithm.
All optical image processing using third harmonic generation for image correl...M. Faisal Halim
Term Paper: All optical image processing using third harmonic generation for image correlation
Optical Information Processing Course
Monday, 20th December, 2010
Background subtraction is a technique used to separate foreground objects from backgrounds in video frames. It works by comparing each frame to a background model and detecting differences which indicate moving foreground objects. Recursive techniques like mixtures of Gaussians model the background pixel values over time using multiple Gaussian distributions, allowing the background model to adapt to changing lighting conditions. Adaptive background/foreground detection uses a background model that evolves over time to distinguish foreground objects from the background in a robust way.
Real Time Detection of Moving Object Based on Fpgaiosrjce
IOSR Journal of Electronics and Communication Engineering(IOSR-JECE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of electronics and communication engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in electronics and communication engineering. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
A Moving Target Detection Algorithm Based on Dynamic BackgroundChittipolu Praveen
The document analyzes and compares two common algorithms for moving target detection: background subtraction and frame difference. Background subtraction detects targets by comparing each frame to a background model, while frame difference compares adjacent frames. Both have advantages but also limitations, such as background subtraction being sensitive to dynamic background changes. The document then proposes a new algorithm based on background subtraction. It generates the background for the next frame by combining the current frame and background, so stationary objects become part of the background over time rather than being detected as foreground. Experimental results show this dynamic background algorithm can detect targets more effectively and precisely.
Air pollution monitoring system using mobile gprs sensors arraySaurabh Giratkar
This paper contain brief introduction to vehicular pollution, effect of increase in vehicular pollution on environment as well on human health. To monitor this pollution wireless sensor network (WSN) system is proposed. The proposed system consists of a Mobile Data-Acquisition Unit (Mobile-DAQ) and a fixed Internet-Enabled Pollution Monitoring Server (Pollution-Server). The Mobile-DAQ unit integrates a single-chip microcontroller, air pollution sensors array, a General Packet Radio Service Modem (GPRS-Modem), and a Global Positioning System Module (GPS-Module). The Pollution-Server is a high-end personal computer application server with Internet connectivity. The Mobile-DAQ unit gathers air pollutants levels (CO, NO2, and SO2), and packs them in a frame with the GPS physical location, time, and date. The frame is subsequently uploaded to the GPRS-Modem and transmitted to the Pollution-Server via the public mobile network. A database server is attached to the Pollution- Server for storing the pollutants level for further usage by various clients such as environment protection agencies, vehicles registration authorities, and tourist and insurance companies.
2008-12 WMO GURME - Air Pollution Monitoringurbanemissions
The document discusses various methods for monitoring air pollution:
1) Different types of monitors are used to measure parameters like particulate matter, ozone, nitrogen oxides at various time resolutions from hourly to daily to provide data for compliance, trends analysis, and model verification.
2) Monitors include continuous gas analyzers, filter samplers, beta attenuation monitors, nephelometers, as well as more specialized equipment like ozonesondes, lidar, and aircraft.
3) The number, location and type of monitors needed depends on the objectives of the monitoring program and balances factors like cost, time resolution, and spatial coverage of the data.
Air pollution monitoring system using mobile gprs sensors array pptSaurabh Giratkar
ppt This paper contain brief introduction to vehicular pollution, effect of increase in vehicular pollution on environment as well on human health. To monitor this pollution wireless sensor network (WSN) system is proposed. The proposed system consists of a Mobile Data-Acquisition Unit (Mobile-DAQ) and a fixed Internet-Enabled Pollution Monitoring Server (Pollution-Server). The Mobile-DAQ unit integrates a single-chip microcontroller, air pollution sensors array, a General Packet Radio Service Modem (GPRS-Modem), and a Global Positioning System Module (GPS-Module). The Pollution-Server is a high-end personal computer application server with Internet connectivity. The Mobile-DAQ unit gathers air pollutants levels (CO, NO2, and SO2), and packs them in a frame with the GPS physical location, time, and date. The frame is subsequently uploaded to the GPRS-Modem and transmitted to the Pollution-Server via the public mobile network. A database server is attached to the Pollution- Server for storing the pollutants level for further usage by various clients such as environment protection agencies, vehicles registration authorities, and tourist and insurance companies.
This document presents an overview of air pollution monitoring using remote sensing and GIS technologies. It discusses how satellite remote sensing can provide synoptic views of large areas and monitor multiple pollutants simultaneously. It also describes some common air pollutants and sources. Two case studies are then presented on using these methods to map ambient air pollution zones and monitor air quality in specific regions.
Monitoring of air pollution involves tracking key pollutants like SO2, smoke, and suspended particles on a daily basis. Common methods include measuring SO2 levels, the smoke index, and deposit of grit and dust. The air quality index provides information on air cleanliness and potential health effects. Major effects of air pollution include respiratory illnesses and increased risk of lung cancer. Prevention and control involves techniques like containment, replacing polluting processes, dilution through green belts, legislation like the Clean Air Act, and international coordination through organizations like the WHO.
This presentation describes the design of an air quality monitoring system. The system uses sensors to detect the levels of air pollutants like carbon monoxide and air quality. It displays the sensor readings and pollutant percentages on an LCD screen. The system aims to continuously monitor indoor and outdoor air quality levels to provide data on air pollution levels. Future improvements could include adding more sensors, uploading real-time data online with location details, and storing readings on an SD card.
http://imatge-upc.github.io/telecombcn-2016-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
Air Quality Sampling and Monitoring: Stack sampling, instrumentation and methods of analysis of SO2, CO etc, legislation for control of air pollution and automobile
pollution
Art is a creative expression that stimulates the senses or imagination according to Felicity Hampel. Picasso believed that every child is an artist but growing up can stop that creativity. Aristotle defined art as anything requiring a maker and not being able to create itself.
Landuse Classification from Satellite Imagery using Deep LearningDataWorks Summit
With the abundance of remote sensing satellite imagery, the possibilities are endless as to the kind of insights that can be derived from them. One such use is to determine land use for agriculture and non-agricultural purposes.
In this talk, we’ll be looking at leveraging Sentinel-2 satellite imagery data along with OpenStreetMap labels to be able to classify land use as agricultural or non-agricultural.
Sentinel-2 data has a 10-meter resolution in RGB bands and is well-suited for land use classification. Using these two datasets, many different machine learning tasks can be performed like image segmentation into two classes (farm land and non-farm land) or more challenging task of identification of crop type being cultivated on fields.
For this talk, we’ll be looking at leveraging convolutional neural networks (CNNs) built with Apache MXNet to train deep learning models for land use classification. We’ll be covering the different deep learning architectures considered for this particular use case along with the appropriate metrics.
We’ll be leveraging streaming pipelines built on Apache Flink and Apache NiFi for model training and inference. Developers will come away with a better understanding of how to analyze satellite imagery and the different deep learning architectures along with their pros/cons when analyzing satellite imagery for land use. SUNEEL MARTHI and CHRIS OLIVIER, Software Development Engineer Amazon Web Services
This document summarizes a research paper on background subtraction under sudden illumination changes. It proposes using phase features and distance transforms. Key points:
1. It extracts phase features from Gabor wavelet coefficients that are insensitive to illumination changes.
2. It models pixel backgrounds using Gaussian mixtures on the phase space and updates the models under a novel matching condition.
3. Experiments show the method achieves better precision and recall than GMM and LBP methods on test sequences under illumination changes.
Large scale landuse classification of satellite imagerySuneel Marthi
This document summarizes a presentation on classifying land use from satellite imagery. It describes using a neural network to filter out cloudy images, segmenting images with a U-Net model to identify tulip fields, and implementing the workflow with Apache Beam for inference on new images. Examples are shown of detecting large and small tulip fields. Future work proposed includes classifying rock formations using infrared bands and measuring crop health.
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
The document evaluates the performance of various foreground extraction algorithms for object detection in visual surveillance. It analyzes three background modeling techniques (change detection mask, median, histogram) and two background subtraction algorithms (frame difference, approximate median). Experimental results on test videos show that background modeling using the median value technique and background subtraction using frame differencing provides the most robust and efficient combination. Processing times are reported for different combinations of algorithms. The study concludes that the median-based approach has good computational efficiency and robustness for background modeling.
This document discusses image reconstruction from projections. It begins by introducing the image reconstruction problem and describes how taking projections from multiple angles can be used to reconstruct an image. It then covers the principles of computed tomography (CT), the Radon transform, and the Fourier-slice theorem. The key idea is that the Fourier transform of a projection is a slice of the 2D Fourier transform of the image. Finally, it describes how filtered back-projection can be used to reconstruct an image by filtering each projection with a ramp filter and back-projecting. Window functions are used to filter the ramp filter to reduce ringing artifacts.
This document summarizes a seminar presentation on an image denoising method based on the curvelet transform. The presentation covered:
1) How image noise occurs and traditional denoising methods like linear filters and edge-preserving smoothing.
2) The curvelet transform process including sub-band decomposition, smooth partitioning, renormalization, and ridgelet analysis.
3) An image denoising algorithm that applies wavelet and curvelet transforms, then combines results using quad tree decomposition.
The document presents a multi-frame marked point process model for extracting targets from ISAR (Inverse Synthetic Aperture Radar) image sequences. The model integrates information across frames using priors on target shape persistency and smooth motion. Experiments show the model achieves better target line and center extraction compared to frame-by-frame detection. Future work involves generalizing the model to identify other objects like airplanes and using extracted features for target classification.
The document presents an experimental validation of an adaptive control scheme for quadrotor MAVs that is robust to uncertainties in mass, center of mass location, and external disturbances. The control scheme uses adaptive techniques to estimate unknown parameters and compensate for their effects. Experimental results show that the adaptive controller more accurately tracks a desired trajectory than a non-adaptive controller, especially when an additional weight is added to introduce parameter uncertainties. The adaptive controller maintains tracking accuracy even in the presence of external disturbances and unknown variations in vehicle parameters.
A Novel Background Subtraction Algorithm for Dynamic Texture ScenesIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
This document summarizes Dr. Li Song's research on perceptual video coding. It discusses using perceptual cues from the human visual system in video coding to discard superfluous data that humans cannot perceive. Recent research areas covered include just-noticeable distortion based rate-distortion optimization, SSIM based RDO, and analysis-completion frameworks. While perceptual metrics have improved coding performance over PSNR, bridging the gap between metrics and perceived quality remains an ongoing challenge.
The document summarizes the status of the B-physics trigger working group. It discusses various B-physics trigger selections being studied at Level 2 and the event filter, including tighter cuts being applied to improve rate reduction. Future work is outlined, such as robustness studies, algorithm development, and investigations using new detector layout simulations and data sets.
A Framework of Secured and Bio-Inspired Image Steganography Using Chaotic Enc...Varun Ojha
This document proposes a novel secured image steganography technique called CEGAO that uses chaotic encryption and genetic algorithm optimization. It encrypts a secret image using a logistic map before embedding it in the least significant bits of cover images. To extract the secret image, the recipient decrypts and retrieves the secret image blocks from the stego image using the same logistic map. The technique achieves PSNR values up to 48.55 dB and SSIM values over 0.99, outperforming conventional LSB techniques. While performance is encouraging, the authors aim to further improve resistance against statistical attacks and optimize other aspects of the algorithm.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2018-embedded-vision-summit-benosman
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Ryad B. Benosman, Professor at the University of Pittsburgh Medical Center, Carnegie Mellon University and Sorbonne Universitas, presents the "What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applications" tutorial at the May 2018 Embedded Vision Summit.
In this presentation, Benosman introduces neuromorphic, event-based approaches for image sensing and processing. State-of-the-art image sensors suffer from severe limitations imposed by their very principle of operation. These sensors acquire the visual information as a series of “snapshots” recorded at discrete point in time, hence time-quantized at a predetermined frame rate, resulting in limited temporal resolution, low dynamic range and a high degree of redundancy in the acquired data. Nature suggests a different approach: Biological vision systems are driven and controlled by events happening within the scene in view, and not – like conventional image sensors – by artificially created timing and control signals that have no relation to the source of the visual information.
Translating the frameless paradigm of biological vision to artificial imaging systems implies that control over the acquisition of visual information is no longer imposed externally on an array of pixels but rather the decision making is transferred to each individual pixel, which handles its own information individually. Benosman introduces the fundamentals underlying such bio-inspired, event-based image sensing and processing approaches, and explores their strengths and weaknesses. He shows that bio-inspired vision systems have the potential to outperform conventional, frame-based vision acquisition and processing systems and to establish new benchmarks in terms of data compression, dynamic range, temporal resolution and power efficiency in applications such as 3D vision, object tracking, motor control and visual feedback loops, in real-time.
Threshold adaptation and XOR accumulation algorithm for objects detectionIJECEIAES
Object detection, tracking and video analysis are vital and energetic tasks for intelligent video surveillance systems and computer vision applications. Object detection based on background modelling is a major technique used in dynamically objects extraction over video streams. This paper presents the threshold adaptation and XOR accumulation (TAXA) algorithm in three systematic stages throughout video sequences. First, the continuous calculation, updating and elimination of noisy background details with hybrid statistical techniques. Second, thresholds are calculated with an effective mean and Gaussian for the detection of the pixels of the objects. The third is a novel step in making decisions by using XOR-accumulation to extract pixels of the objects from the thresholds accurately. Each stage was presented with practical representations and theoretical explanations. On high resolution video which has difficult scenes and lighting conditions, the proposed algorithm was used and tested. As a result, with a precision average of 0.90% memory uses of 6.56% and the use of CPU 20% as well as time performance, the result excellent overall superior to all the major used foreground object extraction algorithms. As a conclusion, in comparison to other popular OpenCV methods the proposed TAXA algorithm has excellent detection ability.
1. Machine learning techniques can be applied to 21cm cosmology studies in various ways such as image reconstruction, signal detection, data analysis, simulation, and foreground subtraction.
2. Neural networks can be used to estimate cosmological parameters from 21cm power spectra or directly recover statistics like bubble size distributions from power spectra.
3. Studies have shown neural networks can accurately recover bubble size distributions from 21cm power spectra, even when including thermal noise at SKA sensitivity levels. This avoids information loss from incomplete image reconstruction.
4. Other work has used neural networks to reconstruct hydrogen distribution maps from galaxy surveys, demonstrating the potential of machine learning to connect 21cm signals to astrophysical sources and properties.
This document summarizes a series of lectures on fundamentals of image processing and analysis delivered at Cambridge University's Engineering Department. The lectures covered topics such as digital imaging, point and local operations, frequency domain methods, image segmentation, representation of objects, and morphological operations. The goal was to introduce basic concepts and techniques in digital image processing and computerized image analysis.
This document summarizes a series of lectures on image processing and analysis given at Cambridge University's Engineering Department. The lectures cover topics such as digital imaging, point and local operations, frequency domain methods, image segmentation, and representation of objects. The goal is to introduce fundamental concepts and techniques in image processing and analysis using computers.
This document summarizes key aspects of video compression techniques discussed in Chapter 10. It introduces basic video compression using motion compensation, where differences between frames are encoded to remove temporal redundancy. It describes methods for searching motion vectors, including sequential, logarithmic and hierarchical searches. It also outlines two early video compression standards, H.261 and H.263, noting their use of motion compensation and treatment of intra-frames and inter-frames.
This document proposes a new method for foreground detection that combines background subspace learning with an object smoothing model. It uses 2D PCA to learn the background subspace and model the foreground as a sparse matrix. An object smoothing model is then applied to refine the foreground by exploiting the spatial clustered property. The method is tested on three public datasets and achieves better F-score performance compared to GMM, KDE and sparse coding methods for foreground detection.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
1. Background Modeling and Foreground
Detection for Video Surveillance:
Recent Advances and Future Directions
Thierry BOUWMANS
Associate Professor
MIA Lab - University of La Rochelle - France
2. Plan
Introduction
Fuzzy Background Subtraction
Background Subtraction via a Discriminative
Subspace Learning: IMMC
Foreground Detection via Robust Principal
Component Analysis (RPCA)
Conclusion - Perspectives
2
3. Goal
Detection of moving objects in video sequence.
Pixels are classified as:
Background(B) Foreground (F)
Séquence Pets 2006 : Image298 (720 x 576 pixels)
3
4. Background Subtraction Process
Incremental Algorithm
t >N Background
Maintenance
t ≥N t=t+1
Batch Algorithm
t ≤N
N images
Video Background F(t) Foreground
Initialization I(t+1) Detection Foreground
N+1
Mask
Classification task
4
5. Related Applications
Video surveillance
Optical Motion Capture
Multimedia Applications
Séquence Danse [Mikic 2002] – Université de Californie SanJump [Mikic 2002]
Projet Aqu@theque – Université de La Rochelle
Projet ATON Séquence Diego
5
6. On the importance of the background
subtraction
Background
Processing
Subtraction
Acquisition
Convex Hull Tracking
6 Pattern Recognition
8. Multimodal Backgrounds
Rippling Water Camera Waving
Water Surface Jitter Trees
Source: http://perception.i2r.a-star.edu.sg/bk_model/bk_index.html
8
9. Statistical Background Modeling
Background Subtraction Web Site: References (553),
datasets (10) and codes (27).
Source: http://sites.google.com/site/backgroundsubtraction/Home.html
(6256 Visitors, Source Google Analytics).
9
10. Plan
Introduction
Fuzzy Background Subtraction
Background Subtraction via a Discriminative
Subspace Learning: IMMC
Foreground Detection via Robust Principal
Component Analysis (RPCA)
Conclusion - Perspectives
10
11. Fuzzy Background Subtraction
A survey in Handbook on Soft Computing for Video
Surveillance, Taylor and Francis Group [HSCVS
2012]
Three approaches developed at the MIA Lab:
Background modeling by Type-2 Fuzzy Mixture of
Gaussians Model [ISVC 2008].
Foreground Detection using the Choquet Integral
[WIAMIS 2008][FUZZ’IEEE 2008]
Fuzzy Background Maintenance [ICIP 2008]
11
12. Weakness of the original MOG
1. False detections due to the matching test
kσ1 kσ 2 kσ 3
12
13. Weakness of the original MOG
2. False detections due to the presence of outliers in
the training step
Exact distribution
μ
μ min μ max
13
14. Mixture of Gaussians
with uncertainty on :
the mean and the variance [Zeng 2006]
(T2 FMOG-UM) (T2 FMOG-UV)
14
15. Mixture of Gaussians with uncertainty on
the mean
(T2 FMOG-UM)
X t ,c : Intensity vector in the RGB color space
15
16. Mixture of Gaussians with uncertainty on
the variance
(T2 FMOG-UV)
X t ,c : Intensity vector in the RGB color space
16
18. Results on the “SHAH” dataset
(160 x 128 pixels) – Camera Jitter
Video at http://sites.google.com/site/t2fmog/
Original sequence MOG
T2 FMOG-UM (km=2) T2 FMOG-UV (kv=0.9)
18
19. Results on the “SHAH” dataset
(160 x 128 pixels) – Camera Jitter
Method Error Image Image Image Image Total Variation in %
Type 271 373 410 465 Error
MOG FN 0 1120 4818 2050
FP 2093 4124 2782 1589 18576
T2-FMOG-UM FN 0 1414 6043 2520
FP 203 153 252 46 10631 42,77
T2-FMOG-UV FN 0 957 2217 1069
FP 3069 1081 1119 1158 10670 42.56
19
20. Results on the “SHAH” dataset
(160 x 128 pixels) – Camera Jitter
[Stauffer 1999]
[Bowden 2001] – Initialization [Zivkovic 2004] – K is variable
20
21. Results on the sequence “CAMPUS”
(160 x 128 pixels) – Waving Trees
Video at http://sites.google.com/site/t2fmog/
Original Sequence MOG
T2 FMOG-UM (km=2) T2 FMOG-UV (kv=0.9)
21
22. Resultat on the sequence “Water
Surface” (160 x 128 pixels) – Water Surface
Video at http://sites.google.com/site/t2fmog/
Original Sequence MOG
T2 FMOG-UM (km=2) T2 FMOG-UV (kv=0.9)
22
23. Fuzzy Foreground Detection :
Features: color, edge, stereo features, motion
features, texture.
Multiple features:
More robustness in presence of illumination
changes, shadows and multimodal backgrounds
23
24. Choice of the features
Color (3 components)
Texture (Local Binary Pattern [Heikkila – PAMI 2006])
For each feature, a similarity (S) is computed
following its value in the background image and
its value in the current image.
24
25. Aggregation of the Color and Texture features with the
Choquet Integral
BG(t) I(t+1)
Color Features Texture Features
S
Similarity mesure
C,1
for the Color
SC,2 SC,3SimilarityTexture
ST measure
for the
Fuzzy Integral
Classification B/F
25 Foreground Mask
26. How to compute S for the Color and the
Texture?
TF TI
C F, k C I, k 0 ≤ T,C ≤ 255
Background Image Current Image
C FBk
T,
if CT,B < CTk
if F k < I , I
C I Ik
T
For the
the ST = 1
,
SC ,k = 1 if CT,B = CTk
if F k = I , I 0≤S ≤1
CI ,k
Color
Texture I
T if CTk < C F ,k k=one of the color
CF ,k if I , I < TB
T B components
26
27. Fuzzy operators
« Sugeno Integral» et «Choquet Integral»
Uncertainty and imprecision
Great flexibility
Fast and simple operations
ordinal cardinal
27
28. Data Fusion using the Choquet Integral
Mesures floues :
Intégrale de
Choquet :
X = { x1 , x 2 , x 3 } {x } {x } {x } {x ,x } {x ,x } {x
1 2 3 1 2 1 3 2 , x3}
28
29. Fuzzy Foreground Detection
Classification using the Choquet integral
If C μ ( x , y ) < Th then ( x, y ) ∈ Background
else ( x, y ) ∈ Foreground
where Th is constant threshold. Cμ ( x, y) is the value of
the Choquet integral for the pixel (x,y)
29
30. Aggregation Color, Texture
Aqu@thèque (384 x 288 pixels) - Ohta color space
Integral Choquet Sugeno
Color space Ohta Ohta
S(A,B) 0.40 0.27
a) Current image b) Ground truth
Comparison between the Sugeno and Choquet [Zhang 2006]
30 c) Choquet integral d) Sugeno integral
31. Aggregation Colors, Texture : Ohta, YCrCb, HSV
Aqu@thèque (384 x 288 pixels)
Texture Color
{x } {x } {x } {x ,x } {x ,x } {x
1 2 3 1 2 1 3 2 , x3} X = { x1 , x 2 , x 3 }
0.6 0.3 0.1 0.9 0.7 0.4 1
0.5 0.4 0.1 0.9 0.6 0.5 1
Choquet - Ohta 0.2
0.5 0.3 0.8 0.7 0.5
Choquet - YCrCb 1
Choquet - HSV
0.5 0.39 0.11 0.89 0.61 0.5 1
0.53 0.34 0.13 0.87 0.66 0.47 1
Integral
Ohta YCrCb
Values of the fuzzy measures μ HSV
Color Space
S(A,B) 0.40 0.42 0.30
Evaluation of the Choquet integral for different color spaces
31
33. Aggregation Colors : Pets 2006 (384 x 288 pixels)
Original sequence Ground truth
OR Sugeno Integral Choquet Integral
YCrCb
Ohta
HSV
33
34. Fuzzy Background maintenance
No-selective rule
Selective rule
Here, the idea is to adapt very quickly a pixel classified as
background and very slowly a pixel classified as foreground.
34
35. Fuzzy adaptive rule
and
Combination of the update rules of the selective scheme
35
36. Results on the Wallflower dataset
Sequence Time of Day
Original Image 1850 Ground Truth
No selective rule Selective rule Fuzzy adaptive rule
Similarity measure
No selective Selectiv Fuzzy adaptive
e
S(A,B)% 58.40 57.08 58.96
36
37. Computation Time
Algorithm Frames/Second
T2-FMOG-UM 11
T2-FMOG-UV 12
MOG 20
Choquet integral 31
Sugeno integral 22
OR 40
Resolution 384*288, RGB, Pentium 1,66GHz, RAM 1GB
37
38. Perspective
Assessment
s
Fuzzy Background Modeling by T2-FMOG
Multimodal Backgrounds
- Using fuzzy approaches in other statistical models.
Fuzzy Foreground Detection using multi-features
- Using more than two features
- Fuzzy measures by learning
Fuzzy Background Maintenance
38
39. Plan
Introduction
Fuzzy Background Subtraction
Background Subtraction via a Discriminative
Subspace Learning: IMMC
Foreground Detection via Robust Principal
Component Analysis (RPCA)
Conclusion - Perspectives
39
40. Background Modeling and Foreground Detection
via a Discriminative Subspace Learning (MIA Lab)
Reconstructive subspace learning models (PCA, ICA, IRT)
[RPCS 2009]
Assumption: The main information contained in the training
sequence is the background meaning that the foreground
has a low contribution.
However, this assumption is only verified when the moving
objects are either small or far away from the camera.
40
41. Discriminative Subspace Learning
Advantages
More efficient and often give better classification results.
Robust supervised initialization of the background
Incremental update of the eigenvectors and eigenvalues.
Approach developed at the MIA Lab:
Background initialization via MMC [MVA 2012]
Background maintenance via Incremental Maximum
Margin Criterion (IMMC) [MVA 2012]
41
42. Background Subtraction via Incremental
Maximum Margin Criterion
Denote the training video sequences S ={I1, ...IN}
where It is the frame at time t
N is the number of training frames.
Let each pixel (x,y) be characterized by its intensity in the grey
scale and asssume that we have the ground truth corresponding to
this training video sequence, i.e we know for each pixel its class
label that can be foreground or background.
42
43. Background Subtraction via Incremental
Maximum Margin Criterion
Thus, we compute respectively the inter-class scatter matrix Sb
and the intra-class scatter matrix Sw:
where c = 2
I is the mean of the intensity of the pixel (x,y) over the training video
Ii is the mean of samples belonging to class i
pi is the prior probability for a sample belonging to class i (Background,
Foreground).
43
44. Background Subtraction via Incremental
Maximum Margin Criterion
Batch Maximum Margin Criterion algorithm.
Extract the first leading eigenvectors that correspond to the
background. The corresponding eigenvalues are contained
in the matrix LM and the leading eigenvectors in the matrix
ΦM .
The current image It can be approximated by the mean
background and weighted sum of the leading
eigenbackgrounds ΦM.
44
45. Background Subtraction via Incremental
Maximum Margin Criterion
The coordinates in leading eigenbackground space of the current
image It can be computed :
When wt is back projected onto the image space, the background
image is created :
45
46. Background Subtraction via Incremental
Maximum Margin Criterion
Foreground detection
Background maintenance via IMMC
46
48. Results on the Wallflower dataset
Original image, ground truth , SG, MOG, KDE,
PCA, INMF, IRT, IMMC (30), IMMC (100)
48
49. Perspective
Assessment
s
Advantages
Robust supervised initialization of the background.
Incremental update of the eigenvectors and
eigenvalues.
Disadvantages
Needs ground truth in the training step.
Others Discriminative Subspace Learning
methods such as LDA.
49
50. Plan
Introduction
Fuzzy Background Subtraction
Background Subtraction via a Discriminative
Subspace Learning: IMMC
Foreground Detection via Robust Principal
Component Analysis (RPCA)
Conclusion - Perspectives
50
51. Foreground Detection via Robust Principal
Component Analysis
PCA (Oliver et al 1999): Not robust to outliers.
Robust PCA (Candes et al. 2011): Decomposition
into low-rank and sparse matrices
Approach developed at the MIA Lab:
Validation [ICIP 2012][ICIAR 2012][ISVC 2012]
RPCA via Iterative Reweighted Least Squares [BMC
2012]
51
52. Robust Principal Component Analysis
Candes et al. (ACM 2011) proposed a convex optimization to address
the robust PCA problem. The observation matrix A is assumed
represented as:
where L is a low-rank matrix and S must be sparse matrix with a small
fraction of nonzero entries.
52 http://perception.csl.illinois.edu/matrix-rank/home.html
53. Robust Principal Component Analysis
This research seeks to solve for L with the following optimization
problem:
where ||.||* and ||.||1 are the nuclear norm (which is the l1-norm of singular
value) and l1-norm, respectively, and λ > 0 is an arbitrary balanced
parameter.
Under these minimal assumptions, this approach called Principal
Component Pursuit (PCP) solution perfectly recovers the low-rank and
the sparse matrices.
53
54. Algorithms for solving PCP
Time required to solve a 1000x1000=106 RPCA problem:
Algorithms Accuracy Rank ||E||_0 # iterations time (sec)
IT 5.99e-006 50 101,268 8,550 119,370.3
DUAL 8.65e-006 50 100,024 822 1,855.4
10,000
times
APG 5.85e-006 50 100,347 134 1,468.9
speedup!
APGP 5.91e-006 50 100,347 134 82.7
ALMP 2.07e-007 50 100,014 34 37.5
ADMP 3.83e-007 50 99,996 23 11.8
Source: Z. Lin , Y. Ma “The Pursuit of Low-dimensional Structures in High-dimensional
(Visual) Data: Fast and Scalable Algorithms”
Time required is still acceptable for ADM but for background
modeling and foreground detection?
54
55. Application to Background Modeling and
Foreground Detection
n is the amount of pixels in a frame (106)
m is the number of frames considered (200)
Computation time is 200* 12s= 40 minutes!!!
Source: http://perception.csl.illinois.edu/matrix-rank/home.html
55
56. PCP and its application to Background
Modeling and Foreground Detection
Only visual validations are provided!!!
Limitations:
Spatio-temporal aspect: None!
Real Time Aspect: PCP takes 40 minutes with the
ADM!!!
Incremental Aspect: PCP is a batch algorithm. For
example, (Candes et al. 2011) collected 200 images.
56
57. PCP and its variants
How to improve PCP?
Algorithms for solving PCP (17 Algorithms)
Incremental PCP (5 papers)
Real-Time PCP (2 papers)
Validation for background modeling and foreground detection
(3 papers) [ICIP 2012][ICIAR 2012][ISVC 2012]
Source: T. Bouwmans, Foreground Detection using Principal Component Pursuit: A Survey, under preparation.
57
58. PCP and its variants
Source: T. Bouwmans, Foreground Detection using Principal Component Pursuit: A Survey, under preparation.
58
59. Validation Background Modeling and Foreground
Detection: Qualitative Evaluation
Original image
Ground truth
PCA
RSL
PCP-EALM
PCP-IADM
PCP-LADM
PCP-LSADM
BPCP-IALM
59 Source: ICIP 2012, ICIAR 2012, ISVC 2012
60. Validation Background Modeling and Foreground
Detection : Quantitative Evaluation
F-Measure
Block PCP gives the best performance!
60 Source: ICIP 2012, ICIAR 2012, ISVC 2012
61. PCP and its application to Background
Modeling and Foreground Detection
Recent improvements:
BPCP (Tang et Nehorai (2012)) : Spatial but not incremental and
not real time!
Recursive Robust PCP (Qiu and Vaswani (2012) ): Incremental but
not real time!
Real Time Implementation on GPU (Anderson et al. (2012) ): Real
time but not incremental!
What we can do?
Research on real time incremental robust PCP!
61
62. Perspective
Conclusion
s
Fuzzy Background Subtraction
Background Subtraction via a Discriminative Subspace
Learning: IMMC
Foreground Detection via Robust Principal Component
Analysis (RPCA)
Fuzzy Learning Rate
Other Discriminative Subspace Learning methods such
as LDA
Incremental and real time RPCA
62
63. Publications
Chapter Fuzzy Background Subtraction
T. Bouwmans, “Background Subtraction For Visual Surveillance: A Fuzzy Approach”,
Handbook on Soft Computing for Video Surveillance, Taylor and Francis Group, Chapter
5, March 2012.
International Conferences :
F. El Baf, T. Bouwmans, B. Vachon, “Fuzzy Statistical Modeling of Dynamic
Backgrounds for Moving Object Detection in Infrared Videos”, CVPR 2009 Workshop ,
pages 1-6, Miami, USA, 22 June 2009.
F. El Baf, T. Bouwmans, B. Vachon, “Type-2 Fuzzy Mixture of Gaussians Model:
Application to Background Modeling”, ISVC 2008, pages 772-781, Las Vegas, USA,
December 2008
F. El Baf, T. Bouwmans, B. Vachon, “A Fuzzy Approach for Background Subtraction”,
ICIP 2008, San Diego, California, U.S.A, October 2008.
F. El Baf, T. Bouwmans, B. Vachon. " Fuzzy Integral for Moving Object Detection ",
IEEE-FUZZY 2008 , Hong Kong, China, June 2008.
F. El Baf, T. Bouwmans, B. Vachon, “Fuzzy Foreground Detection for Infrared Videos”,
CVPR 2008 Workshop , pages 1-6, Anchorage, Alaska, USA, 27 June 2008.
F. El Baf, T. Bouwmans, B. Vachon, “Foreground Detection using the Choquet Integral”,
International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS
2008, pages 187-190, Klagenfurt, Austria, May 2008.
64. Publications
Background Subtraction via IMMC
Journal
D. Farcas, C. Marghes, T. Bouwmans, “Background Subtraction via Incremental
Maximum Margin Criterion: A discriminative approach” , Machine Vision and
Applications , March 2012.
International Conferences :
C. Marghes, T. Bouwmans, "Background Modeling via Incremental Maximum Margin
Criterion", International Workshop on Subspace Methods, ACCV 2010 Workshop
Subspace 2010, Queenstown, New Zealand, November 2010.
D. Farcas, T. Bouwmans, "Background Modeling via a Supervised Subspace Learning",
International Conference on Image, Video Processing and Computer Vision, IVPCV
2010, pages 1-7, Orlando, USA , July 2010.
65. Publications
Chapter Foreground Detection via RPCA
C. Guyon, T. Bouwmans, E. Zahzah, “Robust Principal Component Analysis for
Background Subtraction: Systematic Evaluation and Comparative Analysis”, INTECH,
Principal Component Analysis, Book 1, Chapter 12, page 223-238, March 2012.
International Conferences :
C. Guyon, T. Bouwmans. E. Zahzah, “Foreground Detection via Robust Low Rank Matrix
Factorization including Spatial Constraint with Iterative Reweighted Regression”,
International Conference on Pattern Recognition, ICPR 2012, Tsukuba, Japan,
November 2012.
C. Guyon, T. Bouwmans. E. Zahzah, “Moving Object Detection via Robust Low Rank
Matrix Decomposition with IRLS scheme”, International Symposium on Visual
Computing, ISVC 2012,pages 665–674, Rethymnon, Crete, Greece, July 2012.
C. Guyon, T. Bouwmans, E. Zahzah, “Moving Object Detection by Robust PCA solved
via a Linearized Symmetric Alternating Direction Method”, International Symposium on
Visual Computing, ISVC 2012, pages 427-436, Rethymnon, Crete, Greece, July 2012.
C. Guyon, T. Bouwmans, E. Zahzah, "Foreground Detection by Robust PCA solved via a
Linearized Alternating Direction Method", International Conference on Image Analysis
and Recognition, ICIAR 2012, pages 115-122, Aveiro, Portugal, June 2012.
C. Guyon, T. Bouwmans, E. Zahzah, "Foreground detection based on low-rank and
block-sparse matrix decomposition", IEEE International Conference on Image
Processing, ICIP 2012 , Orlando, Florida, September 2012.
Editor's Notes
Fida EL BAF My name is Thierry BOUWMANS. My talk is about recent advances and future directions for background modeling and foreground detection. I will particularly focus on the methods that I developed at the MIA Lab since five years.
Fida EL BAF First, I will introduce the main challenges in background modeling and foreground detection. Then, I will present the three main approaches that I developed at my lab using fuzzy tools, discriminative subspace learning and recent advances in robust PCA. Then, I will conclude with some perspectives.
Fida EL BAF The goal of background modeling and foreground detection consists in detecting moving objects in video sequences. For this, pixels need to be classified as background or foreground as can be seen at the picture. White pixels correspond to foreground and black pixels correspond to background.
Fida EL BAF This classification is usually achieved by background subtraction process. It is defined by 3 main steps: The background initialization which generates the first background image through N images The foreground detection which needs the background image and the new current image at time N+1 to give a decision weither the pixel corresponds to FG or BG by thresholding the decision rule The background maintenance which update the background image with the recent changes that can occur in the scene. It is why we need to update the background image with the coming of each new frame. For that, 3 information are used: 1) the BG(t), 2)the new frame I(t+1) and, 3)the foreground mask. It is important to note that the training step maybe a batch task, the foreground detection is a classification task and the background maintenance needs an incremental algorithm.
Fida EL BAF The related applications are the following: 1) Video surveillance to detect cars and track them 2) Optical motion capture to detect silhouettes and construct an avatar 3) Multimedia applications such as Aquatheque developed at La Rochelle. Here, we need to detect fish in a tank with moving algae and challenging illmuniations changes.
Fida EL BAF The first step of many video analysis systems is the segmentation of the foregrounds objects from the background. So, false detections on this step affect the following steps: tracking for video surveillance, pattern recognition for multimedia applications such as aquathèque and convex hull for motion capture.
Fida EL BAF What are the challenges for such a system? We remind that the goal is to classify pixels as foreground or background. But some structure background changes or illumination changes or shadows can generate a false classification as we can see in this picture.
Fida EL BAF Multimodal backgrounds are the more challenging ones. We can see on these pictures some examples and the false detections. Many algorithms have been developed to deal with these challenges.
Fida EL BAF Statistical background modeling have attracted much attention. These models can be categorized as follows: Gaussian models, support vectors models and subspace learning models. Gaussian models are more adaptable to dynamic backgrounds, whereas subspace learning models are better suited to illumination changes. However, none of these background models can handle correctly dynamic backgrounds and illuminations changes. More information are available at the background subtraction web site where you can find references, links to codes and links to datasets.
Fida EL BAF Now , I will present how fuzzy theory can be used in background modeling and foreground detection.
Fida EL BAF I will focus on fuzzy approaches developed at the MIA Lab. The other ones can found in my chapter on fuzzy approaches. Fuzzy tools can be used at the background modeling step, foreground detection step and background maintenance step.
Fida EL BAF The most used model in background modeling is the Mixture of Gaussians but this model presents some weakness as for example here: At the left you have the initial estimated Gaussian but during the initialization process, all the data are used to build the Gaussian: The data that are in this interval and some data that are out this interval. But over time only data that are in this interval are used to update the Gaussian. So, the Gaussian comes thicker over time as can seen in this illustration. This fact causes false detections over time.
Fida EL BAF Furthermore, the presence of outliers in the training step causes a not exact estimation of the Gaussian. In this example, we can see that there are uncertainties on the mean and the variance of the Gaussians. So, we can use fuzzy theory to deal with this uncertainty.
Fida EL BAF Here, we can see how we can generate uncertainty on the mean and the variance. They vary within intervals with uniform possibilities The shaded region is the footprint of uncertainty (FOU) The thick solid and dashed lines denote the lower and upper membership functions.
Fida EL BAF Here, we can see the distribution with the uncertainty on the mean with X which is the intensity vector in the red green blue color space.
Fida EL BAF Here, we can see the distribution with the uncertainty on the variance. For these two cases, the learning and update steps are similar to the original MOG except that we introduce uncertainty with km and kv.
Fida EL BAF For the foreground detection, the matching test is different. The measure H is used to measure the uncertainty related to X. This measure is then threholded to obtain the foreground mask. This measure avoid the first weakness of the mixture of gaussians.
Fida EL BAF Here, we present some results obtained by this fuzzy approach. The best results were obtained with the values km=2 and kv=0.9. We can see that we have less false positive with the fuzzy approach.
Fida EL BAF This fact is confirmed using false negative and false positive. The fuzzy approach outperform the original one.
Fida EL BAF We have tested this fuzzy approach on two others variants proposed by Bowden and Zivkovic. We can see that in each case the results are improved.
Fida EL BAF These results show the robustness of the proposed algorithm against waving trees.
Fida EL BAF These results show the robustness of T2 FMOG-UM against water surfaces. So, fuzzy approach is pertinent for background modeling. Now, we will see how we can use fuzzy tools for foreground detection.
Fida EL BAF The features commonly used to compare the background and the current image, are color, edge, stereo, and texture ones. These features have different properties which allow to handle differently the critical situations like the illumination changes, motion changes, structure background changes. In general, they are used separately and the most used is the color one but the use of more than one feature can improved the results.
Fida EL BAF Color features are often very discriminative features of objects but they have several limitations in presence of illumination changes, camouflage, shadows. Background subtraction methods that rely on color information will most probably fail to detect correctly the moving objects of the similar color of the background and the foregound. To solve these problems, some authors proposed to use other features like the edge, the texture and the stereo in addition to the color features. In our work we have adopted the same scheme, but what are the features to be choosed ? For example, Stereo deal with the camouflage but two cameras are needed Edge handle the local illumination changes and the ghost leaved when waking foreground objects begin to move Texture is appropriated to illumination changes and to shadows, which are a main challenge in our work. So, in addition to the intensity color for each component color, we close to utilize texture information when modeling the background and the Local binary pattern developed by Heikkila was selected as the measure of texture because of its properties to increase the robustness to illumination changes and shadows. In the other hand the proposed features are very fast to compute, which is an important property from the practical implementation point of view. Now that features are chosen, how to integrate the information that they hold to detect FG objects? In general, a simple subtraction is made between the current and the background images to detect regions corresponding to foreground. Another way to establish this comparison consists in defining a similarity measure between pixels at the same location in current and BG images. Pixels corresponding to BG should be similar while those corresponding to FG should not be similar.
Fida EL BAF In the literatture, Fuzzy integrals have been successfully applied widely in classification problems. In the context of foreground detection, these integrals seem to be good model candidates for fusing sources obtained from different features. A pixel can be evaluated based on criteria or sources providing information about the state of the pixel whether it corresponds to background or foreground. The more criteria provide information about the pixel, the more relevant the decision of pixel’s state.
Fida EL BAF Here I explain how to compute the similarity measure for color and for texture. We have the background image and the current frame, For each pixel, see the pixel marked in red, After the extraction of the intensity color for each component color and the code LBP for texture feature ; The similarity measure for texture feature is obtained by the ratio of the texture value in background image and the texture value in current image so as to have always a value between zero and one. In the same way, the similarity measure for color features is computed Note that the value of the Code LBP and the value of the intensity color are between 0 and 255
Fida EL BAF There are two fuzzy integrals that can be used to fuse the features: the Sugeno integral and the Choquet integral.They allow to deal with uncertainty and imprecision. They offer great flexibility and they can be achieved with fast and simple operations. The Choquet integral is adapted for cardinal aggregation while Sugeno integral is more suitable for ordinal aggregation. So, the Choquet integral is well suited for foreground detection
Fida EL BAF Some of color spaces allow to separate the Chrominance components from the luminance. For the chosen color space, two components x1 and x2 are chosen according to the relevant information which they contain so as to have the least sensitivity to illumination changes For texture x3 indicate the value of the texture feature obtained by the code LBP with each criterion, we associate a fuzzy measure, mu(x1), mu(x2) and mu(x3), where mu(xi) is the degree of importance of the feature xi in the decision whether pixel corresponds to BG or FD. such that the higher the mu(xi), the more important the corresponding criterion in the decision. To simplify the computing, a lambda fuzzy measure (additive) is used to compute the fuzzy measure of all subsets of criteria. By experimentation, best results are obtained with the last given measures
Fida EL BAF The foreground detection is achieved by the following classification. The results of the Choquet integral are thresholded.
Fida EL BAF Aquatheque dataset is a system dedicated to aquariums to detect and identify fish in a tank. The goal is to provide some educational information about the selected fish by the user. When testing our algorithm on this datatset, where the illumination conditions are uncontroled, we have obtained this result with Ohta color space. When comparing our algorithm with a similar approach using Sugeno integral in presence of Ohta color space developed by Zhang, the result shows an improvement based on visual interpretation. Numerical evaluation is usually done in terms of false negative (number of foreground pixels that we have missed) and false positive (the number of background pixels that we have marked as foreground). The ground truth is achieved manually. Firstly, we show a quantitative evaluation with respect to the measure derived by Li [33] which compare the detected region and the corresponding ground truth, so as this quantity approaches 1 when these 2 regions are similar, and 0 when they have the least similarity. It is well identified that optimum results are obtained by the Choquet integral. To see the progression of the performance of both algorithms, we drew up the ROC Curve. The overall performance of our algorithm seems to be better than the performance of the compared method of the test sequences used. The area under the curve confirms the result.
Fida EL BAF At the same time, we have tried to test other color space like the YCrCb and the HSV with our algorithm. Furthermore, the Ohta and the YCrCb spaces give almost similar results (SOhta = 0,40; SYCrCb = 0,42), when the HSV space registers (SHSV = 0,30). When observing the effect of YCrCb and Ohta spaces on the images, we have noticed that the YCrCb is slightly better than the Ohta space.
Fida EL BAF Some other results in video sport and video surveillance Applications. For each datasets, we provide a comparison with the method proposed by Zhang. The silhouettes are better detected and the illumination variations on the white border are less detected using our method. Here again the algorithm shows a robustness to illumination changes and shadows.
Fida EL BAF Some other results in video sport and video surveillance Applications. For each datasets, we provide a comparison with the method proposed by Zhang. The silhouettes are better detected and the illumination variations on the white border are less detected using our method. Here again the algorithm shows a robustness to illumination changes and shadows.
Fida EL BAF The blind background maintenance consists to update all the pixels with the same rules. The drawbacks of this scheme is that the value of pixels classified as foreground are taken into account in the computation of the new background and so polluted the background image. To solve this problem, some authors use a selective maintenance which consists of computing the new background image with a different learning rate following its previous classification into foreground or background as follows. Here, the idea is to adapt very quickly a pixel classified as background and very slowly a pixel classified as foreground. But the problem is that erroneous classification results may make permanent incorrect background model.
Fida EL BAF The drawback of the selective maintenance is mainly due to the crisp decision which attributes a different rule following the classification in background or foreground. To solve this problem, we propose to take into account the uncertainty of the classification. This can be made by graduate the update rule using the result of the Choquet integral as follows
Fida EL BAF This experiment shows the evaluation of the different update rules for the previous experiments. The fuzzy adaptive scheme seems to be slightly better than the other update rules from the quantitative evaluation point of view, but it shows an improvement based on visual interpretation.
Fida EL BAF Here, you can see some computation times for the fuzzy approach. Their speed are still acceptable. Furthermore, the speed can be performed by a GPU implementation.
Fida EL BAF So, fuzzy tools have been applied with success for background modeling, foreground detection and background maintenance. Future works may concern using fuzzy approaches in other statistical models, using more than two features for the foreground detection, and a more adaptive learning rate.
Fida EL BAF Now , I will present how discriminative subspace learning can be used for background subtraction.
Fida EL BAF Reconstructive subspace learning models, such as principal component analysis (PCA) have been mainly used to model the background by significantly reducing the data’s dimension. The reconstructive representations strive to be as informative as possible in terms of well approximating the original data. Their objective is mainly to encompass the variability of the training data and so they give more effort to model the background in an unsupervised manner than to precisely classify pixels as foreground or background in the foreground detection.
Fida EL BAF On the other hand, discriminative methods are usually less adapted to the reconstruction of data; although they are spatially and computationally much more efficient and often give better classification results compared with the reconstructive methods. So, we propose the use of a discriminative subspace learning model called incremental maximum margin criterion (IMMC). The objective is first to enable a robust supervised initialization of the background and secondly a robust classification of pixels as background or foreground. Furthermore, IMMC also allows us an incremental update of the eigenvectors and eigenvalues.
Fida EL BAF
Fida EL BAF
Fida EL BAF
Fida EL BAF
Fida EL BAF
Fida EL BAF Here, at the first line, there are the current images. Then, we can see the images that corresponds the classes background and foreground, the background image and the foreground mask. Note that only the images which correspond to the class background are used to obtain the background image.
Fida EL BAF Here, we present results on the Wallflower dataset. We can see that the proposed method outperforms the gaussian models and the reconstructive subspace learning.
Fida EL BAF So, discriminative approaches allow us to have a robust supervised initialization of the background and an incremental update of the eigenvectors and eigenvalues. The drawback is that the method needs ground truth images for the training step. For future research, others discriminative subspace can be used.
Fida EL BAF Now , I will present how recent advances in robust principal component analysis can be used for foreground detection.
Fida EL BAF The first method that used PCA for background modeling and foreground detection is the one proposed by Oliver et al but this method present several limitations and it is no robust in presence of outliers. Recent advances in robust PCA which decomposes the data matrix into a low-rank matrix and sparse matrix show a nice framework to separate the moving objects for the background. At the MIA Lab, we have firstly evaluated this method and its variants. Then, we have developed a RPCA method based on the Iterative Reweighted Least Squares.
Fida EL BAF At the picture, we can see the observation and how it can be decomposed in a low-rank and sparse parts. The low-rank matrix is clean and the sparse matrix contains the noise. Here, we can see the main assumption made in this method. The noise have to be uniformly distributed and it is not the case for the moving objects in background modeling and foreground detection.
Fida EL BAF
Fida EL BAF Time requirement is a key point in real time application such as background modeling. Here, we can see the time required for different solvers. For the Alternate Direction Method, the time is still acceptable!
Fida EL BAF When we applied this method directly on background modeling and foreground detection, we can see that the amount of data is larger. Here, two thousand larger than the previous example. Then, the computation time becomes very expensive (forty minutes). At the left of the picture, we can see that the training images are stacked in column in the observation matrix. So, the spatial information is lost. At the right, we can see the decomposition. The low-rank part corresponds to the background and the sparse part to the foreground objects.
Fida EL BAF So, the main drawbacks of PCP is that 1) only qualitative results are shown, 2) It is not real time and 3) PCP is a batch algorithm.
Fida EL BAF There is several variants for PCP as shown in the Table. The stable PCP allow presence of noise by introducing the third term and the constraint is different. The QPCP take into account the quantization of the pixels to allow RPCA on the real data. The Block PCP allow to deal with entry wise outlier by using combined norm. The Local PCP allows to deal with multimodal issues. A complete analysis will be provided in the following paper.
Fida EL BAF There is several variants for PCP as shown in the Table. The stable PCP allow presence of noise by introducing the third term and the constraint is different. The QPCP take into account the quantization of the pixels to allow RPCA on the real data. The Block PCP allow to deal with entry wise outlier by using combined norm. The Local PCP allows to deal with multimodal issues. A complete analysis will be provided in the following paper.
Fida EL BAF First, we have made several quantitative evaluations on the Wallflower dataset. PCA is the one developed by Oliver et al. RSL is a robust PCA but it not decomposes the observation in two matrices as PCP. The others algorithm is PCP solved by different solvers and finally the block PCP.
Fida EL BAF First, we can see the F-measure for each method. The block PCP outperforms the other ones.
Fida EL BAF Recent advances have been made such as the followings.
Fida EL BAF Fuzzy tools, discriminative subspace and robust PCA offer a nice framework for background modeling and foreground detection. However, they need to be investigate and improve to achieve better performances. For example, future directions may concern fuzzy learning rates, the use of other discriminative subspace, and an incremental and real-time robust PCA.