This document presents a physical approach to detecting moving cast shadows in video. It introduces a physics-based shadow model that decomposes light sources into direct and ambient components. Color features are used to encode the difference between shadow and background pixels. A weak shadow detector is used to identify shadow candidates, and a Gaussian mixture model learns the shadow model over time. Spatial information is incorporated to improve learning. The approach detects shadows at light/shadow borders separately. Experimental results on various sequences demonstrate improved shadow detection and discrimination rates compared to other methods. Future work will derive physics-based features for a global shadow model and extend the physical model to more complex cases.
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as l1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.
Note on Coupled Line Cameras for Rectangle Reconstruction (ACDDE 2012)Joo-Haeng Lee
The presentation file for the talk in ACDDE 2012.
http://www.acdde2012.org/
It deals with the research result published in ICPR 2012 with the title as "Camera Calibration from a Single Image based on Coupled Line Cameras and Rectangle Constraint"
https://iapr.papercept.net/conferences/scripts/abstract.pl?ConfID=7&Number=70
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as l1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.
Note on Coupled Line Cameras for Rectangle Reconstruction (ACDDE 2012)Joo-Haeng Lee
The presentation file for the talk in ACDDE 2012.
http://www.acdde2012.org/
It deals with the research result published in ICPR 2012 with the title as "Camera Calibration from a Single Image based on Coupled Line Cameras and Rectangle Constraint"
https://iapr.papercept.net/conferences/scripts/abstract.pl?ConfID=7&Number=70
A Novel Methodology for Designing Linear Phase IIR FiltersIDES Editor
This paper presents a novel technique for
designing an Infinite Impulse Response (IIR) Filter with
Linear Phase Response. The design of IIR filter is always a
challenging task due to the reason that a Linear Phase
Response is not realizable in this kind. The conventional
techniques involve large number of samples and higher
order filter for better approximation resulting in complex
hardware for implementing the same. In addition, an
extensive computational resource for obtaining the inverse
of huge matrices is required. However, we propose a
technique, which uses the frequency domain sampling along
with the linear programming concept to achieve a filter
design, which gives a best approximation for the linear
phase response. The proposed method can give the closest
response with less number of samples (only 10) and is
computationally simple. We have presented the filter design
along with its formulation and solving methodology.
Numerical results are used to substantiate the efficiency of
the proposed method.
This slide is my presentation for a reading circle "Machine Learning Professional Series".
Japanese version is here.
http://www.slideshare.net/matsukenbook/ss-50545587
Slides of the lectures given at the summer school "Biomedical Image Analysis Summer School : Modalities, Methodologies & Clinical Research", Centrale Paris, Paris, July 9-13, 2012
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Yusuke Uchida
Recently, the Fisher vector representation of local features has attracted much attention because of its effectiveness in both image classification and image retrieval. Another trend in the area of image retrieval is the use of binary feature such as ORB, FREAK, and BRISK. Considering the significant performance improvement in terms of accuracy in both image classification and retrieval by the Fisher vector of continuous feature descriptors, if the Fisher vector were also to be applied to binary features, we would receive the same benefits in binary feature based image retrieval and classification. In this paper, we derive the closed-form approximation of the Fisher vector of binary features which are modeled by the Bernoulli mixture model. In experiments, it is shown that the Fisher vector representation improves the accuracy of image retrieval by 25% compared with a bag of binary words approach.
C. Guyon, T. Bouwmans. E. Zahzah, “Foreground Detection via Robust Low Rank Matrix Decomposition including Spatio-Temporal Constraint”, International Workshop on Background Model Challenges, ACCV 2012, Daejon, Korea, November 2012.
Here is my updated CV using the ModernCV template (http://www.latextemplates.com/template/moderncv-cv-and-cover-letter).
You can find the Tex source file in (https://dl.dropbox.com/u/2810224/Homepage/resume/modern%20style.rar)
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Jia-Bin Huang
Jia-Bin Huang, Qin Cai, Zicheng Liu, Narendra Ahuja, and Zhengyou Zhang
Towards Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning From Simulation
Proceedings of ACM Symposium on Eye Tracking Research & Applications (ETRA), 2014
ETRA 2014 Best Paper Award
A Novel Methodology for Designing Linear Phase IIR FiltersIDES Editor
This paper presents a novel technique for
designing an Infinite Impulse Response (IIR) Filter with
Linear Phase Response. The design of IIR filter is always a
challenging task due to the reason that a Linear Phase
Response is not realizable in this kind. The conventional
techniques involve large number of samples and higher
order filter for better approximation resulting in complex
hardware for implementing the same. In addition, an
extensive computational resource for obtaining the inverse
of huge matrices is required. However, we propose a
technique, which uses the frequency domain sampling along
with the linear programming concept to achieve a filter
design, which gives a best approximation for the linear
phase response. The proposed method can give the closest
response with less number of samples (only 10) and is
computationally simple. We have presented the filter design
along with its formulation and solving methodology.
Numerical results are used to substantiate the efficiency of
the proposed method.
This slide is my presentation for a reading circle "Machine Learning Professional Series".
Japanese version is here.
http://www.slideshare.net/matsukenbook/ss-50545587
Slides of the lectures given at the summer school "Biomedical Image Analysis Summer School : Modalities, Methodologies & Clinical Research", Centrale Paris, Paris, July 9-13, 2012
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Yusuke Uchida
Recently, the Fisher vector representation of local features has attracted much attention because of its effectiveness in both image classification and image retrieval. Another trend in the area of image retrieval is the use of binary feature such as ORB, FREAK, and BRISK. Considering the significant performance improvement in terms of accuracy in both image classification and retrieval by the Fisher vector of continuous feature descriptors, if the Fisher vector were also to be applied to binary features, we would receive the same benefits in binary feature based image retrieval and classification. In this paper, we derive the closed-form approximation of the Fisher vector of binary features which are modeled by the Bernoulli mixture model. In experiments, it is shown that the Fisher vector representation improves the accuracy of image retrieval by 25% compared with a bag of binary words approach.
C. Guyon, T. Bouwmans. E. Zahzah, “Foreground Detection via Robust Low Rank Matrix Decomposition including Spatio-Temporal Constraint”, International Workshop on Background Model Challenges, ACCV 2012, Daejon, Korea, November 2012.
Here is my updated CV using the ModernCV template (http://www.latextemplates.com/template/moderncv-cv-and-cover-letter).
You can find the Tex source file in (https://dl.dropbox.com/u/2810224/Homepage/resume/modern%20style.rar)
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Jia-Bin Huang
Jia-Bin Huang, Qin Cai, Zicheng Liu, Narendra Ahuja, and Zhengyou Zhang
Towards Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning From Simulation
Proceedings of ACM Symposium on Eye Tracking Research & Applications (ETRA), 2014
ETRA 2014 Best Paper Award
In this paper, we describe a new interactive image completion system that allows users to easily specify various forms of mid-level structures in the image. Our system supports the specification of four basic symmetric types: reflection, translation, rotation, and glide. The user inputs are automatically converted into guidance maps that encode
possible candidate shifts and, indirectly, local transformations of rotation and scale. These guidance maps are used in conjunction with a color matching cost for image
completion. We show that our system is capable of handling a variety of challenging examples.
http://www.jiabinhuang.com/
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Jia-Bin Huang
A number of bottom-up saliency detection algorithms have been proposed in the literature. Since these have been developed from intuition and principles inspired by psychophysical studies of human vision, the theoretical relations among them are unclear. In this paper, we present a unifying perspective. Saliency of an image area is defined in terms of divergence between certain feature distributions estimated from the
central part and its surround. We show that various, seemingly different saliency estimation algorithms are in fact closely related. We also discuss some commonly
used center-surround selection strategies. Experiments with two datasets are presented to quantify the relative advantages of these algorithms.
Best student paper award in Computer Vision and Robotics Track
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Jia-Bin Huang
In this paper, we propose a fast re-coloring algorithm to improve the accessibility for the color vision impaired. Compared to people with normal color vision, people with color vision impairment have difficulty in distinguishing between certain combinations of colors. This may hinder visual communication owing to the increasing use of colors in recent years. To address this problem, we re-map the hue components in the HSV color space based on the statistics of local characteristics of the original color image. We enhance the color contrast through generalized histogram equalization. A control parameter is provided for various users to specify the degree of enhancement to meet their needs. Experimental results are illustrated to demonstrate the effectiveness and efficiency of the proposed re-coloring algorithm.
Reading academic papers is one of the most important parts of scientific research. However, junior graduate students may spend a lot of time learning how to read papers efficiently and effectively. In this talk, I will discuss some basic issues and introduce useful websites/tools/tips for paper reading.
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Jia-Bin Huang
We propose a method for automatically guiding patch-based image completion using mid-level structural cues. Our method first estimates planar projection parameters, softly segments the known region into planes, and discovers translational regularity within these planes. This information is then converted into soft constraints for the low-level completion algorithm by defining prior probabilities for patch offsets and transformations. Our method handles multiple planes, and in the absence of any detected planes falls back to a baseline fronto-parallel image completion algorithm. We validate our technique through extensive comparisons with state-of-the-art algorithms on a variety of scenes.
Project page: https://sites.google.com/site/jbhuang0604/publications/struct_completion
What makes a creative photograph? This talk summarizes five approaches to make creative photographs. For each approach, many example images from the internet are used to demonstrate how the method works in practice.
For more explanations on example images, please visit my blog: http://jbhuang0604.blogspot.com/
Computer vision has been studied for more than 40 years. Due to the increasingly diverse and rapidly developed topics in vision and the related fields (e.g., machine learning, signal processing, cognitive science), the tasks to come up with new research ideas are usually daunting for junior graduate students in this field. In this talk, I will present five methods to come up with new research ideas. For each method, I will give several examples (i.e., existing works in the literature) to illustrate how the method works in practice.
This is a common sense talk and will not have complicated math equations and theories.
Note: The content of this talk is inspired by "Raskar Idea Hexagon" - Prof. Ramesh Raskar's talk on "How to come up with new Ideas".
To download the presentation slide with videos, please visit
http://jbhuang0604.blogspot.com/2010/05/how-to-come-up-with-new-research-ideas.html
For the video lecture (in Chinese), please visit
http://jbhuang0604.blogspot.com/2010/06/blog-post_14.html
General principles and tricks for writing fast MATLAB code.
Powerpoint slides: https://uofi.box.com/shared/static/yg4ry6s1c9qamsvk6sk7cdbzbmn2z7b8.pptx
The Product are very popular among women due to its effectiveness in restoring the bad skin, oil skin, skin pores, rashes, increase the breast size, etc.
A computationally efficient method for sequential MAP-MRF cloud detectionBeniamino Murgante
A computationally efficient method for sequential MAP-MRF cloud detection
Paolo Addesso, Roberto Conte, Maurizio Longo, Rocco Restaino, Gemine Vivone
- University of Salerno
Approximate Bayesian computation for the Ising/Potts modelMatt Moores
Bayes’ formula involves the likelihood function, p(y|theta), which is a problem when the likelihood is unavailable in closed form. ABC is a method for approximating the posterior p(theta|y) without evaluating the likelihood. Instead, pseudo-data is simulated from a generative model and compared with the observations. This talk will give an introduction to ABC algorithms: rejection sampling, ABC-MCMC and ABC-SMC. Application of these algorithms to image analysis will be presented as an illustrative example. These methods have been implemented in the R package bayesImageS.
This is joint work with Christian Robert (Warwick/Dauphine), Kerrie Mengersen and Christopher Drovandi (QUT).
In this article we consider macrocanonical models for texture synthesis. In these models samples are generated given an input texture image and a set of features which should be matched in expectation. It is known that if the images are quantized, macrocanonical models are given by Gibbs measures, using the maximum entropy principle. We study conditions under which this result extends to real-valued images. If these conditions hold, finding a macrocanonical model amounts to minimizing a convex function and sampling from an associated Gibbs measure. We analyze an algorithm which alternates between sampling and minimizing. We present experiments with neural network features and study the drawbacks and advantages of using this sampling scheme.
Landuse Classification from Satellite Imagery using Deep LearningDataWorks Summit
With the abundance of remote sensing satellite imagery, the possibilities are endless as to the kind of insights that can be derived from them. One such use is to determine land use for agriculture and non-agricultural purposes.
In this talk, we’ll be looking at leveraging Sentinel-2 satellite imagery data along with OpenStreetMap labels to be able to classify land use as agricultural or non-agricultural.
Sentinel-2 data has a 10-meter resolution in RGB bands and is well-suited for land use classification. Using these two datasets, many different machine learning tasks can be performed like image segmentation into two classes (farm land and non-farm land) or more challenging task of identification of crop type being cultivated on fields.
For this talk, we’ll be looking at leveraging convolutional neural networks (CNNs) built with Apache MXNet to train deep learning models for land use classification. We’ll be covering the different deep learning architectures considered for this particular use case along with the appropriate metrics.
We’ll be leveraging streaming pipelines built on Apache Flink and Apache NiFi for model training and inference. Developers will come away with a better understanding of how to analyze satellite imagery and the different deep learning architectures along with their pros/cons when analyzing satellite imagery for land use. SUNEEL MARTHI and CHRIS OLIVIER, Software Development Engineer Amazon Web Services
C. Guyon, T. Bouwmans. E. Zahzah, “Foreground Detection via Robust Low Rank Matrix Factorization including Spatial Constraint with Iterative Reweighted Regression”, International Conference on Pattern Recognition, ICPR 2012, Tsukuba, Japan, November 2012.
Corisco is a method for monocular camera orientation estimation in anthropic environments using edgels. This is my doctorate defense presentation, updated and translated to english.
Research 101 - Paper Writing with LaTeXJia-Bin Huang
Paper Writing with LaTeX
PDF: https://filebox.ece.vt.edu/~jbhuang/slides/Research%20101%20-%20Paper%20Writing%20with%20LaTeX.pdf
PPTX: https://filebox.ece.vt.edu/~jbhuang/slides/Research%20101%20-%20Paper%20Writing%20with%20LaTeX.pptx
Computer vision techniques can be seen in various aspects in our daily life with tremendous impacts. This slides aim at introducing basic concepts of computer vision and applications for the general public.
Download link: https://uofi.box.com/shared/static/24vy7aule67o4g6djr83hzurf5a9lfp6.pptx
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Jia-Bin Huang
Self-similarity based super-resolution (SR) algorithms are able to produce visually pleasing results without extensive training on external databases. Such algorithms exploit the statistical prior that patches in a natural image tend to recur within and across scales of the same image. However, the internal dictionary obtained from the given image may not always be sufficiently expressive to cover the textural appearance variations in the scene. In this paper, we extend self-similarity based SR to overcome this drawback. We expand the internal patch search space by allowing geometric variations. We do so by explicitly localizing planes in the scene and using the detected perspective geometry to guide the patch search process. We also incorporate additional affine transformations to accommodate local shape variations. We propose a compositional model to simultaneously handle both types of transformations. We extensively evaluate the performance in both urban and natural scenes. Even without using any external training databases, we achieve significantly superior results on urban scenes, while maintaining comparable performance on natural scenes as other state-of-the-art SR algorithms.
http://bit.ly/selfexemplarsr
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as l1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
1. A Physical Approach to Moving Cast Shadow
Detection
Jia-Bin Huang and Chu-Song Chen
jbhuang0604@gmail.com, song@iis.sinica.edu.tw
Institute of Information Science
Academia Sinica, Taipei, Taiwan
April 23, 2009
1 / 26
2. Outline
1 Introduction
2 Related Works
3 Physical Model for Cast Shadows
4 Learning and Detecting Cast Shadows
5 Experimental Results
6 Conclusion and Future Work
2 / 26
3. Outline
1 Introduction
2 Related Works
3 Physical Model for Cast Shadows
4 Learning and Detecting Cast Shadows
5 Experimental Results
6 Conclusion and Future Work
3 / 26
4. Introduction
Motivation
Moving object detection is one of the most important task
in low-level vision.
Detecting moving cast shadows is one of the most
challenging problems for accurate object detection in video
streams since shadow points are often misclassified as
object points.
Without careful consideration, cast shadows may introduce
significant error in segmentation, tracking, and recognition.
4 / 26
5. Introduction
The Cause of Cast Shadows
Light sources are partially or totally blocked by the foreground
objects.
Why Detecting Cast Shadows Is Difficult?
1 Shadow points are detectable as foreground points and
typically differ significantly from the background.
2 Cast shadows have the same motion as the objects
casting them.
3 Shaded regions are usually connected with the foreground
objects.
5 / 26
6. Outline
1 Introduction
2 Related Works
3 Physical Model for Cast Shadows
4 Learning and Detecting Cast Shadows
5 Experimental Results
6 Conclusion and Future Work
6 / 26
7. Related Works (1/2)
Previous Works (before 2003)
A Survey paper: [Prati et al. PAMI 2003]
Statistical parametric: [Mikic et al. ICPR 2000]
Statistical nonparametric: [Horprasert et al. ICCV
Workshop 1999]
Deterministic model-based: [Onoguchi ICPR 1998]
Deterministic nonmodel-based: [Cucchiara et al. PAMI
2001]
Major Drawbacks
Need to explicitly tune the parameters for each scene.
Hard to adapt to the illumination conditions and
environment changes.
7 / 26
8. Related Works (2/2)
Learning-based Approaches
Basic Idea: Learn cast shadow model from video sequences.
Shadow Flow, [Porikli et al. ICCV 2005]
Gaussian Mixture Shadow Modeling, [Martel-Brisson et al.
PAMI 2007]
Combining Local and Global Features, [Liu et al. CVPR 07]
Learning Physical Model of Light Sources and Surfaces
[Martel-Brisson et al. CVPR 2008]
Drawbacks
Most of them assume shadow values will attenuate linearly
along the line between the value of the corresponding
background and the origin.
Pixel-based models may suffer from slow learning due to
the lack of sufficient samples.
8 / 26
9. Outline
1 Introduction
2 Related Works
3 Physical Model for Cast Shadows
4 Learning and Detecting Cast Shadows
5 Experimental Results
6 Conclusion and Future Work
9 / 26
10. Main Idea
A general physics-based shadow model
Decompose light incident at the background surface into
two classes
Direct light sources (e.g., sun)
Ambient illumination (e.g., light scattered by the sky,
colored light from nearby surfaces (color bleeding))
Suppose we have N light sources and M ambient
illumination, then the intensity function of light:
N M
E(λ) = Eincident,n(λ) + Eambient,m(λ).
n m
10 / 26
11. Ambient Illuminations and Direct Light Sources
Lambertian model: camera
sensor response gk(p) at point p
gk(p) = E(λ, p)ρ(λ, p)Sk(λ)dλ.
E(λ, p) Intensity function of light
sources
ρ(λ, p) The reflectance of an
object surface
Sk(λ) Sensor spectral sensitivity
function
11 / 26
12. Appearance Variation Under Cast Shadow
Part or total light sources are blocked by foreground
objects
Ambient illumination may be slightly changed (from BGA to
′
BGA)
12 / 26
13. Color Feature Vector
We encode the difference vector between background and
shadow value as our color feature.
xs,t(p) = [αt(p), θt(p), φt(p)]T (in spherical coordinate
system)
Illumination attenuation
||vt(p)||
αt(p) =
||BGt(p)||
Angle information
vG(p)
t
θt(p) = arctan( )
vR(p)
t
vB(p)
t
φt(p) = arccos( )
||vt(p)||
13 / 26
14. Outline
1 Introduction
2 Related Works
3 Physical Model for Cast Shadows
4 Learning and Detecting Cast Shadows
5 Experimental Results
6 Conclusion and Future Work
14 / 26
15. Overview
1 Perform background subtraction to obtain foreground
candidates (i.e., including real foreground and cast
shadows)
2 Apply weak shadow detector as a pre-filter to obtain
shadow candidates (e.g., filter out those pixels whose
illumination values are larger than the corresponding
background values)
3 For these shadow candidates, learn the color feature
vector xs,t(p) using GMM over time
4 Detecting cast shadow using the learned cast shadow
model
15 / 26
17. Incorporating Spatial Information
Prior Knowledge of Cast Shadows
Cast shadows would not enhance the spatial gradient intensity
Introduce ωt(p) as a confidence value of cast shadows
ε + |∇(Bt(p))|
ωt(p) = ,
ε + max{|∇(It(p))|, |∇(Bt(p))|}
where ε is a smooth term.
To accelerate the learning speed of the pixel-based
shadow model, take ωt(p) as confidence value to update
shadow model at pixel p
Penalize samples having larger gradient intensity than
background by lessening the learning rate
17 / 26
18. Detecting Shadows at Light/Shadow Border
Shadows at light/shadow border show different behavior
from shadows inside the shaded region
Solution: Detecting cast shadows only with angle
information
(a)αt(p)
(b)θt(p)
(c)φt(p)
18 / 26
19. Outline
1 Introduction
2 Related Works
3 Physical Model for Cast Shadows
4 Learning and Detecting Cast Shadows
5 Experimental Results
6 Conclusion and Future Work
19 / 26
20. Qualitative Evaluation
(a) (b) (c) (d)
Figure: (a) Original images, (b) Background posterior probability, (c)
Shadow posterior probability, and (d) Forground posterior probability
20 / 26
22. Effect of Shadows at Shadow/Light border
(a) (b) (c)
Figure: Effect of shadows at shadow/light border (a) Original frame of
sequence “Highway I". (b)(c) Foreground posterior without/with
considering shadows at shadow/light border.
22 / 26
23. Outline
1 Introduction
2 Related Works
3 Physical Model for Cast Shadows
4 Learning and Detecting Cast Shadows
5 Experimental Results
6 Conclusion and Future Work
23 / 26
24. Conclusion
Provide a better description for background surface value
variation under cast shadow
Incorporate spatial information to accelerate the learning of
pixel-based shadow model
Take shadows at light/shadow border into consideration
24 / 26
25. Future Work
Derive physics-based features for building a global shadow
model in a scene
Jia-Bin Huang and Chu-Song Chen, “Moving Cast Shadow
Detection using Physics-based Features", CVPR 2009
Extend the physical model to handle more general cases
(e.g., surface with specular reflection, spatial varing
ambient illumination, etc.)
25 / 26