Human Visual Perception Inspired
Background Subtraction

Mahfuzul Haque and Manzur Murshed
Research Goal
Real-time Video Analytics

Stage 1

Video Stream






Stage 2

…

Real-time Processing

Event Detection...
Unexpected Behaviors
 Mob violence
 Unusual Crowding
 Sudden group
formation/deformation
 Shooting
 Public panic
Increasing number of surveillance cameras

Deployment of large number of surveillance cameras in recent years
Modern airpo...
Decreasing reliability

Dependability on human monitors has increased.
Reliability on surveillance system has decreased.
Are we really protected?
Surveillance cameras
Typical Video Analytics Framework

Surveillance
video stream

High level
description of
unusual events/
actions
Alarm!

1....
Background Subtraction
Input

Output
Background Subtraction: How?
Basic Background Subtraction (BBS)
Current frame

=

Background

Foreground Blob

Dynamic Bac...
MOG-based Background Subtraction
σ2

P(x)

µ
P(x)

x

Sky
Cloud
Leaf
Moving Person

σ2

Road
Shadow
Moving Car

Fl...
MOG-based Background Subtraction
Background
Model
Current frame

Detected object
Frame 1

road

Frame N

shadow

car

road...
Typical Surveillance Setup
Video Stream

 Frame-size reduction
 Frame-rate reduction

Background
Subtraction

Feature
Ex...
Scenario 1

α = Learning rate
T = Background data proportion

First Frame

T = 0.4

T = 0.6

T = 0.8

α = 0.1
Test Frame
α...
Scenario 2

α = Learning rate
T = Background data proportion

First Frame

T = 0.4

T = 0.6

T = 0.8

α = 0.1
Test Frame
α...
Scenario 3

α = Learning rate
T = Background data proportion

First Frame

T = 0.4

T = 0.6

T = 0.8

α = 0.1
Test Frame
α...
Observations
• Slow learning rate (α) is not preferable (ghost
or black-out).
• Simple post-processing will not improve th...
How can we detect abnormal situations?
“Hey, a mob will be approaching soon,
and background will be visible only 10%
of th...
Research Goals
• A new background subtraction technique for
unconstrained environments, i.e., no context
related informati...
The New Technique, PMOG
• Perceptual Mixture of Gaussians
• Incorporating perceptual characteristics of
human visual syste...
Realistic Background Value Prediction
Models are ordered by ω/σ

ω1
σ12
µ1

ω2
σ22
µ2

road

ω3
σ32
µ3
car

shadow

65%

1...
Realistic Background Value Prediction
…
μ = (1-α)μ + αXt

μ
…

b
time

x

x

P(x)

b
Most recent observation, b






...
Realistic Background Value Prediction
Models are ordered by ω/σ

ω1
σ12
µ1
b1

ω2
σ22
µ2
b2

road

ω3
σ32
µ3
b3
car

shado...
Perception Based Detection Threshold
Models are ordered by ω/σ

ω1
σ12
µ1
b1

ω2
σ22
µ2
b2

road

ω3
σ32
µ3
b3
car

shadow...
Our Problem: How is x related with b?
Low x

x=?
x

x
P(x)

b

Te

High x
Weber’s Law
How human visual system perceives noticeable intensity
deviation from the background?

Ernst Weber, an experim...
Weber’s Law
Ernst Weber, an experimental psychologist in the
19th century, observed that the just-noticeable
increment ΔI ...
Another perceptual characteristic of HVS
What is the perceptual tolerance level in distinguishing
distorted intensity meas...
Our Problem: How is x related with b?
Weber’s Law

x=?

x = c2b
x

x
P(x)

Perceptual Threshold, TP (0.5 dB)

 255
20 log...
Impact of Perceptual Threshold, TP

Human Vision: Tp = 0.5 dB
Machine Vision: Tp = 1.0 dB (minimal impact of shadow, refle...
Liner Relationship

Te
Error Sensitivity in Darker Background

Te
Rod and Cone Cells of Human Eye
• Rods and Cones are two different types of
photoreceptor cells in the retina of human eye...
Piece-wise Liner Relationship

Scotopic Vision (R)

Photopic Vision (C)
Te
Perceptual Model Similarity in PMOG
Model redundancy in MOG

Te
Perceptual Model Similarity in PMOG
Experiments
Test Sequences
 Total 50 test sequences from 8 different sources
 Scenario distribution






Indoor
Ou...
Test Sequences

PETS (9) Wallflower (7) UCF (7)

IBMTe
(11)

CAVIAR (7)

VSSN06 (7)

Other (2)
Experiments

Te
Experiments

Te
Experiments

Te
Experiments
Experiments
Experiments
Experiments
Experiments
Experiments
PMOG: Summary
• Realistic background value prediction: high model
agility and superior detection quality at fast learning
...
Panic-driven Event Detection
Event Detection

time





Specific types of events vs. abnormality
An event persists for a certain duration of time
T...
The Proposed Event Detection Approach
Architecture
Foreground
Detector

Frame-level
Feature Extractor

Temporal
Feature Ex...
The Proposed Event Detection Approach
f1
f2
f3
.
.
.

time
Frame-level
Features






Event
Model

fn
Temporal
Feature...
The Proposed Event Detection Approach
Motion based approaches

Tracking based approaches

 Key points detection
 Point m...
The Proposed Event Detection Approach
f1
f2
f3
.
.
.

time
Frame-level
Features

Event
Model

fn
Temporal
Features

Classi...
Blob Statistical Analysis
Frame-level features










Blob Area (BA)
Filling Ratio (FR)
Aspect Ratio (AR)
Bound...
Blob Statistical Analysis
Temporal features
2
1

4
3

6
5

Frame #

 Overlapping sliding window
 Temporal order
 Speed ...
Blob Statistical Analysis
Blob Count (BC), Blob Area (BA)
Blob Statistical Analysis
Blob Distance (BD)
Blob Statistical Analysis
Aspect Ratio (AR)
Blob Statistical Analysis
Top five features for four different events

Feature ranking using absolute value criteria of tw...
Experimental Results
Specific Event Detection
•
•
•
•
•
•
•

Four different events: meet, split, runaway, and fight
CAVIAR...
Experimental Results
Experimental Results
Specific Event Detection

Actual

Predicted

Severity
Experimental Results
Abnormal Event Detection
•
•
•
•

University of Minnesota crowd dataset (UMN dataset)
The Runaway eve...
Experimental Results
Abnormal Event Detection (UMN-9)
Experimental Results
Abnormal Event Detection (UMN-10)
Experimental Results
Abnormal Event Detection (UMN-01)
Experimental Results
Abnormal Event Detection (UMN-07)
Experimental Results
Performance Comparison

Method

AUC

Proposed Method

0.89

Pure Optical Flow [1]

0.84

[1] R. Mehra...
URLs of the images used in this presentation
•
•
•
•
•
•
•
•
•
•
•
•
•
•

•

http://www.fotosearch.com/DGV464/766029/
http...
Thanks!

Q&A
Mahfuzul.Haque@gmail.com
Upcoming SlideShare
Loading in …5
×

Talk 2011-buet-perception-event

417 views
256 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
417
On SlideShare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Talk 2011-buet-perception-event

  1. 1. Human Visual Perception Inspired Background Subtraction Mahfuzul Haque and Manzur Murshed
  2. 2. Research Goal Real-time Video Analytics Stage 1 Video Stream     Stage 2 … Real-time Processing Event Detection Action / Activity Recognition Behaviour Recognition Behaviour Profiling Stage N Analytics  Intelligent Video Surveillance  Automated Alert  Smart Monitoring  Context-aware Environments
  3. 3. Unexpected Behaviors  Mob violence  Unusual Crowding  Sudden group formation/deformation  Shooting  Public panic
  4. 4. Increasing number of surveillance cameras Deployment of large number of surveillance cameras in recent years Modern airports now have several thousands cameras!!
  5. 5. Decreasing reliability Dependability on human monitors has increased. Reliability on surveillance system has decreased.
  6. 6. Are we really protected?
  7. 7. Surveillance cameras
  8. 8. Typical Video Analytics Framework Surveillance video stream High level description of unusual events/ actions Alarm! 1. Background Subtraction 2. Feature Extraction, Foreground Blob Classification Foreground Objects Classified Foreground Blobs Event/ Behaviour models 4. Event/Behavior Recognition Tracked trajectories Te 3. Tracking, Occlusion Handling
  9. 9. Background Subtraction Input Output
  10. 10. Background Subtraction: How? Basic Background Subtraction (BBS) Current frame = Background Foreground Blob Dynamic Background Modelling Background Model Current frame Challenges with BBS Foreground Blob • • • • • Not a practical approach Illumination variation Local background motion Camera displacement Shadow and reflection
  11. 11. MOG-based Background Subtraction σ2 P(x) µ P(x) x Sky Cloud Leaf Moving Person σ2 Road Shadow Moving Car Floor Shadow Walking People Cloud µ x P(x) Person Leaf Sky P(x) σ2 µ x Te x (Pixel intensity)
  12. 12. MOG-based Background Subtraction Background Model Current frame Detected object Frame 1 road Frame N shadow car road shadow Models are ordered by ω/σ ω1 σ12 µ1 road ω2 σ22 µ2 shadow 65% Te 20% ω3 σ32 µ3 car 15%
  13. 13. Typical Surveillance Setup Video Stream  Frame-size reduction  Frame-rate reduction Background Subtraction Feature Extraction Event Detection Parameter tuning based on operating environment
  14. 14. Scenario 1 α = Learning rate T = Background data proportion First Frame T = 0.4 T = 0.6 T = 0.8 α = 0.1 Test Frame α = 0.01 Ground Truth α = 0.001 Test Sequence: PETS2001_D1TeC2
  15. 15. Scenario 2 α = Learning rate T = Background data proportion First Frame T = 0.4 T = 0.6 T = 0.8 α = 0.1 Test Frame α = 0.01 Ground Truth α = 0.001 Test Sequence: VSSN06_camera1
  16. 16. Scenario 3 α = Learning rate T = Background data proportion First Frame T = 0.4 T = 0.6 T = 0.8 α = 0.1 Test Frame α = 0.01 Ground Truth α = 0.001 Test Sequence: CAVIAR_EnterExitCrossingPaths2cor
  17. 17. Observations • Slow learning rate (α) is not preferable (ghost or black-out). • Simple post-processing will not improve the detection quality at fast learning rate (α). • Need to know the context behaviour in advance.
  18. 18. How can we detect abnormal situations? “Hey, a mob will be approaching soon, and background will be visible only 10% of that duration. Please set T = 0.1”
  19. 19. Research Goals • A new background subtraction technique for unconstrained environments, i.e., no context related information • Operational at fast learning rate (α) • Acceptable detection quality • High stability across changing operating environments Te
  20. 20. The New Technique, PMOG • Perceptual Mixture of Gaussians • Incorporating perceptual characteristics of human visual system (HVS) in statistical background subtraction – Realistic background value prediction – Perception based detection threshold – Perceptual model similarity measure
  21. 21. Realistic Background Value Prediction Models are ordered by ω/σ ω1 σ12 µ1 ω2 σ22 µ2 road ω3 σ32 µ3 car shadow 65% 15% 20% x x x x P(x) P(x) μ b Te New! Most recent observation, b
  22. 22. Realistic Background Value Prediction … μ = (1-α)μ + αXt μ … b time x x P(x) b Most recent observation, b     Higher agility than using mean Not tied with the learning rate Realistic: actual intensity value No artificial value due to mean Te
  23. 23. Realistic Background Value Prediction Models are ordered by ω/σ ω1 σ12 µ1 b1 ω2 σ22 µ2 b2 road ω3 σ32 µ3 b3 car shadow 65% 15% 20% x x P(x) x P(x) b x x x P(x) b Te b
  24. 24. Perception Based Detection Threshold Models are ordered by ω/σ ω1 σ12 µ1 b1 ω2 σ22 µ2 b2 road ω3 σ32 µ3 b3 car shadow 65% 15% 20% x x x = c1 σ x P(x) x P(x) μ Te b x=?
  25. 25. Our Problem: How is x related with b? Low x x=? x x P(x) b Te High x
  26. 26. Weber’s Law How human visual system perceives noticeable intensity deviation from the background? Ernst Weber, an experimental psychologist in the 19th century, observed that the just-noticeable increment ΔI is linearly proportional to the background intensity I. ΔI = c2I Te
  27. 27. Weber’s Law Ernst Weber, an experimental psychologist in the 19th century, observed that the just-noticeable increment ΔI is linearly proportional to the background intensity I. ΔI = c2I x ? x x P(x) b Te b
  28. 28. Another perceptual characteristic of HVS What is the perceptual tolerance level in distinguishing distorted intensity measures? Method 1 Reference Image p dB Method 2 q dB Distorted Images |p – q| < 0.5 dB Not perceivable by human visual system
  29. 29. Our Problem: How is x related with b? Weber’s Law x=? x = c2b x x P(x) Perceptual Threshold, TP (0.5 dB)  255 20 log10   bx  b Te     20 log  255 10  b  x      1  2TP 
  30. 30. Impact of Perceptual Threshold, TP Human Vision: Tp = 0.5 dB Machine Vision: Tp = 1.0 dB (minimal impact of shadow, reflection, noise etc.) Te
  31. 31. Liner Relationship Te
  32. 32. Error Sensitivity in Darker Background Te
  33. 33. Rod and Cone Cells of Human Eye • Rods and Cones are two different types of photoreceptor cells in the retina of human eye • Rods – Operate in less intense light – Responsible for scotopic vision (night vision) • Cones – Operate in relatively bright light – Responsible for photopic (color vision) Te
  34. 34. Piece-wise Liner Relationship Scotopic Vision (R) Photopic Vision (C) Te
  35. 35. Perceptual Model Similarity in PMOG Model redundancy in MOG Te
  36. 36. Perceptual Model Similarity in PMOG
  37. 37. Experiments Test Sequences  Total 50 test sequences from 8 different sources  Scenario distribution      Indoor Outdoor Multimodal Shadow and Reflection Low background-foreground contrast Evaluation  Qualitative and quantitative  Lee (PAMI, 2005)  Stauffer and Grimson (PAMI, 2000) False Classification False Positive (FP) False Negative (FN)
  38. 38. Test Sequences PETS (9) Wallflower (7) UCF (7) IBMTe (11) CAVIAR (7) VSSN06 (7) Other (2)
  39. 39. Experiments Te
  40. 40. Experiments Te
  41. 41. Experiments Te
  42. 42. Experiments
  43. 43. Experiments
  44. 44. Experiments
  45. 45. Experiments
  46. 46. Experiments
  47. 47. Experiments
  48. 48. PMOG: Summary • Realistic background value prediction: high model agility and superior detection quality at fast learning rate • No context related information: high stability across changing scenarios • Perception based detection threshold: superior detection quality in terms of shadow, noise, and reflection • Perceptual model similarity: optimal number of models throughout the system life cycle • Parameter-less background subtraction: ideal for realtime video analytics Te
  49. 49. Panic-driven Event Detection
  50. 50. Event Detection time     Specific types of events vs. abnormality An event persists for a certain duration of time The duration is variable Event characteristics of the same event  Variable in the same environment How to identify the generic  Variable from one scene to other characteristics of an event? Te
  51. 51. The Proposed Event Detection Approach Architecture Foreground Detector Frame-level Feature Extractor Temporal Feature Extractor Event Models Model Training Frame-level Feature Extraction (30 features) Background Subtraction Labelled frames Temporal Feature Extraction (270 features) Feature Ranking and Selection Event Model Training Foreground blobs Real-time Execution Selective Frame-level Feature Extraction Background Subtraction Incoming frames Foreground blobs Selective Temporal Feature Extraction Trained Event Models Detection Results
  52. 52. The Proposed Event Detection Approach f1 f2 f3 . . . time Frame-level Features     Event Model fn Temporal Features Classifier Event detection as temporal data classification problem A distinct set of temporal features can characterise an event Which/how frame-level features are extracted? How the observed frame-level features are transformed in temporal-features?
  53. 53. The Proposed Event Detection Approach Motion based approaches Tracking based approaches  Key points detection  Point matching in successive frames  Flow vectors: position, direction, speed  Object detection  Object matching in successive frames  Trajectories: object paths Common characteristics  Inter-frame association  Context specific information  Event models are not generic Hu et al. (ICPR 2008) Proposed approach Xiang et al. (IJCV 2006)  No Inter-frame association  Foreground blob detection  Independent frame-level features =>  Global frame-level descriptor based on temporal features considering speed blob statistical analysis, independent and temporal order of scene characteristics
  54. 54. The Proposed Event Detection Approach f1 f2 f3 . . . time Frame-level Features Event Model fn Temporal Features Classifier Summary  Background subtraction for foreground blob detection  Independent frame-level features extracted using blob statistical analysis; no object / position specific information, no spatial association  Frame-level features are transformed into temporal features considering speed and temporal order Te  Supposed to be more context invariant
  55. 55. Blob Statistical Analysis Frame-level features         Blob Area (BA) Filling Ratio (FR) Aspect Ratio (AR) Bounding Box Area (BBA) Bounding box Width (BBW) Bounding box Height (BBH) Blob Count (BC) Blob Distance (BD)
  56. 56. Blob Statistical Analysis Temporal features 2 1 4 3 6 5 Frame #  Overlapping sliding window  Temporal order  Speed of variation
  57. 57. Blob Statistical Analysis Blob Count (BC), Blob Area (BA)
  58. 58. Blob Statistical Analysis Blob Distance (BD)
  59. 59. Blob Statistical Analysis Aspect Ratio (AR)
  60. 60. Blob Statistical Analysis Top five features for four different events Feature ranking using absolute value criteria of two sample t-test, based on pooled variance estimate.
  61. 61. Experimental Results Specific Event Detection • • • • • • • Four different events: meet, split, runaway, and fight CAVIAR dataset with labelled frames 80% of the test frames for model training 100 iterations of 10-fold cross validation Remaining 20% of the test frames for testing SVM classifier as event models Separate model for each event
  62. 62. Experimental Results
  63. 63. Experimental Results Specific Event Detection Actual Predicted Severity
  64. 64. Experimental Results Abnormal Event Detection • • • • University of Minnesota crowd dataset (UMN dataset) The Runaway event model No additional training or tuning Three different sites
  65. 65. Experimental Results Abnormal Event Detection (UMN-9)
  66. 66. Experimental Results Abnormal Event Detection (UMN-10)
  67. 67. Experimental Results Abnormal Event Detection (UMN-01)
  68. 68. Experimental Results Abnormal Event Detection (UMN-07)
  69. 69. Experimental Results Performance Comparison Method AUC Proposed Method 0.89 Pure Optical Flow [1] 0.84 [1] R. Mehran, A. Oyama, and M. Shah, “Abnormal crowd behavior detection using social force model,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition CVPR 2009, 20–25 June 2009, pp. 935–942.
  70. 70. URLs of the images used in this presentation • • • • • • • • • • • • • • • http://www.fotosearch.com/DGV464/766029/ http://www.cyprus-trader.com/images/alert.gif http://security.polito.it/~lioy/img/einstein8ci.jpg http://www.dtsc.ca.gov/PollutionPrevention/images/question.jpg http://www.unmikonline.org/civpol/photos/thematic/violence/streetvio2.jpg http://www.airports-worldwide.com/img/uk/heathrow00.jpg http://www.highprogrammer.com/alan/gaming/cons/trips/genconindy2003/exhibithall-crowd-2.jpg http://www.bhopal.org/fcunited/archives/fcu-crowd.jpg http://img.dailymail.co.uk/i/pix/2006/08/passaPA_450x300.jpg http://www.defenestrator.org/drp/files/surveillance-cameras-400.jpg http://www.cityofsound.com/photos/centre_poin/crowd.jpg http://www.hindu.com/2007/08/31/images/2007083156401501.jpg http://paulaoffutt.com/pics/images/crowd-surfing.jpg http://msnbcmedia1.msn.com/j/msnbc/Components/Photos/070225/070225_surv eillance_hmed.hmedium.jpg http://www.inkycircus.com/photos/uncategorized/2007/04/25/eye.jpg
  71. 71. Thanks! Q&A Mahfuzul.Haque@gmail.com

×