One shot scene specific crowd counting

One Shot Scene Specific Crowd
Counting
Mohammad Asiful Hossain, Mahesh Kumar K, Mehrdad Hosseinzadeh, Omit
Chanda and Yang Wang
Presented by
Hafsa Moontari Ali
ID: 7880835
Course Title: Research Methodology
Winter 2020, University of Manitoba

What is Crowd Counting?
●A technique used to count or estimate the number of people in a crowd.
●The most solution to crowd counting is to actually count the number of people from a
crowd.
●But it becomes difficult when the images of crowd are captured from open areas such as
streets or parks.
2

The problem becomes harder...
4

Where are the potential application areas?
●Urban planning
●Surveillance
●Traffic monitoring
●Geo-political analysis
5

Contribution of this paper
●Addressed a novel problem, “one-shot scene-specific crowd counting”.
●Generated a crowd counting model using Deep Learning.
●Significantly outperformed baseline methods.
6

Proposed Approach
●A density map is predicted from the
input static image.
●Each pixel of the density map
indicates the crowd density at the
corresponding location in the image.
●
●Crowd counting is obtained by
summing the entries of the density
map.
7

Model Architecture
●Dilated Convolutional Neural Networks(CSRNet) architecture is used as backbone.
●It employs convolutional neural network to extract features.
●Dilated convolutional neural network generates output from the features.
●The split of encoder/decoder is flexible and application specific.
8

Model Architecture
Figure 1: One-shot scene-specific adaptation using CSRNet
9

Model Learning
●During training, a collection of labeled training images are used.
●Each scene might correspond to a camera fixed at a particular location.
●It is assumed that each scene has same number of N training images.
●The model can be generalized where different scenes have different number of training
images.
●During training, the model learns the parameters of the encoder network.
10

One Shot Scene Specific Adaptation
●During testing, the crowd counting algorithm is deployed in a specific target scene.
●In this paper, one-shot learning is applied by fine tuning the decoder network.
●The distance between predicted density map and ground truth density map is considered
as loss function.
●Fine tuning is done by computing the gradient of the distance.
●The model is effectively tuned to the target scene.
11

Experimental Results
T
Table 1: Comparison of the performance (MAE and MSE) of our approach and the baselines on the WorldExpo’10
dataset and Trancos dataset. For “ours” and “simple fine-tuning”, either using the last layer or the last two layers of
CSRNet as the decoder are considered.
12

Cross Dataset Testing
Table 2: Performance in the cross-dataset testing with the same(a,b) and different (c,d) object. “W”, “U”, “M” and
“T” are used to denote WorldExpo’10, UCSD, Mall, Trancos, respectively.
13

Future Work
●This paper attempts to deploy a crowd counting model in real-world application.
●In future, this approach can be extended to few shot learning.
●Meta learning, meta-auxiliary learning can also be employed.
●This approach can be extended for unsupervised learning.
14

One shot scene specific crowd counting

More Related Content

What's hot

Similar to One shot scene specific crowd counting

Recently uploaded

One shot scene specific crowd counting