SlideShare a Scribd company logo
1 of 5
Download to read offline
An FPGA based human detection system with embedded platform
Pei-Yung Hsiao a,⇑
, Shih-Yu Lin a
, Shih-Shinh Huang b
a
Department of Electrical Engineering, National University of Kaohsiung, Kaohsiung, Taiwan, ROC
b
Department of Computer and Communication Eng., Nat’l Kaohsiung First Univ. of Science and Technology, Taiwan, ROC
a r t i c l e i n f o
Article history:
Received 27 August 2014
Received in revised form 23 December 2014
Accepted 17 January 2015
Available online 29 January 2015
Keywords:
FPGA circuit design
Real-time embedded system
Human detection
HOG
SVM
Adaboost
a b s t r a c t
Focusing on the computing speed of the practical machine learning based human detection system at the
testing (detecting) stage to reach the real-time requirement in an embedded platform, the idea of
iterative computing HOG with FPGA circuit design is proposed. The completed HOG accelerator contains
gradient calculation circuit module and histogram accumulation circuit module. The linear SVM classifi-
cation algorithm producing a number of necessary weak classifiers is combined with Adaboost algorithm
to establish a strong classifier. The human detection is successfully implemented on a portable embedded
platform to reduce the system cost and size. Experimental result shows that the performance error of
accuracy appears merely about 0.1–0.4% in comparison between the presented FPGA based HW/SW
co-design and the PC based pure software. Meanwhile, the computing speed achieves the requirement
of a real-time embedded system, 15 fps.
Ó 2015 Elsevier B.V. All rights reserved.
1. Introduction
Human and pedestrian detection technologies have been the
rage in the fields of intelligent transportation system, computer
vision, and perceptive surveillance system in the past years [1–5].
As we known, various machine learning algorithms have been pro-
posed to solve the problem of human detection, in which a local fea-
ture vector of the histograms of oriented gradients (HOG), proposed
by Dalal et al. [1] in 2005, turns into the mostly cited human
descriptive feature [2–6].
Despite a lot of researchers aiming at designing FPGA based cir-
cuits for image processing, detection, and other related applica-
tions [7–8], only several but few researches focused on
developing hardware circuit of HOG in the recent years [4–6].
Yet, those few works did not present as a whole system built in
an embedded platform to achieve a real-time HW/SW co-design
system for human detection. Besides, the detailed comparisons of
computation speeds and detection error rates among FPGA acceler-
ator, embedded platform, and personal computer still did not
appear in the past literatures.
In this study, the human detection is successfully implemented
on a portable embedded platform to reduce the system cost and
size. Moreover, the computing speed of the testing stage achieves
about 15 frames per second, which fully matching the requirement
of a real-time embedded system [7].
2. Principles and FPGA design
2.1. Human detection algorithm
The human detection algorithm covers training stage and test-
ing, or named as detecting, stage, as shown in Fig. 1. Both algo-
rithms of SVM [9–10] and Adaboost [3] are required at the
training stage. However, only SVM algorithm should be used in
the testing stage. In our experiment, both of various still image
pictures and videos are utilized as input at both stages, separately.
Two public image datasets are collected for the former and the later
as well, while two public videos and one additional video are cho-
sen for the latter. The additional video is shot by us with the scene
set-up in our laboratory.
The scene images need to be artificially collected for positive
samples (human) and negative samples (non-human) with the res-
olution 64 Â 128 as a detecting window [1] before getting into the
training stage. To reduce the scanning range of the detecting
window, the input image frame is first proceeded foreground seg-
mentation at the testing stage in order to acquire the region of
interest (ROI) for diminishing the computation time. As the human
objects in the image frame would change the sizes with distinct
distances between camera and objects, the detecting windows
with different sizes need to be scaled up/down.
The system will judge whether there is a human or not in each
detecting window by the One Detecting Window Strong Classifier
Module. The module contains two computing steps. First, all HOG
vectors, which being correspondent with all weak classifiers built
http://dx.doi.org/10.1016/j.mee.2015.01.018
0167-9317/Ó 2015 Elsevier B.V. All rights reserved.
⇑ Corresponding author.
E-mail address: pyhsiao@nuk.edu.tw (P.-Y. Hsiao).
Microelectronic Engineering 138 (2015) 42–46
Contents lists available at ScienceDirect
Microelectronic Engineering
journal homepage: www.elsevier.com/locate/mee
as a strong classifier, are calculated. For instance, a strong classifier
with 40 weak classifiers needs to proceed 40 times of HOG calcula-
tion. Second, all weak classifiers are used to proceed the SVM pre-
dict once by using the SVM model file acquired from the training
stage in order to identify the object inside the detecting window
being a human or not a human (non-human).
2.2. Histograms of oriented gradients
As we known, HOG proposed by Dalal et al. [1] is the mostly
cited human local feature. The idea is to use the 36D vector as
the descriptive feature in human detection for representing the
contour and appearance information of an object in an image block
or in a detecting window. With the hardware design of HOG using
FPGA, based on design principles of simplicity, regularity, and
modularity, our circuit architecture presents four arithmetic mod-
ules described below.
2.2.1. Gradient components and gradient magnitude
In order to acquire the vertical and horizontal gradient compo-
nents, the vertical and horizontal differential operations are first
proceeded at a aiming block. In other words, mask [À1, 0, 1] or
[À1, 0, 1]T
is used for the convolution operation, through Eq. (1),
to calculate Gx and Gy, whose values range between À255 and +255.
Gxðx; yÞ ¼ fðx þ 1; yÞ À fðx À 1; yÞ
Gyðx; yÞ ¼ fðx; y þ 1Þ À fðx; y À 1Þ
ð1Þ
The above Gx and Gy are used for calculating the gradient mag-
nitude with Eq. (2), i.e., to square root the sum of the square. The
resultant values appear in 0–357.
rfðx; yÞ ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Gxðx; yÞ2
þ Gyðx; yÞ2
q
ð2Þ
2.2.2. Gradient orientation
The above Gx and Gy are again used for calculating the gradient
orientation with Eq. (3). The results are converted to the angles
ranging between 0° and 180°, i.e., to acquire unsigned gradient
orientation.
hðx; yÞ ¼ tanÀ1 Gyðx; yÞ
Gxðx; yÞ
ð3Þ
2.2.3. Accumulated histogram
After obtaining the gradient magnitude and the gradient orien-
tation, four cells equally divided from a block are separately accu-
mulated to produce four 9D vectors, which are combined into a
36D vector v=(v1,v2,...,v36) as shown in Fig. 2. The gradient orienta-
tion is segmented with 20° as a bin in a cell, and total 9 bins are
acquired.
2.2.4. L2 normalization
The acquired 36D vector utilizes L2 normalization to have each
component value appear in 0–1 from Eq. (4), where e is a small
constant.
vi ¼
vi
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðjjvjj2Þ2
þ e2
q ; i ¼ 1; 2; :::36 ð4Þ
2.3. FPGA modular circuit design
The block diagram of the developed FPGA circuits is shown in
Fig. 3. The HOG vector generator module is illustrated on the right
lower sub-block inside the whole block diagram. Four circuit sub-
modules are contained inside the sub-block. The full buffering
scheme shown in the left upper part of the HOG vector generator
module is designed for adjusting the flow of input block pixels. The
gradient module is used for calculating gradient components, grat a
aiming block. In other wordsadient magnitude, and gradient orienta-
tion. The histogram module is used for accumulating the histograms
of four cells. In addition, the histogram PISO in the left lower part is
designed for buffering the output of the 36D HOG accelerator. More-
over, the testing and communication circuits are put on this design in
theleftpartoutsidethe HOGvectorgenerator moduleinFig.3,soasto
control signal and share data with embedded ARM CPU.
The gradient module covers three sub-modules, including the
GradientComponents for calculating gradient components, the Com-
ponentsToMagnitude for producing the gradient magnitude, and the
ComponentsToOrientation for generating the gradient orientation,
respectively. Before inputting the image data into these sub-mod-
ules, a data buffering scheme is required, and an appropriate com-
puting for simplifying floating point manipulation is necessary
before designing the sub-module circuit of ComponentsToOrienta-
tion. The gradient orientation contains two types of signed and
unsigned gradient orientations, where the latter is used in this
study. When accumulating the orientation, the range of 20° is used
Start
End
Labeled Human
Testing Frames
SVM and AdaBoost
Training Stage
Training Stage
Testing Stage
Detecting Window
Scanning
Foreground
Segmentation
Frame Input and
Grey Scale
One Detecting Window
Strong Classifier
Module
If Result >=0
Yes
No
Positive samples Negative samples
SVM Model File and
Strong Classifier File
Detecting Window
Scaling
Fig. 1. Machine learning based human detection system.
9D 9D
9D9D
Feature block 4 Cells
9D Histogram
36D Histogram
Fig. 2. Combining four 9D histograms to an extracted HOG feature of 36D
histogram.
P.-Y. Hsiao et al. / Microelectronic Engineering 138 (2015) 42–46 43
as the partition basis. By applying Eq. (5) with just one multiplier for
simple and regular hardware implementation principle, each bin
value is defined as the accumulated number of pixels in a cell, in
which those pixel orientations should belong to the same range of
20°. Therefore, there are nine angle bins in a cell. To effectively
reduce the proportion of accumulation deflection, a 32-bit register
is utilized for the 220
magnification.
Gxðx; yÞ tanðhiÞ 6 Gyðx; yÞ < Gxðx; yÞ tanðhiþ1Þ ð5Þ
Besides designing three sub-modules in the gradient module,
two types of sub-modules, BlockTo4Cells and Vote9DVector, are
covered in the histogram module. The former sub-module judges
each set of orientation bin and magnitude belonging to which of
the four cells. Then it combines both of the signals, OrientationBin
(4-bit) and Magnitude (9-bit), as illustrated in the most upper part
in Fig. 4, into a 13-bit signal. Finally, the former sub-module deliv-
ers the combined 13-bit signal to four parallel same-type latter
sub-modules, Vote9Dvector1–Vote9Dvector4 as shown in Fig. 3,
orderly according to those four cells with a 1-to-4 demultiplexer.
Consequently, each sub-module of Vote9Dvector functions
accumulates each set of orientation bin and magnitude to a target
cell. The circuit design of each sub-module of Vote9Dvector is
shown in Fig. 4.
3. Experiment results
3.1. Public image datasets
Two public image datasets, CBCL [11] and CVC [12], are selected
for training and testing stages in the system. The image frames
used for training stage would not be used for testing stage in order
to guarantee the effectiveness and persuasion of the test results,
i.e., detection rate and accuracy.
Such two datasets present distinct characteristics naturally.
After manual treatment, several hundred or thousands samples
are selected for training, and other different samples, one by one,
are used for testing. The numbers of the selected positive and neg-
ative samples are shown in Table 1. Here, total 924 positive sam-
ples are selected from CBCL for the first experiment listed in the
CBCL row of Table 1. Half of them are used in training stage, and
the other half are used for testing stage. Because there is no nega-
tive sample existed in CBCL, 1038 negative samples are taken from
CVC to combine with the first half of 924 positive samples for
training. Similarly, 1359 negative samples are taken from CVC to
combine with the second half of 924 positive samples for testing.
Besides, there are total 3356 positive samples existed in CVC.
For the second experiment listed in the CVC row of Table 1 and
571 positive samples are picked out for training and the other
set of 571 different positive samples for testing. For preparing
the negative samples from CVC, total 4096 negative samples are
segmented out from which 1326 negative samples are picked out
for training stage. On the other hand, the different 2048 negative
samples are selected for testing stage.
3.2. HOG accelerator
The completed HOG accelerator, implemented as a Xilinx FPGA
circuit module, presents the highest frequency up to 192 MHz. The
number of cycles for computing one HOG can be formulated as
#cycles = BlockWidth + BlockWidth * BlockHeight + 58.
In the gradient component module, the boundary of block
image for gradient convolution is processed with shrinking manip-
ulation. Moreover, the circuit computing in the ComponentsTo-
Magnitude module is taken the integer rather than the floating
point value. Such two mentioned factors would bring out differ-
ence between the detection rates reached by the FPGA based
HW/SW co-design on the embedded platform and by the execution
of pure software on PC.
SMIMS Macube
Embedded Platform
Colibri T20 (Linux Software)
XC6SLX150T FPGA(Top Module)(Hardware)
One-HOG Vector Generator Module
Gradient
Components
Components
To
Magnitude
Components
To
Orientation
mem1
64x128*8b
OneBlock
Gradient
Module
Vote 9D Vector
Histogram Module
(Vote 36D Vector)
Block
To
4Cells
Vote 9D Vector
Vote 9D Vector
Vote 9D Vector
MO_Generator Module
mem2
36*32b
36D Vector
Full
Buffering
parameter
Width
Height
BufLen
Histogram
PISO
mem0
Dual
Port
4096x16b
ARM_
Mem_
Interface
Tegra 2
CPU
Write
Buffer
4096x16b
Read
Buffer
4096x16b
CPU Write to FPGA Buffer
CPU Read from FPGA Buffer
Input
Data
Move
Output
Data
Move
Generic
Memory
Interface
(GMI) Data Bus
System
Signal
Controller
OneHVG_Start
PISO_done
StartDone
ARM_R_Ready
ARM_W_Ready
USB
Host
SMIMS
Engine
Control and
Program FPGA
Fig. 3. Block diagram for the architecture of our modular circuits for human
detection.
+
Register1
Demultiplexer
1D-HOG
Component
Sel
OrientationBin[3:0]
Input
Magnitude[8:0]
Output9Output1 Output2 Output3 Output4 Output5 Output6 Output7 Output8
+
Register2
+
Register3
+
Register4
+
Register5
+
Register6
+
Register7
+
Register8
+
Register9
1D-HOG
Component
1D-HOG
Component
1D-HOG
Component
1D-HOG
Component
1D-HOG
Component
1D-HOG
Component
1D-HOG
Component
1D-HOG
Component
Fig. 4. One of four parallel Vote9DVector sub-module circuits.
Table 1
The numbers of positive or negative samples selected from two public image datasets.
DataSet Training/testing #Pos. samples #Neg. samples Total #
CBCL Training 462 1038 (CVC) 1500
Testing 462 1359 (CVC) 1821
CVC Training 571 1326 1897
Testing 571 2048 2619
Table 2
Comparison of computation time for one HOG.
Computation basis Spec. OS Speed for one HOG (rate)
PC
(SW)
i7-3770/3.4 Ghz
DDR3/8 GB
Win7 0.035393 ms
(20.601)
ARM Colibri
T20 (SW)
Tegra 2/1.0 Ghz
DDR2/512 MB
Linux 0.122800 ms
(71.478)
FPGA
(HW)
Xilinx/192 MHz
XC6SLX150T
None 0.001718 ms
(1)
44 P.-Y. Hsiao et al. / Microelectronic Engineering 138 (2015) 42–46
Based on the accuracy observation of in Table 3, the experimen-
tal results reveal that such an error of detection rate is kept in 0.1–
0.4% for CBCL and CVC datasets, respectively. The accuracy
decreases of 0.1% (98.5–98.4%) and 0.4% (97.4–97.0%) were
obtained from the last row of Table 3. The detailed detection rate
analyses and statistics are described in the next sub-section. How-
ever, the comparison of the computing speed with one HOG is
given in advance in Table 2. The computation time of one HOG
for a block of 16 * 16 pixels with the designed FPGA hardware cir-
cuit merely takes 0.001718 ms. That is, the computing speed with
the proposed one HOG hardware circuit is 20 times faster than the
software computation on a PC and 71 times faster than the ARM
CPU based software on the embedded platform. The computation
time of the entire human detection system is further described in
the following sub-section.
3.3. Detection rate and computation speed
Various measures can be utilized for calculating the detection
rate. In this experiment, three statistic measures are applied to
the experiments of above CBCL and CVC public datasets, including
positive predictive value (PPV or precision), true positive rate (TPR
or recall), and accuracy. PPV stands for the probability of the
labeled (detected) human being a real human; TPR represents
the probability of all human images identified as human, i.e., the
so-called detection rate (or recall); accuracy refers to the propor-
tion of all human and non-human being correctly classified.
In comparison of executing software on the embedded HW/SW
co-design platform and replacing HOG module with FPGA hard-
ware circuit, the above various measures of detection rates are pre-
ceded in our experiments. The experimental results are shown in
Table 3, where the errors of PPV, TPR, and accuracy between
HOG accelerating computation with FPGA and pure software run
on embedded platform or on PC appear below 0.9%, 05%, and
0.4%, respectively. This presents that our HOG hardware computa-
tion gives a high precision, and brings out small errors in various
detection rate measures in terms of the whole system.
What is more, the computing time required for processing one
detecting window at the testing stage is further experimented and
compared, as shown in Table 4. The discovered time for 22–33
times of iterative calculating HOG is quite larger than it for calcu-
lating SVM module. Apparently, designing an iterative used HOG
hardware in machine learning based human detection reveals
higher importance and necessity than SVM hardware [10]. Besides,
the time for human detection with iterative computation of HOG
hardware at the testing stage for a detecting window, could be
effectively reduced and suppressed under fewer than 1 ms, namely
0.922 ms or 0.882 ms, from Table 4. In other words, our FPGA
based human detection system could deal with about 1080 detect-
ing windows in a second. When 72 detecting windows are applied
in each image frame, the system can successfully achieve the
requirement of real-time embedded system of about 15 frames
per second. It further turns out that the computing speed compar-
ison by a detecting window shown in Table 4 presents our con-
vinced outcome and value better than that by one time used
HOG hardware in Table 2.
In comparison with Dalal’s detector [1] based on the same data-
sets of CBCL and CVC as shown in Table 1, this detector confirms a
little lift of the performance. On average, the improvement of PPV,
TPR, and accuracy of this system shows 0.3%, 1.4%, and 0.4% better
than Dalal’s detector, respectively, as listed in Table 5.
3.4. Implementation on a real-time embedded platform
In this research, the architecture of real-time embedded devel-
opment platform, MaCube, consisting of an ARM module modeled
Colibri T20 and an NVIDIA Tegra-2 dual-core Cortex-A9 micropro-
cessor run in 2 Â 1.0 Ghz, is employed with the Linux OS. Mean-
while, the HW/SW integration environment is established by
using Tegra-2 generic memory interface (GMI) data bus and Xilinx
Spartan-6 LX-150T FPGA chip.
An interface engine IC is allocated between ARM SOC and FPGA,
being in charge of the procedure control between ARM and FPGA
and able to download our designed circuit files to FPGA. To accel-
erate the data access speed, DMA is used by MaCube platform for
accessing data to/from FPGA chip. It makes the HW/SW integration
be more efficient.
Table 3
Detection rate and accuracy for our embedded h/s co-design and pure software
human detection systems.
DataSets CBCL CVC
H/S co-design
vs. SW
Pure
SW
Embedded SW
with HOG/FPGA
Pure SW Embedded SW
with HOG/FPGA
# Weak classifiers 33 23 24 22
TP 436 435 512 509
TN 1359 1358 2039 2034
FP 0 1 9 14
FN 26 27 59 62
PPV 100% 99.7% 98.2% 97.3%
TPR 94.3% 94.1% 89.6% 89.1%
Accuracy 98.5% 98.4% 97.4% 97.0%
Table 4
Comparison of computation efficiency per one detecting window.
Speed vs. HOG & SVM # Weak
classifiers
HOG
(ms)
SVM
(ms)
Total
(ms)
CBCL Embedded SW 33 4.104 0.047 4.151
Embedded SW with HOG/FPGA 23 0.922 0.029 0.951
CVC Embedded SW 24 2.363 0.030 2.393
Embedded SW with HOG/FPGA 22 0.882 0.028 0.910
Table 5
Detection rate comparison between Dalal’s and our detector.
DataSets CBCL (SW)
(%)
CVC (SW)
(%)
Dalal PPV 99.5 98.1
TPR 93.1 88.0
Accuracy 98.2 96.9
Ours PPV 100 98.2
TPR 94.3 89.6
Accuracy 98.5 97.4
Fig. 5. Video demonstration for our FPGA based human detection system run in a
real-time embedded platform. (a) and (b) Caviar video; (c) and (d) AVSS 2007video;
(e) and (f) our video.
P.-Y. Hsiao et al. / Microelectronic Engineering 138 (2015) 42–46 45
The memory mapping is applied to the user program to corre-
spond with the source end and the target end of the DMA data.
The transmission of DMA is 4096 Â 16 bits at a time. The user pro-
gram writes the data into the write buffer at the source end, calls
the driver to start DMA, and then successfully transmits the data
into FPGA. The control authority of software is then returned to
the user program after the successful transmission. For the reverse
behavior of data transmission, i.e., to read data from the FPFA, the
driver is first called to start DMA so as to read the FPGA data and
write them into the read buffer.
To demonstrate the final results of this study, the videos with
three different scenes of public Caviar video [13], public AVSS
2007 video [14], and self-shot video from our laboratory are pro-
ceeded as the dynamic dataset experiments, as shown in Fig. 5,
where object inside a detecting window automatically detected
as a human by this system is labeled with red rectangle.
4. Conclusion
In order to get rid of a PC-based computing environment and
transfer to the embedded platform as well as to speed up the com-
puting speed of a bottleneck module, the FPGA based HOG vector
generator is successfully accomplished as a modular circuit design.
The completed HOG accelerator contains two circuit modules of
gradient calculation and histogram accumulation. With iterative
used HOG hardware, accuracy changing rate of the FPGA based
human detection system in comparison with that of pure software
appears merely within 0.4%, which presenting a very small effect of
the accelerating design on the detection rate. Meanwhile, the com-
pleted FPGA based human detection system could process about
1075 detecting windows in a second. In other words, it successfully
achieves the requirement of a real-time embedded system of about
15 fps.
Acknowledgments
This research is partially sponsored under the projects MOST
103-2221-E-390-028-MY2 and NSC102-2221-E-390-026.
References
[1] N. Dalal, B. Triggs, Proc. IEEE Conf. Comput. Vision Pattern Recognit. 1 (2005)
886–893.
[2] D. Gerónimo, A.M. López, A.D. Sappa, T. Graf, IEEE Trans. Pattern Anal. Mach.
Intell. 32 (2010) 1239–1258
[3] Q. Zhu, M.-C. Yeh, K.-T. Cheng, S. Avidan, Proc. IEEE Conf. Comput. Vision
Pattern Recognit. 2 (2006) 1491–1498
[4] P.Y. Chen, C.C. Huang, C.Y. Lien, Y.H. Tsai, IEEE Trans. Intell. Transp. Syst. 15 (2)
(2014) 656–662.
[5] S. Bauer, U. Brunsmann, S. Schlotterbeck-Macht, In: MPC Workshop, (2009) pp.
49–58.
[6] R. Kadota, H. Sugano, M. Hiromoto, H. Ochi, R. Miyamoto, Y. Nakamura, Proc.
IIH-MSP IEEE (2009) 1330–1333.
[7] P.Y. Hsiao, C.H. Chen, H. Wen, S.J. Chen, IEE Proc. Comput. Digit. Tech. 153 (4)
(2006) 1871–1874.
[8] K.G. Gokhan, S. Afsar, Microprocess. Microsyst. 37 (3) (2013) 270–286.
[9] R.E. Fan, K.W. Chang, C.J. Hsieh, X.R. Wang, C.J. Lin, J. Machine Learning
Research 9 (2008) 1871–1874. <http://www.csie.ntu.edu.tw/~cjlin/liblinear>.
[10] D. Anguita, A. Boni, S. Ridella, IEEE Trans. Neural Networks 14 (5) (2003) 993–
1009.
[11] CBCL Pedestrian DB, <http://cbcl.mit.edu/software-datasets>.
[12] CVC Virtual Dataset, <http://www.cvc.uab.es/adas/databases>.
[13] CAVIAR, (2001) <http://homepages.inf.ed.ac.uk/rbf/CAVIAR>.
[14] AVSS, <http://www.eecs.qmul.ac.uk/~andrea/avss2007_d.html>.
46 P.-Y. Hsiao et al. / Microelectronic Engineering 138 (2015) 42–46

More Related Content

What's hot

A Real-Time Implementation of Moving Object Action Recognition System Based o...
A Real-Time Implementation of Moving Object Action Recognition System Based o...A Real-Time Implementation of Moving Object Action Recognition System Based o...
A Real-Time Implementation of Moving Object Action Recognition System Based o...ijeei-iaes
 
FPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
FPGA Implementation of FIR Filter using Various Algorithms: A RetrospectiveFPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
FPGA Implementation of FIR Filter using Various Algorithms: A RetrospectiveIJORCS
 
2D/Multi-view Segmentation and Tracking
2D/Multi-view Segmentation and Tracking2D/Multi-view Segmentation and Tracking
2D/Multi-view Segmentation and TrackingTouradj Ebrahimi
 
Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Rec...
Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Rec...Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Rec...
Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Rec...Saif Mahmud
 
Proposal and Implementation of the Connected-Component Labeling of Binary Ima...
Proposal and Implementation of the Connected-Component Labeling of Binary Ima...Proposal and Implementation of the Connected-Component Labeling of Binary Ima...
Proposal and Implementation of the Connected-Component Labeling of Binary Ima...CSCJournals
 
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...IAEME Publication
 
IRJET- Segmenting and Classifying the Moving Object from HEVC Compressed Surv...
IRJET- Segmenting and Classifying the Moving Object from HEVC Compressed Surv...IRJET- Segmenting and Classifying the Moving Object from HEVC Compressed Surv...
IRJET- Segmenting and Classifying the Moving Object from HEVC Compressed Surv...IRJET Journal
 
Real Time Face Detection on GPU Using OPENCL
Real Time Face Detection on GPU Using OPENCLReal Time Face Detection on GPU Using OPENCL
Real Time Face Detection on GPU Using OPENCLcsandit
 
Performance boosting of discrete cosine transform using parallel programming ...
Performance boosting of discrete cosine transform using parallel programming ...Performance boosting of discrete cosine transform using parallel programming ...
Performance boosting of discrete cosine transform using parallel programming ...IAEME Publication
 
Design and development of DrawBot using image processing
Design and development of DrawBot using image processing Design and development of DrawBot using image processing
Design and development of DrawBot using image processing IJECEIAES
 
Reversible encrypted data concealment in images by reserving room approach
Reversible encrypted data concealment in images by reserving room approachReversible encrypted data concealment in images by reserving room approach
Reversible encrypted data concealment in images by reserving room approachIAEME Publication
 
International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) ijceronline
 
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...Editor IJMTER
 
Low complexity features for jpeg steganalysis using undecimated dct
Low complexity features for jpeg steganalysis using undecimated dctLow complexity features for jpeg steganalysis using undecimated dct
Low complexity features for jpeg steganalysis using undecimated dctPvrtechnologies Nellore
 
SVM Based Saliency Map Technique for Reducing Time Complexity in HEVC
SVM Based Saliency Map Technique for Reducing Time Complexity in HEVCSVM Based Saliency Map Technique for Reducing Time Complexity in HEVC
SVM Based Saliency Map Technique for Reducing Time Complexity in HEVCIRJET Journal
 

What's hot (19)

A Real-Time Implementation of Moving Object Action Recognition System Based o...
A Real-Time Implementation of Moving Object Action Recognition System Based o...A Real-Time Implementation of Moving Object Action Recognition System Based o...
A Real-Time Implementation of Moving Object Action Recognition System Based o...
 
H1802054851
H1802054851H1802054851
H1802054851
 
FPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
FPGA Implementation of FIR Filter using Various Algorithms: A RetrospectiveFPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
FPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
 
2D/Multi-view Segmentation and Tracking
2D/Multi-view Segmentation and Tracking2D/Multi-view Segmentation and Tracking
2D/Multi-view Segmentation and Tracking
 
Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Rec...
Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Rec...Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Rec...
Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Rec...
 
Dip day1&2
Dip day1&2Dip day1&2
Dip day1&2
 
Proposal and Implementation of the Connected-Component Labeling of Binary Ima...
Proposal and Implementation of the Connected-Component Labeling of Binary Ima...Proposal and Implementation of the Connected-Component Labeling of Binary Ima...
Proposal and Implementation of the Connected-Component Labeling of Binary Ima...
 
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...
 
IRJET- Segmenting and Classifying the Moving Object from HEVC Compressed Surv...
IRJET- Segmenting and Classifying the Moving Object from HEVC Compressed Surv...IRJET- Segmenting and Classifying the Moving Object from HEVC Compressed Surv...
IRJET- Segmenting and Classifying the Moving Object from HEVC Compressed Surv...
 
Real Time Face Detection on GPU Using OPENCL
Real Time Face Detection on GPU Using OPENCLReal Time Face Detection on GPU Using OPENCL
Real Time Face Detection on GPU Using OPENCL
 
Fn2611681170
Fn2611681170Fn2611681170
Fn2611681170
 
Performance boosting of discrete cosine transform using parallel programming ...
Performance boosting of discrete cosine transform using parallel programming ...Performance boosting of discrete cosine transform using parallel programming ...
Performance boosting of discrete cosine transform using parallel programming ...
 
Design and development of DrawBot using image processing
Design and development of DrawBot using image processing Design and development of DrawBot using image processing
Design and development of DrawBot using image processing
 
Reversible encrypted data concealment in images by reserving room approach
Reversible encrypted data concealment in images by reserving room approachReversible encrypted data concealment in images by reserving room approach
Reversible encrypted data concealment in images by reserving room approach
 
International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER)
 
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...
 
Low complexity features for jpeg steganalysis using undecimated dct
Low complexity features for jpeg steganalysis using undecimated dctLow complexity features for jpeg steganalysis using undecimated dct
Low complexity features for jpeg steganalysis using undecimated dct
 
SVM Based Saliency Map Technique for Reducing Time Complexity in HEVC
SVM Based Saliency Map Technique for Reducing Time Complexity in HEVCSVM Based Saliency Map Technique for Reducing Time Complexity in HEVC
SVM Based Saliency Map Technique for Reducing Time Complexity in HEVC
 
B046050711
B046050711B046050711
B046050711
 

Viewers also liked

Бодряева Любовь Васильевна
Бодряева Любовь ВасильевнаБодряева Любовь Васильевна
Бодряева Любовь Васильевнаpnextorg
 
An Introduction to Teaching With Social Media
An Introduction to Teaching With Social MediaAn Introduction to Teaching With Social Media
An Introduction to Teaching With Social Mediasociamigo
 
Study in ukraine cost
Study in ukraine costStudy in ukraine cost
Study in ukraine costEdufactsIndia
 
Non-Fiction Reading Strategies
Non-Fiction Reading StrategiesNon-Fiction Reading Strategies
Non-Fiction Reading Strategiesegurklis
 
CLA_Newsletter_Spring_2015
CLA_Newsletter_Spring_2015CLA_Newsletter_Spring_2015
CLA_Newsletter_Spring_2015Kelsey Johnson
 
CosmicMedicCharacter/Tech/Creature Design
CosmicMedicCharacter/Tech/Creature DesignCosmicMedicCharacter/Tech/Creature Design
CosmicMedicCharacter/Tech/Creature DesignDaniel Hoisch
 
Fall_2016_Fur_Linesheets_part1
Fall_2016_Fur_Linesheets_part1Fall_2016_Fur_Linesheets_part1
Fall_2016_Fur_Linesheets_part1Mehtab Badwal
 
Proper Pool and Fountain Maintenance | Tips from The Grounds Guys®
Proper Pool and Fountain Maintenance | Tips from The Grounds Guys®Proper Pool and Fountain Maintenance | Tips from The Grounds Guys®
Proper Pool and Fountain Maintenance | Tips from The Grounds Guys®DGCommunications
 
A Robertson & Son - Report 2004
A Robertson & Son - Report 2004A Robertson & Son - Report 2004
A Robertson & Son - Report 2004Graham Robertson
 
綠能27植萃粉 網路版
綠能27植萃粉 網路版綠能27植萃粉 網路版
綠能27植萃粉 網路版蔓繻 林
 
Animal Euthanasia by gassing
Animal Euthanasia by gassingAnimal Euthanasia by gassing
Animal Euthanasia by gassingsydneyhardrath
 
земная жизнь христа
земная жизнь христаземная жизнь христа
земная жизнь христаmaria_kostyk
 

Viewers also liked (18)

Бодряева Любовь Васильевна
Бодряева Любовь ВасильевнаБодряева Любовь Васильевна
Бодряева Любовь Васильевна
 
Syllabus
SyllabusSyllabus
Syllabus
 
Pronouns
PronounsPronouns
Pronouns
 
An Introduction to Teaching With Social Media
An Introduction to Teaching With Social MediaAn Introduction to Teaching With Social Media
An Introduction to Teaching With Social Media
 
00. pti introduction
00. pti   introduction00. pti   introduction
00. pti introduction
 
Study in ukraine cost
Study in ukraine costStudy in ukraine cost
Study in ukraine cost
 
Non-Fiction Reading Strategies
Non-Fiction Reading StrategiesNon-Fiction Reading Strategies
Non-Fiction Reading Strategies
 
Patent Design
Patent DesignPatent Design
Patent Design
 
CLA_Newsletter_Spring_2015
CLA_Newsletter_Spring_2015CLA_Newsletter_Spring_2015
CLA_Newsletter_Spring_2015
 
CosmicMedicCharacter/Tech/Creature Design
CosmicMedicCharacter/Tech/Creature DesignCosmicMedicCharacter/Tech/Creature Design
CosmicMedicCharacter/Tech/Creature Design
 
6 Ways to Save Your Hearing
6 Ways to Save Your Hearing6 Ways to Save Your Hearing
6 Ways to Save Your Hearing
 
Fall_2016_Fur_Linesheets_part1
Fall_2016_Fur_Linesheets_part1Fall_2016_Fur_Linesheets_part1
Fall_2016_Fur_Linesheets_part1
 
Proper Pool and Fountain Maintenance | Tips from The Grounds Guys®
Proper Pool and Fountain Maintenance | Tips from The Grounds Guys®Proper Pool and Fountain Maintenance | Tips from The Grounds Guys®
Proper Pool and Fountain Maintenance | Tips from The Grounds Guys®
 
A Robertson & Son - Report 2004
A Robertson & Son - Report 2004A Robertson & Son - Report 2004
A Robertson & Son - Report 2004
 
綠能27植萃粉 網路版
綠能27植萃粉 網路版綠能27植萃粉 網路版
綠能27植萃粉 網路版
 
Animal Euthanasia by gassing
Animal Euthanasia by gassingAnimal Euthanasia by gassing
Animal Euthanasia by gassing
 
земная жизнь христа
земная жизнь христаземная жизнь христа
земная жизнь христа
 
The Top 5 Hearing Aid Myths Exposed
The Top 5 Hearing Aid Myths ExposedThe Top 5 Hearing Aid Myths Exposed
The Top 5 Hearing Aid Myths Exposed
 

Similar to Fpga human detection

A high frame-rate of cell-based histogram-oriented gradients human detector a...
A high frame-rate of cell-based histogram-oriented gradients human detector a...A high frame-rate of cell-based histogram-oriented gradients human detector a...
A high frame-rate of cell-based histogram-oriented gradients human detector a...IAESIJAI
 
IRJET - Face Recognition based Attendance System
IRJET -  	  Face Recognition based Attendance SystemIRJET -  	  Face Recognition based Attendance System
IRJET - Face Recognition based Attendance SystemIRJET Journal
 
ROAD SIGN DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)
ROAD SIGN DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)ROAD SIGN DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)
ROAD SIGN DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)IRJET Journal
 
COMPARISON OF GPU AND FPGA HARDWARE ACCELERATION OF LANE DETECTION ALGORITHM
COMPARISON OF GPU AND FPGA HARDWARE ACCELERATION OF LANE DETECTION ALGORITHMCOMPARISON OF GPU AND FPGA HARDWARE ACCELERATION OF LANE DETECTION ALGORITHM
COMPARISON OF GPU AND FPGA HARDWARE ACCELERATION OF LANE DETECTION ALGORITHMsipij
 
Comparison of GPU and FPGA Hardware Acceleration of Lane Detection Algorithm
Comparison of GPU and FPGA Hardware Acceleration of Lane Detection AlgorithmComparison of GPU and FPGA Hardware Acceleration of Lane Detection Algorithm
Comparison of GPU and FPGA Hardware Acceleration of Lane Detection Algorithmsipij
 
HARDWARE SOFTWARE CO-SIMULATION FOR TRAFFIC LOAD COMPUTATION USING MATLAB SIM...
HARDWARE SOFTWARE CO-SIMULATION FOR TRAFFIC LOAD COMPUTATION USING MATLAB SIM...HARDWARE SOFTWARE CO-SIMULATION FOR TRAFFIC LOAD COMPUTATION USING MATLAB SIM...
HARDWARE SOFTWARE CO-SIMULATION FOR TRAFFIC LOAD COMPUTATION USING MATLAB SIM...ijcsity
 
Flow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action RecognitionFlow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action RecognitionIRJET Journal
 
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...Sergio Orts-Escolano
 
A Transfer Learning Approach to Traffic Sign Recognition
A Transfer Learning Approach to Traffic Sign RecognitionA Transfer Learning Approach to Traffic Sign Recognition
A Transfer Learning Approach to Traffic Sign RecognitionIRJET Journal
 
On comprehensive analysis of learning algorithms on pedestrian detection usin...
On comprehensive analysis of learning algorithms on pedestrian detection usin...On comprehensive analysis of learning algorithms on pedestrian detection usin...
On comprehensive analysis of learning algorithms on pedestrian detection usin...UniversitasGadjahMada
 
Foreground algorithms for detection and extraction of an object in multimedia...
Foreground algorithms for detection and extraction of an object in multimedia...Foreground algorithms for detection and extraction of an object in multimedia...
Foreground algorithms for detection and extraction of an object in multimedia...IJECEIAES
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
 
Vision-Based Localization and Scanning of 1D UPC and EAN Barcodes with Relaxe...
Vision-Based Localization and Scanning of 1D UPC and EAN Barcodes with Relaxe...Vision-Based Localization and Scanning of 1D UPC and EAN Barcodes with Relaxe...
Vision-Based Localization and Scanning of 1D UPC and EAN Barcodes with Relaxe...Vladimir Kulyukin
 
Implementation of Object Tracking for Real Time Video
Implementation of Object Tracking for Real Time VideoImplementation of Object Tracking for Real Time Video
Implementation of Object Tracking for Real Time VideoIDES Editor
 
REAL-TIME PEDESTRIAN DETECTION USING APACHE STORM IN A DISTRIBUTED ENVIRONMENT
REAL-TIME PEDESTRIAN DETECTION USING APACHE STORM IN A DISTRIBUTED ENVIRONMENTREAL-TIME PEDESTRIAN DETECTION USING APACHE STORM IN A DISTRIBUTED ENVIRONMENT
REAL-TIME PEDESTRIAN DETECTION USING APACHE STORM IN A DISTRIBUTED ENVIRONMENTcscpconf
 

Similar to Fpga human detection (20)

A high frame-rate of cell-based histogram-oriented gradients human detector a...
A high frame-rate of cell-based histogram-oriented gradients human detector a...A high frame-rate of cell-based histogram-oriented gradients human detector a...
A high frame-rate of cell-based histogram-oriented gradients human detector a...
 
imagefiltervhdl.pptx
imagefiltervhdl.pptximagefiltervhdl.pptx
imagefiltervhdl.pptx
 
IRJET - Face Recognition based Attendance System
IRJET -  	  Face Recognition based Attendance SystemIRJET -  	  Face Recognition based Attendance System
IRJET - Face Recognition based Attendance System
 
ROAD SIGN DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)
ROAD SIGN DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)ROAD SIGN DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)
ROAD SIGN DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)
 
COMPARISON OF GPU AND FPGA HARDWARE ACCELERATION OF LANE DETECTION ALGORITHM
COMPARISON OF GPU AND FPGA HARDWARE ACCELERATION OF LANE DETECTION ALGORITHMCOMPARISON OF GPU AND FPGA HARDWARE ACCELERATION OF LANE DETECTION ALGORITHM
COMPARISON OF GPU AND FPGA HARDWARE ACCELERATION OF LANE DETECTION ALGORITHM
 
Comparison of GPU and FPGA Hardware Acceleration of Lane Detection Algorithm
Comparison of GPU and FPGA Hardware Acceleration of Lane Detection AlgorithmComparison of GPU and FPGA Hardware Acceleration of Lane Detection Algorithm
Comparison of GPU and FPGA Hardware Acceleration of Lane Detection Algorithm
 
IMQA Paper
IMQA PaperIMQA Paper
IMQA Paper
 
HARDWARE SOFTWARE CO-SIMULATION FOR TRAFFIC LOAD COMPUTATION USING MATLAB SIM...
HARDWARE SOFTWARE CO-SIMULATION FOR TRAFFIC LOAD COMPUTATION USING MATLAB SIM...HARDWARE SOFTWARE CO-SIMULATION FOR TRAFFIC LOAD COMPUTATION USING MATLAB SIM...
HARDWARE SOFTWARE CO-SIMULATION FOR TRAFFIC LOAD COMPUTATION USING MATLAB SIM...
 
Kk3517971799
Kk3517971799Kk3517971799
Kk3517971799
 
Flow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action RecognitionFlow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action Recognition
 
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...
 
A Transfer Learning Approach to Traffic Sign Recognition
A Transfer Learning Approach to Traffic Sign RecognitionA Transfer Learning Approach to Traffic Sign Recognition
A Transfer Learning Approach to Traffic Sign Recognition
 
On comprehensive analysis of learning algorithms on pedestrian detection usin...
On comprehensive analysis of learning algorithms on pedestrian detection usin...On comprehensive analysis of learning algorithms on pedestrian detection usin...
On comprehensive analysis of learning algorithms on pedestrian detection usin...
 
K0445660
K0445660K0445660
K0445660
 
Cuda project paper
Cuda project paperCuda project paper
Cuda project paper
 
Foreground algorithms for detection and extraction of an object in multimedia...
Foreground algorithms for detection and extraction of an object in multimedia...Foreground algorithms for detection and extraction of an object in multimedia...
Foreground algorithms for detection and extraction of an object in multimedia...
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
Vision-Based Localization and Scanning of 1D UPC and EAN Barcodes with Relaxe...
Vision-Based Localization and Scanning of 1D UPC and EAN Barcodes with Relaxe...Vision-Based Localization and Scanning of 1D UPC and EAN Barcodes with Relaxe...
Vision-Based Localization and Scanning of 1D UPC and EAN Barcodes with Relaxe...
 
Implementation of Object Tracking for Real Time Video
Implementation of Object Tracking for Real Time VideoImplementation of Object Tracking for Real Time Video
Implementation of Object Tracking for Real Time Video
 
REAL-TIME PEDESTRIAN DETECTION USING APACHE STORM IN A DISTRIBUTED ENVIRONMENT
REAL-TIME PEDESTRIAN DETECTION USING APACHE STORM IN A DISTRIBUTED ENVIRONMENTREAL-TIME PEDESTRIAN DETECTION USING APACHE STORM IN A DISTRIBUTED ENVIRONMENT
REAL-TIME PEDESTRIAN DETECTION USING APACHE STORM IN A DISTRIBUTED ENVIRONMENT
 

Recently uploaded

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonJericReyAuditor
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 

Recently uploaded (20)

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lesson
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 

Fpga human detection

  • 1. An FPGA based human detection system with embedded platform Pei-Yung Hsiao a,⇑ , Shih-Yu Lin a , Shih-Shinh Huang b a Department of Electrical Engineering, National University of Kaohsiung, Kaohsiung, Taiwan, ROC b Department of Computer and Communication Eng., Nat’l Kaohsiung First Univ. of Science and Technology, Taiwan, ROC a r t i c l e i n f o Article history: Received 27 August 2014 Received in revised form 23 December 2014 Accepted 17 January 2015 Available online 29 January 2015 Keywords: FPGA circuit design Real-time embedded system Human detection HOG SVM Adaboost a b s t r a c t Focusing on the computing speed of the practical machine learning based human detection system at the testing (detecting) stage to reach the real-time requirement in an embedded platform, the idea of iterative computing HOG with FPGA circuit design is proposed. The completed HOG accelerator contains gradient calculation circuit module and histogram accumulation circuit module. The linear SVM classifi- cation algorithm producing a number of necessary weak classifiers is combined with Adaboost algorithm to establish a strong classifier. The human detection is successfully implemented on a portable embedded platform to reduce the system cost and size. Experimental result shows that the performance error of accuracy appears merely about 0.1–0.4% in comparison between the presented FPGA based HW/SW co-design and the PC based pure software. Meanwhile, the computing speed achieves the requirement of a real-time embedded system, 15 fps. Ó 2015 Elsevier B.V. All rights reserved. 1. Introduction Human and pedestrian detection technologies have been the rage in the fields of intelligent transportation system, computer vision, and perceptive surveillance system in the past years [1–5]. As we known, various machine learning algorithms have been pro- posed to solve the problem of human detection, in which a local fea- ture vector of the histograms of oriented gradients (HOG), proposed by Dalal et al. [1] in 2005, turns into the mostly cited human descriptive feature [2–6]. Despite a lot of researchers aiming at designing FPGA based cir- cuits for image processing, detection, and other related applica- tions [7–8], only several but few researches focused on developing hardware circuit of HOG in the recent years [4–6]. Yet, those few works did not present as a whole system built in an embedded platform to achieve a real-time HW/SW co-design system for human detection. Besides, the detailed comparisons of computation speeds and detection error rates among FPGA acceler- ator, embedded platform, and personal computer still did not appear in the past literatures. In this study, the human detection is successfully implemented on a portable embedded platform to reduce the system cost and size. Moreover, the computing speed of the testing stage achieves about 15 frames per second, which fully matching the requirement of a real-time embedded system [7]. 2. Principles and FPGA design 2.1. Human detection algorithm The human detection algorithm covers training stage and test- ing, or named as detecting, stage, as shown in Fig. 1. Both algo- rithms of SVM [9–10] and Adaboost [3] are required at the training stage. However, only SVM algorithm should be used in the testing stage. In our experiment, both of various still image pictures and videos are utilized as input at both stages, separately. Two public image datasets are collected for the former and the later as well, while two public videos and one additional video are cho- sen for the latter. The additional video is shot by us with the scene set-up in our laboratory. The scene images need to be artificially collected for positive samples (human) and negative samples (non-human) with the res- olution 64 Â 128 as a detecting window [1] before getting into the training stage. To reduce the scanning range of the detecting window, the input image frame is first proceeded foreground seg- mentation at the testing stage in order to acquire the region of interest (ROI) for diminishing the computation time. As the human objects in the image frame would change the sizes with distinct distances between camera and objects, the detecting windows with different sizes need to be scaled up/down. The system will judge whether there is a human or not in each detecting window by the One Detecting Window Strong Classifier Module. The module contains two computing steps. First, all HOG vectors, which being correspondent with all weak classifiers built http://dx.doi.org/10.1016/j.mee.2015.01.018 0167-9317/Ó 2015 Elsevier B.V. All rights reserved. ⇑ Corresponding author. E-mail address: pyhsiao@nuk.edu.tw (P.-Y. Hsiao). Microelectronic Engineering 138 (2015) 42–46 Contents lists available at ScienceDirect Microelectronic Engineering journal homepage: www.elsevier.com/locate/mee
  • 2. as a strong classifier, are calculated. For instance, a strong classifier with 40 weak classifiers needs to proceed 40 times of HOG calcula- tion. Second, all weak classifiers are used to proceed the SVM pre- dict once by using the SVM model file acquired from the training stage in order to identify the object inside the detecting window being a human or not a human (non-human). 2.2. Histograms of oriented gradients As we known, HOG proposed by Dalal et al. [1] is the mostly cited human local feature. The idea is to use the 36D vector as the descriptive feature in human detection for representing the contour and appearance information of an object in an image block or in a detecting window. With the hardware design of HOG using FPGA, based on design principles of simplicity, regularity, and modularity, our circuit architecture presents four arithmetic mod- ules described below. 2.2.1. Gradient components and gradient magnitude In order to acquire the vertical and horizontal gradient compo- nents, the vertical and horizontal differential operations are first proceeded at a aiming block. In other words, mask [À1, 0, 1] or [À1, 0, 1]T is used for the convolution operation, through Eq. (1), to calculate Gx and Gy, whose values range between À255 and +255. Gxðx; yÞ ¼ fðx þ 1; yÞ À fðx À 1; yÞ Gyðx; yÞ ¼ fðx; y þ 1Þ À fðx; y À 1Þ ð1Þ The above Gx and Gy are used for calculating the gradient mag- nitude with Eq. (2), i.e., to square root the sum of the square. The resultant values appear in 0–357. rfðx; yÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Gxðx; yÞ2 þ Gyðx; yÞ2 q ð2Þ 2.2.2. Gradient orientation The above Gx and Gy are again used for calculating the gradient orientation with Eq. (3). The results are converted to the angles ranging between 0° and 180°, i.e., to acquire unsigned gradient orientation. hðx; yÞ ¼ tanÀ1 Gyðx; yÞ Gxðx; yÞ ð3Þ 2.2.3. Accumulated histogram After obtaining the gradient magnitude and the gradient orien- tation, four cells equally divided from a block are separately accu- mulated to produce four 9D vectors, which are combined into a 36D vector v=(v1,v2,...,v36) as shown in Fig. 2. The gradient orienta- tion is segmented with 20° as a bin in a cell, and total 9 bins are acquired. 2.2.4. L2 normalization The acquired 36D vector utilizes L2 normalization to have each component value appear in 0–1 from Eq. (4), where e is a small constant. vi ¼ vi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðjjvjj2Þ2 þ e2 q ; i ¼ 1; 2; :::36 ð4Þ 2.3. FPGA modular circuit design The block diagram of the developed FPGA circuits is shown in Fig. 3. The HOG vector generator module is illustrated on the right lower sub-block inside the whole block diagram. Four circuit sub- modules are contained inside the sub-block. The full buffering scheme shown in the left upper part of the HOG vector generator module is designed for adjusting the flow of input block pixels. The gradient module is used for calculating gradient components, grat a aiming block. In other wordsadient magnitude, and gradient orienta- tion. The histogram module is used for accumulating the histograms of four cells. In addition, the histogram PISO in the left lower part is designed for buffering the output of the 36D HOG accelerator. More- over, the testing and communication circuits are put on this design in theleftpartoutsidethe HOGvectorgenerator moduleinFig.3,soasto control signal and share data with embedded ARM CPU. The gradient module covers three sub-modules, including the GradientComponents for calculating gradient components, the Com- ponentsToMagnitude for producing the gradient magnitude, and the ComponentsToOrientation for generating the gradient orientation, respectively. Before inputting the image data into these sub-mod- ules, a data buffering scheme is required, and an appropriate com- puting for simplifying floating point manipulation is necessary before designing the sub-module circuit of ComponentsToOrienta- tion. The gradient orientation contains two types of signed and unsigned gradient orientations, where the latter is used in this study. When accumulating the orientation, the range of 20° is used Start End Labeled Human Testing Frames SVM and AdaBoost Training Stage Training Stage Testing Stage Detecting Window Scanning Foreground Segmentation Frame Input and Grey Scale One Detecting Window Strong Classifier Module If Result >=0 Yes No Positive samples Negative samples SVM Model File and Strong Classifier File Detecting Window Scaling Fig. 1. Machine learning based human detection system. 9D 9D 9D9D Feature block 4 Cells 9D Histogram 36D Histogram Fig. 2. Combining four 9D histograms to an extracted HOG feature of 36D histogram. P.-Y. Hsiao et al. / Microelectronic Engineering 138 (2015) 42–46 43
  • 3. as the partition basis. By applying Eq. (5) with just one multiplier for simple and regular hardware implementation principle, each bin value is defined as the accumulated number of pixels in a cell, in which those pixel orientations should belong to the same range of 20°. Therefore, there are nine angle bins in a cell. To effectively reduce the proportion of accumulation deflection, a 32-bit register is utilized for the 220 magnification. Gxðx; yÞ tanðhiÞ 6 Gyðx; yÞ < Gxðx; yÞ tanðhiþ1Þ ð5Þ Besides designing three sub-modules in the gradient module, two types of sub-modules, BlockTo4Cells and Vote9DVector, are covered in the histogram module. The former sub-module judges each set of orientation bin and magnitude belonging to which of the four cells. Then it combines both of the signals, OrientationBin (4-bit) and Magnitude (9-bit), as illustrated in the most upper part in Fig. 4, into a 13-bit signal. Finally, the former sub-module deliv- ers the combined 13-bit signal to four parallel same-type latter sub-modules, Vote9Dvector1–Vote9Dvector4 as shown in Fig. 3, orderly according to those four cells with a 1-to-4 demultiplexer. Consequently, each sub-module of Vote9Dvector functions accumulates each set of orientation bin and magnitude to a target cell. The circuit design of each sub-module of Vote9Dvector is shown in Fig. 4. 3. Experiment results 3.1. Public image datasets Two public image datasets, CBCL [11] and CVC [12], are selected for training and testing stages in the system. The image frames used for training stage would not be used for testing stage in order to guarantee the effectiveness and persuasion of the test results, i.e., detection rate and accuracy. Such two datasets present distinct characteristics naturally. After manual treatment, several hundred or thousands samples are selected for training, and other different samples, one by one, are used for testing. The numbers of the selected positive and neg- ative samples are shown in Table 1. Here, total 924 positive sam- ples are selected from CBCL for the first experiment listed in the CBCL row of Table 1. Half of them are used in training stage, and the other half are used for testing stage. Because there is no nega- tive sample existed in CBCL, 1038 negative samples are taken from CVC to combine with the first half of 924 positive samples for training. Similarly, 1359 negative samples are taken from CVC to combine with the second half of 924 positive samples for testing. Besides, there are total 3356 positive samples existed in CVC. For the second experiment listed in the CVC row of Table 1 and 571 positive samples are picked out for training and the other set of 571 different positive samples for testing. For preparing the negative samples from CVC, total 4096 negative samples are segmented out from which 1326 negative samples are picked out for training stage. On the other hand, the different 2048 negative samples are selected for testing stage. 3.2. HOG accelerator The completed HOG accelerator, implemented as a Xilinx FPGA circuit module, presents the highest frequency up to 192 MHz. The number of cycles for computing one HOG can be formulated as #cycles = BlockWidth + BlockWidth * BlockHeight + 58. In the gradient component module, the boundary of block image for gradient convolution is processed with shrinking manip- ulation. Moreover, the circuit computing in the ComponentsTo- Magnitude module is taken the integer rather than the floating point value. Such two mentioned factors would bring out differ- ence between the detection rates reached by the FPGA based HW/SW co-design on the embedded platform and by the execution of pure software on PC. SMIMS Macube Embedded Platform Colibri T20 (Linux Software) XC6SLX150T FPGA(Top Module)(Hardware) One-HOG Vector Generator Module Gradient Components Components To Magnitude Components To Orientation mem1 64x128*8b OneBlock Gradient Module Vote 9D Vector Histogram Module (Vote 36D Vector) Block To 4Cells Vote 9D Vector Vote 9D Vector Vote 9D Vector MO_Generator Module mem2 36*32b 36D Vector Full Buffering parameter Width Height BufLen Histogram PISO mem0 Dual Port 4096x16b ARM_ Mem_ Interface Tegra 2 CPU Write Buffer 4096x16b Read Buffer 4096x16b CPU Write to FPGA Buffer CPU Read from FPGA Buffer Input Data Move Output Data Move Generic Memory Interface (GMI) Data Bus System Signal Controller OneHVG_Start PISO_done StartDone ARM_R_Ready ARM_W_Ready USB Host SMIMS Engine Control and Program FPGA Fig. 3. Block diagram for the architecture of our modular circuits for human detection. + Register1 Demultiplexer 1D-HOG Component Sel OrientationBin[3:0] Input Magnitude[8:0] Output9Output1 Output2 Output3 Output4 Output5 Output6 Output7 Output8 + Register2 + Register3 + Register4 + Register5 + Register6 + Register7 + Register8 + Register9 1D-HOG Component 1D-HOG Component 1D-HOG Component 1D-HOG Component 1D-HOG Component 1D-HOG Component 1D-HOG Component 1D-HOG Component Fig. 4. One of four parallel Vote9DVector sub-module circuits. Table 1 The numbers of positive or negative samples selected from two public image datasets. DataSet Training/testing #Pos. samples #Neg. samples Total # CBCL Training 462 1038 (CVC) 1500 Testing 462 1359 (CVC) 1821 CVC Training 571 1326 1897 Testing 571 2048 2619 Table 2 Comparison of computation time for one HOG. Computation basis Spec. OS Speed for one HOG (rate) PC (SW) i7-3770/3.4 Ghz DDR3/8 GB Win7 0.035393 ms (20.601) ARM Colibri T20 (SW) Tegra 2/1.0 Ghz DDR2/512 MB Linux 0.122800 ms (71.478) FPGA (HW) Xilinx/192 MHz XC6SLX150T None 0.001718 ms (1) 44 P.-Y. Hsiao et al. / Microelectronic Engineering 138 (2015) 42–46
  • 4. Based on the accuracy observation of in Table 3, the experimen- tal results reveal that such an error of detection rate is kept in 0.1– 0.4% for CBCL and CVC datasets, respectively. The accuracy decreases of 0.1% (98.5–98.4%) and 0.4% (97.4–97.0%) were obtained from the last row of Table 3. The detailed detection rate analyses and statistics are described in the next sub-section. How- ever, the comparison of the computing speed with one HOG is given in advance in Table 2. The computation time of one HOG for a block of 16 * 16 pixels with the designed FPGA hardware cir- cuit merely takes 0.001718 ms. That is, the computing speed with the proposed one HOG hardware circuit is 20 times faster than the software computation on a PC and 71 times faster than the ARM CPU based software on the embedded platform. The computation time of the entire human detection system is further described in the following sub-section. 3.3. Detection rate and computation speed Various measures can be utilized for calculating the detection rate. In this experiment, three statistic measures are applied to the experiments of above CBCL and CVC public datasets, including positive predictive value (PPV or precision), true positive rate (TPR or recall), and accuracy. PPV stands for the probability of the labeled (detected) human being a real human; TPR represents the probability of all human images identified as human, i.e., the so-called detection rate (or recall); accuracy refers to the propor- tion of all human and non-human being correctly classified. In comparison of executing software on the embedded HW/SW co-design platform and replacing HOG module with FPGA hard- ware circuit, the above various measures of detection rates are pre- ceded in our experiments. The experimental results are shown in Table 3, where the errors of PPV, TPR, and accuracy between HOG accelerating computation with FPGA and pure software run on embedded platform or on PC appear below 0.9%, 05%, and 0.4%, respectively. This presents that our HOG hardware computa- tion gives a high precision, and brings out small errors in various detection rate measures in terms of the whole system. What is more, the computing time required for processing one detecting window at the testing stage is further experimented and compared, as shown in Table 4. The discovered time for 22–33 times of iterative calculating HOG is quite larger than it for calcu- lating SVM module. Apparently, designing an iterative used HOG hardware in machine learning based human detection reveals higher importance and necessity than SVM hardware [10]. Besides, the time for human detection with iterative computation of HOG hardware at the testing stage for a detecting window, could be effectively reduced and suppressed under fewer than 1 ms, namely 0.922 ms or 0.882 ms, from Table 4. In other words, our FPGA based human detection system could deal with about 1080 detect- ing windows in a second. When 72 detecting windows are applied in each image frame, the system can successfully achieve the requirement of real-time embedded system of about 15 frames per second. It further turns out that the computing speed compar- ison by a detecting window shown in Table 4 presents our con- vinced outcome and value better than that by one time used HOG hardware in Table 2. In comparison with Dalal’s detector [1] based on the same data- sets of CBCL and CVC as shown in Table 1, this detector confirms a little lift of the performance. On average, the improvement of PPV, TPR, and accuracy of this system shows 0.3%, 1.4%, and 0.4% better than Dalal’s detector, respectively, as listed in Table 5. 3.4. Implementation on a real-time embedded platform In this research, the architecture of real-time embedded devel- opment platform, MaCube, consisting of an ARM module modeled Colibri T20 and an NVIDIA Tegra-2 dual-core Cortex-A9 micropro- cessor run in 2 Â 1.0 Ghz, is employed with the Linux OS. Mean- while, the HW/SW integration environment is established by using Tegra-2 generic memory interface (GMI) data bus and Xilinx Spartan-6 LX-150T FPGA chip. An interface engine IC is allocated between ARM SOC and FPGA, being in charge of the procedure control between ARM and FPGA and able to download our designed circuit files to FPGA. To accel- erate the data access speed, DMA is used by MaCube platform for accessing data to/from FPGA chip. It makes the HW/SW integration be more efficient. Table 3 Detection rate and accuracy for our embedded h/s co-design and pure software human detection systems. DataSets CBCL CVC H/S co-design vs. SW Pure SW Embedded SW with HOG/FPGA Pure SW Embedded SW with HOG/FPGA # Weak classifiers 33 23 24 22 TP 436 435 512 509 TN 1359 1358 2039 2034 FP 0 1 9 14 FN 26 27 59 62 PPV 100% 99.7% 98.2% 97.3% TPR 94.3% 94.1% 89.6% 89.1% Accuracy 98.5% 98.4% 97.4% 97.0% Table 4 Comparison of computation efficiency per one detecting window. Speed vs. HOG & SVM # Weak classifiers HOG (ms) SVM (ms) Total (ms) CBCL Embedded SW 33 4.104 0.047 4.151 Embedded SW with HOG/FPGA 23 0.922 0.029 0.951 CVC Embedded SW 24 2.363 0.030 2.393 Embedded SW with HOG/FPGA 22 0.882 0.028 0.910 Table 5 Detection rate comparison between Dalal’s and our detector. DataSets CBCL (SW) (%) CVC (SW) (%) Dalal PPV 99.5 98.1 TPR 93.1 88.0 Accuracy 98.2 96.9 Ours PPV 100 98.2 TPR 94.3 89.6 Accuracy 98.5 97.4 Fig. 5. Video demonstration for our FPGA based human detection system run in a real-time embedded platform. (a) and (b) Caviar video; (c) and (d) AVSS 2007video; (e) and (f) our video. P.-Y. Hsiao et al. / Microelectronic Engineering 138 (2015) 42–46 45
  • 5. The memory mapping is applied to the user program to corre- spond with the source end and the target end of the DMA data. The transmission of DMA is 4096 Â 16 bits at a time. The user pro- gram writes the data into the write buffer at the source end, calls the driver to start DMA, and then successfully transmits the data into FPGA. The control authority of software is then returned to the user program after the successful transmission. For the reverse behavior of data transmission, i.e., to read data from the FPFA, the driver is first called to start DMA so as to read the FPGA data and write them into the read buffer. To demonstrate the final results of this study, the videos with three different scenes of public Caviar video [13], public AVSS 2007 video [14], and self-shot video from our laboratory are pro- ceeded as the dynamic dataset experiments, as shown in Fig. 5, where object inside a detecting window automatically detected as a human by this system is labeled with red rectangle. 4. Conclusion In order to get rid of a PC-based computing environment and transfer to the embedded platform as well as to speed up the com- puting speed of a bottleneck module, the FPGA based HOG vector generator is successfully accomplished as a modular circuit design. The completed HOG accelerator contains two circuit modules of gradient calculation and histogram accumulation. With iterative used HOG hardware, accuracy changing rate of the FPGA based human detection system in comparison with that of pure software appears merely within 0.4%, which presenting a very small effect of the accelerating design on the detection rate. Meanwhile, the com- pleted FPGA based human detection system could process about 1075 detecting windows in a second. In other words, it successfully achieves the requirement of a real-time embedded system of about 15 fps. Acknowledgments This research is partially sponsored under the projects MOST 103-2221-E-390-028-MY2 and NSC102-2221-E-390-026. References [1] N. Dalal, B. Triggs, Proc. IEEE Conf. Comput. Vision Pattern Recognit. 1 (2005) 886–893. [2] D. Gerónimo, A.M. López, A.D. Sappa, T. Graf, IEEE Trans. Pattern Anal. Mach. Intell. 32 (2010) 1239–1258 [3] Q. Zhu, M.-C. Yeh, K.-T. Cheng, S. Avidan, Proc. IEEE Conf. Comput. Vision Pattern Recognit. 2 (2006) 1491–1498 [4] P.Y. Chen, C.C. Huang, C.Y. Lien, Y.H. Tsai, IEEE Trans. Intell. Transp. Syst. 15 (2) (2014) 656–662. [5] S. Bauer, U. Brunsmann, S. Schlotterbeck-Macht, In: MPC Workshop, (2009) pp. 49–58. [6] R. Kadota, H. Sugano, M. Hiromoto, H. Ochi, R. Miyamoto, Y. Nakamura, Proc. IIH-MSP IEEE (2009) 1330–1333. [7] P.Y. Hsiao, C.H. Chen, H. Wen, S.J. Chen, IEE Proc. Comput. Digit. Tech. 153 (4) (2006) 1871–1874. [8] K.G. Gokhan, S. Afsar, Microprocess. Microsyst. 37 (3) (2013) 270–286. [9] R.E. Fan, K.W. Chang, C.J. Hsieh, X.R. Wang, C.J. Lin, J. Machine Learning Research 9 (2008) 1871–1874. <http://www.csie.ntu.edu.tw/~cjlin/liblinear>. [10] D. Anguita, A. Boni, S. Ridella, IEEE Trans. Neural Networks 14 (5) (2003) 993– 1009. [11] CBCL Pedestrian DB, <http://cbcl.mit.edu/software-datasets>. [12] CVC Virtual Dataset, <http://www.cvc.uab.es/adas/databases>. [13] CAVIAR, (2001) <http://homepages.inf.ed.ac.uk/rbf/CAVIAR>. [14] AVSS, <http://www.eecs.qmul.ac.uk/~andrea/avss2007_d.html>. 46 P.-Y. Hsiao et al. / Microelectronic Engineering 138 (2015) 42–46