Your SlideShare is downloading. ×
Human gait recognition by the fusion of motion and static spatio temporaltemplates
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Human gait recognition by the fusion of motion and static spatio temporaltemplates

246
views

Published on

Published in: Technology, News & Politics

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
246
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Pattern Recognition 40 (2007) 2563 – 2573 www.elsevier.com/locate/pr Human gait recognition by the fusion of motion and static spatio-temporal templates Toby H.W. Lam, Raymond S.T. Lee ∗ , David Zhang Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hung Hom, Hong Kong Received 5 December 2005; received in revised form 24 June 2006; accepted 16 November 2006 Abstract In this paper, we propose a gait recognition algorithm that fuses motion and static spatio-temporal templates of sequences of silhouette images, the motion silhouette contour templates (MSCTs) and static silhouette templates (SSTs). MSCTs and SSTs capture the motion and static characteristic of gait. These templates would be computed from the silhouette sequence directly. The performance of the proposed algorithm is evaluated experimentally using the SOTON data set and the USF data set. We compared our proposed algorithm with other research works on these two data sets. Experimental results show that the proposed templates are efficient for human identification in indoor and outdoor environments. The proposed algorithm has a recognition rate of around 85% on the SOTON data set. The recognition rate is around 80% in intrinsic difference group (probes A–C) of USF data set. ᭧ 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. Keywords: Gait recognition; Motion silhouette contour templates; Static silhouette templates; Biometrics 1. Introduction Biometrics has received substantial attention from researchers. Biometrics is method of recognizing a human according to physiological or behavioral characteristic. Gait is one of the biometrics that different from the traditional biometrics. Gait is the manner of walking. Early medical study showed that individual gaits are unique, varying from person to person and are difficult to disguise [1]. In addition, it has been shown that gaits are so characteristic that we recognize friends by their gait [2] and that a gait can even reveal an individual’s sex [3]. Unlike other biometrics such as finger-prints and palm-prints, gait recognition requires no contact with a capture device as a gait can be captured in a distance as a low resolution image sequence. Gait recognition is basically divided into two types: modelbased and model-free recognition [4]. In model-based recognition, researchers use information gathered from the human ∗ Corresponding author. Tel.: +852 27667298. E-mail address: csstlee@comp.polyu.edu.hk (R.S.T. Lee). body, especially from the joints, to construct a model for recognition. In general, model-based approach is view and scale invariant. To gather these gait information, a high quality of the gait sequences is required. Thus, some of the model-based recognition systems require multi-camera for collecting the information. One of the classical model-based gait recognition experiments was undertaken by Johansson [5], who attached light bulbs to a human and then used the movement of the light bulbs to capture the subject’s motion. In Ref. [6], the body contours are computed in each frame of the walking sequence. A stick model is then created from the body contours for recognition. Johnson and Bobick proposed a multi-view gait recognition algorithm which used static body parameters for recognition [7]. These static body parameters are the height of the silhouette, the distance between the head and pelvis, the distance between the left and right foot and the maximum value of the distance between the pelvis and the feet. Lee and Grimson [8] proposed a similar approach for recognition. They used the silhouette images to compute the seven features vectors, such as the aspect ratio and centroid of the silhouette, for recognition. In addition, they also proposed gait features which based on spectral components for recognition [8]. Wagg and Nixon [9] 0031-3203/$30.00 ᭧ 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.patcog.2006.11.014
  • 2. 2564 T.H.W. Lam et al. / Pattern Recognition 40 (2007) 2563 – 2573 presented an automated model-based method for gait extraction which based on the mean shape and motion information of gait. At present, most current gait recognition research uses model-free (or holistic) recognition. This means using motion information directly without the need for a model reconstruction. Model-free approaches usually use sequences of binary silhouettes, extracting the silhouettes of moving objects from a video using segmentation techniques such as background subtraction. Techniques for use in moving object recognition include Murase and Sakai’s [10] proposed parametric eigenspace representation. The eigenspace technique was originally used in face recognition [11], but Murase and Sakai [10] applied it to gait recognition and lip reading, projecting the extracted silhouette images onto the eigenspace using principle component analysis (PCA). The sequence of movement forms a trajectory in the eigenspace, a parametric eigenspace representation. The input image sequence is preprocessed to form a sequence of binary silhouettes and this binary sequence is projected to form a trajectory in the eigenspace. The smallest distance between the input trajectory and the reference trajectory is the best match. Huang et al. [12] applied a similar technique for purposes of gait recognition using linear discriminating analysis (LDA), also known as canonical analysis. The advantage of using LDA is that it discriminates better between different classes. Three different types of temporal templates were proposed, all generated by the computation of optical flow. Canonical analysis allows the temporal template sequence to be projected to form a manifold in the subspace. Foster et al. [13] presented an area-based metric which is called gait masks. First, they masked each silhouette in an image sequence and then measure the unmasked area. Difference in this information is used to form a time-varying signal which can be used as a signature in automatic gait recognition. HayfronAcquah et al. [14] proposed a gait recognition method that uses a generalized symmetry operator which exploits the symmetry of human motion, using a symmetry operator on the edge of the silhouette image to generate a symmetry map. A gait signature is then generated by using a fast Fourier transformation (FFT) on the mean of the symmetry map. BenAbdelkader et al. [15] had a similar idea for gait recognition that proposed a new gait representation called image-self similarity plot. These plots are then projected to an eigenspace by using PCA. Wang and Tan [16] proposed a new transformation method for reducing the dimensionality of the input feature space by unwrapping a 2D silhouette image and transforming it into a 1D distance signal. The sequences of silhouette images are then transformed to create a time-varying distance signals. Finally, they apply the eigenspace transformation to the distance signal. Liu and Sarkar [17] proposed another representation for gait recognition which is called averaged silhouette. The silhouette sequence is transformed into a single image representation for recognition. Euclidean distance is adopted for similarity measures between these representations. In this paper, we propose a fast, robust and innovative gait recognition algorithm based on motion and static spatiotemporal templates. We propose to recognize gaits using two gait feature templates in combination, motion silhouettes con- tour templates (MSCTs) and static silhouette templates (SSTs). MSCTs and SSTs embed critical spatial and temporal information. The use of these templates reduces the computation cost and the size of the database. The efficacy of the proposed method in indoor and outdoor environments has been demonstrated on the SOTON [18] and USF data sets [19]. The rest of this paper is organized as follows. In Section 2, we describe the detail on the MSCT and SST. Section 3 provides details of the proposed recognition algorithm. Section 4 presents the experimental results and Section 5 offers our conclusion. 2. Features extraction In this section, we describe the details of the motion spatiotemporal template, MSCT, and static spatio-temporal template, SST. 2.1. Motivation The main motivation of the proposed templates is to construct discriminative representations from the motion and static characteristics of the walking sequence for recognition. As mentioned in previous section, there are different approaches for gait recognition such as holistic and model-based approach. One of the common approaches is extracting features from each frame of a walking sequence and then generating a sequence of features [10,12,13,16]. Unlike these methods, in this paper, we proposed a method which simply extract two feature templates, exemplar MSCT and exemplar SST, from a sequence of silhouette images for recognition. We consider the motion characteristic of gait is the part of the body in motion during walking, such as hand and leg. This motion characteristic is captured by using the contour of the silhouette images. We suppose the static characteristic of gait is the torso which remains steady during walking. This static characteristic is captured by using the silhouette images directly. Compared with the existing research works, instead of creating only one representation for recognition, two representations are constructed for reorganization in our proposed algorithm. The proposed templates for gait recognition are simple and computationally efficient. It could be computed without the need to generate a sequence of features or perform any transformations. Different from the model-based approach, it is not necessary to construct any model for recognition. The template construction process is simple and the construction time is short. Thus, it is suitable for real-time gait recognition. The following section describes how to construct exemplar MSCT and exemplar SST. 2.2. Motion silhouette contour template (MSCT) and static silhouette template (SST) The basis of these templates is a sequence of silhouette images. A MSCT contains information about the movement characteristics of a human gait and a SST contains information about the static characteristics of a human gait. These templates are used together for gait recognition. First, the silhouettes are
  • 3. T.H.W. Lam et al. / Pattern Recognition 40 (2007) 2563 – 2573 2565 Fig. 2. Samples of normalized silhouette. 2.2.2. Gait period estimation Human walking repeats its motion in a stable frequency. Since our proposed gait feature templates depend on the gait period, we must estimate the number of frames in each walking cycle. A single walking cycle can be regarded as that period in which a person moves from the mid-stance (both legs are closest together) position to a double support position (both legs are furthest apart), then the mid-stance position, followed by the double support position, and finally back to the midstance position. Fig. 3 shows samples of silhouette images in one cycle. The gait period Pgait can then be estimated by calculating the number of foreground pixels in the silhouette image [19]. In mid-stance position, the silhouette image contains a smallest number of foreground pixels. In double support position, the silhouette contains a greatest number of foreground pixels. However, because sharp changes in the gait cycle are most obvious in the lower part of the body, gait period estimation makes use only of the lower half of the silhouette image, with the gait period being the median of the distance between two consecutive minima. Fig. 1. Flow diagram of the proposed gait recognition algorithm. extracted and normalized to a fixed size. Then, the gait period is estimated from the silhouette sequence. The silhouette sequence is further divided into several cycles according to the estimated gait period. In each cycle, two templates, an MSCT and a SST, are computed. There are a number of MSCT and SST in each silhouette sequence. To ease of computation, an exemplar MSCT and an exemplar SST are computed by averaging the MSCT and SST in each sequence. Fig. 1 shows a flow diagram of the proposed gait recognition algorithm. 2.2.1. Preprocessing In our proposed algorithm, silhouettes are the basis of gait recognition. The silhouettes are extracted by simple background subtraction and thresholding [20]. Then, the binarization process renders the image in black and white; the background is black and the foreground is white. The bounding box of the silhouette image in each frame is computed. The silhouette image is extracted according to the size of the bounding box and the extracted image is resized to a fixed size (128 ∗ 88 pixels). The purpose of normalization is to eliminate the scaling effect. Fig. 2 shows the examples of the normalized silhouette images. 2.2.3. Generating exemplar MSCTs A MSCT contains the motion information about the human gait. A MSCT could be constructed in three steps. First, the silhouette images are extracted and normalized to a fixed size of 128*88 pixels. Then, the sequence of silhouettes is used to estimate the gait period Pgait . Finally, the silhouette image sequences are then divided into several cycles according to the estimated gait period Pgait . MSCTs are created by using the image sequences and the gait period Pgait . The exemplar MSCT is the average MSCTs. The MSCT is generated from a sequence of silhouette contours. The contour of the silhouette CSi is obtained by subtracts the original silhouette Si with the eroded silhouette ESi, as in Eq. (1). The eroded silhouette ESi could be computed by erosion operation, as in Eq. (2): CSi = Si − ESi = Si − (Si S), (1) ESi = Si S = (2) (Si)−s , s∈S where Si is the original silhouette which is used to erode, ESi is the eroded silhouette, CSi is the silhouette contour, S is the structuring element and is the eroding operator. (Si)−s represents the translation of silhouette image Si by s. The structuring element S is a set of coordinate points. The foreground and background pixels are represented by 1’s and
  • 4. 2566 T.H.W. Lam et al. / Pattern Recognition 40 (2007) 2563 – 2573 Fig. 3. Samples of silhouette images in one walking cycle. 1 1 1 1 1 1 1 MSCT may increase the degree of computational complexity. To reduce this complexity, an exemplar MSCT is obtained. The exemplar MSCT is the average MSCT MSCTi in each walking sequence as 1 1 MSCT = Fig. 4. The 3 × 3 structuring element adopted for erosion operation. 0’s respectively. Fig. 4 shows the structuring element which we adopted for erosion operation. The eroding operator is used to superimpose the structuring element with the input image. If each nonzero element of the structuring element is contained in the input image, then the output pixel is 1. Otherwise, the output pixel is 0 [21]. Fig. 5 shows the image of original silhouette, eroded silhouette and the silhouette contour which computed by erosion operation. An algorithm (3) is used to create the sequence of silhouette contour images: MSCT i (x, y, t) 255 if CSi i (x, y, t) = 1, = max(0, MSCT i (x, y, t−1)− ) otherwise, (3) where i is the cycle number in the gait sequence, is the intensity decay parameter and MSCTi is the number i motion silhouette contour template. The intensity decay parameter could be computed using the following formula: = 255/Pgait , (4) where Pgait is the estimated gait period. The use of a dynamic decay value rather than a fixed intensity decay parameter eliminates the walking speed effect. Fig. 6 shows some examples MSCT. The number of MSCT in a sequence depends on the gait period and the number of frames in the walking sequence. The fact that different subjects may produce different numbers of n i=1 MSCT i n , (5) where n is the number of MSCT in the sequence. Fig. 7 shows some examples of exemplar MSCT. A great advantage of using MSCT is that the contour images from which they are formed are an order of magnitude smaller than silhouette images and are thus more computationally efficient. However, if the silhouettes are extracted at a low quality, an MSCT may embed irrelevant information which affects the recognition rate. In the following section we describe how this error can be reduced by using the SST. 2.2.4. Generating exemplar SSTs SSTs are used in our recognition algorithm in conjunction with MSCTs as a way of reducing the recognition rate error. A SST is generated in much the same way as a MSCT except that it uses the entire silhouette image. The SST can be generated by using the following algorithm: SSTi (x, y, t) = if SSTi (x, y, t) = SSTi (x, y, t − 1), otherwise, 1 0 (6) where i is the cycle number in the gait sequence and SSTi is the i static silhouette template. Fig. 8 shows the examples of the SST. As in the generation of MSCT, the number of SST SSTi in the sequence depends on the gait period and the number of frames in the sequence. We further obtain the exemplar SST by averaging the SST i in each walking sequence given as SST = n i=1 SST i n , (7)
  • 5. T.H.W. Lam et al. / Pattern Recognition 40 (2007) 2563 – 2573 2567 Fig. 5. (a) Original silhouette, (b) eroded silhouette, (c) silhouette contour. Fig. 6. Examples of motion silhouette contour template (MSCT). where n is the number of cycle in the sequence. Fig. 9 shows some examples of exemplar SST. 3. Recognition The similarity score represents the level of similarity between the testing data and the training data. In this section, we explain the detail of the similarity measures in our proposed algorithm. For ease of understanding, gallery means training and probe means testing. Suppose there are Ngallery subjects in the gallery data set, Nprobe subjects in the probe data set and each subject contains a walking sequence. A probe sequence is captured and then measures the similarity score between each training sequence. Suppose u is the probe sequence and v be the gallery sequence. A probe sequence Seq u = {Seq u (1), Seq u (2) . . . Seq u (P )} from training data set and a gallery sequence Seq v = {Seq v (1), Seq v (2) . . . Seq v (Q)} from testing data set are used for calculating the similarity score, where P and Q are, respectively, the number of frames in the probe and gallery sequences. We calculate the gait period of each subject in the probe and gallery sequences. We follow the procedures described in Section 2 to create the exemplar MSCT and SST for the gallery and probe sequences. After that, each subject would have two templates, MSCT u and SST u , for the probe sequence and another two templates, MSCT v and SST v , for gallery sequence. Our algorithm makes use of two similarity scores. To measure the similarity between gallery and probe MSCT, we calculate the similarity score SimScoreMSCT . To measure the similarity between gallery and probe SST, we calculate the similarity score SimScoreSST . These similarity scores are calculated by using the Euclidean distance. The similarity score SimScoreMSCT can be computed by Eq. (8) and similarity
  • 6. 2568 T.H.W. Lam et al. / Pattern Recognition 40 (2007) 2563 – 2573 Fig. 7. Examples of exemplar MSCT. Fig. 8. Examples of static silhouette template (SST). score SimScoreSST can be calculated by Eq. (9): SimScoreMSCT (MSCT u , MSCT v ) = MSCT u − MSCT v SimScoreMSCT , SimScoreSST (SST u , SST v ) = (8) SST u − SST v SimScoreSST , MSCT and SST v is the exemplar SST of the gallery sequence v, SimScoreMSCT and SimScoreSST are the mean similarity score of the exemplar MSCT and exemplar SST, respectively. SimScoreMSCT and SimScoreSST can be computed by Eqs. (10) and (11), respectively: (9) where MSCT u is the exemplar MSCT and SST u is the exemplar SST of the probe sequence u, MSCT v is the exemplar SimScoreMSCT = Ngallery i=1 Nprobe j =1 MSCT i − MSCT j Ngallery ∗ Nprobe , (10)
  • 7. T.H.W. Lam et al. / Pattern Recognition 40 (2007) 2563 – 2573 2569 Fig. 9. Examples of exemplar SST. SimScoreSST = Ngallery i=1 Nprobe j =1 SST i − SST j Ngallery ∗ Nprobe 4.1. Recognition on the SOTON data set , (11) where Ngallery is the number of subjects in gallery set and Nprobe is the number of subjects in probe set. The final similarity score SimScore between two subjects can be calculated as follows: SimScore(u, v) = SimScoreMSCT (MSCT u , MSCT v ) + SimScoreSST (SST u , SST v ). (12) In our proposed recognition algorithm, nearest neighbor (NN) classifier is adopted for classification. For a testing sample u, we calculate the final similarity score SimScore with each subject in the gallery data set by Eq. (12). Thus, there are totally Ngallery final similarity score SimScore. The sample u is classified as v when the final score SimScore is the minimum obtained from any of the training patterns. Thus, the testing sample u is classified as the subject v if min SimScore(u, i) = SimScore(u, v), (13) where i = 1, 2, . . . Ngallery . 4. Experiments In this section, we show the performance of proposed gait recognition algorithm in two data sets, the SOTON data set [18] and the USF data set [19]. The SOTON data set was captured in indoor environment and the USF data set was captured in outdoor environment. Fig. 10 shows some silhouette images from these two data sets. For the evaluation, we adopted the FERET scheme [1] and measured the identification rate and the verification rate using cumulative match characteristics (CMCs). All experiments are implemented using Matlab and tested on a P4 2.26 GHz computer with 512 MB memory. For the SOTON data set, we had to make a number of adjustments to the data set. Since there were insufficient frames in each walking sequence of SOTON data set to estimate the gait period, to generate the proposed feature templates, we fixed the gait period as 30. There were varying numbers of walking sequences for each subject in the data set, so we constructed three further data sets: data sets A, B, and C. In data set A, 50% of the image sequences of each subject were selected for training and the remainders were used for testing. In data set B, 75% of the image sequences of each subject were selected for training and the remainders were used for testing. In data set C, 90% of the image sequences of each subject were selected for training and the remainders were used for testing. The proposed algorithm was tested to determine its ability to recognize by using an MSCT and an SST together, an MSCT alone, and an SST only. The NN classifier was used in these tests. Table 1 shows that the algorithm achieved its best result on the combined templates, with a recognition rate of above 86% for all three subsets. Fig. 11 shows the recognition rate for the three data sets plotted as a CMC curve. The algorithm performs well on both the MSCT and SST but MSCT is the better of the two. 4.2. Recognition on the USF data set In this experiment, the proposed algorithm was evaluated on the outdoor data set—USF data set. The USF version 2.1 data set contains 1 gallery set and A–L (12) probe sets. This data set offers experimental challenges in that it contains a number of covariates such as shoe types, surface types and viewing angles. We compared our proposed algorithm with baseline algorithm [19] and UMD Hidden Markov Model (HMM) algorithm [22].
  • 8. 2570 T.H.W. Lam et al. / Pattern Recognition 40 (2007) 2563 – 2573 Fig. 10. Sample silhouette images from (a) SOTON data set. (b) USF data set. Table 1 Performance on SOTON HiD data set Table 2 USF HiD probe sets (version 2.1) Rank 1 (%) Rank 5 (%) Rank 10 (%) Group Probe Data set covariates Number of samples (I) MSCT and SST (1) 50% train 50% test (2) 75% train 25% test (3) 90% train 10% test 86.41 88.95 89.56 94.75 94.30 95.18 96.16 95.54 95.58 (I) Intrinsic difference A B C View Shoe View, shoe 122 54 54 (II) MSCT (1) 50% train 50% test (2) 75% train 25% test (3) 90% train 10% test (II) Surface difference 81.16 85.03 88.35 90.25 91.80 93.98 93.72 94.65 95.18 D E F G Surface Surface, shoe Surface, view Surface, shoe, view 121 60 121 60 (III) SST (1) 50% train 50% test (2) 75% train 25% test (3) 90% train 10% test 77.69 81.64 80.32 89.50 91.09 92.37 93.25 93.58 93.98 (III) Extrinsic difference H I J K L 98 Cumulative Match Score 96 94 92 90 90% train, 10% test 75% train, 15% test 50% train, 50% test 88 86 0 5 10 15 20 25 Rank Fig. 11. Cumulative match characteristic for SOTON data set. To ease our explanation, we placed under three group headings: (I) intrinsic difference, (II) surface difference, and (III) extrinsic difference on USF version 2.1 data set. Table 2 provides more detailed information about these groupings. The recognition performance is illustrated in Table 3. The experimental result shows that the proposed algorithm is only slightly worse than the baseline algorithm in group (II). Briefcase 120 Shoe, briefcase 47 View, briefcase 70 Time, shoe, clothing 33 Surface, time, shoe, clothing 33 Compared with the baseline algorithm, the proposed algorithm has a better recognition rate for groups (I) and (III). The rank 1 performance in group (I) of the proposed algorithm is slightly worse than the UMD HMM algorithm by 2%. However, there is a distance in the performance of the proposed algorithm compared with UMD HMM approach in groups (II) and (III). This gives rise to a number of interesting observations. It would seem that the proposed templates are insensitive in different viewing angle and shoe types. In group (I), the recognition rate is around 66% by using baseline algorithm. Compared with baseline algorithm, there is average 14% improvement in recognition rate in group (I) by our proposed algorithm. The performance of the UMD HMM algorithm and the proposed algorithms is nearly the same. The rank 1 recognition rate of the UMD HMM algorithm and the proposed algorithms are 82% and 80%, respectively. In group (III) (probes H–L), the average identification rate of proposed algorithm is higher than baseline algorithm by 2% with a significantly high recognition rate in probes K and L. Although there is a distance between our proposed algorithm with the UMD HMM algorithm, there is also a high recognition rate in probe K. This would indicate that the proposed templates retain their discriminative power over time. The fact that the proposed algorithm does not work very well in group (II) (probes D–G, surface differences) indicates that the proposed templates are sensitive to the surface type. Fig. 12 shows the recognition rate of proposed gait recognition al-
  • 9. T.H.W. Lam et al. / Pattern Recognition 40 (2007) 2563 – 2573 2571 Table 3 The match scores of proposed algorithm and other algorithm in USF data set (version 2.1) Group Probe Baseline UMD HMM MSCT SST MSCT and SST Rank 1 Rank 5 Rank 1 Rank 5a Rank 1 Rank 5 Rank 1 Rank 1 Rank 1 Rank 5 I A B C Mean score 73 78 48 66 88 93 78 86 89 88 68 82 — — — — 71 82 59 71 92 96 85 91 77 83 69 76 90 93 81 88 80 89 72 80 94 94 87 92 II D E F G Mean score 32 22 17 17 22 66 55 42 38 50 35 28 15 21 25 — — — — — 15 10 8 8 10 46 40 22 25 33 12 13 9 12 12 41 32 29 28 33 14 10 10 13 12 41 35 26 28 33 III H I J K L Mean score 61 57 36 3 3 32 85 78 62 12 15 50 85 80 58 17 15 51 — — — — — — 54 55 37 42 12 40 83 82 67 55 42 66 38 33 19 39 9 28 63 58 42 42 24 46 49 43 30 39 9 34 78 75 61 55 36 61 a Since Rank 5 performance is not mentioned in Ref. [22], it is missed for comparison. 90 80 80 Cumulative Match Score 100 90 Cumulative Match Score 100 70 60 50 40 Probe A Probe B Probe C Probe D Probe E Probe F Probe G Probe H Probe I Probe J Probe K Probe L 30 20 10 0 0 5 10 15 20 70 60 50 40 Probe A Probe B Probe C Probe D Probe E Probe F Probe G Probe H Probe I Probe J Probe K Probe L 30 20 10 0 25 Rank Fig. 12. Recognition rate by using MSCT and SST together in USF HiD data set. gorithm in USF data set with respect to different ranks. In the illustration, rank n means the individual is matched with one of the top n samples in ordered similarity scores. We also applied the feature templates individually for gait recognition in USF data set. The recognition rate also recorded in Table 3. The recognition rate of using two templates together is higher than the recognition rate of using feature template individually. MSCT had a higher recognition rate than SST in group (III). It means that MSCT could retain more distinctive information than SST in carrying, clothing and time covariant. Figs. 13 and 14 show the recognition rate of using MSCT and SST when applied to the USF data set individually. Compared with baseline algorithm, the proposed algorithm achieves a significant improvement in groups (I) and (III). The 0 5 10 15 20 25 Rank Fig. 13. Recognition rate by using MSCT in USF HiD data set. experiments showed that the algorithm does not work very well if the surface type is different from the gallery set. The extracted silhouette images may include noise such as shadow under different surface types. The distorted silhouette images may affect the recognition rate. To further improve the recognition rate, we should find out some methods to reconstruct the distorted silhouette images to an noise-free silhouette images. UMD HMM approach uses Hidden Markov Model for recognition. Compared with UMD HMM approach, our proposed algorithm uses the feature templates directly for recognition. Since our proposed algorithm does not have any model construction and transformation before recognition, this probably affects the recognition performance. In the future, we would like to further investigate to adopt HMM or other statistical model with our proposed templates for gait recognition.
  • 10. 2572 T.H.W. Lam et al. / Pattern Recognition 40 (2007) 2563 – 2573 Table 4 The match scores of proposed algorithm and CMU algorithm in USF data set (version 1.7) Group Probe CMU MSCT SST MSCT and SST Rank 1 Rank 5 Rank 1 Rank 5 Rank 1 Rank 1 Rank 1 Rank 5 I A B C Mean Score 87 81 66 78 100 90 83 91 69 54 37 53 89 76 61 75 69 59 34 34 85 71 54 54 79 61 37 37 92 76 61 61 II D E F G Mean Score 21 19 27 23 23 59 50 53 43 51 16 18 10 5 12 33 34 31 34 33 23 11 16 14 16 39 34 33 30 34 24 20 20 9 18 39 41 34 32 37 5. Conclusions 100 Cumulative Match Score 90 80 70 60 50 40 Probe A Probe B Probe C Probe D Probe E Probe F Probe G Probe H Probe I Probe J Probe K Probe L 30 20 10 0 0 5 10 15 20 25 Rank Fig. 14. Recognition rate by using SST in USF HiD data set. We further compared our proposed algorithm with other research work, CMU key frame algorithm [23], on USF data set. In this experiment, we adopted USF version 1.7 data set. In this data set, the silhouette images are extracted from parameterized algorithm. It contains 1 gallery set and A–G (7) probe sets. The data set covariates are similar to version 2.1 data set. The research work of CMU uses the key frame of the walking sequence for gait recognition [23]. Table 4 shows the match scored of the proposed algorithm and key frame approach (CMU) in USF version 1.7 data set. The experimental results showed that the proposed algorithm is comparable with the key frame algorithm. The performance of the proposed algorithm is better than the key frame approach in probes D and E. The rank 1 recognition rate of the proposed algorithm is slightly worse than CMU algorithm in probes A and F. Different from version 2.1 data set, the proposed representations, MSCT and SST, have a better performance in group (II). Since the silhouette images in USF version 1.7 are extracted by parameterized algorithm, the quality of the extract silhouettes depends on the parameter value. It reveals that the recognition performance depends on the quality of the silhouette images. In this paper, we proposed gait recognition algorithm for human identification by fusion of motion and static spatiotemporal templates. The proposed algorithm has a promising performance under indoor and outdoor environments. Two feature templates are proposed in this paper, they are motion silhouette contour template (MSCT) and static silhouette template (SST). These templates embed the static and motion characteristic of gait. The performance of the proposed algorithm is evaluated experimentally using the SOTON data set [18] and the USF data set [19]. In the experiments, there is around 85% recognition rate in SOTON data set. In USF data set (version 2.1), under the same surface type, the recognition rate of the proposed algorithm is higher than the baseline recognition. The average recognition rate is 80% and 34% in group (I) (probes A–C) and group (III) (probes H–L). The experimental results showed that performance of proposed algorithm is promising in indoor and outdoor environments. In our proposed algorithm, two features templates, MSCT and SST, are used together for gait recognition. These feature templates retain their discriminative power under the various covariates such as shoe type, viewing angle and time. However, when surface type of the probe set is different from the gallery set, the performance of our proposed algorithm is a little worse than baseline algorithm and USF HMM algorithm. This showed that the discriminative power of these feature templates is affected by the surface type. The recognition rate is lowered due to the distorted silhouette images. Since the shadow of human is different under different surface types, this may affect the accuracy of the silhouette extraction. To further improve the recognition rate under surface type difference situation, we will investigate some algorithms for reconstructing silhouette images. The algorithm could create an noise-free silhouette image under different conditions such as shoe difference, clothing difference and surface type difference. In our proposed algorithm, two templates are directly used for recognition without any model creation, parameter setting and transformation. The proposed algorithm is simple and suitable for real-time recognition. Experiments showed that the average processing time is around 7.7 s. The performance is
  • 11. T.H.W. Lam et al. / Pattern Recognition 40 (2007) 2563 – 2573 comparable with some existing works. However, the algorithm still has room for improvement. In the future, we shall also seek to apply some dimension reduction techniques like kernel PCA for reducing the computation complexity. In addition, by using such technique, it could further reduction the execution time. In addition, we would also like to adopt other statistical model such as Hidden Markov Model to further improve the recognition performance of our algorithm. Furthermore, we would like to find some new gait feature templates for recognition. Acknowledgments This work was partially supported by the iJADE projects BQ569, A-PF74 and Cogito iJADE project PG50 of the Hong Kong Polytechnic University. References [1] M.P. Murray, A.B. Drought, R.C. Kory, Walking patterns of normal men, J. Bone Joint Surg. 46 A(2) (1964) 335–360. [2] J. Cutting, L. Kozlowski, Recognizing friends by their walk: gait perception without familiarity cues, Bull. Psychon. Soc. 9 (5) (1977) 353–356. [3] C. Barclay, J. Cutting, L. Kozlowski, Temporal and spatial factors in gait perception that influence gender recognition, Percept. Psychophys. 23 (2) (1978) 145–152. [4] N.V. Boulgouris, D. Hatzinakos, K.N. Plataniotis, Gait recognition: a challenging signal processing technology for biometric identification, IEEE Signal Process. Mag. 22 (6) (2005) 78–90. [5] G. Johansson, Visual motion perception, Sci. Am. (1975) 75–88. [6] S.A. Niyogi, E.H. Adelson, Analyzing and recognizing walking figures in XYT, Proc. Computer Vision Pattern Recognition (1994) 469–474. [7] A. Johnson, A. Bobick, A multi-view method for gait recognition using static body parameters, in: Proceedings of the Third International Conference in Audio- and Video-based Biometric Person Authentication, 2001, pp. 301–311. [8] L. Lee, W.E.L. Grimson, Gait analysis for recognition and classification, in: Proceedings of IEEE International Conference in Automatic Face and Gesture Recognition, 2002, pp. 148–155. 2573 [9] D.K. Wagg, M.S. Nixon, On automated model-based extraction and analysis of gait, in: Proceedings of IEEE International Conference in Automatic Face and Gesture Recognition, 2004, pp. 11–16. [10] H. Murase, R. Sakai, Moving object recognition in eigenspace representation: gait analysis and lip reading, Pattern Recognition Lett. 17 (1996) 155–162. [11] M. Turk, A. Pentland, Face recognition using eigenfaces, Proc. Comput. Vision Pattern Recognition (1991) 586–591. [12] P.S. Huang, C.J. Harris, M.S. Nixon, Human gait recognition in canonical space using temporal templates, IEE Proc. Vision Image Signal Process. 146 (2) (1999) 93–100. [13] J.P. Foster, M.S. Nixon, A. Prügel-Bennett, Automatic gait recognition using area-based metrics, Pattern Recognition Lett. 24 (2003) 2489– 2497. [14] J.B. Hayfron-Acquah, M.S. Nixon, J.N. Carter, Automatic gait recognition by symmetry analysis, Pattern Recognition Lett. 24 (2003) 2175–2183. [15] C. BenAbdelkader, R. Culter, H. Nanda, L.S. Davis, EigenGait: motion-based recognition of people using image self-similarity, in: Proceedings of International Conference on Audio and Video-based Person Authentication (AVBPA), 2001. [16] L. Wang, T. Tan, Silhouette analysis-based gait recognition for human identification, IEEE Trans. PAMI 25 (12) (2003) 1505–1518. [17] Z. Liu, S. Sarkar, Simplest representation yet for gait recognition: averaged silhouette, in: Proceedings of International Conference on Pattern Recognition, vol. 4, 2004, pp. 211–214. [18] J.D. Shutler, M.G. Grant, M.S. Nixon, J.N. Carter, On a large sequencebased human gait database, in: Proceedings of the Fourth International Conference on Recent Advances in Soft Computing, 2002, pp. 66–71. [19] S. Sarkar, P.J. Phillips, Z. Liu, I.R. Vega, P. Grother, K.W. Bowyer, The humanID gait challenge problem: data sets, performance, and analysis, IEEE Trans. PAMI 27 (2) (2005) 162–177. [20] P.S. Huang, C.J. Harris, M.S. Nixon, Human gait recognition in canonical space using temporal templates, IEE Proc. Vision Image Signal Process. 146 (2) (1999) 93–100. [21] R. van den Boomgaard, R. van Balen, Methods for fast morphological image transforms using bitmapped binary images, Comput. Vision Graphics Image Process.: Graphical Models Image Process. 54 (3) (1992) 252–258. [22] A. Kale, A. Sundaresan, A. Rajagopalan, N. Cuntoor, A. RoyChowdhury, V. Kruger, R. Chellappa, Identification of humans using gait, IEEE Trans. Image Process. (2004) 1163–1173. [23] R.T. Collins, R. Gross, J. Shi, Silhouette-based human identification from body shape and gait, in: Proceedings of the International Conference on Automatic Face and Gesture Recognition, 2002, pp. 351–356. About the Author—TOBY LAM graduated from the Department of Computing of Hong Kong Polytechnic University in 2003. He is now the Ph.D. candidate working in the fields of Pattern Recognition, Gait Recognition, Biometrics, Intelligent Agent Technology and Agent Ontology. About the Author—RAYMOND LEE received his B.Sc. from Hong Kong University in 1989, he received his M.Sc. and Ph.D. from Hong Kong Polytechnic University in 1997 and 2000, respectively. After graduation from Hong Kong University, he joined the Hong Kong Government in the Hong Kong Observatory as a Meteorological Scientist for weather forecasting and took part in the development of Telecommunication systems for the provision of meteorological services from 1989 to 1993. Prior to joining the Hong Kong Polytechnic University in September 1998, Raymond also worked as MIS Manager and System Consultant in various business organizations in Hong Kong. In his work, he developed various IS and e-commerce projects. His major research areas include: Intelligent Agent Technology (IAT), Agent Ontology, Chaotic Neural Networks, Pattern recognition, Epistemology, Visual Perception and Visual Psychology, Weather Simulation and Forecasting, Intelligent E-Commerce Systems. About the Author—DAVID ZHANG (M’92–SM’95) graduated in Computer Science from Peking University in 1974. He received his M.Sc. and Ph.D. in Computer Science from the Harbin Institute of Technology (HIT) in 1982 and 1985, respectively. From 1986 to 1988 he was a Postdoctoral Fellow at Tsinghua University and then an Associate Professor at the Academia Sinica, Beijing. In 1994 he received his second Ph.D. in Electrical and Computer Engineering from the University of Waterloo, Ont., Canada. Currently, he is a Chair Professor, the Hong Kong Polytechnic University where he is the Founding Director of the Biometrics Technology Centre (UGC/CRC) supported by the Hong Kong SAR Government. He also serves as Adjunct Professor in Tsinghua University, Shanghai Jiao Tong University, Beihang University, Harbin Institute of Technology, and the University of Waterloo. Professor Zhang is a Croucher Senior Research Fellow and Distinguished Speaker of IEEE Computer Society.