ARTICLE IN PRESS
Neurocomputing 73 (2010) 1892–1899

Contents lists available at ScienceDirect

Neurocomputing
journal hom...
ARTICLE IN PRESS
R. Hu et al. / Neurocomputing 73 (2010) 1892–1899

1893

e

m

Ti

Space Feature

A Recursive Time

Extra...
ARTICLE IN PRESS
1894

R. Hu et al. / Neurocomputing 73 (2010) 1892–1899

those power spectrums. To achieve the best discr...
ARTICLE IN PRESS
R. Hu et al. / Neurocomputing 73 (2010) 1892–1899

1895

and d is the dimension of periodicity feature ve...
ARTICLE IN PRESS
1896

R. Hu et al. / Neurocomputing 73 (2010) 1892–1899

based on the margin degree. In the final stage, a...
ARTICLE IN PRESS
R. Hu et al. / Neurocomputing 73 (2010) 1892–1899

1897

Table 1
Twelve experiments designed for individu...
ARTICLE IN PRESS
1898

R. Hu et al. / Neurocomputing 73 (2010) 1892–1899

stage of space feature extraction (Section 4.2)....
ARTICLE IN PRESS
R. Hu et al. / Neurocomputing 73 (2010) 1892–1899

~
distance between z i,N and zj, iaj, is called betwee...
Upcoming SlideShare
Loading in …5
×

Recursive spatiotemporal subspace learning for gait recognition

411
-1

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
411
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Recursive spatiotemporal subspace learning for gait recognition

  1. 1. ARTICLE IN PRESS Neurocomputing 73 (2010) 1892–1899 Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom Recursive spatiotemporal subspace learning for gait recognition Rong Hu Ã, Wei Shen, Hongyuan Wang Department of Electronics and Information Engineering, Huazhong University of Science and Technology, Wuhan 430074, China a r t i c l e in f o a b s t r a c t Available online 12 March 2010 In this paper, we propose a new gait recognition method using recursive spatiotemporal subspace learning. In the first stage, periodic dynamic feature of gait over time is extracted by Principal Component Analysis (PCA) and gait sequences are represented in the form of Periodicity Feature Vector (PFV). In the second stage, shape feature of gait over space is extracted by Discriminative Locality Alignment (DLA) based on the PFV representation of gait sequences. After the recursive subspace learning, gait sequence data is compressed into a very compact vector named Gait Feature Vector (GFV) which is used for individual recognition. Compared to other gait recognition methods, GFV is an effective representation of gait because the recursive spatiotemporal subspace learning technique extracts both the shape features and the dynamic features. And at the same time, representing gait sequences in PFV form is an efficient way to save storage space and computational time. Experimental result shows that the proposed method achieves highly competitive performance with respect to the published gait recognition approaches on the USF HumanID gait database. & 2010 Elsevier B.V. All rights reserved. Keywords: Gait recognition Recursive spatiotemporal subspace learning Periodicity feature vector Gait feature vector Principal Component Analysis Discriminative Locality Alignment 1. Introduction Gait recognition [1–3] is a challenging signal processing technology for video surveillance and biometric identification, which has gained more and more attention from people. Compared to other biometrics such as face [4], fingerprints [5], iris [6], and signature [7], gait can be used to identify persons more effectively under public surveillance circumstance. The benefits of gait recognition are: it is hard to fake because gait recognition requires no prior consent of the observed subject; no special equipment is required for image acquisition, it is easy to fix cameras at the corners of public; it offers potential for recognition at a long distance when the observed subject occupies too few image pixels for other biometrics to be perceivable. As most of the biometric recognition applications, gait recognition seeks to extract the human gait feature for identification. The only difference is that gait data is a three dimensional cube, as in Fig. 1, while the dimensions of other biometrics [4–7] are two. The additional dimension of gait is the time axes, wherein the dynamics of gait are contained. As a consequence, gait recognition has to experience two feature extractions to get both the shape features and the dynamic features. Fig. 1 shows the framework of gait recognition. There are two alternative paths for gait feature extraction: the upper one takes space-time extracting order, and the lower one takes time-space extracting à Corresponding author. Tel./fax: + 86 027 87543535. E-mail addresses: hr@smail.hust.edu.cn (R. Hu), shenwei@smail.hust.edu.cn (W. Shen), wythywl@public.wh.hb.cn (H. Wang). 0925-2312/$ - see front matter & 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2009.12.034 order. The final gait feature is obtained by a recursive feature extraction on the interim space or time gait feature. 2. Related work and our contribution In recent years, various techniques have been proposed for human recognition by gait. These techniques can be divided as model-based and model-free approaches. The model-based approaches aim to derive the movement of the torso and/or the legs. BenAbdelkader et al.’s approach using structural stride parameters (stride and cadence) [8] is a prime example of a model-based approach. An early system for automatic extraction and description of gait models was proposed by Cunado et al. [9], then Yam et al. extended this system to describe both legs and to handle walking as well as running [10]; Wagg and Nixon [11] developed an alternative model-based system which uses evidence gathering and model-based analysis driven by anatomical constraints. Unlike model-based approaches, model-free approaches operate directly on the gait sequences without assuming any specific model for the walking human. A prime example of a model-free approaches are Kale et al. and Sundaresan et al.’s deployment of hidden Markov models (HMM) [12,13] which consider two different image features: the width of the outer contour of a binary silhouette and the entire binary silhouette itself. Wang et al. and Vega and Sarkar considered other image features: silhouette boundary vector variations [14] and change of feature relationship [15], principal components analysis is applied on these features to reduce their dimensions while
  2. 2. ARTICLE IN PRESS R. Hu et al. / Neurocomputing 73 (2010) 1892–1899 1893 e m Ti Space Feature A Recursive Time Extraction Feature Extraction Gait Data Cube e m Ti Time Feature Extraction Gait Feature Data A Recursive Space Y Y Space Feature Time Feature X Feature Extraction Gait Feature Data X Fig. 1. The framework of gait recognition to extract gait feature data. increasing the discriminative power. Sarkar et al. proposed an approach which performs recognition by temporal correlation of silhouettes [16], the aim was to develop a technique against which future performance could be evaluated. Simple temporal correlation [14–16] is the most common method for sequence matching, while other methods like Fourier analysis [17] and dynamic time warping [18] are available. Model-based approaches mainly focus on the dynamics of gait, while shape features are omitted. And since model-based approaches rely on the identification of specific gait parameters in the gait sequence, these approaches usually require high-quality gait sequences to be useful. Moreover, other hindrances such as self occlusion of walking subjects may even render the computation of model parameters impossible. Because of the fatal drawbacks of model-based approaches, we focus on model-free approach. Model-free approaches usually perform space feature extraction first, and then match the feature sequence using simple temporal correlation or dynamic time warping, etc. But in real problems, the silhouette images are quite distorted and the extracted space feature cannot represent the gait stance, so that the classification rate is not satisfying. Efforts have been done by averaging to resolve such problems, like temporal template [20], gait energy image [19] and eigenstance of gait [13], which achieved obvious improvement. However, the average image method is too simple to contain all the dynamic features of gait and eigenstance method needs accurate gait cycle detection and phase estimation. Based on the above discussions, we proposed a new gait recognition method using the recursive spatiotemporal subspace learning. Different from most of the model-free approaches, we extract the time feature first. Since gait is a periodic activity with stable frequency, each location of aligned gait images should take certain periodic characteristic during the movement. To make full use of this periodic (dynamic) feature of gait, we adopt DFT (discrete Fourier transform) to transform the periodic signals at each location of aligned gait images into the frequency domain. An unsupervised subspace learning, PCA (Principal Component Analysis), is then applied to obtain the most discriminative components from the frequency signals. The extracted feature vector is called Periodic Feature Vector (PFV) which represents the dynamic feature at each location of gait. Based on the PFV representation of gait, a recursive subspace learning process named Discriminative Locality Alignment (DLA) is applied to extract the shape feature of gait. After DLA, original gait sequence data is compressed into a very compact vector named Gait Feature Vector (GFV). The Euclidean distance between GFVs is used for measuring the similarity of gaits. In comparison with state-of-the-art, the contributions of this paper are: 1. Unlike model-based approaches, the recursive spatiotemporal subspace learning method considers both the dynamic and 2. 3. 4. 5. shape feature of gait, which improves the recognition rate. And since it does not rely on the identification of specific gait parameters, we do not require high-quality gait sequences to be useful. Other hindrances such as self occlusion will not be a problem either. PFV is robust to silhouette distortions. Different from most of the model-free approaches, we extract the dynamic feature first. So in our approach, the shape feature of gait is based on the entire sequence but not single frames. Silhouette distortion in a single frame is averaged by the whole sequence, which makes PFV a robust representation for gait dynamics. PFV is an efficient representation for gait dynamics, which makes full use of periodic characteristic of gait. Compared to the average gait image, gait PFV representation contains more dynamic features. The PFV representation of gait saves both storage space and computational time. It compresses the gait sequence into a single vector image while preserving most of the temporal information. At the same time, there is no need to extract shape feature frame by frame. Instead, based on the vector image, shape feature extraction is carried out only once. Compared to traditional Fisher linear discriminant analysis (LDA), Discriminative Locality Alignment (DLA) reveals the nonlinear structure hidden in the high dimensional and Non-Gaussian distributed data. Meanwhile, DLA does not suffer from the matrix singularity problem. The rest of paper is organized as follows: Section 3 introduces the details of extracting periodicity feature vector using DFT and PCA; Section 4 introduces the details of extracting gait feature vector based on the gait PFV representation using DLA, and the procedure of individual recognition; Section 5 presents the experiment result and analysis; Section 6 concludes the paper. 3. Periodicity feature vector 3.1. Motivation Gait is a typical periodic activity with stable frequency, so we seek to find an efficient way to represent the dynamics at different gait locations while making the best use of its periodic characteristic. Under the assumption that the observed person is properly aligned along the whole gait sequence, the sample sequences at different gait locations should be periodic. And for periodic signals, power spectrum (using DFT) is the best choice for representing the signal, since it contains all dynamic characteristics and sequences are automatically aligned at different spectral positions. However, the power spectrums are still not discriminative enough. It is hard to tell the difference between
  3. 3. ARTICLE IN PRESS 1894 R. Hu et al. / Neurocomputing 73 (2010) 1892–1899 those power spectrums. To achieve the best discriminative effect, a subspace learning process should be followed. It is a typical case of unsupervised subspace learning [30], since we do not know how many classes there are and which class a training spectrum belongs to. We adopt PCA [31] to increase the discriminative power between spectrum signals. those frequency sequences whose sampling rates are less than the maximum sampling rate. Suppose the original frequency range of the kth gait sequence is [0, fk] while the maximum frequency range is ½0; fmax Š, trailing zeros are added between the frequency range ½fk ; fmax Šand the number of trailing zeros is Nzeros ¼ 3.2. Preprocessing Silhouettes are extracted using the simple background subtraction method [16], a normalization step is followed immediately to make sure that all extracted silhouettes are scaled to the same height and the centroid of body is aligned with the image’s centerline. The sample sequence at position (x, y) is denoted by P(x, y, t), in which variable t represents the time axis. To reduce the sequential noise, P(x, y, t) is first Gaussian filtered in the time dimension, ð1Þ P ðx,y,tÞ ¼ Pðx,y,tÞà G ðtÞ s g where Gs(t) stands for a Gaussian kernel and symbol ‘*’ denotes the convolution operation. Fourier Transform is then applied to these time sequences. But before that, Pg(x, y, t) should be zeromeaned to remove the DC component from its power spectrums, Pz ðx,y,tÞ ¼ Pg ðx,y,tÞÀ N 1X Pg ðx,y,tÞ Nt¼1 xðiÞ ¼ i¼1 aNk X eðjÞ x ð2Þ ð3Þ j¼1 where x(i) represents the spectrum sequence and a is a positive ~ integer. xðjÞ is calculated by the following criterion: ~ xðjÞ ¼ xðiÞ=a, if j A ½ðiÀ1Þa þ1,iaŠ ð4Þ The second step of interpolation is to compress the expanded sequence from aNk samples to N samples and to keep the running sum unchanged, aNk X ~ xðiÞ ¼ i¼1 N X ^ xðjÞ where Nk is the number of samples in the original frequency sequence. 3.3. PCA training The training set of PCA consists of power spectrums at each location of the training gait sequences. Suppose Npca is the number of training gait sequences and the size of silhouette image is L, then the training set of PCA can be represented by X¼{x1, x2, y , xn}, where xi is the processed spectrum as described in Section 3.2, and n ¼L  Npca. The dimension of xi is N, which equals to the number of samples in the power spectrums. First, compute the scatter matrix S of training set X, S¼ n X ðxi ÀmÞðxi ÀmÞT ð8Þ P where m ¼ 1=nð n¼ 1 xi Þ, then we obtain the d-dimensional i periodicity feature vector yk by yk ¼ ½e1 ,e2 , . . . ,ed ŠT xk ¼ Mxk ð9Þ where doN, and [e1,e2, y, ed] are the d eigenvectors of the scatter matrix S with highest eigenvalues. The dimension of periodicity feature vector yk is d, which depends on the eigenvalues of scatter matrix S. Suppose {l1, l2,y, lN} are the eigenvalues of scatter matrix S arranging from high to low, parameter d is decided by the following criterion: , d N X X ð10Þ li li 4Ts Wd ¼ i¼1 i¼1 where Ts is the threshold. Once the transform matrix [e1, e2, y, ed]T is established in the training step, it can be used directly to get periodicity feature vectors of all power spectrums, and time sequences are replaced by periodicity feature vector yk. Since the dimension of yk is much less than the original gait sequence length, the PFV representation of gait saves a lot of storage space. The PFV representation of gait is a vector image which can be divided into dimensions. The collection of the ith dimensional value of yk at each location of gait forms the ith dimensional image of PFV representation. To give a more intuitive understanding, we scale the values into range [0, 255] and show it as gray level image. Fig. 2 shows the first five dimensions of gait PFV representation together with the averaged gait image as a comparison. From Fig. 2 we can see that each dimension represents a different periodicity characteristic. ð5Þ j¼1 4. Space feature extraction ^ xðjÞ is calculated as follows: ^ xðjÞ ¼ ð7Þ i¼1 where N is the length of gait sequence. Since Fourier transform is carried out in the discrete form and the length of gait sequence varies, interpolation is necessary to make sure that all power spectrums are aligned at the same frequency locations. Suppose the gait sequences have the same sampling rate. These sequences are transformed by DFT (discrete Fourier transform) to the frequency domain in which the frequency samples are uniformly distributed over the range [0, fs/2], where fs is the sampling rate. If the length of the kth gait sequence is Nk, then the number of samples of the corresponding frequency sequence is also Nk. The aim of interpolation is to transform this sequence from Nk samples to N samples while preserving its frequency characteristic, where NaNk. The first step of interpolation is to expand the power spectrum from Nk samples to aNk samples while keeping the running sum unchanged, Nk X fmax Àfk fk =Nk jb X ~ xðiÞ ð6Þ i ¼ ðjÀ1Þb þ 1 Note that both a and b are positive integers and they are properly chosen such that aNk ¼ bN. The sampling rates of gait are the same at most of times. However, if sampling rates differ, an adding trailing zeroes step should be applied before interpolation. Trailing zeroes expand Principal components analysis (PCA) and Fisher’s linear discriminant analysis (LDA) are two of the most popular linear dimensionality reduction algorithms. PCA [31] maximizes the mutual information between original high dimensional Gaussian distributed data and projected low dimensional data. PCA is optimal for reconstruction of Gaussian distributed data. However it is not optimal for classification problems. LDA [32] overcomes this shortcoming by utilizing the class label information. It finds
  4. 4. ARTICLE IN PRESS R. Hu et al. / Neurocomputing 73 (2010) 1892–1899 1895 and d is the dimension of periodicity feature vectors. The covariance between the ith- and jth-dimension of gait vector images is illustrated in Fig. 3, where yk,i and yk,j are the ith- and jth-dimension of gk; E[yi] and E[yj] are the mean of the ith- and jth-dimension of gait vector images, respectively. The covariance is formulated as followed, covði; jÞ ¼ n X jyk;i ÀE½yi Šj Á jyk;j ÀE½yj Šj cos y ð11Þ k¼1 where |yk,i À E[yi]| and |yk,j ÀE[yj]| represent the lengths of vector (yk,i À E[yi]) and (yk,j À E[yj]), respectively; while y represents the angle between them. Eq. (11) can be expressed in the matrix form, S¼ n X ðgk ÀmÞðgk ÀmÞT ð12Þ k¼1 P where both gk and m are L  d matrix, and m ¼ 1=nð n ¼ 1 gk Þ is k the mean vector image. We obtain the transformation matrix Mpca ¼[e1, e2, y, ed1]T, d1 oL by calculating the eigenvectors of S, Eq. (12). And then transform gk (k¼1, 2, y, n) into the lower dimensional space by xk ¼ MPCA gk ð13Þ d1 Âd where xk A R . The Euclidean distance is used for measuring the distance between points in the lower dimensional space. the projection directions that maximize the trace of the betweenclass scatter matrix and minimize the trace of the within-class scatter matrix simultaneously. While LDA is a good algorithm to be applied for classification, it also has several problems as follows. First, LDA considers only the global Euclidean structure, so it cannot discover the nonlinear structure hidden in the high dimensional non-Gaussian distributed data. Second, LDA is in fact based on the assumption that all samples contribute equivalently for discriminative dimensionality reduction, although samples around the margins are more important in classification than inner samples. Finally, LDA suffers from the matrix singularity problem since the between-class scatter matrix is often singular. In this section, a PCA+ DLA [21] subspace selecting method is used for extracting the space feature of gait PFV representation. PCA is first applied to gait vector images for eliminating the useless information and then DLA (Discriminative Locality Alignment) is used for classification. Compared to LDA, DLA has three particular advantages: 1) because it focuses on the local patch of each sample, it can deal with the nonlinearity of the distribution of samples while preserving the discriminative information; 2) since the importance of marginal samples is enhanced to discriminative subspace selection, it learns low dimensional representations for classification properly; and 3) because it obviates the need to compute the inverse of a matrix, it has no matrix singularity problem. 4.2. Discriminative locality alignment Many methods have been proposed to overcome the shortages of LDA, these methods include: geometric mean [22], transductive component analysis [23], discriminant locally linear embedding with high-order tensor data [24] and so on. Discriminative Locality Alignment (DLA) [21] is one among these newly proposed methods which achieves excellent performance. DLA operates in three stages. In the first stage, for each sample in the dataset, one patch is built by the given sample and its neighbors which includes the samples from not only the same class but also different classes from the given sample. On each patch, an objective function is designed to preserve the local discriminative information. In the second stage, margin degree is defined for each sample as a measure of the sample importance in contributing classification. Then, objective functions are weighted yk , i yk , j Y Fig. 2. (a) Averaged gait image, (b)–(f) the first five dimensions of gait PFV representation. 4.1. PCA on vector images The training samples of traditional subspace learning methods for space feature extraction are scalar images. However, the training samples in this paper are vector images since gait sequences are represented in the PFV form. So changes are needed to cope with this problem. Suppose G¼{g1, g2, y, gn} is the training set for space feature extraction, where n is the number of gait sequences in the gallery set and gk represents the kth gait vector image. gk is L  d matrix, where L is the size of vector image E[yi] E[yj] x Z Fig. 3. An illustration of calculating the covariance between dimension i and j of vector images.
  5. 5. ARTICLE IN PRESS 1896 R. Hu et al. / Neurocomputing 73 (2010) 1892–1899 based on the margin degree. In the final stage, all the weighted objective functions are integrated into together to form a global coordinate. The projection matrix can be obtained by solving a standard eigen-decomposition problem. For a given gait xi (after PCA), according to the class label information, DLA divides the other ones into the two groups: samples in the same class with xi and samples from different classes with xi. It select k1 nearest neighbors from samples in the same class with xi and k2 nearest neighbors from samples in different classes with xi, and the local patch for xi is represented by Xi ¼[xi, xi1, y, xik1, xi1, y, xik2]. For each patch, the corresponding output in the low dimensional space is Yi ¼[yi, yi1, y, yik1, yi1, y, yik2]. DLA expects that distances between yi and neighbor samples from an identical class are as small as possible. Meanwhile, it expects that distances between yi and neighbor samples from different classes are as large as possible, so we get: 0 1 k1 k2 X X 2 2A arg min@ :yi Àyij : Àb :yi Àyip : ð14Þ yi p¼1 j¼1 where b is a scaling factor in [0, 1] to unify the different measures of the within-class distance and the between-class distance. Define the coefficients vector k2 k1 oi ¼ ½1, . . . ,1,Àb, . . . ,ÀbŠT ð15Þ then, Eq. (14) reduces to arg min trðYi Li YiT Þ ð16Þ Yi where Li encapsulates both the local geometry and the discriminative information, and it is given by 2 3 k1 þ k2 X ð oi Þ j ÀoT 7 6 i 7 ð17Þ Li ¼ 6 j ¼ 1 4 5 Àoi diagðoi Þ To quantify the importance of a sample xi for discriminative subspace selection, DLA finds a measure, termed margin degree mi. For a sample, its margin degree is proportional to the number of samples with different class labels from the label of the sample but in the e-ball centered at the sample. The definition of margin degree mi for the ith sample is 1 mi ¼ exp À ð18Þ ðni þ dÞt where ni is the number of samples in the e-ball centered at xi with labels different from the label of xi; d is a regularization parameter; and t is a scaling factor. In DLA, the part optimization of the ith patch is weighted by the margin degree of the ith sample before the whole alignment stage, i.e., arg min mi trðYi Li YiT Þ ¼ arg min trðYi mi Li YiT Þ Yi Yi ð19Þ Yi is selected from the global coordinate Y ¼[y1, y, yn], such that Yi ¼ YSi ð20Þ where Si A Rnðk1 þ k2 þ 1Þ is the selection matrix. Then, Eq. (19) can be rewritten as arg min trðYSi mi Li ST Y T Þ i ð21Þ Y By summing over all part optimizations described as Eq. (21) together, we can obtain the whole alignment as n X arg mintr Yð Si mi Li ST ÞY T ¼ arg min trðYLY T Þ i Y where L ¼ i¼1 Pn i¼1 Y Si mi Li ST A RnÂn is the alignment matrix. i ð22Þ To obtain the linear and orthogonal projection matrix U, such that Y¼UTX, standard eigen-decomposition is applied to matrix XLX T A Rd1 Âd1 , where X A Rd1 ÂdÂn is the input data set. We obtain the transformation matrix MDLA ¼[e1, e2, y, ed2]T, d2 od1 by calculating the eigenvectors of XLXT, and transform xi (i¼1, 2, y, n) into gait feature space by zi ¼ MDLA xi ¼ MDLA MPCA gi ð23Þ d2 Âd is called gait feature vectors (GFVs). The Euclidean where zi A R distance between gait feature vectors is used to measure the similarity between gaits. 4.3. Individual recognition We train the spectrum signals of gait sequences for PFV extraction and get the transformation matrix M; we train the labeled gallery gaits which are represented in the PFV form for space feature extraction and get transformation matrix MPCA and MDLA. For any probe gait sequence: (1) Calculate the power spectrums at each gait location as described in Section 3.2; (2) Using matrix M to transform each spectrum into periodicity feature vector (Eq. (9)); (3) Extract gait feature vectors (GFVs) using MPCA and MDLA (Eq. (23)); (4) Calculate the Euclidean distance between probe GFV and gallery GFVs as follows: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u d d 2 uX X ð24Þ ðzm,n Àzm,n Þ2 Dðzi ,zj Þ ¼ t i j m¼1n¼1 where zi represents the probe GFV and zj represents the gallery GFV. There are c classes (individuals) in gallery. We assign a probe gait zi to class k if it satisfies the following criterion: ð25Þ i ¼ arg Min Dðzi ,zj Þ j A Dk where Dk is the set of gait sequences belonging to the kth class. 5. Experiment and analysis 5.1. Data and parameter Our experiments are carried out on the USF HumanID gait database [16]. This database consists of 122 persons and for each person, there are up to five covariates: viewpoints (left/right), two different shoe types, surface types (grass/concrete), carrying conditions (with/without a briefcase), clothing and time. Twelve experiments are designed for individual recognition as shown in Table 1. Fig. 4 shows the first three dimensions of PFV representation in the gallery set and their corresponding sequences in probe set A-L. From Fig. 4, we can see that each dimension of the PFV representation represents a different type of periodic characteristic for gait dynamics, and they are unique to individuals. There are three parameters to be decided in the process of reducing the dimensionality of gait data: the dimension d of Periodicity Feature Vectors (Section 3.3), the number of principal components d1 in the PCA stage of space feature extraction (Section 4.1) and the number of chosen eigenvectors d2 in the DLA
  6. 6. ARTICLE IN PRESS R. Hu et al. / Neurocomputing 73 (2010) 1892–1899 1897 Table 1 Twelve experiments designed for individual recognition in USF HumanID gait database. Exp. Probe (Surface, Shoe, View, Carry, Time) (C/G, A/B, L/R, NB/BF, time) Number of subjects Difference A B C D E F G H I J K L (G, A, L, NB, t1) (G, B, R, NB, t1) (G, B, L, NB, t1) (G, A, R, NB, t1) (C, B, R, NB, t1) (C, A, L, NB, t1) (C, B, L, NB, t1) (G, A, R, BF, t1) (G, B, R, BF, t1) (G, A, L, BF, t1) (G, A/B, R, NB, t2) (C, A/B, R, NB, t2) 122 54 54 121 60 121 60 120 60 120 33 33 V S S +V F F+S F+V F + S+ V B S +B V +B T+ S+ C F + T+ S+ C (V—view; S—shoe; F—surface; B—briefcase; C—clothing; T—time). Gallery A B C D E Probe F G H I J K L Dimension #1 Dimension #2 Dimension #3 Person A Dimension #1 Dimension #2 Dimension #3 Person B Dimension #1 Dimension #2 Dimension #3 Person C Fig. 4. The first three dimensions of PFV representation of three individuals in the gallery set and their corresponding sequence in probe set A–L.
  7. 7. ARTICLE IN PRESS 1898 R. Hu et al. / Neurocomputing 73 (2010) 1892–1899 stage of space feature extraction (Section 4.2). All of these parameters are decided by the criterion of Eq. (10), and Ts is set to 0.9 in our experiment. In DLA, parameter k1 stands for the number of samples with same class label and parameter k2 stands for the number of samples with different class label. If we set k1 + k2 ¼n, where n is the total number of gait sequences, then DLA is similar to LDA because the global structure is considered. With this setting, DLA ignores the local geometry and performs poor. Thus, by setting k1 and k2 suitably, DLA can capture both the local geometry and the discriminative information of samples. In the experiment, k1 is set to 4 and k2 is set to 6. 5.2. Performance evaluation Our comparison includes the following methods: Baseline method [16], PCA+ LDA [29], GEI+ MFA-1 [26], GEI +TR1DA [27,28], Gabor+ GTDA+ LDA [25] and PFV+ PCA+ DLA in this paper, which is shown in Table 2. Note that, for fairness, all the methods are carried out on the same gait database. The performance in Table 2 is represented by Rank1 and Rank5 recognition rate. Rank1 Table 2 The Rank1 and Rank5 recognition rate for human gait recognition. CCR(%) A B C D E F G H I J K L RANK1 Baseline PCA+ LDA GEI + MFA-1 GEI + TR1DA Gabor + GTDA PFV+ DLA 73 87 89 85 91 94 78 85 89 88 93 92 48 76 83 71 86 85 32 31 38 19 32 46 22 30 43 23 47 51 17 18 25 15 21 28 17 21 29 14 32 34 61 63 58 49 95 68 57 59 59 47 90 66 36 54 56 45 68 62 3 3 9 7 16 13 3 6 18 7 19 20 RANK5 Baseline PCA+ LDA GEI + MFA-1 GEI + TR1DA Gabor + GTDA PFV+ DLA 88 93 95 100 98 99 93 92 96 97 99 97 78 89 93 95 95 94 66 58 64 52 58 66 55 60 67 52 64 69 42 36 44 34 41 52 38 43 53 45 52 59 85 90 89 47 98 92 78 81 88 71 99 91 62 79 81 70 87 84 12 12 24 25 31 28 15 12 27 25 37 35 and Rank5 performance is the standard for estimating the correct recognition rate. Rank1 performance means the percentage of the correct subject appearing in the first place of the retrieved rank list and Rank5 means the percentage of the correct subjects appearing in any of the first five places of the retrieved rank list. 5.3. Analysis of PFV representation To understand the property of dimensions of gait PFV representation better, Fig. 5 gives an intuitive explanation. Locations having the same periodicity characteristic are clustered in ellipses, like the background region, the leg swing region, the hand swing region, the body region and so on. The direction having the largest variance among all sample points is shown by the diagonal line labeled D1 which represents the first dimension of PFVs. D2 and D3 are directions having the second and third largest variance among all sample points. The first dimension represents the difference between foreground and background, and the rest dimensions represent the difference between body parts, such as leg, hand, body. Table 3 lists the first 10 eigenvalues in PFV training. The largest eigenvalue occupies nearly 70% of the total energy, and then it decreases rapidly to around 7.5% for the second eigenvalue. The rest eigenvalues still occupy small energy in contrast with the largest eigenvalue, but they decrease slowly. The same as gait averaged image [19], the PFV representation does not require phase estimation. However, the sequence length of gait used in DFT as described in Section 3.2 has certain impact on the recognition rate. If the length of gait sequence is exactly integer multiples of T which is the period of gait, then the corresponding GFV is called the expected GFV. To evaluate the impact of sequence length on recognition rate, first we get all the expected GFVs (one gait for each individual) which are denoted by zj. Then for each gait, we vary the sequence length ~ and get the corresponding GFV, which is denoted by z i,N and N represents the sequence length. The Euclidean distance between ~ z i,N and zj, i¼j, is called within-class variation and the euclidean 0.1 0.09 0.08 Sample Space hand ratio 0.07 leg body 0.06 0.05 0.04 0.03 0.02 0.01 background 0 20 40 60 80 100 120 140 160 180 sequence length Fig. 5. Explanation of dimensions of gait PFV representation. Fig. 6. The change of ratio of within-class variation and the averaged betweenclass variation with sequence length increasing for the ith gait sequence. Table 3 The first ten eigenvalues in PFV training. Index Eigenvalue 1 1.0475 2 0.1149 3 0.0834 4 0.0497 5 0.0377 6 0.0262 7 0.0198 8 0.0160 9 0.0120 10 0.0100
  8. 8. ARTICLE IN PRESS R. Hu et al. / Neurocomputing 73 (2010) 1892–1899 ~ distance between z i,N and zj, iaj, is called between-class variation. For each gait sequence, the ratio of within-class variation and the averaged between-class variation is an important measure for evaluating its discriminative power. For the ith gait sequence, Fig. 6 shows its corresponding change of this ratio with sequence length increasing. From this figure, we can see that the sequence length should be selected by two criterions: 1) it should be close to the integer multiples of T; 2) it should be long enough, typically more than 3 or 4 gait cycles, to minimize the within-class variation. 6. Conclusion and future work In this paper, a new gait representation is proposed for individual recognition. By DFT and PCA, gait sequences are represented in the PFV form. Based on the PFV representation of gaits, PCA+DLA is then applied to extract the discriminative information for recognition. Gait data is finally compressed into Gait Feature Vector (GFV) which shows competitive discriminative power. Our future work includes multimodal biometric recognition [33], by integrating more individual modalities, e.g., face, gait, and fingerprint, the recognition accuracy will be significantly raised. References [1] Xuelong Li, S.J. Maybank, Shuicheng Yan, Dacheng Tao, Dong Xu, Gait components and their application to gender recognition, IEEE T-SMC, Part C 38 (2) (2008) 145–155. [2] M.S. Nixon, J.N. Carter, Advances in automatic gait recognition, Proc. Int. Conf. Autom. Face Gesture Recognition (2004) 139–146. [3] D. Gavrila, The visual analysis of human movement: a survey, Comput. Vision Image Understanding 73 (1) (1999) 82–98. [4] M. Turk, A. Pentland, Face recognition using eigenfaces, Proc. Conf. Comput. Vision Pattern Recognition (1991) 586–591. [5] A.K. Jain, L. Hong, S. Pankanti, R. Bolle, An identity verification system using fingerprints, Proc. IEEE 85 (9) (1999) 1365–1388. [6] J. Daugman, High confidence visual recognition of persons by a test of statistical independence, IEEE Trans. Pattern Anal. Mach. Intell. 15 (11) (1993) 1148–1161. [7] Y. Qi, B.R. Hunt, A multiresolution approach to computer verification of handwritten signatures, IEEE Trans. Image Process. 4 (6) (1995) 870–874. [8] C. BenAbdelkader, R. Cutler, L. Davis, Stride and cadence as a biometric in automatic person identification and verification, Proc. Int. Conf. Autom. Face Gesture Recognition (2002) 372–377. [9] D. Cunado, M.S. Nixon, J.N. Carter, Automatic extraction and description of human gait models for recognition purposes, Comput. Vision Image Understanding 90 (1) (2003) 1–41. [10] C.Y. Yam, M.S. Nixon, J.N. Carter, Automated person recognition by walking and running via model-based approaches, Pattern Recognition 37 (5) (2004) 1057–1072. [11] D.K. Wagg, M.S. Nixon, On automated model-based extraction and analysis of gait, Proc. Int. Conf. Autom. Face Gesture Recognition (2004) 11–16. [12] A. Kale, A. Sundaresan, A.N. Rajagopalan, N.P. Cuntoor, A.K. Roy-Chowdhury, V. Kruger, R. Chellappa, Identification of humans using gait, IEEE Trans. Image Process. 13 (2004) 1163–1173. [13] A. Sundaresan, A.R. Chowdhury, R. Chellappa, A hidden Markov Model based framework for recognition of humans from gait sequences, Proc. IEEE Int. Conf. Image Process. (2003) 143–150. [14] L. Wang, T. Tan, H. Ning, W. Hu, Silhouette analysis-based gait recognition for human identification, IEEE Trans. Pattern Anal. Mach. Intell. 25 (12) (2003) 1505–1518. [15] I.R. Vega, S. Sarkar, Statistical motion model based on the change of feature relationships: human gait-based recognition, IEEE Trans. Pattern Anal. Mach. Intell. 25 (10) (2003) 1323–1328. [16] S. Sarkar, P.J. Phillips, Z. Liu, I.R. Vega, P. Grother, K.W. Bowyer, The human ID gait challenge problem: data sets, performance, and analysis, IEEE Trans. Pattern Anal. Mach. Intell. 27 (2) (2005) 162–177. [17] S.D. Mowbry, M.S. Nixon, Automatic gait recognition via Fourier descriptors of deformable objects, Proc. Conf. Audio Visual Biometric Person Authentication (2003) 566–573. [18] A. Veeraraghavan, A.R. Chowdhury, R. Chellappa, Matching shape sequences in video with applications in human movement analysis, IEEE Trans. Pattern Anal. Mach. Intell. 27 (12) (2005) 1896–1909. 1899 [19] J. Han, B. Bhanu, Individual recognition using gait energy image, IEEE Trans. Pattern Anal. Mach. Intell. 28 (2) (2006) 316–322. [20] A.F. Bobick, J.W. Davis, The recognition of human movement using temporal templates, IEEE Trans. Pattern Anal. Mach. Intell. 23 (3) (2001) 257–267. [21] Tianhao Zhang, Dacheng Tao, Jie Yang, Discriminative Locality Alignment. ECCV Part I (2008) 725–738. [22] Dacheng Tao, Xuelong Li, X.D. Wu, S.J. Maybank, Geometric mean for subspace selection, IEEE Trans. Pattern Anal. Mach. Intell. 31 (2) (2009) 260–274. [23] Wei Liu, Dacheng Tao, J.Z. Liu, Transductive component analysis, IEEE Int. Conf. Data Mining, ICDM (2008) 433–442. [24] Xuelong Li, S. Lin, Shuicheng Yan, Dong Xu, Discriminant locally linear embedding with high-order tensor data, IEEE Trans. Syst., Man, Cybern., Part B: Cybern. 38 (2) (2008) 342–352. [25] Dacheng Tao, Xuelong Li, X.D. Wu, S.J. Maybank, General tensor analysis and Gabor features for gait recognition, IEEE Trans. Pattern Anal. Mach. Intell. 29 (10) (2007) 1700–1715. [26] Dong Xu, Shuicheng Yan, S. Lin, H.J. Zhang, Marginal fisher analysis and its variants for human gait recognition and content-based image retrieval, IEEE Trans. Image Process. 16 (11) (2007) 2811–2821. [27] Dacheng Tao, Xuelong Li, X.D. Wu, S.J. Maybank, Tensor rank one discriminant analysis—a convergent method for discriminative multilinear subspace selection, Neurocomputing 71 (10–12) (2008) 1866–1882. [28] Dacheng Tao, Xuelong Li, X.D. Wu, S.J. Maybank, Elapsed time in human gait recognition: a new approach, ICASSP, IEEE Int. Conf. Acoustics, Speech Signal Process. (2006) 177–180. [29] P.S. Huang, C.J. Harris, M.S. Nixon, Recognizing humans by gait via parameteric canonical space, Artificial Intell. Eng. 13 (1999) 359–366. [30] A.M. Martinez, A.C. Kak, PCA versus LDA, IEEE Trans. Pattern Anal. Mach. Intell. 23 (2) (2001) 228–233. [31] I. Joliffe, Principal Component Analysis, Springer-Verlag, 1986. [32] R.A. Fisher, The use of multiple measurements in taxonomic problems, Annals of Eugenics 7 (1936) 179–188. [33] Tianhao Zhang, Xuelong Li, Dacheng Tao, Jie Yang, Multimodal biometrics using geometry preserving projections, Pattern Recognition 41 (3) (2008) 805–813. Rong Hu received the BS degree in electronics and information engineering from Huazhong University of Science Technology (HUST), Wuhan, China, in 2004 and the MS degree in electronics and information engineering from HUST in 2006. He is now a PhD candidate in Digital Video and Communication Laboratory, HUST. His research interests include computer graphics, computer vision, and pattern recognition. Wei Shen received the B.S. degree in Electronics and Information Engineering from Huazhong University of Science and Technology (HUST), Wuhan, China, in 2007. Currently, he is a PhD candidate in Digital Video and Communication Laboratory, HUST. Hongyuan Wang is professor of the Department of Electronics and Information Engineering at Huazhong University of Science and Technology (HUST). From 1984 to 1985, he worked in University of Oklahoma as a visiting scholar. His current research areas include digital video communication and digital signal processing.

×