Your SlideShare is downloading. ×
21 ..................................................................................................... eliF ataD etanidr...
SAIT Internship Technical Report


1. Introduction
One of the hottest issues in vision research area is face recognition a...
SAIT Internship Technical Report


Table 2 Institutions with Face Studio & Face DB
       Face DB                 KISA[3] ...
SAIT Internship Technical Report


2. Video Capture and Image Localization Procedure

weivrevO & puteS metsyS .1.2
weivrev...
SAIT Internship Technical Report




Figure 2 Settings & Video Capture process
We used Logitech Quick Cam 6.0 to capture v...
SAIT Internship Technical Report


smhtiroglA .2.2
This section will provide full detail of how ground truth data for eye ...
SAIT Internship Technical Report


.2.2.2
.2.2.2
.2.2.2
.2.2.2    noitazilacoL
          noitazilacoL
          noitazilac...
SAIT Internship Technical Report




Figure 7 Ratio applied Image Width by Hor. Degree (left) Image Width by Vert. Degree
...
SAIT Internship Technical Report


Table 5 Directory Structure of the Database & Image Number PER directory
 ]ETAD[       ...
SAIT Internship Technical Report




Figure 9 VertMosaic V_120
VertMosaic is a sort of PoseMosaic, except that it only cov...
.3002 ,ecnegilletnI enihcaM dna sisylanA nrettaP no sn
oitcasnarT EEEI , esabataD noisserpxE dna ,noitanimullI ,esoP UMC e...
SAIT Internship Technical Report


6. Appendices

                                                                eliF ata...
Upcoming SlideShare
Loading in...5
×

Face Localization from Video Streams Face Localization from ...

568

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
568
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Face Localization from Video Streams Face Localization from ..."

  1. 1. 21 ..................................................................................................... eliF ataD etanidrooC eyE etaidemretnI .1.6 21 ............................................................................................................................................................... xidneppA .6 11 ............................................................................................................................................................ secnerefeR .5 11 ...................................................................................................................................................tnemgdelwonkcA .4 11 ................................................................................................................................... eruC dna ssenkaeW .3.3 01 ......................................................................................... esabataD siht gnisU fo segatnavdA : htgnertS .2.3 01 ......................................................................................................................................sevitanretlA emoS .1.3 01 .................................................................................................................................... snoissucsiD & noisulcnoC .3 8 ................................................................................................................................. noitazinagrO esabataD .3.2 6 .................................................................................................................................................... smhtiroglA .2.2 4 ...........................................................................................................................weivrevO & puteS metsyS .1.2 4 ............................................................................................................................ erudecorP ezilacoL dna erutpaC .2 3 .....................................................................................................................snoitnevnoC dna ygolonimreT .2.1 2 ........................................................................................................... noitisiuqcA esabataD ecaF suoiverP .1.1 2 ............................................................................................................................................................. noitcudortnI .1 stnetnoC stnetnoC stnetnoC stnetnoC moc.gnusmas.rentrap@gnos.3yh :liam-E /rk.oc.gnusmas.tias.www//:ptth aeroK htuoS ,ygolonhceT fo noitutitsnI decnavdA gnusmaS ,baL ICH gnoS gnuoY-nuyH Face Localization from Video Streams SAIT Technical Report
  2. 2. SAIT Internship Technical Report 1. Introduction One of the hottest issues in vision research area is face recognition and face verification. In face recognition and face verification, the goal is to recognize a person by the pattern of their face, and to verify someone from already registered images. Method used extensively in these two areas is called the ‘eigen-face’ method.[1] By forming a basis of principle components, or eigenvectors from a person’s two dimensional image, eigenvectors contain information about who the face is from a process referred as Principal Component Analysis (PCA). To verify a person’s face, eigenvectors are aligned due to whether they belong to one person or to another person in the registration step, with certain distinguishable feature, which later is used to figure out a person’s identity. However, eigen-face method has disadvantages of low performance in circumstances of pose, illumination, and face expression variation. Accordingly, in order to minimize these defects, plenty of images with these three variances and algorithms to reduce such defects are inevitable. Several attempts have been made from CMU, Yale, KISA and etc. to amass large image database to devise and to improve such algorithms, , yet due to several restrictions, no current database is suitable for solely pose estimation research, for most of them only have 7~13 pose variations in there image database. Motivated from limitations of external face databases, we the Human Vision technical group, HCI Lab, SAIT, collected 2D face images of 120 people, controlling the angle measure of face pose in horizontally 180 degrees range and vertically 120(60) degrees range. In eigen-face space, the eye position of images has to be lined up before undergoing PCA process (that is image have to be cropped by certain ratio concerning the eye coordinate), for if this condition’s not satisfied before feature extraction and registering process, principal components can result in poor face recognition result. On the contrary, if the eye position is sufficiently aligned, it is said [1] that the only factor that can affect the efficiency of PCA method would be variation of the illumination, pose, and expression.[2] In addition to obtaining normal face image for ordinary pattern experiments, considering the requirement of eigen-face space, “localization” referring to eye coordinate aligning procedure within the 2D image seemed to be essential. Therefore, after acquiring face images from 120 people, the images were cropped into 56×56 pixel image, all consistent in their eye positions within the cropped figure. noitisiuqcA esabataD ecaF suoiverP .1.1 noitisiuqcA esabataD ecaF suoiverP .1.1 noitisiuqcA esabataD ecaF suoiverP .1.1 noitisiuqcA esabataD ecaF suoiverP .1.1 There are three methods in acquiring facial images for every Euler degree. First method is by using a laser scanner. After scanning the head of a subject, input 3D object data is transformed into 3D mesh information. We apply rotation of wanted degree to the mesh and then project it to 2D plane. The advantage of this method is that it ensures exact rotation degree of the face. However, post processing procedure should not be underestimated within this method, for mesh error that occurs in modeling hairy part are extensive. Additionally, main concern with laser scanning is getting fine mesh structure, texture of the mesh is of no big importance, so camera options are short, which accordingly results in poor picture quality, and in case of poor texture quality, illumination control is very difficult. Secondly, 2D face images can be obtained from specialized studio made entirely for face images’ sake. Three institutions below are such examples. (See Table 1) They set up cameras and lights in different positions to simulate pose and light variances. Images obtained this way do not suffer from mesh control of hairy parts or illumination control problems, nevertheless, certain constraints should be taken into account. The total number of pose equals to the actual number of camera, and because camera number can’t be flexible, the total number of pose couldn’t be either, and 7 to 13 pose images are far too short to perform experiments such as pose estimation. Additionally, what if we need images with the testee’s head bent down or lifted? CMU obtained picture of the testee’s head bent down by positioning the camera at the ceiling toward down. Image taken from camera positioned on the ceiling is quite different from image taken from a front view camera and testee’s bending the head down, because factors like gravitation, eye direction and etc and differ. We therefore used video camera to solve problem just mentioned, to get many different pose images that was not easy when using limited number of camera. Also, we positioned this video camera only at the front of the testee. The video was taken by rotating the testee’s chair with asking him/her to lift or bend down their head; we were able to obtain images of horizontal at intervals of 1 degree and images of vertical at intervals of 10 degree because the testee were asked to lift or bend down within 10 degrees gap.
  3. 3. SAIT Internship Technical Report Table 2 Institutions with Face Studio & Face DB Face DB KISA[3] Yale[4] CMU(PIE)[5] Illumination 43 illumination 64 light cones 21 flashes Variance combinations Pose Variance Total 7 Poses Total 9 Poses Total 13 Poses (Rotation (0°, 0°), (±15°, 0°), (0°, ±12°), (9°, ±9°), (12°, 0°), (0°, 0°), (±22.5°, 0°), (±45°, 0°), degree of (±30°, 0°),( ±45°, 0°) (17°, ±17°), (24°, 0°) (±67.5°, 0°), (±90°, 0°), (0°, ±22.5°) Head) - Horizontally 15° Gap - Five poses at 12°(from cam’s axis) - Horizontally 22.5° gap in on row - No cameras vertically - Three poses at 24° (from cam’s - 2 cameras vertically up and down axis) - 2 cameras to simulate surveillance camera DB size 1000 people 10 people 68 people (52,000 images) (4500 images used) (41,368 images) snoitnevnoC dna ygolonimreT .2.1 snoitnevnoC dna ygolonimreT .2.1 snoitnevnoC dna ygolonimreT .2.1 snoitnevnoC dna ygolonimreT .2.1 • Vertical, Horizontal Degree : HV_N_M In order to manage inner and outer plane rotations, we asked the testee to lift or bend down their head to certain amount and then turned the chair around clockwise, so we were able to obtain images of horizontal at intervals of 1 degree, and vertically at intervals of 10 degree. Vertical degree is this lifted up or bent down degree and the amount of rotation from the chair’s axis will be refer to as horizontal degree. See Figure 1 for detail. t h gi R tf e L 0 _09- _V H 0 _09 _V H noitatoR latnoziroH noitatoR lacitreV ar e ma C 0 _0 _V H Figure 1 Horizontal & Vertical degree definitions, notations Left figure is to illustrate the position of the camera, how horizontal rotation and vertical rotation is defined. If the subject’s face is straight toward the camera, that angle is defined horizontal 0° new notation for this is HV_0_0. Accordingly, from now on if the head is toward left, , this is same as HV_90_0, and if the head is toward right, this is same as HV_-90_0. •Localization ev ecaf dna noitingocer ecaf fo ecnamrofrep eht ecnahne ot segami dezilacol llew teg ot si repap siht fo aedi niaM p eht nehw ,eroferehT .setanidrooc eye rieht yb dengila segami ecaf 65*65 lla gnikam yb enod saw sihT .noitacifir ‘ ’ .setanidrooc eye rieht ot gnidrocca dengila segami lla gnikam snaem ti , noitazilacol ot srefer repa
  4. 4. SAIT Internship Technical Report 2. Video Capture and Image Localization Procedure weivrevO & puteS metsyS .1.2 weivrevO & puteS metsyS .1.2 We used Logitech Quick Cam 6.0, Windows 2000 Professional Operating System on 1GHz CPU and 512MB Ram. Two testers were involved, one to control the video camera, and the other to rotate the testee’s chair and to measure vertical degree of the head with normal graduated protractor. See Figure 2 Settings & Video Capture process. The tester rotated the testee from horizontal minus 120 degree to plus 120 degree. ( We only need -90°~90° range image.) This redundant rotation was due to keep rotation speed constant within horizontal -90°~90°range and to check which frame should be indicated as horizontal -90° frame and 90° rotation frame. (Three frame numbers are inserted as parameter for horizontal -90°, 0°, 90°) After, taking the video, AVI format video stream was changed into JPG image file, one image file per a frame. Next, for localization, we had to retrieve the eye coordinate ground truth data. Since JPG image files were made from video stream and there is just a slight change between frames, 12 key frames of equal interval between -90° ~90° were selected to mark the eye coordinate pixel value. Then, these 12 quadruplets (x, y values for left and right eye each) were used to interpolate eye coordinate value between key frames. After assigning eye coordinate value to the entire jpg image set, the distance of the two eyes could be figured out. Although, there are few exceptions, distance between two eyes are proportional to the size of a person’s face and the degree of rotation, so using ‘distance between the eyes’ for image width and height, and ‘position of the two eye-coordinates’ for alignment, images extracted from video stream were cropped according to two former factors into 56 by 56 pixel image. Table 3 Average time required for capturing images PER SUBJECT emiT gnitteS tnemnorivnE erutpaC emaN eludoM despalE erudecorP 0.6 maCkciuQ ]margorP[ nim 51 .wodahs ecnalabretnuoc ot desu erew spmal owT erutpaC oediV neercS eulB .level eye ta denoitisop saw aremaC .2 erugiF eeS .dnuorgkcab rof desu saw exe.GPJekaM nim 5 egami GPJ otni detrevnoc saw maerts oediv IVA GPJ ot IVA exe.llaGPJekaM VCnepO dna ++C ni nettirw margorp a yb selif noisrevnoC .yrarbil 5.4.1 buDlautriV ]margorP[ nim 03 saw eno ,pets siht ni sboj launam owt erew erehT iateD dnuorG m.HetanidrooCeyEteG ° ° egami 09 , 0 , 09- fo rebmun emarf eht gnitteg° l hturT m.VetanidrooCeyEteG eye eht gnitcartxe saw rehto eht dna ,elif ofnI ataD .eulav lexip etanidrooc oitcartxE 2.2 ot refer sliated erom roF n ‘ ’ baltaM ni scihparG yb detroppus noitcnuf )(tupniG .qeR ,)(2paps dna ,atad hturt dnuorg tcartxe ot desu saw ‘ ni xoblooT enilpS yb detroppus snoitcnuf )(ipaps ’ .setanidrooc eye gnitalopretni rof desu erew baltaM m.HporC nim 5 egamI yb detroppus snoitcnuf )(porc dna )(eziseR ‘ gnipporC egamI m.VporC ’ rof desu erew baltaM ni xobkooT gnissecorP detroppus snoitcnuf )(ipaps ,)(2paps dna ,gnipporc ‘ ’ rof desu erew baltaM ni xoblooT enilpS yb .thgieh dna htdiw egami porc eht gnitsujda 2.2 ot refer sliated erom roF .desu si ralucitrap gnihton ,pets siht nI emaN egamI nim 5 m.eergeDotemarF tamrof GPJ ot tamrof IVA morf noisrevnoc nehW & noisrevnoC m.ciasoMesoP 1 .rebmun emarf lanigiro eht retfa deman si egami eht ciasoMesoP m.ciasoMtreV ot gnidrocca degnahc era eman segami pets siht nI’ .eerged lacitrev dna latnoziroh rieht 1 PoseMosaic: After cropping the image files, we made a new image file that supported overall view of the cropped result.
  5. 5. SAIT Internship Technical Report Figure 2 Settings & Video Capture process We used Logitech Quick Cam 6.0 to capture video stream and two electric lights to alleviate the side effect caused by shadow. Blue Screen was used to support background substitution. Two testers were involved during the video capturing process, one to take control of the video camera program, and the other to rotate the testee’s chair and to measure vertical degree of the head with normal graduated protractor.
  6. 6. SAIT Internship Technical Report smhtiroglA .2.2 This section will provide full detail of how ground truth data for eye positions was extracted, the input and output of some modules, and algorithm used in localization (i.e. ratio that affect scaling factor used during cropping process). .1.2.2 noitcartxE ataD hturT dnuorG Although there is lots of ground truth information, we decided only to extract eye positions to leave out redundant time consumed if more ground truth information were extracted. Because there were so many images, it was almost impossible to extract eye positions manually from all of them one by one. Therefore, we decided to estimate the eye coordinates of some frames from already known adjacent frames, since images were made from video streams and changes are not radical between frames. First, we had to decide the number of key frame that had to be processed manually. Under empirical observation, it was proved that about 12 key frames were sufficient for estimating eye coordinates between them. For each AVI file, twelve left and right (x, y) values were saved in a vector data structure, and these were used by Hermite Least Square approximation function supported by MATLAB spline toolbox to estimate eye coordinate values in between ( spap2() ). Left graph of Figure 3 shows actual Y value that has been obtained manually and Right graph shows how it has been approximated. We used cubic spline to approximate the data in between, and after the spline function was made, theta values of each frame were used to get approximated eye coordinates. Theta value were obtained linearly, i.e. if there were 1 to 5 frame between horizontal -90° ~90°, then each theta value would be -90°, -45°, -0°, 45°, 90°. Figure 3 Hermite Least Square Approximated Spline: Left eye Y value graph Left graph is the result of plotting 12 right eye y coordinate values from key frames. This initial graph is smoothed with Hermite Least Square Approximation Spline. Eye coordinate values (in this case, ‘left eye y value’) of other frames are obtained using this graph. Figure 4 Approximated Eye Coordinates The approximated eye coordinates obtained from the splines are marked on the sample images with white circle. Notice that they are considerably accurate. The result information was made into a text file and was saved with JPG images. Refer to Appendix, in case you need details about the intermediate data.
  7. 7. SAIT Internship Technical Report .2.2.2 .2.2.2 .2.2.2 .2.2.2 noitazilacoL noitazilacoL noitazilacoL noitazilacoL Localization is essential for eigen-face space regularity, so after image acquisition, images were localized for preparation of experiments, such as pose estimation. Basically, two rules were considered during the cropping step. First rule was for the eye position within an image. For frontal-view image, two eye positions were fixed at (0.3, 0.32), (0.7, 0.32), but when the head rotates clockwise from facing forward (horizontal 0°~90°, head toward left direction), this coordinate can’t be applied by rote. Therefore, when the head was to the left direction, only the (0.7, 0.32) value were applied. Similarly, when the head was to the right direction (horizontal degree -90°~0°), only the (0.3, 0.32) value were applied. o 03 o 02 )23.0 ,3.0( o 01 o 0 o 01- )23.0 ,7. 0( o 02- o 03- o 04- o o o o o o o o o o o 09- 08- 06- 04- 02- 0 02 04 06 08 09 o 05- Figure 5 Localization Rule One ( PoseMosaic HV_180_60) Two rules were applied in the cropping process. When aligning the images, (0.3, 0.32) was the position of the left eye when horizontally min us 90° ~0°, and (0.7, 0.32) was the position of the right eye when horizontally 0°~ plus 90° degree. These two ratios are Localization Rule O ne. This new version of image tiled is also called HV_180_60 PoseMosaic, which we will later explain in 2.3. Secondly, for determining the size of the image the distance between eyes and the horizontal rotation degree was used. It is quite explicit how the width and height of the hv_0_0 image is obtained. If the distance between the eyes is divided by 0.4 (eye distance proportion in cropped image), this is roughly the length for cropping. However, this ratio has to vary according to the rotation of the head. Image is merely a projection of 3D object to the view plane, and this fact can be applied to the distance between the eyes as well. As you can see in Figure 5, when an imaginary line between two eyes is projected on the view plane, the projection degree would equal to the rotation degree of the head. Therefore, the ratio was extended to 0.4×cosθ. × θ Figure 6 Localization Scaling Factor This image is a figure of subject’s head, viewpoint located Thus, the equation for the width of the image would be like: above subject’s head towards the subject. The triangles in dicates nose of two situations. When head is toward horizo 1 : 0.4 × cosθ = Image Width : Distance between two eyes θ ntal 60 degree, the distance between the eyes shorten at c osθ rate because the horizontal angle are same with the a ngle between projection plane and imaginary line between Image Width = Distance between two eyes ÷ ( 0.4 × cosθ ) θ the eyes. This how scaling factor 0.4× cosθ is acquired. Using the equation above, approximated Image Width looks somewhat like left graph in Figure 7. Therefore, the scaling factor would be 56÷ (ImageWidth). Intuitively, the graph should be a line, ÷ but as assumptions made in Figure 6 like head of normal people being exact sphere can’t be achieved, the graph becomes to be a curved one. Graph on the right is approximated width by vertical degree.
  8. 8. SAIT Internship Technical Report Figure 7 Ratio applied Image Width by Hor. Degree (left) Image Width by Vert. Degree If (Image Width) = (0.4 × cosθ ) ÷ (Distance between two eyes) equation is evaluated the result graph looks like left one. The image size of a certain horizontal degree is obtained from this graph. After 7 (13) graphs of left form are evaluated, mean values of these 7(13) graphs are calculated, then it is used to approximate again to smooth the image size vertically. noitazinagrO esabataD .3.2 Since the total number of the subject was not large (about 120), database organization was no difficult job because there were about 2500 images per subject. Accordingly, the importance of Database Schema and Image Naming Rules are not to be underestimated. Table 3 shows how images were named. First, all images were identified by its [Name]. [Name] is made up of the subject’s gender, where the subject belongs to (HCI Lab, or INetwork Lab), and the subject’s initial. In case of same initials, we occasionally abbreviated the middle part of the name into 2 characters initial. [Name] is followed by an underscore ‘_’ and [Format]. [Format] indicates how the video stream was obtained. If the subject’s head was rotated via the rotation of chair it is horizontal mode and number that follows it indicates the bent amount of the subject’s head. If the subject lifted their head up from facing downwards, it is called vertical mode and ‘v’ comes instead of ‘h’, also for vertical mode we only filmed face facing front, so there is no number indicator for vertical mode stream. Lastly, ‘g’ and ‘n’ stands for ‘glasses’ and ‘normal’. For example, let’s consider format ‘hp30g’, the image was obtained from a video stream with the subject’s eye glasses on, rotating the chair, and its head lifted up to 30 degrees. Every image was maintained in three style; video stream type AVI file (videoStreams), 320×240 image (ImgFromVideo), 56×56 (ImageCropped) and mostly in frame number order. (In ImgCropped directory, there also exists a set of image sorted according to their horizontal and vertical degree). Table 4 shows the amount of image per associating directory. Approximately 7 to 13 videos were taken per subject and each video stream is processed individually into appropriate directory. There are about 300 to 500 frames in a video stream, so the number of image obtained falls between these to numbers. Since, images are named according to their rotation degree and saved in that order in localization step, there are exactly 180 more image files in ‘ImgCropped’ directory. Therefore, total number of image saved per subject rounds to 1000 ((7~13)×(300+500 ~500+700)). Table 4 Image Naming Rule Directory name eluR gnimaN egamI videoStreams gpj.)muNemarF(]tamroF[_]emaN[ ImgFromVideo gpj.)muNemarF(]tamroF[_]emaN[ ImgCropped gpj.)muNemarF(]tamroF[_]emaN[_porc gpj.)eerged lacitrev(_)eerged latnoziroh(_vh_]emaN[ ]emaN[ )laitini eman :xxxx|xxx(+)i|h(+)f|m( [Format] (h|v)(p|m)(0|10|20|30|40|50|60)(g|n)
  9. 9. SAIT Internship Technical Report Table 5 Directory Structure of the Database & Image Number PER directory ]ETAD[ smaertSoediv ]EMAN[ 31~7 )yrotcerid rep( selif IVA oediVmorFgmI ]EMAN[ ]TAMROF[ 005~003 selif GPJ depporCgmI ]EMAN[ ]TAMROF[ 007~005 selif GPJ Table 5 shows overall result of this project with detail. As we rotated every subject by 10 vertical degree gap, from -60° to +50° or from -30° to +30°, we could obtain every horizontal degree face image for specific vertical degree. This goes same with the vertically incremented video stream. Subjects with glasses were asked to take video twice for horizontal and vertical axis rotation; fourth and fifth row of Table 5 indicates image statistics for this case. During the image cropping process, it was difficult to see the entire localized images at a glance, so we made some sort of image, tiled up by every horizontal and vertical 10 degree images and named some POSEMOSAIC and VERTMOSAIC. There are two types of POSEMOSAIC as mentioned in Table 5; Hv_180_110 and Hv_180_60. Figure 4 is an example of Hv_180_60 PoseMosaic and Figure 6 is an example of Hv_180_110 Pose Mosaic. VERTMOSAIC was made from vertically incremented video stream and an instance of it is Figure 7. Table 6 Database Statistics Specification latnoziroH lacitreV muN latoT ciasoMtreV ro ciasoMesoP egnaR egnaR )sessalg htiw( ° ° pag 1 htiw : 09~ 09- ° ° ° pag 01 htiw : 05~ 06- ° 44 011_081_vH ° ° pag 1 htiw : 09~ 09- ° pag 01 htiw :03~03- ° 48 06_081_vH ° ° pag 1 htiw : 09~ 09- ° ° ylno 0 )96 = 25 + 71 ( 03 - ° ylno 0 ° ° pag 1 htiw : 06~ 06- ° )42(28 = 83 +44 021_V Figure 8 PoseMosaic HV_180_110 During the image cropping process, it was difficult to see the entire localized images at a glance, so we made some sort of image, tiled up by every horizontal and vertical 10 degree images and named some ‘PoseMosaic’. HV_180_110 indicates the range of horizontal and vertical degree. As this PoseMosaic shows horizontally -90° to +90° and vertically -60° to +50° localized images, it is classified into HV_180_110 PoseMosaic.
  10. 10. SAIT Internship Technical Report Figure 9 VertMosaic V_120 VertMosaic is a sort of PoseMosaic, except that it only covers images of same horizontal degree, usually horizontal 0°. Each column is picture of HV_0_-60, HV_0_-50, HV_0_-40, HV_0_-30, HV_0_-20, HV_0_-10, HV_0_0, HV_0_10, HV_0_20, HV_0_30, HV_0_40, HV_0_50, HV_0_60 individually. 3. Conclusion & Discussions sevitanretlA emoS .1.3 sevitanretlA emoS .1.3 sevitanretlA emoS .1.3 sevitanretlA emoS .1.3 ‘Localized width’ (Distance between two eyes ÷ ( 0.4 × cosθ ) )values around horizontal +90°~+80°, -80°~-90° are useless, because cosine value around ±90 degrees approximates near zero and so does the distance between the eyes which result in zero divided by zero. In order to improve the localization result, certain portion of it was discarded, and then extrapolated the discarded part. If we discarded more, or less perhaps, the recognition or verification performance might have changed. For now, we don’t have any clue. The order of approximation or interpolation spline function is all ‘cubic’. This was because cubic spline was the most prevalent spline used. If we changed this order to square of quadric, cropped image size or eye coordinates of approximated frame might have changed. ‘Localization Rule One, Two’ are not suitable for images of extreme degree, such as vertical +50°, -60°. If refer to Figure 8 for left bottom, or right top image, these images are no better than images randomly cropped with no measure at all. Thus, some other localization measure or criteria is in need to improve localization. esabataD siht gnisU fo shtgnertS .2.3 People have their own way of bending or lifting their head. Someone’s head might be tilt a little; someone might not be good at bending or lifting their head at all. Because we actually asked the subjects to position their head to a certain vertical angle, we could obtain all kind of head configuration that reflects each subject’s specific neck condition. If some future research topic is on the actual movement of a human’s neck, or actual picture of every angle movement of the head, this database would be the first candidate to be used. Moreover, from right graph of Figure 7, we can suggest form the database that the center of sphere of vertical rotation between vertical -60° ~0° and 0°~50° is different because if it were same the graph should be somewhat like regular cosine function at -60° ~50° As mentioned in the introduction, image taken by placing the camera at the ceiling with the subject looking straight forward, and image taken by setting the camera in front of the subject and the head bent down may be of the same horizontal and vertical degree but are quite different from one another. One factor that makes these two images different is gravity. Images taken by former method (CMU PIE) are not suitable for experiments that have constraints of camera position as in ATM machine. It is difficult to get eye coordinate value if the one of the eye is hidden behind nose. As we used approximating method, there was no need to worry about these hidden eye coordinates because the approximation step revised those values to correct ones.
  11. 11. .3002 ,ecnegilletnI enihcaM dna sisylanA nrettaP no sn oitcasnarT EEEI , esabataD noisserpxE dna ,noitanimullI ,esoP UMC ehT ,tasB .M dna ,rekaB .S ,miS .T ” “ ]5[ .0002 ,noitingoceR erutseG dna ecaF citamotuA no ecnerefnoC lanoitanretnI EEEI ht4 eht fo sgnideecorP nI , noitanimulli dna esop elbairav rednu noitin ” gocer rof sledom evitareneG :ynam ot wef morF ,namgeirK .J.D dna ,ruemuhleB .N.P ,sedaihgroeG .S.A “ ]4[ 21 .2002 ,)ASIK(원흥진호보보정국한 , ”축구 BD 굴얼 용구연 “ ,어디미얼추버 ㈜ ]3[ 171~661pp 호월3 년3002 지회학기전 ,㈜기전 시비쓰미 , ”천변 의술기 식인 상화 굴얼 “ ,코히즈가 미스 ]2[ .4991 yluJ ,7722 .loV EIPS ,snamuH fo noitcepsnI dna noitacifitnedI eht rof smetsy S citamotuA ,secapsnegiE raludoM dna desab-weiV gnisu noitingoceR ecaF ,dnaltneP .A ,maddahgoM .B ” “ ]1[ 5. References . e s a b at a d e ht g n it c u r t s n o c f o s s e c o r p e rit n e e ht g ni r u d t c e p s e r y r e v e ni e c n a d i u g d et r o p p u s o h w e e L k o o S n o W o t e d utit a r g t s o m e ht s e w o r o ht u a e ht , r e v o e r o M .t c ej o r p e ht d et si s s a o h w n o o S g n o e y M k e a B ot s k n a ht , o s l A . e s a b at a d e g a m i e c af si ht f o l e d o m e ht e b o t g n i r e et n ul o v r of b a L I C H d n a , b a L g ni k r o wt e N I f o s r e b m e m e ht k n a ht ot e kil dl u o w r o ht u a e h T 4. Acknowledgment Etc: When the tester rotated a chair, the testee on the chair tends to bend his/her head in the opposite direction of rotation. to obtain 120 people face database; automation of these two processes would enable acquisition of large database. Manual jobs are time consuming and hinder from acquiring large size database. In this project, 2 men-month was consumed Overall time: There were two steps done manually during the whole process; chair rotating; ground truth data extraction. supported auto focus function, this problem will be of no concern. between subject and the camera was far enough to ignore the difference made from turning the chair around, or the camera option, and the subject’s rotation affected the focus length, quality of obtained images was not the finest. If the distance Focus: As the focus length between subject and the camera was very close, and the camera did not support automatic focus room is needed to settle this problem. Moreover, when the intensity of illumination increased, shadow prevented getting picture of fine quality. Therefore, closed Illumination: As illumination wasn’t controlled, light condition changed according to the daylight brightness. eruC dna ssenkaeW .3.3 SAIT Internship Technical Report
  12. 12. SAIT Internship Technical Report 6. Appendices eliF ataD etanidrooC eyE etaidemretnI .1.6 eliF ataD etanidrooC eyE etaidemretnI .1.6 eht rof sdnats nmuloc hcaE .yrotcerid oediVmorFgmI ni devas elif atad etanidrooc eye etaidemretni na si sihT latnoziroh dna ,noitisop y eye tfel ,noitisop x eye tfel ,noitisop y eye thgir ,noitisop x eye thgir ,rebmun emarf ° snoitces 09~ 0 dna 0~ 09- noitcnuf ecapsniL BALTAM htiw dengissa saw eerged detatoR .eerged detator ° ° ° eerht fo rebmun emarf enimreted ot ysae saw ti ,selif egami GPJ ot elif IVA eht degnahc ew nehW .yletarapes ° ° ° ot nesohc saw 292 ,481 ,16 rebmun emarf ,elpmas 7 elbaT ni ,elpmaxe roF . 09- , 0 , 09 ;seerged latnoziroh esac ° ° ° ÷ rof eerged s emarf gnidecerp eht morf detcartbus saw )16-481( 09 ,nehT .eerged 09- , 0 , 09 rof emarf eht eb ’ eht ,yltneuqesnoC .flah rehto eht htiw emas seog ecneuqes sihT .481 ot 16 morf emarf hcae rof eulav ateht ° ° ° ° pag eerged noitator ,7 elbaT ni ees nac uoy sa tub ,tnereffid era 09- ~ 0 , 0~ 09 neewteb pag eerged noitator ’ .nrecnoc taerg fo eb ot hcum reffid t nseod 338,0 dna 627.0 eulav Table 7 [Name]_[Format].txt in IMGFROMVIDEO Directories 61 1 4 8 .2 7 5 1 1 7 .5 3 2 1 4 7 .8 0 4 1 1 7 .4 9 6 9 0 .0 0 0 62 1 4 8 .0 0 0 1 1 7 .5 3 7 1 4 7 .9 5 6 1 1 7 .5 0 2 8 9 .2 7 4 63 1 4 7 .7 2 6 1 1 7 .5 4 2 1 4 8 .1 0 8 1 1 7 .5 0 8 8 8 .5 4 8 …… . ( o mit t ed) … ….. 122 1 3 5 .3 4 0 1 1 7 .9 5 1 1 5 6 .3 9 9 1 1 7 .7 8 5 4 5 .7 2 6 123 1 3 5 .2 0 0 1 1 7 .9 6 0 1 5 6 .5 2 0 1 1 7 .7 8 8 4 5 .0 0 0 124 1 3 5 .0 6 3 1 1 7 .9 6 9 1 5 6 .6 4 0 1 1 7 .7 9 1 4 4 .2 7 4 …… . ( o mit t ed) … ….. 182 1 3 1 .8 8 9 11 8 .6 1 3 1 6 1 .5 8 5 11 7 .9 0 7 1 .4 5 2 183 1 3 1 .9 3 0 11 8 .6 2 6 1 6 1 .6 4 0 11 7 .9 0 8 0 .7 2 6 184 1 3 2 .0 2 4 1 1 8 .6 5 3 1 6 1 .7 5 0 1 1 7 .9 0 9 0 .0 0 0 185 1 3 2 .0 8 5 1 1 8 .6 6 8 1 6 1 .8 1 2 1 1 7 .9 1 0 - 0 .8 3 3 186 1 3 2 .1 5 1 1 1 8 .6 8 3 1 6 1 .8 7 4 1 1 7 .9 1 0 - 1 .6 6 7 …… . ( o mit t ed) … ….. 237 1 4 3 .3 8 3 1 1 8 .8 0 6 1 6 5 .5 3 5 1 1 7 .9 2 7 - 4 4 .1 6 7 238 1 4 3 .7 2 5 1 1 8 .7 9 7 1 6 5 .5 8 9 1 1 7 .9 2 7 - 4 5 .0 0 0 239 1 4 4 .0 6 9 1 1 8 .7 8 6 1 6 5 .6 4 0 1 1 7 .9 2 7 - 4 5 .8 3 3 …… . ( o mit t ed) … ….. 290 1 6 1 .6 4 6 1 1 7 .6 4 5 1 6 1 .5 2 1 1 1 7 .9 1 1 - 8 8 .3 3 3 291 1 6 1 .9 2 7 1 1 7 .6 1 1 1 6 1 .2 4 9 1 1 7 .9 1 0 - 8 9 .1 6 7 292 1 6 2 .2 0 3 1 1 7 .5 7 6 1 6 0 .9 6 7 1 1 7 .9 0 9 - 9 0 .0 0 0

×