Tablet gaze unconstrained appearance based gaze estimation in mobile tablets

Interaction Lab. Kumoh National Institute of Technology
TabletGaze : Unconstrained Appearance-based
Gaze Estimation in Mobile Tablets
:Computer Vision and Pattern Recognition 2016
Jeong JaeYeop

■Intro
■Rice TabletGaze dataset
■TabletGaze algorithms
■Results and analysis
■Discussion and conclusion
Agenda
Interaction Lab., Kumoh National Institue of Technology 2

Intro
Rice TabletGaze dataset
TabletGaze algorithms
Data Engineering Lab., Kumoh National Institue of Technology 3

■Gaze estimation in Mobile Tablets
 Commonplace connected mobile computing device
 User-tablet interaction
• Touch and sound
• Gaze is an emerging proxy of the user’s attention and intention
■ Hands-free human device interaction
■ Behavior studies
■ User authentication
Intro(1/4)

■Gaze estimation in Mobile Tablets
 Gaze estimation of tablets
• Without requiring any additional hardware
• Front-facing cameras
• Appearance-based methods
■ Not calibration stage
■ Mapping from appearance of eye region to gaze direction
Intro(2/4)

■A key challenge in tablet gaze estimation
 No constraint on
• How people use the tablet
• What kind of body posture people have when using tablet
• The user of the tablet
Intro(3/4)

■Tablet gaze estimation problem in three steps
 Collect dataset an unconstrained mobile gaze dataset of tablet
• 51 subjects
• Rice TabletGaze dataset
 TabletGaze Algorithms
• Feature extraction
• Dimensionality reduction
• Regression
 Analysis
Intro(4/4)

Rice TabletGaze dataset
Results and analysis

■Rice TabletGaze dataset
 Unique, unconstrained characteristics in the mobile environment
• 51 subjects, each with 4 different body postures
• Released online
 The learned model can be used for other devices
• Transfer learning, domain adaptation
Rice TabletGaze dataset(1/10)

■Data collection
 Setup
• Samsung Galaxy Tab S 10.5
■ Screen size of 22.62 x 14.14 com (8.90 x 5.57 inches)
■ 35 gaze locations (points)
■ The raw data(videos) captured by the front-camera
■ Resolution – 1280 x 720
• 51 subjects
■ 12 female and 39 male
■ 26 of them wearing prescription glasses
■ 28 of the subjects are Caucasians, and the remaining 23 are Asians
■ The ages of subjects range from 20 to 40

■Data collection
 Four body postures
• Standing
• Sitting
• Slouching
• Lying
 Four recording session and four body posture
• 16 video sequences
• No restriction on
■ How the subject held the tablet
■ How they performed each body posture
 Natural lit office environment

■Data collection
 One data collection session
• Front-facing camera of the tablet begin recording a video
• Beep sound notified the beginning of the video
• Dot changing its location every three seconds and focus
■ Dot Randomize among 35 possible points
■ Free to blink

■Observations on the Rice TabletGaze dataset
 The entire face may not be visible in most of the image frames
• To quantify the extent of facial visibility, label each video in the dataset
■ The whole face
■ From mouth and above
■ From nose and above
■ From eyes and above
■ Even the eyes are no visible
• Manually reviewed 4 images

 Body posture and facial visibility extent appear to be correlated

 Glasses can cause reflection, and in many instances, the reflection
can be significant

■Sub-dataset Labeling
 The total amount of raw data
• 51 x 16 = 816 video sequences
• A portion of the data is not usable
■ Loss of concentration of subjects
■ Eye detector failure
■ Involuntary eye blinks and large motion blur
 Sub-dataset of 41 subjects to be used in experiments

 Loss of concentration of subjects
• Refocus time
• Extract after 1.5 to 2.5 seconds when dot appears at a new location
• For the 35 videos chunk from extracted from each video
■ Inspect gaze drift more than 5 video chunk, if so, abandon the data
• Hard to determine the true gaze location

 Eye detector failures
• Eyes are not visible in the image frame
• Strong reflection from glasses
• Occlusion from hair
• Poor illumination
 Use LoG(Laplacian of Gaussian) value
• Images of closed eyes
■ Higher mean intensity value given the disappearance of the dark pupil
• Blurred eye region image
■ Lower mean intensity value because motion blur weakens the edge information

Discussion and conclusion

■Overview
TabletGaze algorithms(1/6)

■Preprocessing
 Eye detector
• Two Harr feature CART-tree based cascade detectors
• False positive bounding boxes
■ Establish threshold for the size of the box (nostril)
■ Symmetric locations of the boxes (mouth)
• 100 x 15 resize

■Feature Calculation
 Feature extraction
• Contrast normalized pixel intensities
• LoG (Laplacian of Gaussian)
• LBP (Local Binary Pattern)
• HoG (Histogram of Oriented Gradients)
• mHoG (multilevel HoG)
■ Concatenate HoG features at different scales

 Dimensionality reduction
• Feature is High dimensional and compromised by noise
• Mapping the features to a lower dimensional space
• CNN pooling
• LDA (Linear Discriminant Analysis)
• PCA (Principal Component Analysis)

 LDA (Linear Discriminant Analysis)
• Intra-class scatter is maximized
• Inter-class scatter is minimized
 PCA(Principal Component Analysis)

■Regression
 The gaze labels of the data include two parts
• Horizontal and vertical coordinates on the tablet screen (x, y)
 Methods
• k-NN (k-Nearest Neighbors)
• RF (Random Forest)
■ Set of weak binary tree regressors
■ 100 trees
• GPR (Gaussian Process Regression)
• SVR (Support Vector Regression)

■Error Metrics
 Previous works
• Angular error
■ Arctangent of the ratio between the distance from the subject’s eyes to screen
 This work
• Mean Error (ME)
■ 2D location on the tablet
Results and analysis(1/8)

■Comparisons for different features + regressors
 Use 100,000 images from 41 subjects using cross validation
• GPR, SVR are only used 15 subject data

■ Person-dependent and person-independent performance comparison
 in prior works appearance-based gaze estimation methods
• Person and session dependent

■Comparison with prior results

■Effect of training data size
 Groups of different sizes 𝐾
• 𝐾 : 2 ~ 41
 Use 𝐾 − 1 for training, one is test
 Randomly selecting a subset of data
 Repeat the same process 5 times and average

■Eyeglasses, race and posture
 Three experiments setups
• Experiment 1
■ The dataset was partitioned into 2 groups of wearing glasses and not
■ Training and testing are done separately for each group
• Experiment 2
■ Leave-one-subject-out cross validation for all data
■ ME is separated for each group
• Experiment 3
■ Combine data of half each group

■Eyeglasses, race and posture

■Continuous gaze tracking from videos
 Use temporal information
• Bilateral filter

■Discussion
 All of the evaluations of the algorithms are conducted on a desktop
 Pre-trained off-line and loaded onto the device
 RF + mHoG feature
• Real time gaze estimation
Discussion and conclusion(1/2)

■Conclusion
 Unconstrained mobile gaze estimation problem
• A large dataset is collected in an unconstrained environment
■ First dataset
■ Four different postures
• ME : 3.17±2.10 cm
Discussion and conclusion(2/2)

Q&A

Tablet gaze unconstrained appearance based gaze estimation in mobile tablets

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Tablet gaze unconstrained appearance based gaze estimation in mobile tablets

Similar to Tablet gaze unconstrained appearance based gaze estimation in mobile tablets (20)

Recently uploaded

Recently uploaded (20)

Tablet gaze unconstrained appearance based gaze estimation in mobile tablets