SlideShare a Scribd company logo
Gaze-Net: Appearance-Based Gaze
Estimation using Capsule Networks
Bhanuka Mahanama(@mahanama94)
Yasith Jayawardana (@yasithmilinda)
Sampath Jayarathna (@openmaze)
Department of Computer Science
Old Dominion University
Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Outline
● Introduction
● Related work
● Approach
● Proposed Architecture
● Experiments and Results
● Conclusion
2/11
Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Introduction
● Gaze Estimation Applications
○ Physiological studies
○ Human-computer interaction
● Modern methods
○ Convolution Neural Networks
○ Facial Region
○ Ocular Region
3/11
Appearance based-multi user eye tracking
(https://mgaze.nirds.cs.odu.edu/)
Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Related Work
● Estimation methods
○ Fixed head-pose - early methods (Sewell et al.[2010])
○ Variable head pose
■ Explicit pose data (Zhang et al.[2015])
■ Implicit pose (Zhang et al.[2016], Krafka et al.[2017])
● Training methods
○ Data driven (Zhang et al.[2015, 2016])
○ User specific (Kassner et al. [2014], Huang et al.[2014], Papoutsaki et al.[2016])
4/11
Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Approach
● Two-step approach
○ Classify
○ Estimate
● Classification
○ Convolution NN
○ Capsule Network
● Estimation
○ Fully connected
● Regularization
○ Reconstruction
○ Estimation error
5/11
Left Top Middle Top Right Top
Left Bottom Middle Bottom Right Bottom
Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Training and Testing
● Training
○ MPIIGaze dataset (200,000+ images)
■ https://arxiv.org/abs/1711.09017
● Testing
○ MPIIGaze dataset
○ Columbia Gaze dataset (~5000 images)
6/11
MPIIGaze Dataset: Raw images
(https://www.mpi-inf.mpg.de/)
MPIIGaze Dataset: Processed
images
(https://www.mpi-inf.mpg.de/)
Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Experiments
● Metrics
○ Accuracy - Gaze categorization
○ Mean Absolute Error - Gaze estimation
● Experiment conditions
○ No regularization
○ Gaze estimation regularization
○ Image Reconstruction
○ Estimation + Reconstruction
7/11
Accuracy MAE (Estimation)
No Regularization 67.15 -
Image
Reconstruction
65.97 -
Gaze Error 63.98 2.88
Gaze Error +
Reconstruction
62.67 2.84
Figure 2: Comparison of MPIIGaze image
reconstructionwith the original images.Œ
top row shows the reconstructed images,
and the bottŠom row shows the original
images.
Table 1: Classi€cation Accuracy (ACC) and Mean
AbsoluteError (MAE) of Gaze Estimation for each
Regularizationmethod.
Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Transfer Learning
● Transfer Learning
○ Knowledge from one problem on another
● Dataset
○ Columbia Gaze Dataset
○ Ocular region extracted using PoseNet
■ PoseNet: Real-time pose estimation model
■ https://github.com/tensorflow/tfjs-
models/tree/master/posenet
○ Per participant experiments
8/11
Processed images from Columbia Gaze
Dataset
Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Transfer Learning - Experiments
● Conditions
○ No retraining
○ Retraining estimation
network
9/11
MAE
(Estimation)
No Retraining 10.04
Retraining Estimation
Network
5.92
Table 2: Mean Absolute Error (MAE) of
gaze estimation be-fore and a…er training
on Columbia Gaze Dataset.
Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Discussion
● Gaze estimation with ocular
images
○ Decoding head pose
○ Decoding eye rotation
● Transfer learning for
personalizing
○ Generalized model from
larger dataset
○ Personalized from a smaller
dataset
10/11
Figure 3: Dimension perturbations.Each row shows the
reconstruction when one of the 16 dimensions in the
GazeCaps output is tweaked by intervals of 0.125 in the
range[−0.25,0.25]
Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Questions?
● Ocular images are sufficient for
○ Decoding facial orientation
○ Eye rotation
○ Estimating gaze
● Transfer learning
○ Better performing personalized models
● More info
○ MGaze: https://mgaze.nirds.cs.odu.edu/
○ Research Group: @NirdsLab
○ Homepage: https://www.cs.odu.edu/~bhanuka/
○ Twitter: @mahanama94, @yasithmilinda, @openmaze
11/11

More Related Content

What's hot

Ultrafast Optical signal processing
Ultrafast Optical signal processingUltrafast Optical signal processing
Ultrafast Optical signal processing
Hossein Babashah
 
J. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AIJ. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AI
MLILAB
 
Shallow Dense Network for Effective Image Classification
Shallow Dense Network for Effective Image ClassificationShallow Dense Network for Effective Image Classification
Shallow Dense Network for Effective Image Classification
A. Hasib Uddin
 
Numerical Integral using NNI
Numerical Integral using NNINumerical Integral using NNI
Numerical Integral using NNI
Fahmeen Mazhar
 
Feature disentanglement in generating a three dimensional structure from a tw...
Feature disentanglement in generating a three dimensional structure from a tw...Feature disentanglement in generating a three dimensional structure from a tw...
Feature disentanglement in generating a three dimensional structure from a tw...
Chung Hyung Jin
 
CenterForDomainSpecificComputing-Poster
CenterForDomainSpecificComputing-PosterCenterForDomainSpecificComputing-Poster
CenterForDomainSpecificComputing-Poster
Yunming Zhang
 
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
mailjkb
 
Matlab reversible watermarking based on invariant image classification and d...
Matlab  reversible watermarking based on invariant image classification and d...Matlab  reversible watermarking based on invariant image classification and d...
Matlab reversible watermarking based on invariant image classification and d...
Ecway Technologies
 

What's hot (8)

Ultrafast Optical signal processing
Ultrafast Optical signal processingUltrafast Optical signal processing
Ultrafast Optical signal processing
 
J. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AIJ. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AI
 
Shallow Dense Network for Effective Image Classification
Shallow Dense Network for Effective Image ClassificationShallow Dense Network for Effective Image Classification
Shallow Dense Network for Effective Image Classification
 
Numerical Integral using NNI
Numerical Integral using NNINumerical Integral using NNI
Numerical Integral using NNI
 
Feature disentanglement in generating a three dimensional structure from a tw...
Feature disentanglement in generating a three dimensional structure from a tw...Feature disentanglement in generating a three dimensional structure from a tw...
Feature disentanglement in generating a three dimensional structure from a tw...
 
CenterForDomainSpecificComputing-Poster
CenterForDomainSpecificComputing-PosterCenterForDomainSpecificComputing-Poster
CenterForDomainSpecificComputing-Poster
 
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
 
Matlab reversible watermarking based on invariant image classification and d...
Matlab  reversible watermarking based on invariant image classification and d...Matlab  reversible watermarking based on invariant image classification and d...
Matlab reversible watermarking based on invariant image classification and d...
 

Similar to Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks

Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
RAHUL BHOJWANI
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
Vikas Jain
 
Neural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settingsNeural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settings
Jaey Jeong
 
Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013
Pedro Lopes
 
Prepare for the final thesis presentation
Prepare for the final thesis presentationPrepare for the final thesis presentation
Prepare for the final thesis presentation
naoki0625
 
Supervised embedding techniques in search ranking system
Supervised embedding techniques in search ranking systemSupervised embedding techniques in search ranking system
Supervised embedding techniques in search ranking system
Marsan Ma
 
neuralAC
neuralACneuralAC
neuralAC
Dr Rupesh Shet
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
Naeem Shehzad
 
Unsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimationUnsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimation
Jaey Jeong
 
Parallel Computing Application
Parallel Computing ApplicationParallel Computing Application
Parallel Computing Application
hanis salwan
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
Jinwon Lee
 
Identifying Land Patterns from Satellite Images using Deep Learning
Identifying Land Patterns from Satellite Images using Deep LearningIdentifying Land Patterns from Satellite Images using Deep Learning
Identifying Land Patterns from Satellite Images using Deep Learning
Soumyadeep Debnath
 
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
Memory Efficient Graph Convolutional Network based Distributed Link PredictionMemory Efficient Graph Convolutional Network based Distributed Link Prediction
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
miyurud
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
Scott Clark
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
SigOpt
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Sangwoo Mo
 
Web Traffic Time Series Forecasting
Web Traffic  Time Series ForecastingWeb Traffic  Time Series Forecasting
Web Traffic Time Series Forecasting
BillTubbs
 
Ai based glaucoma detection using deep learning
Ai based glaucoma detection using deep learningAi based glaucoma detection using deep learning
Ai based glaucoma detection using deep learning
jaijoy6
 
PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018
Natalia Díaz Rodríguez
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective Background
IJCSIS Research Publications
 

Similar to Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks (20)

Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
 
Neural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settingsNeural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settings
 
Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013
 
Prepare for the final thesis presentation
Prepare for the final thesis presentationPrepare for the final thesis presentation
Prepare for the final thesis presentation
 
Supervised embedding techniques in search ranking system
Supervised embedding techniques in search ranking systemSupervised embedding techniques in search ranking system
Supervised embedding techniques in search ranking system
 
neuralAC
neuralACneuralAC
neuralAC
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
 
Unsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimationUnsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimation
 
Parallel Computing Application
Parallel Computing ApplicationParallel Computing Application
Parallel Computing Application
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
Identifying Land Patterns from Satellite Images using Deep Learning
Identifying Land Patterns from Satellite Images using Deep LearningIdentifying Land Patterns from Satellite Images using Deep Learning
Identifying Land Patterns from Satellite Images using Deep Learning
 
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
Memory Efficient Graph Convolutional Network based Distributed Link PredictionMemory Efficient Graph Convolutional Network based Distributed Link Prediction
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
 
Web Traffic Time Series Forecasting
Web Traffic  Time Series ForecastingWeb Traffic  Time Series Forecasting
Web Traffic Time Series Forecasting
 
Ai based glaucoma detection using deep learning
Ai based glaucoma detection using deep learningAi based glaucoma detection using deep learning
Ai based glaucoma detection using deep learning
 
PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective Background
 

Recently uploaded

20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 

Recently uploaded (20)

20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 

Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks

  • 1. Gaze-Net: Appearance-Based Gaze Estimation using Capsule Networks Bhanuka Mahanama(@mahanama94) Yasith Jayawardana (@yasithmilinda) Sampath Jayarathna (@openmaze) Department of Computer Science Old Dominion University
  • 2. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Outline ● Introduction ● Related work ● Approach ● Proposed Architecture ● Experiments and Results ● Conclusion 2/11
  • 3. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Introduction ● Gaze Estimation Applications ○ Physiological studies ○ Human-computer interaction ● Modern methods ○ Convolution Neural Networks ○ Facial Region ○ Ocular Region 3/11 Appearance based-multi user eye tracking (https://mgaze.nirds.cs.odu.edu/)
  • 4. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Related Work ● Estimation methods ○ Fixed head-pose - early methods (Sewell et al.[2010]) ○ Variable head pose ■ Explicit pose data (Zhang et al.[2015]) ■ Implicit pose (Zhang et al.[2016], Krafka et al.[2017]) ● Training methods ○ Data driven (Zhang et al.[2015, 2016]) ○ User specific (Kassner et al. [2014], Huang et al.[2014], Papoutsaki et al.[2016]) 4/11
  • 5. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Approach ● Two-step approach ○ Classify ○ Estimate ● Classification ○ Convolution NN ○ Capsule Network ● Estimation ○ Fully connected ● Regularization ○ Reconstruction ○ Estimation error 5/11 Left Top Middle Top Right Top Left Bottom Middle Bottom Right Bottom
  • 6. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Training and Testing ● Training ○ MPIIGaze dataset (200,000+ images) ■ https://arxiv.org/abs/1711.09017 ● Testing ○ MPIIGaze dataset ○ Columbia Gaze dataset (~5000 images) 6/11 MPIIGaze Dataset: Raw images (https://www.mpi-inf.mpg.de/) MPIIGaze Dataset: Processed images (https://www.mpi-inf.mpg.de/)
  • 7. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Experiments ● Metrics ○ Accuracy - Gaze categorization ○ Mean Absolute Error - Gaze estimation ● Experiment conditions ○ No regularization ○ Gaze estimation regularization ○ Image Reconstruction ○ Estimation + Reconstruction 7/11 Accuracy MAE (Estimation) No Regularization 67.15 - Image Reconstruction 65.97 - Gaze Error 63.98 2.88 Gaze Error + Reconstruction 62.67 2.84 Figure 2: Comparison of MPIIGaze image reconstructionwith the original images.Œ top row shows the reconstructed images, and the bottŠom row shows the original images. Table 1: Classi€cation Accuracy (ACC) and Mean AbsoluteError (MAE) of Gaze Estimation for each Regularizationmethod.
  • 8. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Transfer Learning ● Transfer Learning ○ Knowledge from one problem on another ● Dataset ○ Columbia Gaze Dataset ○ Ocular region extracted using PoseNet ■ PoseNet: Real-time pose estimation model ■ https://github.com/tensorflow/tfjs- models/tree/master/posenet ○ Per participant experiments 8/11 Processed images from Columbia Gaze Dataset
  • 9. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Transfer Learning - Experiments ● Conditions ○ No retraining ○ Retraining estimation network 9/11 MAE (Estimation) No Retraining 10.04 Retraining Estimation Network 5.92 Table 2: Mean Absolute Error (MAE) of gaze estimation be-fore and a…er training on Columbia Gaze Dataset.
  • 10. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Discussion ● Gaze estimation with ocular images ○ Decoding head pose ○ Decoding eye rotation ● Transfer learning for personalizing ○ Generalized model from larger dataset ○ Personalized from a smaller dataset 10/11 Figure 3: Dimension perturbations.Each row shows the reconstruction when one of the 16 dimensions in the GazeCaps output is tweaked by intervals of 0.125 in the range[−0.25,0.25]
  • 11. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Questions? ● Ocular images are sufficient for ○ Decoding facial orientation ○ Eye rotation ○ Estimating gaze ● Transfer learning ○ Better performing personalized models ● More info ○ MGaze: https://mgaze.nirds.cs.odu.edu/ ○ Research Group: @NirdsLab ○ Homepage: https://www.cs.odu.edu/~bhanuka/ ○ Twitter: @mahanama94, @yasithmilinda, @openmaze 11/11