Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks

•Download as PPTX, PDF•

0 likes•438 views

This document describes a study that used capsule networks to estimate gaze from eye images. The researchers developed a two-step approach called Gaze-Net that first classifies gaze direction and then estimates gaze. Gaze-Net was trained on a large dataset and achieved good accuracy. It was also able to be personalized to new users through transfer learning, improving performance on a separate dataset. The study demonstrated that ocular images contain sufficient information for decoding head pose and eye orientation to estimate gaze direction.

Technology

Gaze-Net: Appearance-Based Gaze
Estimation using Capsule Networks
Bhanuka Mahanama(@mahanama94)
Yasith Jayawardana (@yasithmilinda)
Sampath Jayarathna (@openmaze)
Department of Computer Science
Old Dominion University

Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Outline
● Introduction
● Related work
● Approach
● Proposed Architecture
● Experiments and Results
● Conclusion
2/11

Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Introduction
● Gaze Estimation Applications
○ Physiological studies
○ Human-computer interaction
● Modern methods
○ Convolution Neural Networks
○ Facial Region
○ Ocular Region
3/11
Appearance based-multi user eye tracking
(https://mgaze.nirds.cs.odu.edu/)

Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Related Work
● Estimation methods
○ Fixed head-pose - early methods (Sewell et al.[2010])
○ Variable head pose
■ Explicit pose data (Zhang et al.[2015])
■ Implicit pose (Zhang et al.[2016], Krafka et al.[2017])
● Training methods
○ Data driven (Zhang et al.[2015, 2016])
○ User specific (Kassner et al. [2014], Huang et al.[2014], Papoutsaki et al.[2016])
4/11

Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Approach
● Two-step approach
○ Classify
○ Estimate
● Classification
○ Convolution NN
○ Capsule Network
● Estimation
○ Fully connected
● Regularization
○ Reconstruction
○ Estimation error
5/11
Left Top Middle Top Right Top
Left Bottom Middle Bottom Right Bottom

Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Training and Testing
● Training
○ MPIIGaze dataset (200,000+ images)
■ https://arxiv.org/abs/1711.09017
● Testing
○ MPIIGaze dataset
○ Columbia Gaze dataset (~5000 images)
6/11
MPIIGaze Dataset: Raw images
(https://www.mpi-inf.mpg.de/)
MPIIGaze Dataset: Processed
images
(https://www.mpi-inf.mpg.de/)

Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Experiments
● Metrics
○ Accuracy - Gaze categorization
○ Mean Absolute Error - Gaze estimation
● Experiment conditions
○ No regularization
○ Gaze estimation regularization
○ Image Reconstruction
○ Estimation + Reconstruction
7/11
Accuracy MAE (Estimation)
No Regularization 67.15 -
Image
Reconstruction
65.97 -
Gaze Error 63.98 2.88
Gaze Error +
Reconstruction
62.67 2.84
Figure 2: Comparison of MPIIGaze image
reconstructionwith the original images.Œ
top row shows the reconstructed images,
and the bottŠom row shows the original
images.
Table 1: Classi€cation Accuracy (ACC) and Mean
AbsoluteError (MAE) of Gaze Estimation for each
Regularizationmethod.

Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Transfer Learning
● Transfer Learning
○ Knowledge from one problem on another
● Dataset
○ Columbia Gaze Dataset
○ Ocular region extracted using PoseNet
■ PoseNet: Real-time pose estimation model
■ https://github.com/tensorflow/tfjs-
models/tree/master/posenet
○ Per participant experiments
8/11
Processed images from Columbia Gaze
Dataset

Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Transfer Learning - Experiments
● Conditions
○ No retraining
○ Retraining estimation
network
9/11
MAE
(Estimation)
No Retraining 10.04
Retraining Estimation
Network
5.92
Table 2: Mean Absolute Error (MAE) of
gaze estimation be-fore and a…er training
on Columbia Gaze Dataset.

Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Discussion
● Gaze estimation with ocular
images
○ Decoding head pose
○ Decoding eye rotation
● Transfer learning for
personalizing
○ Generalized model from
larger dataset
○ Personalized from a smaller
dataset
10/11
Figure 3: Dimension perturbations.Each row shows the
reconstruction when one of the 16 dimensions in the
GazeCaps output is tweaked by intervals of 0.125 in the
range[−0.25,0.25]

Gaze-Net: Appearance Based Gaze Estimation@NirdsLab
Questions?
● Ocular images are sufficient for
○ Decoding facial orientation
○ Eye rotation
○ Estimating gaze
● Transfer learning
○ Better performing personalized models
● More info
○ MGaze: https://mgaze.nirds.cs.odu.edu/
○ Research Group: @NirdsLab
○ Homepage: https://www.cs.odu.edu/~bhanuka/
○ Twitter: @mahanama94, @yasithmilinda, @openmaze
11/11

Cold spray additive manufacturing is an emerging technology that offers unique advantages including high production rate, unlimited product size and the ability to process oxygen-sensitive materials. However, cold spray additive manufacturing suffers from low surface quality and dimensional accuracy which prevent its integration into commercial manufacturing systems. These problems originate from the poor understanding of complex relationship between process parameters and deposit geometry. This knowledge gap motivated the development of an accurate predictive model for the geometry of a cold spray deposit profile to overcome the problems. Recently, a machine learning approach has gained interest in developing the predictive model of such complex additive manufacturing process due to its superior nonlinear mapping capability as seen in other manufacturing applications. Therefore, the aim of the present study was to integrate a machine learning approach into the geometry prediction of cold spray additive manufacturing. The findings can contribute to the optimization of the process for shorter production time and the development of build strategy for better as-fabricated surface and dimensional quality control. The approach in this study is also applicable in other deposition-based additive manufacturing technologies such as Wire Arc Additive Manufacturing and Laser Cladding.

Hybrid predictive modelling of geometry with limited data in cold spray addit...

Daiki Ikeuchi

The document describes research into developing a data-efficient neural network model to predict the track profile geometry in cold spray additive manufacturing using limited experimental data. Researchers trained an artificial neural network on profile data from 36 experiments and tested it on 12 additional experiments. The neural network was able to accurately predict both symmetric and asymmetric track profiles using less data than previous mathematical models, demonstrating the potential for data-efficient machine learning in additive manufacturing applications with scarce training data.

3D human body modeling from RGB images

Arithmer Inc.

Slide for study session given by Dr. Enrico Rinaldi at Arithmer inc. It is a summary of established methods for parametric modeling of 3D human body "SMPL", which has many possible applications in apparel/health care industry. Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。 Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.

Segmentation of challenging microscopy images

Lviv Data Science Summer School

G. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AI

MLILAB

This document discusses how network compression techniques can cause attribution maps to become deformed, compromising the reliability and trustworthiness of compressed models. It proposes matching attribution maps between original and compressed networks to address this. Specifically, it generates attribution maps by collapsing channels and employs losses to keep compressed network maps close to the original. Experiments show this attribution preservation framework can effectively maintain attribution across compression methods like knowledge distillation and pruning, improving predictive performance.

J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI

MLILAB

Graph Transplant is a method for augmenting graph-structured data using a technique analogous to Mixup for images. It extracts salient subgraphs from graphs based on node importance, then transplants one subgraph to replace a subgraph in another graph. This preserves local structure while mixing graphs. It also adaptively assigns labels to the mixed graphs based on the saliency of the constituent subgraphs. Experiments show Graph Transplant improves graph classification performance, model robustness, and calibration compared to other augmentation methods.

PR-159 : Synergistic Image and Feature Adaptation: Towards Cross-Modality Dom...

Sunghoon Joo

(1) The document discusses the design of an on-chip optical signal processor for bioinformatics applications using various optical computing techniques. (2) It proposes using linear group delay waveguides to generate chirped signals and microring resonators for four-wave mixing to perform temporal Fourier transforms and multiplications. (3) The document also covers the use of metamaterials to design an optical integrator for spatial analog optical computing operations like integration.

J. Park, AAAI 2022, MLILAB, KAIST AI

MLILAB

1) Saliency Grafting is a new data augmentation technique that uses saliency maps to stochastically sample patches from images to mix, generating diverse yet meaningful augmented data. 2) It introduces calibrated label mixing, where the label mixing ratio is determined by the relative importance of images based on saliency maps. 3) Experiments show Saliency Grafting outperforms other mixup-based augmentation methods, improving performance even under data scarcity conditions by maintaining high sample diversity.

Shallow Dense Network for Effective Image Classification

A. Hasib Uddin

In this study, we demonstrate high image classification performance using a Dense network with only one hidden layer. In this method, we systematically tuned the number of neurons in the hidden layer and trained our model on a benchmark image classification dataset. The shallow model was able to successfully gain state-of-the-art AlexNet level performance. Neural networks with extensively deep architectures typically contain millions of parameters, which are both computationally expensive and time-consuming to train. This study shows that going deeper into neural networks is not always necessary, rather it is more important to focus on the correct number of neurons in each layer.

Numerical Integral using NNI

Fahmeen Mazhar

Feature disentanglement in generating a three dimensional structure from a tw...

Chung Hyung Jin

SliceGAN is a neural network that can generate a 3D structure from a single 2D slice, applicable to material science simulations. However, the original SliceGAN could not control the features of the generated volume. The authors introduce Adaptive Instance Normalization (AdaIN) to disentangle features, allowing interpolation of parameters to control characteristics. They train SliceGAN using 2D images and generate 3D volumes with controllable features enabled by AdaIN.

CenterForDomainSpecificComputing-Poster

Yunming Zhang

This document summarizes work on implementing and accelerating a 3D front propagation segmentation algorithm to measure tumor volumes from medical images. Key points: - The algorithm uses async/finish in Habanero C to spawn parallel tasks that trace contours on individual 2D slices for 3D image segmentation. - Initial results show speedups of 1.38x on a dual core for a 128x128x11 image and 1.26x for a 512x512x29 image. - Future work includes rewriting sequential distance calculation and contour tracing for more speedup, improving seed point detection, and excluding non-nodule regions from segmentation.

JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM

mailjkb

This document discusses using machine learning algorithms for job scheduling in a grid computing environment. It aims to minimize makespan, the total time to complete all tasks, by learning from past scheduling experiences. It proposes using ant colony optimization, where artificial ants probabilistically choose task-machine pairs to incrementally find optimal schedules. The algorithm is compared to other scheduling methods and extended to online scheduling by classifying jobs with attributes to appropriate machines. A feasibility study demonstrates classification and scheduling of test jobs using machine learning tools.

Matlab reversible watermarking based on invariant image classification and d...

Ecway Technologies

This document proposes a new reversible watermarking scheme with two main contributions: 1) An adaptive histogram shifting modulation that embeds data in textured image areas where other methods fail. It considers prediction errors and their neighborhoods. 2) A classification process that identifies parts of an image best suited for watermarking using a reference image invariant to watermark insertion. Experiments show the method can embed more data with lower distortion than existing schemes, achieving 1-2 dB higher PSNR than the most efficient existing approach.

Semantic Segmentation on Satellite Imagery

RAHUL BHOJWANI

This is an Image Semantic Segmentation project targeted on Satellite Imagery. The goal was to detect the pixel-wise segmentation map for various objects in Satellite Imagery including buildings, water bodies, roads etc. The data for this was taken from the Kaggle competition <https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection>. We implemented FCN, U-Net and Segnet Deep learning architectures for this task.

Learning with Relative Attributes

Vikas Jain

This document is a project report submitted by Shubham Jain and Vikas Jain for their course CS676A. The project aims to learn relative attributes associated with face images using the PubFig dataset. Convolutional neural network features and the RankNet model were used to predict attribute rankings. RankNet achieved better performance than RankSVM and GIST features. Zero-shot learning for unseen classes was explored by building probabilistic class models, but performance was poor. Future work could improve the modeling of unseen classes.

Neural networks for semantic gaze analysis in xr settings

Jaey Jeong

This document presents a method for semantic gaze analysis in XR settings using neural networks. The approach uses a CAD model to generate training data for a Cycle-GAN, which creates synthetic data to train a CNN for identifying volumes of interest (VOIs) from eye-tracking data without manual annotation. An experiment applied this method to data from participants interacting with a virtual and real coffee machine, showing it performed slightly better in VR than reality while eliminating the need for lengthy human annotation of gaze data. Future work could aim to improve prediction accuracy further.

Poster_Reseau_Neurones_Journees_2013

Pedro Lopes

The document presents research on using neural networks to predict Earth Orientation Parameters (EOP) such as UT1-TAI. Three neural network models were tested: 1) Network 1 varied the number of neurons proportionally with increasing training sample size. 2) Network 2 kept the number of neurons constant while increasing sample size. 3) Network 3 used daily training data with 2 neurons and sample sizes of 4, 10, 20, and 365 days. The goal was to minimize prediction error (RMSE) for horizons of 5-25 days by adjusting sample size and neurons. Results showed the best balance was needed between these factors, and that short-term prediction was possible within 10 days using

Prepare for the final thesis presentation

naoki0625

1) The document discusses reconstructing 3D images of indoor scenes using two different methods: SIFT and 2DCDP. 2) For the SIFT method, 75,000 matching points were obtained and it took 24 hours for calculation. A preliminary 3D reconstruction was shown. 3) For the 2DCDP method, the results are still under investigation since the experiment is not yet complete. 4) The document outlines comparing the two methods to improve 3D reconstruction of indoor environments.

Supervised embedding techniques in search ranking system

Marsan Ma

This document discusses using supervised embeddings techniques in search ranking systems. It begins by outlining different types of embedding techniques, including graph-based, content-based, supervised, and unsupervised. It then describes how graph-based embedding models like Item2vec could be implemented in a product pipeline. Finally, it discusses ongoing experiments comparing different embedding models and techniques, showing improvements in metrics like logloss and ROC AUC.

neuralAC

Dr Rupesh Shet

This document discusses using a cascade correlation neural network (CCNN) to capture the drawing style of a caricaturist in order to automatically generate caricatures. It proposes extracting facial components from original images, mean faces, and caricatures to create training data. The CCNN is trained using this data to learn the exaggerations made by the caricaturist. Experiments show the CCNN can accurately predict nonlinear exaggerations to components. The approach aims to address limitations of existing caricature generation systems by learning an individual artist's unique style through training on their deformations of facial objects.

Decomposing image generation into layout priction and conditional synthesis

Naeem Shehzad

Unsupervised representation learning for gaze estimation

Jaey Jeong

This document summarizes a research paper on unsupervised representation learning for gaze estimation. The paper proposes an unsupervised learning framework that uses a large amount of unlabeled eye image data to learn a gaze representation. This representation is used to train a gaze redirection network and support few-shot gaze estimation with only a small number of labeled samples. The method learns the representation using a feature extractor network and differences in representations between aligned image pairs. Evaluation on three datasets shows the approach can accurately estimate gaze using as few as 10-100 labeled samples per person.

Parallel Computing Application

hanis salwan

This research article proposes accelerating a geodesic ray-tracing algorithm for fiber tracking in brain imaging using parallel programming on a GPU. Fiber tracking uses diffusion MRI to noninvasively examine brain fiber structures at a microscopic level. While geodesic ray-tracing is robust, it is computationally expensive to reliably find all fibers between seed points and target regions. The authors implemented a highly parallel version of the algorithm using NVIDIA's CUDA platform on a GPU. This provided a significant reduction in running time of up to 40x compared to a multithreaded CPU implementation, greatly increasing the applicability of the algorithm.

PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...

Jinwon Lee

The document summarizes a study on training Vision Transformers (ViTs) by exploring different combinations of data augmentation, regularization techniques, model sizes, and training dataset sizes. Some key findings include: 1) Models trained with extensive data augmentation on ImageNet-1k performed comparably to those trained on the larger ImageNet-21k dataset without augmentation. 2) Transfer learning from pre-trained models was more efficient and achieved better results than training models from scratch, even with extensive compute. 3) Models pre-trained on more data showed better transfer ability, indicating more data yields more generic representations.

Identifying Land Patterns from Satellite Images using Deep Learning

Soumyadeep Debnath

▫️ Research Domain : Machine Learning (ML), Deep Learning (DL) and Convolutional Neural Network (CNN). ▫️ Conference Details : International Conference on the Networked Digital Earth (ICNDE 2018) at Indian Institute of Technology Kharagpur (IITkgp), India during March 7 - 9, 2018.  https://cse.iitkgp.ac.in/conf/NSDE/sds/ICNDE2018/ ▫️ Presentation Details : Presented the conference poster at ICNDE 2018 in front of Prof. Ravi Sundaram [Northeastern University, Boston, USA], Organizing Chair and Dr. Anil Vullikanti [Virginia Tech, USA], Invited Chair.

Memory Efficient Graph Convolutional Network based Distributed Link Prediction

miyurud

Graph Convolutional Networks (GCN) have found multiple applications of graph-based machine learning. However, training GCNs on large graphs of billions of nodes and edges with rich node attributes consume significant amount of time and memory resources. This makes it impossible to train such GCNs on general purpose commodity hardware. Such use cases demand high-end servers with accelerators and ample amounts of memory. In this paper we implement a memory efficient GCN based link prediction on top of a distributed graph database server called JasmineGraph. Our approach is based on federated training on partitioned graphs with multiple parallel workers. We conduct experiments with three real world graph datasets called DBLP-V11, Reddit, and Twitter. We demonstrate that our approach produces optimal performance for a given hardware setting. JasmineGraph was able to train a GCN on the largest dataset DBLP-V11(>10GB) in 20 hours and 24 minutes for 5 training rounds and 3 epochs by partitioning it into 16 partitions with 2 workers on a single server while the conventional training method could not process it at all due to lack of memory. The second largest dataset Reddit took 9 hours 8 minutes to train with conventional training while JasmineGraph took only 3 hours and 11 minutes with 8 partitions-4 workers in the same hardware giving 3 times improved performance. In case of Twitter dataset JasmineGraph was able to give 5 times improved performance. (10 hours 31 minutes vs 2 hours 6 minutes;16 partitions-16 workers).

Using Bayesian Optimization to Tune Machine Learning Models

Scott Clark

1) Bayesian optimization can be used to efficiently tune the hyperparameters of machine learning models, requiring far fewer evaluations than standard random search or grid search methods to find good hyperparameters. 2) It builds a statistical model called a Gaussian process to model the objective function based on previous evaluations, and uses this to select the most promising hyperparameters to evaluate next in order to optimize an objective metric like accuracy. 3) SigOpt is a service that uses Bayesian optimization to tune machine learning models, outperforming expert humans on tasks like classifying images from CIFAR10 and reducing error rates more than standard methods.

What's hot

Ultrafast Optical signal processing

Hossein Babashah

J. Park, AAAI 2022, MLILAB, KAIST AI

MLILAB

Shallow Dense Network for Effective Image Classification

A. Hasib Uddin

Numerical Integral using NNI

Fahmeen Mazhar

Feature disentanglement in generating a three dimensional structure from a tw...

Chung Hyung Jin

CenterForDomainSpecificComputing-Poster

Yunming Zhang

JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM

mailjkb

Matlab reversible watermarking based on invariant image classification and d...

Ecway Technologies

What's hot (8)

Ultrafast Optical signal processing

J. Park, AAAI 2022, MLILAB, KAIST AI

Shallow Dense Network for Effective Image Classification

Numerical Integral using NNI

Feature disentanglement in generating a three dimensional structure from a tw...

CenterForDomainSpecificComputing-Poster

JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM

Matlab reversible watermarking based on invariant image classification and d...

Similar to Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks

Semantic Segmentation on Satellite Imagery

RAHUL BHOJWANI

Learning with Relative Attributes

Vikas Jain

Neural networks for semantic gaze analysis in xr settings

Jaey Jeong

Poster_Reseau_Neurones_Journees_2013

Pedro Lopes

Prepare for the final thesis presentation

naoki0625

Supervised embedding techniques in search ranking system

Marsan Ma

neuralAC

Dr Rupesh Shet

Decomposing image generation into layout priction and conditional synthesis

Naeem Shehzad

Unsupervised representation learning for gaze estimation

Jaey Jeong

Parallel Computing Application

hanis salwan

PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...

Jinwon Lee

Identifying Land Patterns from Satellite Images using Deep Learning

Soumyadeep Debnath

Memory Efficient Graph Convolutional Network based Distributed Link Prediction

miyurud

Using Bayesian Optimization to Tune Machine Learning Models

Scott Clark

Using Bayesian Optimization to Tune Machine Learning Models

SigOpt

1. Tuning machine learning models is challenging due to the large number of non-intuitive hyperparameters. 2. Traditional tuning methods like grid search are computationally expensive and can find local optima rather than global optima. 3. Bayesian optimization uses Gaussian processes to build statistical models from prior evaluations to determine the most promising hyperparameters to test next, requiring far fewer evaluations than traditional methods to find better performing models.

Deep Implicit Layers: Learning Structured Problems with Neural Networks

Sangwoo Mo

Deep implicit layers allow neural networks to solve structured problems by following algorithmic rules. They include layers for convex optimization, discrete optimization, differential equations, and more. The forward pass runs an algorithm, while the backward pass computes gradients using algorithmic properties like KKT conditions. This enables problems like structured prediction, meta-learning, and time series modeling to be solved reliably with neural networks by respecting their underlying structure.

Web Traffic Time Series Forecasting

BillTubbs

The document summarizes a Kaggle competition to forecast web traffic for Wikipedia articles. It discusses the goal of forecasting traffic for 145,000 articles, the evaluation metric used, an overview of the winner's solution using recurrent neural networks, and lessons learned. Key points include that the winner used a sequence-to-sequence model with GRU units to capture local and global patterns in the time series data, and employed techniques like model averaging to reduce variance.

Ai based glaucoma detection using deep learning

jaijoy6

This document describes a study that used deep learning and the VGG16 model to develop an artificial intelligence system for detecting glaucoma from fundus images. The system first preprocessed images then extracted features using VGG16. It was trained on labeled images to classify images as glaucoma or normal. The proposed system achieved 99.8% accuracy, outperforming previous methods. It provides an effective way to diagnose glaucoma from fundus images using deep learning.

PAISS (PRAIRIE AI Summer School) Digest July 2018

Natalia Díaz Rodríguez

The document summarizes key points from presentations at the PAISS Prairie AI summer school in July 2018. It discusses several machine learning techniques: 1. Cordelia Schmid presented on action recognition from optical flow data and the importance of warping for optical flow estimation. 2. Julien Mairal discussed incremental gradient descent methods for large-scale optimization and machine learning. 3. Martial Hebert covered robotics applications for vision and planning, including techniques for failure prediction, reducing supervision across tasks, and avoiding early commitment.

K-Means Clustering in Moving Objects Extraction with Selective Background

IJCSIS Research Publications

We presents a technique for moving objects extraction. There are several different approaches for moving object extraction, clustering is one of object extraction method with a stronger teorical foundation used in many applications. And need high performance in many extraction process of moving object. We compare K-Means and Self-Organizing Map method for extraction moving objects, for performance measurement of moving object extraction by applying MSE and PSNR. According to experimental result that the MSE value of K-Means is smaller than Self-Organizing Map. It is also that PSNR of K-Means is higher than Self-Organizing Map algorithm. The result proves that K-Means is a promising method to cluster pixels in moving objects extraction.

Similar to Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks (20)

Semantic Segmentation on Satellite Imagery

Learning with Relative Attributes

Neural networks for semantic gaze analysis in xr settings

Poster_Reseau_Neurones_Journees_2013

Prepare for the final thesis presentation

Supervised embedding techniques in search ranking system

neuralAC

Decomposing image generation into layout priction and conditional synthesis

Unsupervised representation learning for gaze estimation

Parallel Computing Application

PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...

Identifying Land Patterns from Satellite Images using Deep Learning

Memory Efficient Graph Convolutional Network based Distributed Link Prediction

Using Bayesian Optimization to Tune Machine Learning Models

Deep Implicit Layers: Learning Structured Problems with Neural Networks

Web Traffic Time Series Forecasting

Ai based glaucoma detection using deep learning

PAISS (PRAIRIE AI Summer School) Digest July 2018

K-Means Clustering in Moving Objects Extraction with Selective Background

Recently uploaded

20 Comprehensive Checklist of Designing and Developing a Website

Pixlogix Infotech

Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.

TrustArc Webinar - 2024 Global Privacy Survey

TrustArc

How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024? In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores. See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe. This webinar will review: - The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey - The top challenges for privacy leaders, practitioners, and organizations in 2024 - Key themes to consider in developing and maintaining your privacy program

UiPath Test Automation using UiPath Test Suite series, part 6

DianaGray10

Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI. UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities. Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes. What will you get from this session? 1. Insights into integrating generative AI. 2. Understanding how this integration enhances test automation within the UiPath platform 3. Practical demonstrations 4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath Topics covered: What is generative AI Test Automation with generative AI and Open AI. UiPath integration with generative AI Speaker: Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP

Uni Systems Copilot event_05062024_C.Vlachos.pdf

Uni Systems S.M.S.A.

Removing Uninteresting Bytes in Software Fuzzing

Aftab Hussain

Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process. In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds. - These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.

Artificial Intelligence for XMLDevelopment

Octavian Nadolu

In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject. We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup. Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved. The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring. The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise. By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.

20240605 QFM017 Machine Intelligence Reading List May 2024

Matthew Sinclair

みなさんこんにちはこれ何文字まで入るの？40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの？えこ...

名前です男

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Paige Cruz

Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack. While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack. I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:

Full-RAG: A modern architecture for hyper-personalization

Zilliz

Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.

Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Neo4j

Leonard Jayamohan, Partner & Generative AI Lead, Deloitte This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...

Neo4j

Dr. Sean Tan, Head of Data Science, Changi Airport Group Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.

Communications Mining Series - Zero to Hero - Session 1

DianaGray10

This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered: • Communication Mining Overview • Why is it important? • How can it help today’s business and the benefits • Phases in Communication Mining • Demo on Platform overview • Q/A

Microsoft - Power Platform_G.Aspiotis.pdf

Uni Systems S.M.S.A.

Mind map of terminologies used in context of Generative AI

Kumud Singh

Monitoring Java Application Security with JDK Tools and JFR Events

Ana-Maria Mihalceanu

“I’m still / I’m still / Chaining from the Block”

Claudio Di Ciccio

Large Language Model (LLM) and it’s Geospatial Applications

Rohit Gautam

How to use Firebase Data Connect For Flutter

Daiki Mogmet Ito

Recently uploaded (20)

20 Comprehensive Checklist of Designing and Developing a Website

TrustArc Webinar - 2024 Global Privacy Survey

UiPath Test Automation using UiPath Test Suite series, part 6

Uni Systems Copilot event_05062024_C.Vlachos.pdf

Removing Uninteresting Bytes in Software Fuzzing

Artificial Intelligence for XMLDevelopment

20240605 QFM017 Machine Intelligence Reading List May 2024

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Full-RAG: A modern architecture for hyper-personalization

Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...

Communications Mining Series - Zero to Hero - Session 1

Microsoft - Power Platform_G.Aspiotis.pdf

Mind map of terminologies used in context of Generative AI

Monitoring Java Application Security with JDK Tools and JFR Events

“I’m still / I’m still / Chaining from the Block”

Large Language Model (LLM) and it’s Geospatial Applications

How to use Firebase Data Connect For Flutter

Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks

1. Gaze-Net: Appearance-Based Gaze Estimation using Capsule Networks Bhanuka Mahanama(@mahanama94) Yasith Jayawardana (@yasithmilinda) Sampath Jayarathna (@openmaze) Department of Computer Science Old Dominion University

2. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Outline ● Introduction ● Related work ● Approach ● Proposed Architecture ● Experiments and Results ● Conclusion 2/11

3. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Introduction ● Gaze Estimation Applications ○ Physiological studies ○ Human-computer interaction ● Modern methods ○ Convolution Neural Networks ○ Facial Region ○ Ocular Region 3/11 Appearance based-multi user eye tracking (https://mgaze.nirds.cs.odu.edu/)

4. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Related Work ● Estimation methods ○ Fixed head-pose - early methods (Sewell et al.[2010]) ○ Variable head pose ■ Explicit pose data (Zhang et al.[2015]) ■ Implicit pose (Zhang et al.[2016], Krafka et al.[2017]) ● Training methods ○ Data driven (Zhang et al.[2015, 2016]) ○ User specific (Kassner et al. [2014], Huang et al.[2014], Papoutsaki et al.[2016]) 4/11

5. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Approach ● Two-step approach ○ Classify ○ Estimate ● Classification ○ Convolution NN ○ Capsule Network ● Estimation ○ Fully connected ● Regularization ○ Reconstruction ○ Estimation error 5/11 Left Top Middle Top Right Top Left Bottom Middle Bottom Right Bottom

6. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Training and Testing ● Training ○ MPIIGaze dataset (200,000+ images) ■ https://arxiv.org/abs/1711.09017 ● Testing ○ MPIIGaze dataset ○ Columbia Gaze dataset (~5000 images) 6/11 MPIIGaze Dataset: Raw images (https://www.mpi-inf.mpg.de/) MPIIGaze Dataset: Processed images (https://www.mpi-inf.mpg.de/)

7. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Experiments ● Metrics ○ Accuracy - Gaze categorization ○ Mean Absolute Error - Gaze estimation ● Experiment conditions ○ No regularization ○ Gaze estimation regularization ○ Image Reconstruction ○ Estimation + Reconstruction 7/11 Accuracy MAE (Estimation) No Regularization 67.15 - Image Reconstruction 65.97 - Gaze Error 63.98 2.88 Gaze Error + Reconstruction 62.67 2.84 Figure 2: Comparison of MPIIGaze image reconstructionwith the original images.Œ top row shows the reconstructed images, and the bottŠom row shows the original images. Table 1: Classi€cation Accuracy (ACC) and Mean AbsoluteError (MAE) of Gaze Estimation for each Regularizationmethod.

8. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Transfer Learning ● Transfer Learning ○ Knowledge from one problem on another ● Dataset ○ Columbia Gaze Dataset ○ Ocular region extracted using PoseNet ■ PoseNet: Real-time pose estimation model ■ https://github.com/tensorflow/tfjs- models/tree/master/posenet ○ Per participant experiments 8/11 Processed images from Columbia Gaze Dataset

9. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Transfer Learning - Experiments ● Conditions ○ No retraining ○ Retraining estimation network 9/11 MAE (Estimation) No Retraining 10.04 Retraining Estimation Network 5.92 Table 2: Mean Absolute Error (MAE) of gaze estimation be-fore and a…er training on Columbia Gaze Dataset.

10. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Discussion ● Gaze estimation with ocular images ○ Decoding head pose ○ Decoding eye rotation ● Transfer learning for personalizing ○ Generalized model from larger dataset ○ Personalized from a smaller dataset 10/11 Figure 3: Dimension perturbations.Each row shows the reconstruction when one of the 16 dimensions in the GazeCaps output is tweaked by intervals of 0.125 in the range[−0.25,0.25]

11. Gaze-Net: Appearance Based Gaze Estimation@NirdsLab Questions? ● Ocular images are sufficient for ○ Decoding facial orientation ○ Eye rotation ○ Estimating gaze ● Transfer learning ○ Better performing personalized models ● More info ○ MGaze: https://mgaze.nirds.cs.odu.edu/ ○ Research Group: @NirdsLab ○ Homepage: https://www.cs.odu.edu/~bhanuka/ ○ Twitter: @mahanama94, @yasithmilinda, @openmaze 11/11

Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks

Similar to Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks (20)

Recently uploaded

Recently uploaded (20)

Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks