This document summarizes a study that compared different algorithms for mapping eye gaze data to on-screen objects. Nine algorithms from literature were tested, including ones that expand object size, associate gaze with the nearest object, or accumulate "interest" over time. A new "competing algorithm" was also proposed, which adds interest to the gazed object and subtracts from others. Testing involved presenting words, icons, and dots of varying sizes to participants and recording their eye movements. The algorithms were then evaluated based on correct selection rates and selection times. Results showed that a fractional mapping algorithm performed best but also had the highest incorrect rate, while the new "dynamic competing algorithm" showed the next best results with a high incorrect rate. Object type
Robust Clustering of Eye Movement Recordings for QuantiGiuseppe Fineschi
Characterizing the location and extent of a viewer’s interest, in terms of eye movement recordings, informs a range of investigations in image and scene viewing. We present an automatic data-driven method for accomplishing this, which clusters visual point-of-regard (POR) measurements into gazes and regions-ofinterest using the mean shift procedure. Clusters produced using this method form a structured representation of viewer interest, and at the same time are replicable and not heavily influenced by noise or outliers. Thus, they are useful in answering fine-grained questions about where and how a viewer examined an image.
This article aims at a new algorithm for tracking moving objects in the long term. We have tried to overcome some potential difficulties, first by a comparative study of the measuring methods of the difference and the similarity between the template and the source image. In the second part, an improvement of the best method allows us to follow the target in a robust way. This method also allows us to effectively overcome the problems of geometric deformation, partial occlusion and recovery after the target leaves the field of vision. The originality of our algorithm is based on a new model, which does not depend on a probabilistic process and does not require a data based detection in advance. Experimental results on several difficult video sequences have proven performance advantages over many recent trackers. The developed algorithm can be employed in several applications such as video surveillance, active vision or industrial visual servoing.
The determination of Region-of-Interest has been recognised as an important means by which
unimportant image content can be identified and excluded during image compression or image
modelling, however existing Region-of-Interest detection methods are computationally
expensive thus are mostly unsuitable for managing large number of images and the compression
of images especially for real-time video applications. This paper therefore proposes an
unsupervised algorithm that takes advantage of the high computation speed being offered by
Speeded-Up Robust Features (SURF) and Features from Accelerated Segment Test (FAST) to
achieve fast and efficient Region-of-Interest detection.
Efficient Reversible Data Hiding Algorithms Based on Dual Predictionsipij
In this paper, a new reversible data hiding (RDH) algorithm that is based on the concept of shifting of
prediction error histograms is proposed. The algorithm extends the efficient modification of prediction
errors (MPE) algorithm by incorporating two predictors and using one prediction error value for data
embedding. The motivation behind using two predictors is driven by the fact that predictors have different
prediction accuracy which is directly related to the embedding capacity and quality of the stego image. The
key feature of the proposed algorithm lies in using two predictors without the need to communicate
additional overhead with the stego image. Basically, the identification of the predictor that is used during
embedding is done through a set of rules. The proposed algorithm is further extended to use two and three
bins in the prediction errors histogram in order to increase the embedding capacity. Performance
evaluation of the proposed algorithm and its extensions showed the advantage of using two predictors in
boosting the embedding capacity while providing competitive quality for the stego image.
Robust Clustering of Eye Movement Recordings for QuantiGiuseppe Fineschi
Characterizing the location and extent of a viewer’s interest, in terms of eye movement recordings, informs a range of investigations in image and scene viewing. We present an automatic data-driven method for accomplishing this, which clusters visual point-of-regard (POR) measurements into gazes and regions-ofinterest using the mean shift procedure. Clusters produced using this method form a structured representation of viewer interest, and at the same time are replicable and not heavily influenced by noise or outliers. Thus, they are useful in answering fine-grained questions about where and how a viewer examined an image.
This article aims at a new algorithm for tracking moving objects in the long term. We have tried to overcome some potential difficulties, first by a comparative study of the measuring methods of the difference and the similarity between the template and the source image. In the second part, an improvement of the best method allows us to follow the target in a robust way. This method also allows us to effectively overcome the problems of geometric deformation, partial occlusion and recovery after the target leaves the field of vision. The originality of our algorithm is based on a new model, which does not depend on a probabilistic process and does not require a data based detection in advance. Experimental results on several difficult video sequences have proven performance advantages over many recent trackers. The developed algorithm can be employed in several applications such as video surveillance, active vision or industrial visual servoing.
The determination of Region-of-Interest has been recognised as an important means by which
unimportant image content can be identified and excluded during image compression or image
modelling, however existing Region-of-Interest detection methods are computationally
expensive thus are mostly unsuitable for managing large number of images and the compression
of images especially for real-time video applications. This paper therefore proposes an
unsupervised algorithm that takes advantage of the high computation speed being offered by
Speeded-Up Robust Features (SURF) and Features from Accelerated Segment Test (FAST) to
achieve fast and efficient Region-of-Interest detection.
Efficient Reversible Data Hiding Algorithms Based on Dual Predictionsipij
In this paper, a new reversible data hiding (RDH) algorithm that is based on the concept of shifting of
prediction error histograms is proposed. The algorithm extends the efficient modification of prediction
errors (MPE) algorithm by incorporating two predictors and using one prediction error value for data
embedding. The motivation behind using two predictors is driven by the fact that predictors have different
prediction accuracy which is directly related to the embedding capacity and quality of the stego image. The
key feature of the proposed algorithm lies in using two predictors without the need to communicate
additional overhead with the stego image. Basically, the identification of the predictor that is used during
embedding is done through a set of rules. The proposed algorithm is further extended to use two and three
bins in the prediction errors histogram in order to increase the embedding capacity. Performance
evaluation of the proposed algorithm and its extensions showed the advantage of using two predictors in
boosting the embedding capacity while providing competitive quality for the stego image.
Disparity Estimation by a Real Time Approximation AlgorithmCSCJournals
This paper presents an approximation real time algorithm for estimating the disparity of the stereo
images. The approximation is achieved by shrinking the left and right of original images.
According to this method (i ) left and right images have been shrinked three times,(ii) the disparity
image is computed from the shrinked left and right images to reconstruct the disparity image and
extrapolate the disparity image to retrieve the original image size. The computational time of
proposed algorithm is less than the existing methods, approximately real time and requires less
memory space. This method is applied on the standard stereo images and the results show that it
can easily reduce the computational time of about 76.34 % with no appreciable degradation of
accuracy.
Integrated Hidden Markov Model and Kalman Filter for Online Object Trackingijsrd.com
Visual prior from generic real-world images study to represent that objects in a scene. The existing work presented online tracking algorithm to transfers visual prior learned offline for online object tracking. To learn complete dictionary to represent visual prior with collection of real world images. Prior knowledge of objects is generic and training image set does not contain any observation of target object. Transfer learned visual prior to construct object representation using Sparse coding and Multiscale max pooling. Linear classifier is learned online to distinguish target from background and also to identify target and background appearance variations over time. Tracking is carried out within Bayesian inference framework and learned classifier is used to construct observation model. Particle filter is used to estimate the tracking result sequentially however, unable to work efficiently in noisy scenes. Time sift variance were not appropriated to track target object with observer value to prior information of object structure. Proposal HMM based kalman filter to improve online target tracking in noisy sequential image frames. The covariance vector is measured to identify noisy scenes. Discrete time steps are evaluated for identifying target object with background separation. Experiment conducted on challenging sequences of scene. To evaluate the performance of object tracking algorithm in terms of tracking success rate, Centre location error, Number of scenes, Learning object sizes, and Latency for tracking.
Multiple Person Tracking with Shadow Removal Using Adaptive Gaussian Mixture ...IJSRD
Person detection in a video surveillance system is major concern in real world. Several application likes abnormal event detection, congestion analysis, human gait characterization, fall detection, person identification, gender classification and for elderly people. In this algorithm, we use GMM method in background subtraction for multi person detection because of Gaussian Mixture Model (GMM) model is one such popular method this give a real time object detection. There is still not robustly but Multi person tracking with shadow removal fill this gap, in this work, HOG-LBP hybrid approach with GMM algorithm is presented for Multi person tracking with Shadow removal.
Corner Detection Using Mutual InformationCSCJournals
This work presents a new method of corner detection based on mutual information and invariant to image rotation. The use of mutual information, which is a universal similarity measure, has the advantage of avoiding the derivation which amplifies the effect of noise at high frequencies. In the context of our work, we use mutual information normalized by entropy. The tests are performed on grayscale images.
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
A New Algorithm for Tracking Objects in Videos of Cluttered ScenesZac Darcy
The work presented in this paper describes a novel algorithm for automatic video object tracking based on
a process of subtraction of successive frames, where the prediction of the direction of movement of the
object being tracked is carried out by analyzing the changing areas generated as result of the object’s
motion, specifically in regions of interest defined inside the object being tracked in both the current and the
next frame. Simultaneously, it is initiated a minimization process which seeks to determine the location of
the object being tracked in the next frame using a function which measures the grade of dissimilarity
between the region of interest defined inside the object being tracked in the current frame and a moving
region in a next frame. This moving region is displaced in the direction of the object’s motion predicted on
the process of subtraction of successive frames. Finally, the location of the moving region of interest in the
next frame that minimizes the proposed function of dissimilarity corresponds to the predicted location of
the object being tracked in the next frame. On the other hand, it is also designed a testing platform which is
used to create virtual scenarios that allow us to assess the performance of the proposed algorithm. These
virtual scenarios are exposed to heavily cluttered conditions where areas which surround the object being
tracked present a high variability. The results obtained with the proposed algorithm show that the tracking
process was successfully carried out in a set of virtual scenarios under different challenging conditions.
A Novel Approach To Detection and Evaluation of Resampled Tampered ImagesCSCJournals
Most digital forgeries use an interpolation function, affecting the underlying statistical distribution of the image pixel values, that when detected, can be used as evidence of tampering. This paper provides a comparison of interpolation techniques, similar to Lehmann [1], using analyses of the Fourier transform of the image signal, and a quantitative assessment of the interpolation quality after applying selected interpolation functions, alongside an appraisal of computational performance using runtime measurements. A novel algorithm is proposed for detecting locally tampered regions, taking the averaged discrete Fourier transform of the zero-crossing of the second difference of the resampled signal (ADZ). The algorithm was contrasted using precision, recall and specificity metrics against those found in the literature, with comparable results. The interpolation comparison results were similar to that of [1]. The results of the detection algorithm showed that it performed well for determining authentic images, and better than previously proposed algorithms for determining tampered regions.
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Olivier Jeunen
Slides for our full paper presentation at RecSys '19 in Copenhagen, titled "Efficient Similarity Computation for Collaborative Filtering in Dynamic Environments".
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operatorQUESTJOURNAL
ABSTRACT:Tracking of moving objects that is called video tracking is used for measuring motion parameters and obtaining a visual record of the moving objects, it is an important area of application in image processing. In general there are two different approaches to obtain object tracking: the first is Recognition-based Tracking, and the second is the Motion-based Tracking. Video tracking system raises a wide possibility in today’s society. This system is used in various applications such as military, security, monitoring, robotic, and nowadays in dayto-day applications. However the video tracking systems still have many open problems and various research activities in a video tracking system are explores. This paper presents an algorithm for video tracking of any moving targets with the uses of contour based detection technique that depends on the sobel operator. The proposed system is suitable for indoor and outdoor applications. Our approach has the advantage of extending the applicability of tracking system and also, as presented here improves the performance of the tracker making feasible high frame rate video tracking. The goal of the tracking system is to analyze the video frames and estimate the position of a part of the input video frame (usually a moving object), our approach can detect, tracked any object more than one object and calculate the position of the moving objects. Therefore, the aim of this paper is to construct a motion tracking system for moving objects. Where, at the end of this paper, the detail outcome and result are discussed using experiments results of the proposed technique
AUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISEijcsa
Mosaicing is blending together of several arbitrarily shaped images to form one large balanced image such
that boundaries between the original images are not seen. Image mosaicing creates a large field of view
using of scene and the result image can be used for texture mapping of a 3D environment too. Blended
image has become a wide necessity in images captured from real time sensor devices, bio-medical
equipment, satellite images from space, aerospace, security systems, brain mapping, genetics etc. Idea
behind this work is to automate the Image Mosaicing System so that blending may be fast, easy and
efficient even if large number of images are considered. This work also provides an analysis of blending
over images containing different kinds of distortion and noise which further enhances the quality of the
system and make the system more reliable and robust.
A Hybrid Architecture for Tracking People in Real-Time Using a Video Surveill...sipij
This paper describes a novel method for tracking customers using images taken from video-surveillance
cameras. This system analyzes the number of customers and their motions through the aisles of big-box
stores (supermarkets) in real-time. The originality of our approach is based on the study of the blobs
properties for managing the splitting/merging issues using a mathematical morphology operator. In the
order hand, in order to manage a high number of customers in real-time, we combine the advantage of two
tracking algorithms.
Usage of Shape From Focus Method For 3D Shape Recovery And Identification of ...CSCJournals
Shape from focus is a method of 3D shape and depth estimation of an object from a sequence of pictures with changing focus settings. In this paper we propose a novel method of shape recovery, which was originally created for shape and position identification of glass pipette in medical hybrid robot. In proposed algorithm, Sum of Modified Laplacian is used as a focus operator. Each step of the algorithm is tested in order to pick the operators with the best results. Reconstruction allows not only to determine shape but also precisely define position of the object. The results of proposed method, performed on real objects, have shown the efficiency of this scheme.
Disparity Estimation by a Real Time Approximation AlgorithmCSCJournals
This paper presents an approximation real time algorithm for estimating the disparity of the stereo
images. The approximation is achieved by shrinking the left and right of original images.
According to this method (i ) left and right images have been shrinked three times,(ii) the disparity
image is computed from the shrinked left and right images to reconstruct the disparity image and
extrapolate the disparity image to retrieve the original image size. The computational time of
proposed algorithm is less than the existing methods, approximately real time and requires less
memory space. This method is applied on the standard stereo images and the results show that it
can easily reduce the computational time of about 76.34 % with no appreciable degradation of
accuracy.
Integrated Hidden Markov Model and Kalman Filter for Online Object Trackingijsrd.com
Visual prior from generic real-world images study to represent that objects in a scene. The existing work presented online tracking algorithm to transfers visual prior learned offline for online object tracking. To learn complete dictionary to represent visual prior with collection of real world images. Prior knowledge of objects is generic and training image set does not contain any observation of target object. Transfer learned visual prior to construct object representation using Sparse coding and Multiscale max pooling. Linear classifier is learned online to distinguish target from background and also to identify target and background appearance variations over time. Tracking is carried out within Bayesian inference framework and learned classifier is used to construct observation model. Particle filter is used to estimate the tracking result sequentially however, unable to work efficiently in noisy scenes. Time sift variance were not appropriated to track target object with observer value to prior information of object structure. Proposal HMM based kalman filter to improve online target tracking in noisy sequential image frames. The covariance vector is measured to identify noisy scenes. Discrete time steps are evaluated for identifying target object with background separation. Experiment conducted on challenging sequences of scene. To evaluate the performance of object tracking algorithm in terms of tracking success rate, Centre location error, Number of scenes, Learning object sizes, and Latency for tracking.
Multiple Person Tracking with Shadow Removal Using Adaptive Gaussian Mixture ...IJSRD
Person detection in a video surveillance system is major concern in real world. Several application likes abnormal event detection, congestion analysis, human gait characterization, fall detection, person identification, gender classification and for elderly people. In this algorithm, we use GMM method in background subtraction for multi person detection because of Gaussian Mixture Model (GMM) model is one such popular method this give a real time object detection. There is still not robustly but Multi person tracking with shadow removal fill this gap, in this work, HOG-LBP hybrid approach with GMM algorithm is presented for Multi person tracking with Shadow removal.
Corner Detection Using Mutual InformationCSCJournals
This work presents a new method of corner detection based on mutual information and invariant to image rotation. The use of mutual information, which is a universal similarity measure, has the advantage of avoiding the derivation which amplifies the effect of noise at high frequencies. In the context of our work, we use mutual information normalized by entropy. The tests are performed on grayscale images.
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
A New Algorithm for Tracking Objects in Videos of Cluttered ScenesZac Darcy
The work presented in this paper describes a novel algorithm for automatic video object tracking based on
a process of subtraction of successive frames, where the prediction of the direction of movement of the
object being tracked is carried out by analyzing the changing areas generated as result of the object’s
motion, specifically in regions of interest defined inside the object being tracked in both the current and the
next frame. Simultaneously, it is initiated a minimization process which seeks to determine the location of
the object being tracked in the next frame using a function which measures the grade of dissimilarity
between the region of interest defined inside the object being tracked in the current frame and a moving
region in a next frame. This moving region is displaced in the direction of the object’s motion predicted on
the process of subtraction of successive frames. Finally, the location of the moving region of interest in the
next frame that minimizes the proposed function of dissimilarity corresponds to the predicted location of
the object being tracked in the next frame. On the other hand, it is also designed a testing platform which is
used to create virtual scenarios that allow us to assess the performance of the proposed algorithm. These
virtual scenarios are exposed to heavily cluttered conditions where areas which surround the object being
tracked present a high variability. The results obtained with the proposed algorithm show that the tracking
process was successfully carried out in a set of virtual scenarios under different challenging conditions.
A Novel Approach To Detection and Evaluation of Resampled Tampered ImagesCSCJournals
Most digital forgeries use an interpolation function, affecting the underlying statistical distribution of the image pixel values, that when detected, can be used as evidence of tampering. This paper provides a comparison of interpolation techniques, similar to Lehmann [1], using analyses of the Fourier transform of the image signal, and a quantitative assessment of the interpolation quality after applying selected interpolation functions, alongside an appraisal of computational performance using runtime measurements. A novel algorithm is proposed for detecting locally tampered regions, taking the averaged discrete Fourier transform of the zero-crossing of the second difference of the resampled signal (ADZ). The algorithm was contrasted using precision, recall and specificity metrics against those found in the literature, with comparable results. The interpolation comparison results were similar to that of [1]. The results of the detection algorithm showed that it performed well for determining authentic images, and better than previously proposed algorithms for determining tampered regions.
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Olivier Jeunen
Slides for our full paper presentation at RecSys '19 in Copenhagen, titled "Efficient Similarity Computation for Collaborative Filtering in Dynamic Environments".
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operatorQUESTJOURNAL
ABSTRACT:Tracking of moving objects that is called video tracking is used for measuring motion parameters and obtaining a visual record of the moving objects, it is an important area of application in image processing. In general there are two different approaches to obtain object tracking: the first is Recognition-based Tracking, and the second is the Motion-based Tracking. Video tracking system raises a wide possibility in today’s society. This system is used in various applications such as military, security, monitoring, robotic, and nowadays in dayto-day applications. However the video tracking systems still have many open problems and various research activities in a video tracking system are explores. This paper presents an algorithm for video tracking of any moving targets with the uses of contour based detection technique that depends on the sobel operator. The proposed system is suitable for indoor and outdoor applications. Our approach has the advantage of extending the applicability of tracking system and also, as presented here improves the performance of the tracker making feasible high frame rate video tracking. The goal of the tracking system is to analyze the video frames and estimate the position of a part of the input video frame (usually a moving object), our approach can detect, tracked any object more than one object and calculate the position of the moving objects. Therefore, the aim of this paper is to construct a motion tracking system for moving objects. Where, at the end of this paper, the detail outcome and result are discussed using experiments results of the proposed technique
AUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISEijcsa
Mosaicing is blending together of several arbitrarily shaped images to form one large balanced image such
that boundaries between the original images are not seen. Image mosaicing creates a large field of view
using of scene and the result image can be used for texture mapping of a 3D environment too. Blended
image has become a wide necessity in images captured from real time sensor devices, bio-medical
equipment, satellite images from space, aerospace, security systems, brain mapping, genetics etc. Idea
behind this work is to automate the Image Mosaicing System so that blending may be fast, easy and
efficient even if large number of images are considered. This work also provides an analysis of blending
over images containing different kinds of distortion and noise which further enhances the quality of the
system and make the system more reliable and robust.
A Hybrid Architecture for Tracking People in Real-Time Using a Video Surveill...sipij
This paper describes a novel method for tracking customers using images taken from video-surveillance
cameras. This system analyzes the number of customers and their motions through the aisles of big-box
stores (supermarkets) in real-time. The originality of our approach is based on the study of the blobs
properties for managing the splitting/merging issues using a mathematical morphology operator. In the
order hand, in order to manage a high number of customers in real-time, we combine the advantage of two
tracking algorithms.
Usage of Shape From Focus Method For 3D Shape Recovery And Identification of ...CSCJournals
Shape from focus is a method of 3D shape and depth estimation of an object from a sequence of pictures with changing focus settings. In this paper we propose a novel method of shape recovery, which was originally created for shape and position identification of glass pipette in medical hybrid robot. In proposed algorithm, Sum of Modified Laplacian is used as a focus operator. Each step of the algorithm is tested in order to pick the operators with the best results. Reconstruction allows not only to determine shape but also precisely define position of the object. The results of proposed method, performed on real objects, have shown the efficiency of this scheme.
A colour-based particle filter can achieve the goal of effective target tracking, but it has some drawbacks
when applied in the situations such as: the target and its background with similar colours, occlusion in
complex backgrounds, and deformation of the target. To deal with these problems, an improved particle
filter tracking system based on colour and moving-edge information is proposed in this study to provide
more accurate results in long-term tracking. In this system, the moving-edge information is used to ensure
that the target can be enclosed by the bounding box when encountering the problems mentioned above to
maintain the correctness of the target model. Using 100 targets in 10 video clips captured indoor and
outdoor as the test data, the experimental results show that the proposed system can track the targets
effectively to achieve an accuracy rate of 94.6%, higher than that of the colour-based particle filter
tracking system proposed by Nummiaro et al. (78.3%) [10]. For the case of occlusion, the former can also
achieve an accuracy rate of 91.8%, much higher than that of the latter (67.6%). The experimental results
reveal that using the target’s moving-edge information can enhance the accuracy and robustness of a
particle filter tracking system.
In this report, Argus, a tool for generating visualizations for eye tracking data is presented. There are numerous ways to
visually present eye tracking data: heatmaps, scanpath, gaze stripes, eye clouds and AOI transition diagrams to name a few. On top
of that, there are multiple ways to interact with these visualizations like selecting users, stimuli and fixation points to compare these
features between the different visualizations. All of the aforementioned visualizations and interaction techniques are implemented into
this tool. This report describes these visualizations and interactions including their advantages and disadvantages and how they are
used in understanding eye tracking data. Furthermore, the report also looks at the structure of the dataset, how the tool runs on a
server, how data is stored and the design philosophy of the website. Finally, the tool is previewed by means of an application example
and the performance and limitations are discussed.
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...sipij
Efficient and efficient multiple object segmentation is an important task in computer vision and object recognition. In this work; we address a method to effectively discover a user’s concept when multiple objects of interest are involved in content based image retrieval. The proposed method incorporate a framework for multiple object retrieval using semi-supervised method of similar region merging and flood fill which models the spatial and appearance relations among image pixels. To improve the effectiveness of similarity based region merging we propose a new similarity based object retrieval. The users only need to roughly indicate the after which steps desired objects contour is obtained during the automatic merging of similar regions. A novel similarity based region merging mechanism is proposed to guide the merging process with the help of mean shift technique and objects detection using region labeling and flood fill. A region R is merged with its adjacent regions Q if Q has highest similarity with Q (using Bhattacharyya descriptor) among all Q’s adjacent regions. The proposed method automatically merges the regions that are initially segmented through mean shift technique, and then effectively extracts the object contour by merging all similar regions. Extensive experiments are performed on 12 object classes (224 images total) show promising results.
Kandemir Inferring Object Relevance From Gaze In Dynamic ScenesKalle
As prototypes of data glasses having both data augmentation and gaze tracking capabilities are becoming available, it is now possible to develop proactive gaze-controlled user interfaces to display information about objects, people, and other entities in real-world setups. In order to decide which objects the augmented information should be about, and how saliently to augment, the system needs an estimate of the importance or relevance of the objects of the scene for the user at a given time. The estimates will be used to minimize distraction of the user, and for providing efficient spatial management of the augmented items. This work is a feasibility study on inferring the relevance of objects in dynamic scenes from gaze. We collected gaze data from subjects watching a video for a pre-defined task. The results show that a simple ordinal logistic regression model gives relevance rankings of scene objects with a promising accuracy.
Detecting and Shadows in the HSV Color Space using Dynamic Thresholds IJECEIAES
The detection of moving objects in a video sequence is an essential step in almost all the systems of vision by computer. However, because of the dynamic change in natural scenes, the detection of movement becomes a more difficult task. In this work, we propose a new method for the detection moving objects that is robust to shadows, noise and illumination changes. For this purpose, the detection phase of the proposed method is an adaptation of the MOG approach where the foreground is extracted by considering the HSV color space. To allow the method not to take shadows into consideration during the detection process, we developed a new shade removal technique based on a dynamic thresholding of detected pixels of the foreground. The calculation model of the threshold is established by two statistical analysis tools that take into account the degree of the shadow in the scene and the robustness to noise. Experiments undertaken on a set of video sequences showed that the method put forward provides better results compared to existing methods that are limited to using static thresholds.
Combined cosine-linear regression model similarity with application to handwr...IJECEIAES
Abstract: the similarity or the distance measure have been used widely to calculate the similarity or dissimilarity between vector sequences, where the document images similarity is known as the domain that dealing with image information and both similarity/distance has been an important role for matching and pattern recognition. There are several types of similarity measure, we cover in this paper the survey of various distance measures used in the images matching and we explain the limitations associated with the existing distances. Then, we introduce the concept of the floating distance which describes the variation of the threshold’s selection for each word in decision making process, based on a combination of Linear Regression and cosine distance. Experiments are carried out on a handwritten Arabic image documents of Gallica library. These experiments show that the proposed floating distance outperforms the traditional distance in word spotting system.
Implementation of Object Tracking for Real Time VideoIDES Editor
Real-time tracking of object boundaries is an
important task in many vision applications. Here we propose
an approach to implement the level set method. This approach
does not need to solve any partial differential equations (PDFs),
thus reducing the computation dramatically compared with
optimized narrow band techniques proposed before. With our
approach, real-time level-set based video tracking can be
achieved.
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...IAEME Publication
Multiple human tracking based on object detection has been a challenge due to its
complexity. Errors in object detection would be propagated to tracking errors. In this
paper, we propose a tracking method that minimizes the error produced by object
detector. We use RetinaNet as object detector and Hungarian algorithm for tracking.
The cost matrix for Hungarian algorithm is calculated using the RetinaNet features,
bounding box center distances, and intersection of unions of bounding boxes. We
interpolate the missing detections in the last step. The proposed method yield 43.2
MOTA for MOT16 benchmark
The Extraction of Spatial Features from remotely sensed data and the use of this information as input into further decision making systems such as geographical information systems (GIS) has received considerable attention over the few decades. The successful use of GIS as a decision support tool can only be achieved, if it becomes possible to attach a quality label to the output of each spatial analysis operation. Thus the accuracy of Spatial Feature Extraction gained more attention as geographic features can hardly formulated in a certain pattern due to intra-class variation and inter-class similarity. Besides these Spatial Feature Extraction further include positional uncertainty, attribute uncertainty, topological uncertainty, inaccuracy, imprecision/inexactitude, inconsistency, incompleteness, repetition, vagueness, noisy, omittance, misinterpretation, misclassification, abnormalities and knowledge uncertainty. To control and reduce uncertainty in an acceptable degree, a Probabilistic shape model is described for Extracting Spatial Features from multi-spectral image. The advantages of this, as opposed to the conventional approaches, are greater accuracy and efficiency, and the results are in a more desirable form for most purposes.
Fast Computational Four-Neighborhood Search Algorithm For Block matching Moti...IJERA Editor
The Motion estimation is an effective method for removing temporal redundancyfound in video sequence compression. Block Matching algorithm has been widely used in motion estimation and a number of fast algorithms have proposed to reduce the computational complexity of BMA. In this paper we propose a new search strategy for fast block matching based on Four-Neighborhood Search (FNS) and fast computational strategy, this new algorithm can significantly speed up the computation of the block matching by reducing the number of checked points and the time computational. Results have been shown that 89% to 93% of operations can be saved while maintaining the quality of video relative to full search algorithm.
Similar to Spakov.2011.comparison of gaze to-objects mapping algorithms (20)
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Spakov.2011.comparison of gaze to-objects mapping algorithms
1. Comparison of Gaze-to-Objects Mapping Algorithms
Oleg Špakov
University of Tampere
Kanslerinrinne 1 B 1063
33014, Tampere, Finland
+358 3 3551 8556
oleg@cs.uta.fi
ABSTRACT
Gaze data processing is an important and necessary step in gaze-
based applications. This study focuses on the comparison of
several gaze-to-object mapping algorithms using various dwell
times for selection and presenting targets of several types and
sizes. Seven algorithms found in literature were compared
against two newly designed algorithms. The study revealed that a
fractional mapping algorithm (known) has produced the highest
rate of correct selections and fastest selection times, but also the
highest rate of incorrect selections. The dynamic competing
algorithm (designed) has shown the next best result, but also
high rate of incorrect selections. A small impact on the type of
target to the calculated statistics has been observed. A strictly
centered gazing has helped to increase the rate of correct
selections for all algorithms and types of targets. The directions
for further mapping algorithms improvement and future
investigation have been explained.
Categories and Subject Descriptors
H.5.2 [User Interfaces]: Input devices and strategies
General Terms
Algorithms, Experimentation, Measurement, Performance,
Design.
Keywords
Gaze to object mapping, eye gaze pointing and selection,
algorithm design, gaze controlled applications.
1. INTRODUCTION
The development of gaze-controlled applications always
deals with a search for gaze data correspondence to onscreen
interactive objects. A solution of this task is not straightforward,
as it is when the input is mouse cursor positions: even the most
accurate eye-tracking systems designed for computer-aided
assistance measure gaze point (a point on a screen where gaze
lands on it) with some noise. Therefore, a user focus cannot be
estimated using naïve mapping, when a gaze point reported by an
eye-tracking system is always associated with the object that
geometrically includes this point.
The noise presented in gaze point measurements during
fixations consists of several components. The two obvious
components are a) technology-related: the inaccuracy due to the
imperfectness of “camera-image - to - screen”calibration-based
transformation algorithms, and b) biology-related: noise due to
various eye movements, such as tremor and microsaccades, and
also a natural random offset (due to fuzzy fovea dimensions1
)
between the vector of actual attentive gaze direction and eye
optical axe. Both components introduce dynamic noise into a
measured gaze point; however, for a short period, like duration of
a fixation, the noise can be roughly approximated as consisting of
a constant (offset) and dynamic (jittering) components (for
illustration, Figure 1 shows targets and recorded gaze points).
Figure 1. Noise in gaze data: offset and jittering.
Various methods have been proposed to overcome the
inaccuracy problem in gaze point association with targets. It is
worth to note that in many solutions fixations are supposed to be
used as the input for mapping algorithms rather than raw gaze
points. However, fixation detection algorithms are designed to
classify all gaze points into “moving” (saccades) or “stable”
(fixations), and answer a questions like “whether this gaze point
is from the same fixation, as the previous one, or not?”, rather
than map them onto objects. Moreover, any fixation detection
algorithm can be treated as a filter than outputs a mean of the
collected gaze points, and therefore can be compared against
gaze-to-object mapping algorithms. In this paper “gaze-point - to
– object” mapping rather than “fixation-to-object”mapping is
meant everywhere next.
The simplest and most evident solution to deal with the
noise in gaze data is to make all interactive objects large enough,
so that any inaccuracy in output from a gaze tracking system is
compensated by object’s size (for example, see the GazeTalk
application developed by Hansen, Hansen and Johansen [2]). The
only requirement for design of objects in this case is to make
their visual informative part located at the very center, thus
leaving a lot of empty (i.e., non-informative, but still included
1
See more on this topic on http://webvision.med.utah.edu
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
NGCA’11, May 26–27, 2011, Karlskrona, Sweden.
Copyright 2011 ACM 978-1-4503-0680-5/11/05… $10.00.
2. into visual object representation) space between object’s content
and its borders.
A similar solution was proposed by Jacob [4]: the objects
are large as well, but the visible object representation is limited
to its informative part, thus leaving an invisible “gaze-sensible”
area around (i.e., it exceeds the object’s visual border). It gives
the effect that objects “attract”gaze points. However, the screen
space around objects cannot be filled by any information and
should stay blank. Miniotas, Špakov and MacKenzie [8] studied
dynamic expansion of gaze-sensible area: if a gaze point than
belongs to a fixation lands on an object, the object is recognized
to be in focus while the fixation continues (so called “grab-and-
hold” effect). However, the requirement of having no objects
around decreases the value of this solution. A similar to the
“grab-and-hold”effect also was studied by Zhang, Ren, and Zha
[16], who implemented object force-feedback and gaze point
speed reduction algorithms to “hold”gaze over an object, thus
preventing dwell time to reset.
Another mapping algorithm described by Monden,
Matsumoto, Yamato [9] does not take into account the object’s
size, but rather associates a gaze point with the nearest object
(the distance is calculated from object’s center). However, in this
case a number of unintended selections may be very high, as a
user can make a selection looking at any point of a screen, even
at its farthest from interactive objects place.
Graphical interfaces of applications are nearly always
consists of interactive (buttons, menus, web-links, etc.) and
passive (text, images, etc.) elements. The necessity to make all
interactive elements large reduces the space left to accommodate
non-interactive elements, and gaze-controlled interfaces
sometimes contain specific interactive widgets to deal with this
problem (for example, see [7] and [13]). It is also worth to note
that non-interactive elements also can (and, probably, should) be
designed so that their content does not take all the space
available, but leave a gap between their informative content and
border. Therefore, one may found that in many cases all screen
real estate is divided between visual interface element (both
interactive and passive), and algorithms that try to expand
objects or split the “free”areas to assign to nearest elements will
be out of use simple because there is no such “free” area,
although they still may benefit in cases when a gaze point falls
slightly out of a screen.
The only known option for this kind of gaze point mapping
is a dynamic hidden resize of gaze-sensible area of interactive
elements. The resize procedure uses some system or/and user
states that it tracks. MacKenzie and Zhang [6] have introduced
this kind of intelligence into their onscreen keyboard for eye-
typing (operating with distances from fixation to object center,
rather than with object sizes). In their keyboard the keys have
various “priorities”at a certain time moment, so fixations are
associating not with the nearest key, but rather with the key for
which a product of distance and its priority gives the highest
value. The follow-up study revealed that there was some
(although very small) improvement in eye-typing (less errors,
higher entry speed) using this mapping algorithm.
Another interesting proposal was introduced by Xu, Jiang,
and Lau [15], who have suggested a complex association between
a single gaze point and few nearest objects. In their study the
authors were calculating an overall time spent focusing on
objects, and if a gaze point was landed so that is was close to
more than one object, its “time”(eye-tracking system sampling
interval) was divided between all these objects proportionally to
the distance to them. This idea may be found useful when
applied to the gaze-based interaction using dwell time: the object
that first accumulates a certain “amount of time”, or “interest”,
will win the competition for the user’s attention (focus) and get
selected.
However, other known gaze point mapping algorithms used
for dwell-time based selection always treat input data as solid
temporal units. Instead, these algorithms put a specific attention
of how to preserve the object’s accumulated interest (i.e., sum of
gaze points multiplied by sampling interval) if few gaze points
have landed outside of the object. So, Hansen et al [3] has
suggested simply to ignore such gaze points (i.e., do not alter the
accumulated interest). Tien and Atkins [14] suggested to select
objects if they collected 30 of 40 last gaze points, and then all of
the next 20 gaze points (~700 ms). Almost the same proposal can
be found in a study by Abe, Ohi, and Ohyama [1]: to be selected,
an object has first (“initial”state) receive 5 consecutive gaze
points (another options: more than half out of 5 recent gaze
points), and then (“continuous”state) more than half out of 5 or
10 following points.
Laqua, Bandara, and Sasse [5] implemented a gaze-
controlled application where objects were competing for the
user’s attention. Using the first solution, a gaze point contributed
into some object’s accumulated interest was not affecting the
accumulated interest of other objects. The reset of accumulated
interest values was initiated only at object (either any or the
corresponding –two option were available) selection moment.
The other solution applies an accumulated interest decay
effect to the objects other than the one onto which the recent gaze
point was mapped. That is, each gaze point contributed into some
object’s accumulated interest was affecting the accumulated
interest of other objects, in contrast to the previous solution. The
accumulated interest was decreased by 50% for every 50 gaze
points (1 second) mapped somewhere else. Decay of accumulated
interest was also used by Ohno [10, 11] in his EyePrint
application.
2. RESEARCH INTEREST
The variety of mapping algorithms which somehow deal
with inaccuracy of output from eye-tracking systems naturally
raises a question: which one is the best, if applied in real
conditions, where objects are located next to each other and there
is no space to expand their gaze-sensible area? Another research
interest is related to the object size and its impact to the
performance of these algorithms.
In existing gaze-controlled application objects usually
presented as rectangles or circles with an appropriate pictogram.
The impact of pictogram type (text, icon, or blank) to the
performance of mapping algorithms is one more interest of this
study.
It is expected, that user’s focus, when looking a at target, is
located at some distance from the target’s center. The impact of
this offset to the mapping performance is another target or this
study.
3. The comparison required same gaze data to be streamed into
mapping algorithms, therefore a database of gaze points was
collected during an experiment, and then used to feed the
comparing algorithms. Only the algorithms that can be used for
gaze point mapping onto multiple adjacent objects and does not
require any knowledge of the task, system or user states were
selected for this study (the complete list is in the “Analysis”
section).
3. DESIGNING A NEW ALGORITHM
The accumulated interest protection against an outlying gaze
points was recognized as a key feature of all the algorithms
described in the previous section. However, there is no
agreement of the effect that an outlying gaze point should apply
to the accumulated interest. In some cases, like force-feedback
and speed-reduction mapping algorithm, the outlying gaze points
reset the accumulated value. In another cases, like the one
described in [1], the interest remains unchanged, even if it is
evident that another object was focused. One more solution, like
the second algorithm described in [5], implies slow decay of the
interest.
The decision to decay somehow the accumulated interest
seems to the most natural: the decay protects the accumulated
interest against outlying gaze points, and resets it after some
time, if another object got the user’s focus. However, the
reported decay seems to be very slow, and more natural solution
may sound as following: add the sampling interval to the interest
of the object onto which a gaze point was mapped directly, and
subtract same value from the interest of all other objects. The
proposed algorithm was called a “competing algorithm”.
In case of often outlying gaze points, this kind of mapping
algorithms will cause a delay in selection. Since outlying points
can be recognized as “outlying”and not as “switch of focus”only
after collecting several further points, it is possible to re-estimate
their association with objects later. Thus, keeping a history of
results of gaze point mappings and looking backward with every
new gaze point may serve for improvement of the proposed
algorithm. Naturally, the history should be limited in time and be
reset after each selection.
The improved algorithm, named as “dynamic competing
algorithm”, pulls in each past gaze point to the recent gaze point
by the value inversely proportional to the distance and time
interval between them. This procedure uses the Gaussian
function and the parameters for the spatial and temporal
components (sigma’s) was found empirically, but they may be
dependent on peculiarities of output of the eye-tracking system
used. The “pull-in”procedure resembles the magnetic effect of
the algorithms used in [16].
4. METHOD
4.1 Experiment design
Three types of targets were used in this study: short words
(3 letters), long words (9 letters), and icons. The monospaced
font “Consolas”was used to display words, as it allows keeping
their graphical representation of a same size. The lists of words
were created from the “Oxford 3000” keywords2
(the most
2
http://www.oxfordadvancedlearnersdictionary.com/oxford3000
frequent 3000 words of English language) in order to minimize
cognitive load for participants, thus 3-letters list contained 177
words, and 9-letters list contained 253 words. The list of icons
was constructed from various free icons (designed to the
displayed on buttons) available in the Internet3
, and contained
195 icons.
A fourth type of object (“small dot”) was introduced to serve
as a center of the targets emulated during the analysis step. It
was done to make the comparison of centered and free gazing
available: the size of emulated targets resembled the size of other
shown targets.
The shown targets were appearing in three sizes: 18, 24, and
30 points for words, and 35, 60, and 85 pixels for square-shaped
icons. The blank targets had corresponding size to each of
graphical targets. A “small dot”target was displayed as a 6 by 6
pixels square. Words were displayed in black color, blank targets
in grey color, and the screen background had dark grey color.
The screen was divided into 25 cells (5 by 5); targets were
presenting in a center of each cell in random order. One block of
the experiment consisted of presentation of 25 targets of the
same type: only words (either from 3-letters or from 9-letters
list), or icons, or blank targets of a same size were displayed
within a block, but for each presentation a new random word or
icon was used. The order of blocks was random.
4.2 Procedure
The first experiment step was a calibration of the eye
tracker. Then a grid of 25 cells with circles (D=20px) at center
was displayed, and a gaze point was visible on a screen.
Participants where asked to quickly glance at each point. This
procedure was introduced to estimate visually the quality of
calibration. The displayed circles got a color between red and
green, depending on what percentage of gaze points landed on a
particular cell hit the central (invisible) square (D=50px). The
calibration procedure was repeated if more than half of the
circles were red or orange, until a satisfied calibration quality
was recognized.
Then the first block started and a home box at the screen
center was displayed for 2 seconds. The experiment continues
displayed 25 targets one by one. Each target stood visible for 2
second; they were shown and hidden automatically. After a block
was finished, the participants had a chance to rest, if needed, and
then press a key to continue with the next block. This way all 20
blocks were completed: [3 types] x [3 sizes] x [graphical / blank]
+ 2 x [“small dot”]. Each participant gazed at 20 x 25 = 500
targets.
In order to escape visual search for small targets appearing
far from the previously shown target, a big white rectangle was
displayed for 150 ms at the target appearance location to make a
“blink”effect.
The recorded data consisted of records, each containing a
gaze points (timestamp and location) and the displayed target
properties (location, size and type).
3
In particular, some the icons located at
http://wefunction.com/2008/07/function-free-icon-set/ were
used in this study
4. 4.3 Participants
Ten volunteers (aged 25–59 years, 5 male, 5 female) took
part in the test. They were students or staff at the University of
Tampere, and all had previously participated in other related eye
typing experiments. Prior to the experiment, participants were
informed about their rights and anonymity of the data in the
experiment.
4.4 Equipment
The experiment was conducted in the usability laboratory at
the University of Tampere. The Tobii T60 eye tracking system
was used to measure participants’ eye movements. The
proprietary software with embedded ETU-Driver4
was developed
(C#, .NET 2.0, MS Visual Studio 2005) to display targets and
collect data. The setup consisted adjustable chairs and tables.
The chair was set so that the participant’s eyes were at
approximately 60 cm from the 17-inch monitor.
5. ANALYSIS
Altogether, 500 x 10 = 5000 sets of gaze points were
recorded and used to feed the mapping algorithms.
The software dedicated for the first-step analysis was
developed in C#, .NET 2.0, MS Visual Studio 20055
. It inputs
the recorded data form a log file, and outputs results of selection
for each displayed target. The results depend on the chosen dwell
time, mapping algorithm, and chosen parameters of this
algorithm. The output from this software consisted of 950 records
when analyzing one log file (one participant data): 500 selections
of the displayed targets, and 450 selections of emulated objects
of all sizes when the target was presented as a “small dot”. Each
record contained location, size and type of target, movement
time, selection time, and selection result (correct selection,
incorrect selection, or no selection). This data was loaded into
MS Excel to summarize the results of the study.
Four dwell times DT were used as selection criteria: 300,
500, 850 and 1300 milliseconds.
Most of the implemented algorithms use same additive-only
interest calculation formula:
otherwise
OyxifS
OATOAT itt
itit
Î
î
í
ì
+= -
),(
0
)()( 1
Here ATt(Oi) means the accumulated interest in
milliseconds of the ith
object at the time t, S is the sampling
interval in milliseconds, )( , tt yx are the adjusted gaze point
coordinates at the time t. The adjustment is algorithm-specific,
and is shown below as the corresponding formula (if any) in the
list of implemented algorithms (the interest calculation formula
is shown also, if differs from the mentioned above):
· Competing, C
4
Available on http://www.cs.uta.fi/~oleg/
5
Stimuli presentation and analysis software is available from
http://www.cs.uta.fi/~oleg/products/LookPoint.zip
otherwise
Oyxif
D
S
OATOAT itt
itit
Î
î
í
ì
-
+= -
),(
)()( 1 ,
where tt x=x and tt y=y
· Dynamic competing, DC:
å
= î
í
ì Î
=
t
j
ijj
it
otherwise
OyxifS
OAT
0
),(
0
)( ,
where
22
2)(2)(
)( ss
jytyjxtx
jtjj exxxx
-+-
-
-+= ,
22
2)(2)(
)( ss
jytyjxtx
jtjj eyyyy
-+-
-
-+= ,
and st = 80, ss = 20
· Force Feedback, FF [16]
)(
}),{},,({
}),{},,({ 11
ct
cctt
tttt
tt xx
yxyxD
yxyxD
sxx -+= --
,
)(
}),{},,({
}),{},,({ 11
ct
cctt
tttt
tt yy
yxyxD
yxyxD
syy -+= --
,
where D is the distance between the given points, (xc,yc) is a
center of object Oi, s = 0.8.
· Speed reduction, SR [16]
otherwise
yy
xx
yxyxDyxyxDif
ryyry
rxxrx
tt
tt
ccttcctt
ttt
ttt
ïî
ï
í
ì
=
=
>
ïî
ï
í
ì
+-=
+-=
--
-
-
}),{},,({}),{},,({
)1(
)1(
11
1
1
,
where D is the distance between the given points, (xc,yc) is a
center of object Oi, and r = 0.85
· Fractional mapping, FM [15]
22
)(
22
)(
)( ys
cyiy
xs
cxix
it SeOAT
-
-
-
-
= ,
where (xc,yc) is a center of object Oi, sx = 120, sy = 120.
· Accurate ending, AE: 75% of 67% period of the dwell
time, then 100% of the rest period [14].
· More-than-half, MTH: the “initial”state is 33% of the
dwell time, and the “continuous”state is the rest period
[1].
· Static interest accumulation, SIA [5].
· Dynamic interest decay, DID: 0.5% of accumulated
interest for each 20 ms of gazing off-target [5]
otherwise
Oyxif
OAT
S
D
S
OATOAT itt
it
itit
Î
ïî
ï
í
ì
-
+=
-
-
),(
)(
20
)()(
1
1
,
where tt x=x , tt y=y , and D = 0.005
To ensure the best performance of DC, FF, SR, FM and DID
mapping algorithms, a prior search for their optimum parameters
was completed (for illustration, the dependency of average rate of
correct selection recorded using the FF algorithm on the
5. parameter s is shown on Figure 2). The search was organized so
that the lists of candidate values included those reported in the
literature. The best parameter is the one that results in the
highest rate of correct selection. However, if these rates were
about equal for several parameters (the observed difference was
less than 0.5%), then the selection time and rate of incorrectly
selected objects were also taken into account to recognize the
best parameter. The optimization was completed using DT =
500ms as a selection criterion.
No any data filtering was applied intentionally to prevent
possible decrease in differences of the results between the tested
algorithms and to determine they own efficiency when dealing
with inaccuracies in gaze data.
Figure 2. The rate of correct selections dependency on the
parameter using FF algorithm.
The DC algorithm was the most demanding for
computational power resulting in slow data processing, and the
MTH algorithm was the fastest.
6. RESULTS AND DISCUSSION
The overall rate of correct selections was rather low (from
20% to 60%, about 45% in average) due to the small size of
some targets and drifts in calibration. Therefore, a gaze data
correction was introduced, and the mapping results of both
original and corrected gaze data are presented next.
The correction was session and location -based: gaze data of
each of 20 targets appeared at a particular location during same
session was shifted equally. Applied offsets were calculated as a
distance between the corresponding target center location and the
average of (almost) all gaze points collect when a “small-dot”
target was shown.
The grand mean of the rate of correct selections Sc is 46.8%,
and the rate of incorrect selection Sw is 31.2%. The data
correction has improved these values significantly: cS =65.9%
and wS =15.6% (p < 0.001).
The performance of the algorithms is expressed as Sc, Sw,
and selection time Ts (only for the cases when a selection was
recognized): higher Sc, lower Sw, and shorter Sc point to better
performance. The average Sc calculated for each algorithm using
original data is shown in Figure 3 as colored solid bars. These
value are quite similar if DT=300, but the Grubb’s test treats the
value of MTH’s algorithm as an outlier, i.e. this algorithm
performs worse than others at the given DT.
Figure 3. Correct selections, by algorithms and dwell time.
The increase of DT improves the Sc of FM algorithm, and
worsens this rate of C and AE algorithms. Other algorithms
reaches their highest Sc when DT=500, but then show its steady
decrease as DT continues increasing and a time pressure (due to
the approach to the duration of trials) become stronger. Some
algorithms like C, SIA and DID suffer more from this increase
than others like MTH.
The average Sw calculated for each algorithm using original
data is shown in Figure 4 as colored solid bars. The C and AE
are the most resistant algorithms to the selections of incorrect
objects. The FM algorithm showed the worst resistance to the
incorrect selections, but only when DT was 850 and 1300 ms.
The DC and MTH algorithm also produces higher than average
Sw for all DTs.
Figure 4. Incorrect selections, by algorithms and dwell time.
The DC and FM algorithms showed the shortest selection
time independently from DT. Other algorithms had about equal
Ts for same DT. These values are shown in Figure 5.
Figure 5. Selection time, by algorithms and dwell time.
The analysis of algorithms’performance reveals that the FM
algorithm could be considered as the best because or the highest
Sc and shortest Ts, and DC as the second best. However, both
6. algorithms showed high rate of Sw, and therefore their superiority
is disputable. It is worth to note that the advantage in Sc at long
DT resulted also in proportional disadvantage in Sw.
The ratio between Sw and Sc may help understanding the
algorithms’resistance to the incorrect selection. This ratio is
about equal for most of the algorithms, but is worse (greater) for
MTH and better (lower) for C and AE algorithm. The latter
algorithms showed the worst Sc, therefore they cannot be treated
as candidates for the best performance.
The discussed ratios are shown in Table 1. The table also
contains Sc, Sw and Tc values ranged by target type (rows) and
algorithms (columns). Values calculated from original and
corrected data are shown for each algorithm separately (left and
right sub-columns, correspondingly).
The analysis of cS and wS using the corrected data reveals
a bit distinctive picture. Although the FM algorithm still have
much greater cS when DT=1300, its cS value for DT=300 is
the worst (see Figure 3, grey rocky bars). Even more significant
change was observed for its wS : the improvement in data quality
resulted in very high values of wS comparing to this value
produced by other algorithms (see Figure 4, grey rocky bars),
although in absolute units it decreased by 5-12%. Other
algorithms showed about same increase of cS and decrease of
wS comparing correspondingly to Sc and Sw. The ratio between
wS and cS left lowest for AE and C algorithm, but the highest
ratio got the FM algorithm: it was about twice greater than the
average.
The analysis of algorithms’Sc dependency on target size
reveals that the observed differences in this value between
algorithms are strongly expressed if target size is small, and this
difference vanishes for the largest targets, as expected. For
example, the FM algorithm showed about twice higher Sc than
others algorithm when targets were words displayed in small
font. The rates of correct selection for all targets and algorithms
Figure 6. Correct selections, by algorithms and targets.
Table 1. Rates of correct and incorrect selections and selection times by algorithms and target types
using original (black) and corrected (blue) data
Algorithm C DC FF SR FM AE MTH SIA DID
Graphical 42.7 60.8 49.9 73.4 45.4 66.8 45.0 67.4 61.2 71.3 37.6 58.7 48.0 70.5 49.3 73.3 48.7 72.6
Blank 41.1 59.8 49.1 68.3 44.8 62.2 44.2 62.8 56.6 65.8 36.8 53.7 45.4 63.7 48.2 67.8 47.6 67.0
Shown
Mean 41.9 60.3 49.5 70.9 45.1 64.5 44.6 65.1 58.9 68.5 37.2 56.2 46.7 67.1 48.8 70.5 48.1 69.8Sc
Emulated 44.3 78.8 52.7 87.0 48.4 77.6 48.0 81.6 60.6 73.1 39.1 71.8 46.5 76.9 51.4 85.4 50.6 84.7
Graphical 23.9 8.5 34.4 14.4 30.7 12.6 28.8 11.5 38.3 27.9 19.6 6.7 37.1 17.0 33.7 13.6 32.6 13.2
Blank 24.4 11.5 34.2 17.6 29.6 15.1 28.1 13.9 42.7 33.3 21.0 9.7 36.8 21.2 33.2 16.8 32.4 16.3
Shown
Mean 24.2 10.0 34.3 16.0 30.2 13.8 28.5 12.7 40.5 30.6 20.3 8.2 37.0 19.1 33.4 15.2 32.5 14.7
Sw
Emulated 23.1 3.3 33.2 6.4 30.3 5.3 27.9 4.7 38.8 26.2 18.5 2.8 37.3 12.1 32.4 5.8 31.6 5.6
Sw / Sc ratio 0.58 0.17 0.69 0.23 0.67 0.21 0.64 0.19 0.69 0.45 0.55 0.15 0.79 0.28 0.68 0.22 0.67 0.21
Graphical 1293 1267 1140 1125 1310 1300 1351 1338 1025 1023 1275 1254 1149 1137 1237 1210 1239 1213
Blank 1332 1295 1170 1149 1340 1322 1375 1355 1034 1032 1315 1277 1165 1146 1266 1238 1267 1240
Shown
Mean 1312 1281 1155 1137 1325 1311 1363 1346 1030 1027 1295 1266 1157 1142 1252 1224 1253 1226
Ts
Emulated 1313 1241 1149 1112 1329 1311 1368 1330 1037 1034 1291 1228 1149 1127 1253 1194 1255 1196
7. are shown in Figure 6: the colored solid bars denote the results of
analysis using original data, and grey rocky bars denote the
results of analysis using corrected data. However, the
discrimination in Sw is also dependent on the target size in a
similar way. Similar dependency was observed when analyzing
both selection rates using corrected data.
The summarized impact of target’s pictogram is shown in
Figure 7. The Sc rates are slightly lower for the blank targets:
46.0% versus 47.5% in average for the graphical targets using
the original data and 63.5% versus 68.5% in average using the
corrected data, and the differences are statistically significant (p
< 0.01 and p < 0.001 correspondingly). The type of target
affected the Sc and Sw values of each algorithm about equally.
The direct comparison between the results produced for the
trials with long and short words is not applicable, as they
occupied distinct amount of screen. However, the visual
inspection of the values shown in Figure 6 does not reveal any
remarkable changes of the relative performance of the tested
algorithms.
Figure 7. Correct selections, by targets.
The grand mean of Sc for emulated targets was slightly
higher than for shown targets (47.9% versus 45.6%, p < 0.001)
when making analysis with the original data. This improvement
was about equal for many algorithms (but for MTH algorithm
there was no improvement at all, p < 0.05) and types of targets,
therefore the centralized gaze allocation over a target was
recognized as helping in selecting it, but the benefit was only
about 5%. However, the increase of Sc for large images
comparing to Sc of small and middle-sized images was much
higher (5.7% against 0.8% and 0.3%) and statistically significant
(p < 0.001). The effect of gaze centralized position was the
opposite when using the corrected data: the increase of Sc for
large images was only 3.8% against 19.4% and 7.2%; for other
types of targets the improvement consisted of about 15%.
The improvement of Sc is higher if the corrected data was
used, but this “artificial”increase was expected, as same data
used for emulation and correction. Nevertheless, the differences
in changes of Sc still offer the interest. Indeed, the improvement
of Sc differed a lot among the algorithms. So, the average
improvement was 13.8%, but the FM algorithm improved its Sc
only by 4.6%, while the C algorithm improved Sc by 18.5% (see
Figure 8). A weak decrease of Sw (about one percent) in the
analysis with original data also has changed a lot (7.6% in
average) when the corrected data was used.
Figure 8. Changes in rates of correct and incorrect
selections, by algorithm.
7. CONCLUSIONS
The fractional mapping algorithm FM was recognized as the
best one if considering the rate of correction selection Sc
(especially when selecting small targets and the time pressure
was high) and selection time Ts: for this algorithm these values
were the best in most of the tested conditions. However, its rate
of correction selection Sw was often the highest. The ratio
between Sw and Sc was similar to those of other algorithm,
therefore the functionality of this algorithm can be treated as
leading to simply a highest rate of selections: this algorithm
allowed to make a selection even when all other algorithms gave
up to do it within the given time window. This observation does
not allow treating this algorithm as the ultimate winner of the
conducted competition.
The designed dynamic competing algorithm DC has shown a
good performance: its Sc and Ts were one of the best in most of
the conditions, and less sensitive to the time pressure at long
dwell times than for other algorithms. However, it also showed
high rate of incorrect selections.
The Sc averaged by all algorithms was slightly lower for the
blank targets, and slightly higher for the targets with central gaze
allocation. Only few algorithms had distinct effect on the type of
targets than the majority. However, the study leads to a
conclusion that the design of target graphical view so that it
attracts gaze to its very central location is advisable.
8. FUTURE WORK
The current study was conducted using an expensive high-
end eye tracking system. The low-cost eye tracking systems with
free software and off-the-shelf hardware components available
nowadays are expected producing data of less quality, and
therefore the performance of the tested algorithms may differ
significantly. The future work includes testing the algorithms on
data gathered from systems like ITU GazeTracker [12].
Another direction of this research is the investigation of
possibilities for further improvement of some algorithms. For
example, the combination of DC, FM and some other algorithms
may lead to a very high Sc rates with low Ts and moderate Sw.
Finally, the impact of a prior data filtering and smoothing is
one more interest of the research in gaze-to-object mapping.
8. 9. ACKNOWLEDGMENTS
This work was supported by Academy of Finland (grant 256415).
10. REFERENCES
[1] Abe, K., Ohi, S., and Ohyama, M. 2007. An Eye-Gaze Input
System Using Information on Eye Movement History. In
Proceedings of the 4th international conference on
Universal access in human-computer interaction: ambient
interaction (UAHCI '07), Stephanidis, C. (Ed.). Springer-
Verlag, Berlin, Heidelberg, 721-729.
[2] Hansen, J. P., Hansen, D. W., and Johansen, A. S. 2001.
Bringing Gaze-based Interaction Back to Basics. In
Stephanidis, C. (Ed.), Universal Access in Human-
Computer Interaction, Lawrence Erlbaum Associates, 325-
328..
[3] Hansen, J.P., Johansen, A.S., Hansen, D.W., Itoh, K., and
Mashino, S. 2003. Command without a click: dwell time
typing by mouse and gaze selections. In Proceedings of
Human Computer Interaction (INTERACT '03), IOS Press,
Amsterdam, 121-128.
[4] Jacob, R. J. K. (1995) Eye tracking in advanced interface
design. In Virtual environments and advanced interface
design, Woodrow Barfield and Thomas A. Furness, III
(Eds.). Oxford University Press, Inc., New York, NY, USA,
258-288.
[5] Laqua, S., Bandara, S. U. and Sasse, M. A. 2007.
GazeSpace: eye gaze controlled content spaces. In
Proceedings of the 21st British HCI Group Annual
Conference on People and Computers: HCI...but not as we
know it (BCS-HCI '07), Vol. 2. British Computer Society,
Swinton, UK, 55-58.
[6] MacKenzie, I. S., and Zhang, X. (2008) Eye typing using
word and letter prediction and a fixation algorithm. In
Proceedings of the 2008 symposium on Eye tracking
research & applications (ETRA '08). ACM, New York, NY,
USA, 55-58.
[7] Miniotas, D., Špakov, O., and Evreinov, G. 2003. Symbol
Creator: An Alternative Eye-Based Text Entry Technique
with Low Demand for Screen Space. In Proceedings of IFIP
TC13 International Conference on Human-Computer
Interaction (INTERACT 2003), IOS Press, pp. 137-143.
[8] Miniotas, D., Špakov, O., and MacKenzie. I.S. 2004. Eye
gaze interaction with expanding targets. In Extended
abstracts on Human factors in computing systems (CHI
'04). ACM, New York, NY, USA, 1255-1258.
[9] Monden, A., Matsumoto, K., and Yamato, M. 2005.
Evaluation of gaze-added target selection methods suitable
for general GUIs. In International Journal on Computer
Applications in Technology, 24(1), 17-24.
[10] Ohno, T. 2001. EyePrint - An Information Navigation
Method with Gaze Position Trace. In Human-Computer
Interaction (INTERACT '01), Hirose M. (ed.), IOS Press,
743-744.
[11] Ohno, T. (2004) EyePrint: Support of Document Browsing
with Eye Gaze Trace, In Proceedings of the 6th
International Conference on Multimodal Interfaces
(ICMI’04), 16-23.
[12] San Agustin, J., Skovsgaard, H., Hansen, J. P., and Hansen,
D. W. 2009. Low-cost gaze interaction: ready to deliver the
promises. In Proceedings of the 27th international
conference extended abstracts on Human factors in
computing systems (CHI EA '09). ACM, New York, NY,
USA, 4453-4458.
[13] Špakov, O., and Majaranta, P. 2009. Scrollable Keyboard
for Casual Gaze Typing. In Boehme, M., Hansen, J.-P., and
Mulvey, F. (eds.) PsychNology Journal: Gaze Control for
Work and Play, ISSN 1720-7525, 7(2), pp. 159-173.
[14] Tien, G., and Atkins. M. S. 2008. Improving hands-free
menu selection using eyegaze glances and fixations. In
Proceedings of the 2008 symposium on Eye tracking
research & applications (ETRA '08). ACM, New York, NY,
USA, 47-50.
[15] Xu, S., Jiang, H., and Lau, F. C. M. 2008. Personalized
online document, image and video recommendation via
commodity eye-tracking. In Proceedings of the 2008 ACM
conference on Recommender systems (RecSys '08). ACM,
New York, NY, USA, 83-90.
[16] Zhang, X., Ren, X., and Zha. H. 2008. Improving eye
cursor’s stability for eye pointing tasks. In Proceeding of the
26th annual SIGCHI conference on Human factors in
computing systems, ACM, Florence, Italy, 525–534.