Kammerer Gaze Based Web Search The Impact Of Interface Design On Search Result Selection

1,108 views
1,050 views

Published on

This paper presents a study which examined the selection of Web search results with a gaze-based input device. A standard list interface was compared to a grid and a tabular layout with regard to task performance and subjective ratings. Furthermore, the gazebased input device was compared to conventional mouse interaction. Test persons had to accomplish a series of search tasks by selecting search results. The study revealed that mouse users accomplished more tasks correctly than users of the gazebased input device. However, no differences were found between input devices regarding the number of search results taken into account to accomplish a task. Regarding task completion time and ease of search result selection only in the list interface gaze-based interaction was inferior to mouse interaction. Moreover, with a gaze-based input device search tasks were accomplished faster in tabular presentation than in a standard list interface, suggesting a tabular interface as best suited for gaze-based interaction.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,108
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Kammerer Gaze Based Web Search The Impact Of Interface Design On Search Result Selection

  1. 1. Gaze-based Web Search: The Impact of Interface Design on Search Result Selection Yvonne Kammerer* Wolfgang Beinhauer† Knowledge Media Research Center, Fraunhofer Insitute for Industrial Engineering, Tuebingen, Germany Stuttgart, Germany Abstract In this paper, we concentrate on the task of discovering and accessing information on the Web that usually starts by using a This paper presents a study which examined the selection of Web general search engine. The aim of this study is to identify search results with a gaze-based input device. A standard list alternative designs for search results interfaces that overcome the interface was compared to a grid and a tabular layout with regard problems induced by densely printed result lists, which are most to task performance and subjective ratings. Furthermore, the gaze- common with popular search engines. based input device was compared to conventional mouse interaction. Test persons had to accomplish a series of search 2 Related Work tasks by selecting search results. The study revealed that mouse While eye tracking has been widely used for the examination of users accomplished more tasks correctly than users of the gaze- Web search patterns, the optimization of search result presentation based input device. However, no differences were found between for gaze controlled applications is rather new. Kumar et al. [2007] input devices regarding the number of search results taken into investigated a combination of eye gaze and keyboard input that account to accomplish a task. Regarding task completion time and was used for navigating through a series of Web pages. The study ease of search result selection only in the list interface gaze-based showed that gaze-based interaction resulted in longer click times interaction was inferior to mouse interaction. Moreover, with a and higher click errors than mouse interaction. gaze-based input device search tasks were accomplished faster in tabular presentation than in a standard list interface, suggesting a Other ways of overcoming the problem of too densely arranged tabular interface as best suited for gaze-based interaction. search items are the magnification of parts of the screen, such as a gaze-contingent fish-eye lens [Ashmore et al. 2005], or the CR Categories: H.5.2 [Information Interfaces and approach of graphical, multi-scaled information spaces Presentation]: User Interfaces - Evaluation/methodology; Screen [Mollenbach et al. 2008]. However, the initial case of easing design; Input devices and strategies conventional Web search by means of search engines has not been tackled. The experiment described in the sequel presents a direct Keywords: gaze-based interaction, search result selection, input approach towards optimizing search engine result pages (SERPs) devices, search results interfaces, Web search for use by gaze control. 1 Motivation 3 Improving Gaze-based Search Result Gaze-based interaction has become a promising means of Selection accessing computers when the user’s hands are occupied or Due to the immanent inaccuracy of gaze control and its cannot be used for some other reason. Even more, for some detrimental properties such as the Midas Touch Problem [Jacob people with physical disabilities such as Amyotrophic lateral 1990], densely arranged result lists - like linear pull-down menus - Sclerosis (ALS), the use of gaze controlled interfaces often is the seem to be poorly suited for gaze-based search result selection. only possibility to interact with computers and thus to The poor performance of linear pull-down menus [Kammerer et communicate with their environment. A survey conducted among al. 2008] points towards the necessity of a new design approach gaze control users with ALS listed internet access, e-mailing and that places all interactive elements (i.e., the hyperlinks of the social communication among the most used applications search results) sufficiently apart from each other (design guideline [Donegan et al. 2005]. Enabling these favored tasks of 1). Furthermore, as the effect of inaccuracy due to calibration maintaining social interaction and educating oneself via internet is errors tends to increase towards the screen periphery [Beinhauer therefore a primary task of applied research in gaze control. Both 2006], interaction elements should be placed more towards the tasks strongly depend on efficient user interfaces. center of the screen (design guideline 2). Finally, in order to avoid Midas Touch, interactive elements should be separated from the From a user’s point of view, internet search is a two-step process, actual content (design guideline 3). Based on these considerations, consisting of query formulation and the processing of result sets. in this study two alternative layouts of search results interfaces While eye-typing is of particular importance for the query were chosen: a grid interface and a tabular interface. composition, efficient information retrieval strongly depends on the presentation of result sets. In a grid interface, which has lately been used in some novel * search engines, search results are presented in multiple rows and e-mail: y.kammerer@iwm-kmrc.de columns. Therefore, in line with guidelines 1 and 2, search results † e-mail: wolfgang.beinhauer@iao.fraunhofer.de can be placed more in the center of the screen and with larger space in between the search results. In a tabular interface, which Copyright © 2010 by the Association for Computing Machinery, Inc. was used in a Web search study by Rele and Duchowski [2005], Permission to make digital or hard copies of part or all of this work for personal or search results are listed from top to bottom while grouping the classroom use is granted without fee provided that copies are not made or distributed for commercial advantage and that copies bear this notice and the full citation on the individual elements (title, summary, URLs) in columns. first page. Copyrights for components of this work owned by others than ACM must be Additional information comprised in the summary and the URL of honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on a search result is separated from the link (i.e., the title), in servers, or to redistribute to lists, requires prior specific permission and/or a fee. compliance with guideline 3. Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail permissions@acm.org. ETRA 2010, Austin, TX, March 22 – 24, 2010. © 2010 ACM 978-1-60558-994-7/10/0003 $10.00 191
  2. 2. In order to test the suitability of the different search results could navigate back to the SERP by clicking on a hyperlink “back interfaces for gaze-based search result selection, an experiment to the Google page” placed in the center of the screen. was conducted, in which participants successively had to use the three different search results interfaces. Additionally, the gaze- Apart from the experimental manipulation of the search results based input device was compared to conventional mouse interfaces (see section 4.3) the SERPs were displayed in Google interaction. style because of people’s familiarity with this search engine. However, ads and the hyperlinks “in cache” and “similar pages” We expected the mouse to be superior to the gaze-based input were not included on the SERPs. In the experiment the SERPs device for search result selection, resulting in higher task were presented in full screen mode such that there was no browser performance and more positive subjective ratings. For instance, task bar displayed. In each of the interfaces the nine search results gaze-based search result selection should take longer and evoke fit on the screen, thus obviating the need for scrolling. more accidental selections. For gaze-based interaction, the standard list interface was expected to be the least suitable of the 4.3 Experimental Design three search results interfaces, as the to-be-selected search results The experiment was a 3 (within-subjects) x 2 (between-subjects) are vertically aligned next to each other in the screen periphery (to mixed-model factorial design. the left of the screen). In contrast, we hypothesized that the tabular interface would be most apt for a gaze-based input device: As a first factor, the search results interface was varied within As the summary and URL of a search result can be read without subjects by presenting search results in a list interface, a grid the risk of accidental selections, the Midas Touch Problem should interface, or a tabular interface (see Figure 1). In the list interface be reduced. Therefore, the tabular interface was expected to lead the nine search results were listed from top to bottom to the left of to higher task performance and more positive subjective ratings the screen. In the grid interface, search results were arranged in than the list interface. The suitability of the grid interface was three rows and three columns, towards the center of the screen. In supposed to be in between the two other search results interfaces the tabular interface every search result element was presented in because the likelihood of calibration errors should be reduced, but a separate column. The titles (i.e., the hyperlinks) were presented not the Midas Touch Problem. in the left column, the summaries in the middle column, and the URLs in the right column. The nine search results were listed 4 Method from top to bottom, with the hyperlinks being presented to the left 4.1 Experimental Setup of the screen. Thirty-six able-bodied university students (7 male; mean age: As a second factor the input device was manipulated between 23.33 years) participated in this experiment. All participants had subjects, who either used a computer mouse or a gaze-based input normal or corrected to normal vision. Participants reported to device for operating the search results interfaces. In the gaze- have intermediate or advanced computer- and Web search skills based input device, a dwell-time based selection mechanism was without any differences between the two experimental conditions used. Because of the complexity of result selection, involving (i.e., gaze-based input device vs. mouse). None of the participants visual scan and decision processes, the dwell time was set to 750 had experience with gaze-based computer input. ms. The hyperlinks indicated their interactivity by inverting their color when hovering over them in either interaction technique. The eye gaze data was collected with a Tobii 1750 remote eye Participants were randomly assigned to one of the two conditions, tracker built into a 17” monitor set to a resolution of 1280 x 1024 with 19 participants using the gaze-based input device and 17 the pixels. Participants were seated on a height-adjustable seat with mouse. backrest. The viewing distance was approx. 65cm, and recorded gaze data was smoothed by a filter algorithm. 4.2 Tasks and Material In order to investigate a rather natural Web search situation, participants were requested to find the answers to specific questions by selecting search results presented by a search engine. Twenty-seven search tasks and associated result lists were created, covering a broad range of topics including sports, movies, travel, news, computers, literature, and automotive. Example tasks included: “When did Apollo 13 take off?” or “Who was the youngest world champion in chess?” Each search task started with a control page containing one of the 27 questions and a brief task description. By pressing the space bar, a Google SERP with pre- defined query terms (e.g., “take off Apollo 13”) and nine search results appeared. The search results were manipulated such that for each task there was exactly one search result, which lead to the correct answer. The eight other results were distracters. Note that the correct search result did not contain the answer, but clearly indicated that the answer could be found on the corresponding Figure 1. SERP types, from back to front: Web page. The correct search result was displayed in one of the List interface, grid interface, tabular interface. nine positions, allowing three tasks for each position. The search results were not linked to real Web pages, but to control pages 4.4 Procedure that a) denoted that the correct answer could not be found on this page or b) presented the correct answer. In case of a) participants Participants were tested in individual sessions of approximately one hour. Before starting with the experiment participants were asked to provide some demographic and personal data, received 192
  3. 3. some general instructions and were calibrated on the eye tracking Table 1. Means and standard deviations of task performance. system using a nine-point calibration. Subsequently, the first experimental run started with a training task to get acquainted to Mouse interaction Gaze-based interaction gaze control and the interaction with the search results interface. Then, participants performed 9 tasks (with the correct search List Grid Tab. List Grid Tab. result being located once at each of the nine positions). # correct 8.88 8.88 8.88 8.05 8.05 8.53 Participants were asked to accomplish each task as fast and with tasks (0.33) (0.33) (0.33) (1.13) (1.31) (0.70) the least number of clicks as possible. They were informed that for each task only one of the nine search results presented on a time 22.99 22.46 23.47 29.67 26.03 24.26 SERP lead to the correct answer. A search task was regarded as (in s) (6.03) (5.77) (5.72) (9.42) (9.34) (6.41) successfully accomplished if the correct search result was selected within a time limit of 90 seconds. Participants received a feedback # clicks / 1.76 1.75 1.65 1.82 1.64 1.47 on their task accomplishment after each task and were then corr. task (0.60) (0.70) (0.52) (0.68) (0.62) (0.57) provided with the next task. After having processed all tasks, a # clicks / 1.78 1.79 1.67 1.84 1.67 1.51 questionnaire addressing participants’ subjective ratings regarding all task (0.60) (0.72) (0.51) (0.63) (0.65) (0.57) the interface was administered. Afterwards, the eye tracker was recalibrated and the second experimental run started. All participants performed the same 27 search tasks. The order in Table 2. Means and standard deviations of subjective ratings. which participants used the three interfaces was counterbalanced Mouse interaction Gaze-based interaction across participants as well as the order of the search tasks and the position of the correct search results. List Grid Tab. List Grid Tab. 4.5 Dependent Measures mental 43.53 51.76 46.76 34.74 40.53 32.37 demand (23.8) (21.0) (17.7) (18.0) (22.0) (19.7) To test the suitability of the three search results interfaces for accomplishing the fact-finding tasks, we examined participants’ 3.24 2.82 2.82 3.32 3.53 3.58 layout task performance and subjective ratings with either the mouse or (0.97) (0.95) (1.43) (0.89) (1.07) (1.12) the gaze-based input device. ease of 3.88 3.29 3.59 2.79 3.00 3.47 Task performance. Task performance was determined by three selection (0.78) (1.11) (1.06) (0.92) (1.20) (1.17) dependent measures. First, the number of correctly accomplished tasks (with a minimum of 0 and a maximum of 9 tasks) was satisfac- 3.47 3.06 3.41 3.42 3.37 3.63 counted. Second, task completion time was recorded (in ms) from tion (0.87) (0.90) (1.28) (0.90) (1.12) (1.27) the start of the task with the pressing of the space bar until the selection of the correct search result. However, the time spent on wrong pages (i.e., the time from the moment of having selected a 5.1 Comparisons between Mouse- and Gaze- wrong search result until the return to the SERP) was not included based Interaction in the time measurement. Only correctly accomplished tasks were To compare task performance and subjective ratings between included in the calculations. For gaze-based interaction the dwell mouse-based and gaze-based interaction we conducted time (750 ms/click) was included in the analysis of task MANOVAs with input device as between-subjects factor. completion time. Third, the number of search results selected per task was counted both for correctly accomplished tasks and for all Task performance. The MANOVA showed a significant effect of tasks (i.e., also including failed tasks). The number of search the input device on the number of correctly accomplished tasks results selected in correctly accomplished tasks comprised the (F(3, 32)=4.71, p=.01). Univariate analyses revealed that in all number of false search results selected plus the selection of the three result interfaces mouse users accomplished more tasks correct search result per task (resulting in a minimum of 1). correctly than users using the gaze-based input (list interface: F(1, 34)=8.50, p=.01; grid interface: F(1, 34)=6.42, p=.02; tabular Subjective ratings. Subjective ratings included the following interface: F(1, 34)=3.68, p=.06). The greatest differences between measures: First, participants were asked to rate their mental the two input devices appeared in the list interface and the least in demand during task processing on a scale ranging from 0=very the tabular interface. With regard to the task completion time, the low to 100=very high. Second, participants were presented three MANOVA showed a marginally significant effect of the input statements, which they were asked to rate on a five-point scale device (F(3, 32)=2.50, p=.08). Univariate analyses revealed that (5=highly agree). The statements addressed 1) how much they this effect could be traced back to the list interface (F(1, 34)=6.25, liked the layout of the interface, 2) how easy they found the p=.02). In the list interface tasks were accomplished significantly search result selection from an interface, and 3) how satisfied they faster with the mouse than with the gaze-based device. Though, were with the interface. for the grid interface and the tabular interface there were no differences between input devices. Contrary to our expectations, 5 Results the number of search results selected per task did not differ Tables 1 and 2 show the mean values of the seven dependent between input devices, irrespective of whether only correctly measures as a function of the two factors input device and accomplished tasks were included in the analyses or all tasks. interface. For statistical analyses, first, comparisons between Subjective ratings. The MANOVA showed no significant mouse interaction and gaze-based interaction were made. Second, differences between input devices on participants’ perceived the suitability of the three different interfaces for gaze-based mental demand. Although not reaching statistical significance, the search results selection was analyzed. Because of space MANOVA showed a statistical trend of input device on limitations, statistical values are only reported for significant participants’ ratings regarding the layout of the interfaces (F(3, results. 32)=2.30, p=.10). Univariate analyses revealed that this effect 193
  4. 4. could be traced back to the grid interface (F(1, 34)=4.28, p=.05) input device, in line with our expectations, search tasks were and marginally to the tabular interface (F(1, 34) =3.16, p=.08). accomplished faster and with fewer search results selected than in Users of the gaze-based input device tended to like the layout of a standard list interface. One drawback of the grid interface might these alternative interfaces more than mouse users. With regard to be that it is perceived more mentally demanding than the tabular participants’ ratings about the ease of selection of search results interface, which might be due to its unclear arrangement of the from a SERP, the MANOVA showed a significant effect of input search results. device F(3, 32)=4.90, p=.01). Again, univariate analyses revealed that this effect could be traced back to the list interface (F(1, To conclude, even though not all of the experimental results 34)=14.62, p=.001). Users of the gaze-based device rated search reached statistical significance, the study quite clearly speaks result selection in the list interface less easy (i.e., more strenuous) against using conventional list interfaces for gaze-based search than mouse users, whereas for the grid and the tabular interface result selection. Rather, this study provides first indications that ratings did not differ between input devices. Finally, the the tabular interface is best suited for gaze-based interaction MANOVA showed no differences between input devices with among the given alternatives. Its suitability for a series of regard to users’ overall satisfaction. consecutive searches in case that the desired information could not be found among the presented results or for browsing or 5.2 Suitability of Search Results Interfaces for navigational tasks has yet to be shown. Furthermore, one can Gaze-based Interaction assume that by including a separate activation link instead of using the title as link the advantage of a tabular layout might be To compare task performance and subjective ratings between the further increased. Nonetheless, without further manipulation of three search results interfaces, repeated-measures ANOVAs with search engines, the tabular interface as it was used in the current interface as within-subjects factor were conducted. study presents a first step towards more efficient Web searching Task performance. The repeated-measures ANOVA showed no for situations, when the user’s hands cannot be used, for instance, significant differences between the three interfaces on the number due to motor impairment. of correctly accomplished tasks. However, with regard to task completion time, the ANOVA showed a marginally significant References effect of interface (F(2, 36)=2.55, p=.09). Bonferroni-adjusted post hoc tests showed that in the tabular interface tasks were ASHMORE, M., DUCHOWSKI, A.T., and SHOEMAKER, G. 2005. accomplished faster than in the list interface (p=.08). With regard Efficient eye pointing with a fisheye lens. In Proceedings of to the number of search results selected per task, the ANOVA Graphics interface. GI ‘05. ACM International Conference also showed a significant effect of interface (F(2, 36)=3.30, Proceeding Series, vol. 112. Canadian Human-Computer p=.05). Again, post hoc tests revealed that in the tabular interface Communications Society, 203-210. participants selected less search results to accomplish the task BEINHAUER, W. 2006. A widget library for gaze-based interaction than in the list interface (p=.03). When analyzing the number of elements. In Proceedings of the 2006 Symposium on Eye clicks for all tasks, this effect becomes even stronger (p=.01). Tracking Research & Applications. ETRA ‘06. ACM, New Furthermore, for both variables (time and clicks), values for the York, NY, 53-53. grid interface were in between, neither differing from the list interface nor from the tabular interface. DONEGAN, M. et al. 2005. User requirements report with observations of difficulties users are experiencing. COGAIN, Subjective ratings. Although not reaching statistical significance, IST-2003-511598: Deliverable 3.1. the ANOVA showed a statistical trend of interface on participants’ perceived mental demand (F(2, 36)=2.51, p=.10). JACOB, R. J. K. 1990. What you look at is what you get: eye Post hoc tests revealed that in the grid interface participants movement-based interaction techniques. In Proceedings of the tended to perceive a higher mental demand than in the tabular SIGCHI Conference on Human Factors in Computing Systems: interface (p=.09). Besides this, no differences were found Empowering. J.C. Chew and J. Whiteside, Eds. CHI ‘90. ACM, between interfaces on users’ ratings regarding the layout of the New York, NY, 11-18. interfaces, the ease of search result selection, and their overall KAMMERER, Y., BEINHAUER, W., and SCHEITER, K. 2008. Looking satisfaction with the interfaces. In case of mouse operation, no my way through the menu: The impact of menu design and significant differences were registered between the interfaces with multimodal input on gaze-based menu selection. In regard to task performance and subjective ratings. Proceedings of the 2008 Symposium on Eye Tracking Research 6 Conclusions & Applications. ETRA ‘08. ACM, New York, NY, 213-220. As expected, the study showed that mouse users accomplished KUMAR, M., PAEPCKE, A., and WINOGRAD, T. 2007. EyePoint: more tasks correctly than users of the gaze-based input device. Of Practical pointing and selection using gaze and keyboard. In note, however, irrespective of the input device almost all tasks Proceedings of ACM Conference on Human Factors in were accomplished correctly. No differences were found between Computing Systems. CHI ‘07. ACM, New York, NY, 421-430. input devices regarding the number of search results selected to MOLLENBACH, E., STEFANSSON, T., and HANSEN, J.P. 2008. All accomplish a task, with very few wrong search results being eyes on the monitor: gaze based interaction in zoomable, multi- selected. Thus, contrary to our expectations, gaze-based search scaled information-spaces. In Proceedings of the 13th result selection in general did not evoke more accidental international Conference on Intelligent User Interfaces. IUI selections. Furthermore, with regard to task completion time and ‘08. ACM, New York, NY, 373-376. ease of search result selection only in the list interface gaze-based interaction was inferior to mouse interaction, but not in the two RELE, R S., and DUCHOWSKI, A.T., 2005. Using Eye Tracking to alternative interfaces. The layouts of the two alternative interfaces Evaluate Alternative Search Results Interfaces. In Proceedings were also liked better when operated with the gaze-based input of the Human Factors and Ergonomics Society, Sep. 26-30, device than when operated by mouse. Moreover, when search 2005, Orlando, FL. results were presented in a tabular interface with a gaze-based 194

×