Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

1,258 views

Published on

This is the slide set that described and summarized the Smart eBook research we published in VAST2006. (yes, 4 years ago).

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

  1. 1. Ed H. Chi, Lichan Hong, Julie Heiser, Stu Card Palo Alto Research Center The user study portion of this research has been funded in part by ARDA NIMD program. Ed H. Chi VAST 2006 1
  2. 2. Ed H. Chi, Lichan Hong, Julie Heiser, Stu Card, Michelle Gumbrecht Palo Alto Research Center The user study portion of this research has been funded in part by ARDA NIMD program. Ed H. Chi VAST 2006 2
  3. 3. Reading is an essential human activity. •  Giant leaps forward is marked by new and better ways to find, correlate, and comprehend information. Ed H. Chi VAST 2006 Copyright 2004 PARC 3
  4. 4. Many books digitized in the (goal) digital library effort. 90% (current) •  Amazon, Google, and RECALL 75% CMU’s Million Book Project. Intelligent Analysts spend an enormous amount of time 0% Reading! [Pirolli, Lee, Card] 0 TIME Ed H. Chi VAST 2006 Copyright 2004 PARC 4
  5. 5. Subject Indexes as a new reading “device” •  Invented in the 15th Century [Dewar98] •  Method of design refined thru centuries Peter Heylyn's 1652 Cosmographie in Four Bookes Ed H. Chi VAST 2006 Copyright 2004 PARC 5
  6. 6. Instead of generating new indexes using IR techniques, why not enhance them? Take advantage of centuries of experience in building subject indexes. Ed H. Chi VAST 2006 Copyright 2004 PARC 6
  7. 7. Ed H. Chi VAST 2006 Copyright 2004 PARC 7
  8. 8. Readers need help in directing their attention to the most relevant passages to their topic of interest. Idea: conceptually highlight passages and keywords that are related to user search keywords. Ed H. Chi VAST 2006 Copyright 2004 PARC 8
  9. 9. Conceptually highlight any relevant User first type search keywords: passages and keywords “anthrax symptoms” Draw user attention Ed H. Chi VAST 2006 Copyright 2004 PARC 9
  10. 10. Biohazard •  by Ken Alibek •  non-fiction retelling of his experiences working on biological weapons in the former Soviet Union. •  13 index pages in two columns, consisting 829 entries Ed H. Chi VAST 2006 Copyright 2004 PARC 10
  11. 11. Associated Entries underlined in red Exact Matches in red Ed H. Chi VAST 2006 Copyright 2004 PARC 11
  12. 12. Ed H. Chi VAST 2006 Copyright 2004 PARC 12
  13. 13. Ed H. Chi VAST 2006 Copyright 2004 PARC 13
  14. 14. Ed H. Chi VAST 2006 Copyright 2004 PARC 14
  15. 15. Ed H. Chi VAST 2006 Copyright 2004 PARC 15
  16. 16. Goal: Compare how users find, compare, and comprehend information using the ScentIndex and 3Book versus the physical book. It’s not clear to us that the ScentIndex would be better, because: •  Unfamiliarity with 3Book Interface (page turning, clicking on page numbers, use of search box) •  Inability to grasp the ScentIndex concept (reorganization might be confusing, harder to read the index page on screen) •  Readability of the Screen (hard to read a large number of pieces of text) •  Users might be very good at using the physical book index. Ed H. Chi VAST 2006 Copyright 2004 PARC 16
  17. 17. Study Design: •  Within-subject •  Interface condition (ScentIndex vs. Physical Book Index), and •  Task Type (find, compare, comprehend), •  with the order of the interface used and expertise level as the between-subject variables. Subjects: 16 subjects (8 experts on the content, 8 novices) Materials: subjects used PC with two LCD monitors, and the physical Alibek book. Ed H. Chi VAST 2006 Copyright 2004 PARC 17
  18. 18. Overall, the ScentIndex eBook performed better against the physical Book. Faster Speed: •  Subjects using the ScentIndex were faster in completing their tasks no matter whether they were experts or novices, F(1,12)=12.96, p<.01. More Accurate: •  Answers that they provided while using ScentIndex interface were more accurate, F(1,12)=3.991, p=.06. Ed H. Chi VAST 2006 Copyright 2004 PARC 18
  19. 19. Ed H. Chi VAST 2006 Copyright 2004 PARC 19
  20. 20. The difficulty seems to be, not so much that we publish unduly in view of the extent and variety of present-day interests, but rather that publication has been extended far beyond our present ability to make real use of the record.” --- V. Bush Indeed, our goal is to enhance current Browsing Interfaces for more productive reading. Ed H. Chi VAST 2006 Copyright 2004 PARC 20
  21. 21. The user study portion of this research has been funded in part by contract #MDA904-03-C-0404 to Stuart K. Card and Peter Pirolli from the Advanced Research and Development Activity, Novel Intelligence from Massive Data program. We thank Jock Mackinlay for some fruitful conversation about the interaction of the eBook; Michelle Gumbrecht and Tan Lee for running some of our experiments; Pam Desmond for help in the data analysis, and Brian Tramontana for the video production. Contact: Ed H. Chi (chi@acm.org) Ed H. Chi VAST 2006 Copyright 2004 PARC 21
  22. 22. Ed H. Chi VAST 2006 Copyright 2004 PARC 22
  23. 23. Ed H. Chi VAST 2006 Copyright 2004 PARC 23
  24. 24. Ed H. Chi VAST 2006 Copyright 2004 PARC 24
  25. 25. Ed H. Chi VAST 2006 Copyright 2004 PARC 25
  26. 26. Page Textures sample Renderer scan Page Images Words + extract Locations OCR Word Association Text compute Matrix Sentence Scent parse Structure Highlights Ed H. Chi VAST 2006 Copyright 2004 PARC 26
  27. 27. Ed H. Chi VAST 2006 Copyright 2004 PARC 27
  28. 28. Early proposal of an indexing system: Memex [Bush45] Electronic Books: Rocket eBook, SoftBook Reader, DigiPaper [Huttenlocher00], DjVu [DjVu Zone03], PDF [Adobe03], MS Reader [MS03]. 3D Electronic Books: SGI Demo Book [SGI93], WebBook [Card96], British Library Turning the Pages [BL03], 3Book [Card03]. Computer help search systems such as Apple or Microsoft. Google or AltaVista provide highlighting and searching Automatic Indexing in IR: use noun-phrases and parsers to create indexes [Wacholder01, Nevill-Manning99]. Scatter/Gather [Cutting91]. Concept similar to: Word Co-occurrence [Schuetze99], InfoScent and Spreading Activation [Chi01, Chi00]. Ed H. Chi VAST 2006 Copyright 2004 PARC 28
  29. 29. HyperText Book Systems: SuperBook [Remde87] provides a dynamic TOC with fisheye DOI. Ed H. Chi VAST 2006 Copyright 2004 PARC 29
  30. 30. Two issues: •  M is calculated using a 40 word window •  Caveat: Exact word matches do not always show up. •  Solution: Insert large values onto the diagonal Ed H. Chi VAST 2006 Copyright 2004 PARC 30
  31. 31. An experimenter without prior knowledge of how the ScentIndex system works devised a total of 12 tasks. The tasks were divided into two groups of six tasks each. •  Between the two sets of questions, half of the subjects received one set first; the other received the other set first. •  Tasks from one group were designed to be one-to-one equivalents of the other group. •  Of these six tasks, two were Simple Fact Retrieval questions, two were Dispersed Comparison questions, and two were Comprehension questions. Ed H. Chi VAST 2006 Copyright 2004 PARC 31
  32. 32. Initial Survey (on computing and search experiences.) 4 expert and 4 novice subjects used the Book interface first, and the other eight used the ScentIndex first. Subjects were trained to use the ScentIndex immediately before they needed to use it. Each task was given on a separate sheet of paper. Read, understand each question completely before they start the task. •  time limit for each task (simple retrieval=2min, comparison=4min, comprehension=6min) with one minute warnings. •  For each interface, subjects performed the simple fact retrievals first, the dispersed comparisons second, and the comprehension questions last. Post Survey (comments, preferences). Ed H. Chi VAST 2006 Copyright 2004 PARC 32
  33. 33. Simple Fact Retrieval: •  The last natural occurring case of WHICH virus occurred in Somalia in 1977? •  Who received a state award for developing a Q fever weapon? Dispersed Comparision: •  What is the death rate of smallpox and tularemia? Which virus has a higher death rate? •  What year did Russia open negotiations with Iraq for large fermentation vessels? What year did Vladimir Kryuchkov become chairman of the KGB? Which occurred first? Comprehension: •  Pasechnik’s defection to the West had grave implications for the Soviet biowarfare program. Match the person with the fact that describes how they’re involved: •  Persons: Frolov, Chernyayev, Karpov, Vinogradov •  Facts: A. First told Alibek about Pasechnik’s defection. B. Deputy minister who refused to sign formal diplomatic reply. C. Given demarche that said US have “new information”, presumably given by Pasechnik. D. Told American visitors that Pasechnik’s jetstream milling machine was for “salt”. Ed H. Chi VAST 2006 Copyright 2004 PARC 33
  34. 34. Time to Completion 6 Participants using the ScentIndex interface performed ln(completion time) in seconds 5.5 tasks faster than those using the 5 Book, F(1,12)=12.96, p<.01. Many tasks not completed in 4.5 the time allotted using the Book interface. 4 Book •  6/7 for simple retrieval, ScentIndex 3.5 •  7/8 for comparison, •  3/5 for comprehension. 3 Simple Dispersed Comprehension Ed H. Chi VAST 2006 Copyright 2004 PARC 34
  35. 35. Natural log transformation on the completion time 6 As predicted, experts performed ln(completion time) in seconds tasks faster than novices overall 5 5.435 •  (Expert Mean=4.85, S.D.=. 4 4.987 212, Novice Mean=4.58, S.D.=.212, F(1,12)=17.7, p<. 3 3.722 01.) 2 •  There were no interactions. Simple Retrieval < Dispersed 1 Comparison < Comprehension, F 0 (2,24)=204, p<.01. Sim ple Dispersed Com prehension Ed H. Chi VAST 2006 Copyright 2004 PARC 35
  36. 36. Converted the scores for each task to a percentage. (measured in Simple Dispersed Comprehen points) Retrieval Comparison sion ScentIndex M=1.88 M=1.88 M=1.77 eBook S.D.=.342 S.D.=.269 S.D.=.284 Book M=1.75 M=1.58 M=1.84 SD=.447 S.D.=.516 S.D.=.259 We found that users performed better using the ScentIndex, reaching marginal significance F(1,12)=3.991, p=.06. We found no difference between experts and novices. Ed H. Chi VAST 2006 Copyright 2004 PARC 36
  37. 37. Users overwhelmingly preferred the ScentIndex interface (15/16) •  “can search using keyword combinations” •  “clicking on page number to navigate” •  “highlighting enables faster scanning and skimming” •  “easier to compare index entries because it’s all on 1 page.” Some users mentioned that they prefer paper for extensive reading Potential Future work: •  Compare with search engines (organize results by relevancy). •  Understand difference between this technique and keyword finding. •  Limit the page number list of each relevant index entry to only the pages that are relevant to the keywords specified. Ed H. Chi VAST 2006 Copyright 2004 PARC 37

×