Slides for a presentation at the First International Workshop on Mining Scientific Publications @ JCDL 2012
Paper: http://dlib.org/dlib/july12/herrmannova/07herrmannova.html
Dasha HerrmannovaResearch Scientist at Oak Ridge National Laboratory
Visual Search for Supporting Content Exploration in Large Document Collections
1. 1/48
Visual search for supporting content
exploration in large document collections
Drahomira Herrmannova and Petr Knoth
2. 2/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
3. 3/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
4. 4/48
What do we do
• Improve search in (large) document collections
• Examples of collections:
– News articles
– Cultural heritage collection
– Collection of scientific papers
• Current search engines:
– Support for lookup
– Much less support for exploration
5. 5/48
Search tasks (Rose and Levinson, 2004)
• Undirected (or exploratory) queries – significant
portion of all searches (Rose and Levinson, 2004)
7. 7/48
How to support exploratory search
• One possible solution – information
visualisation
• Why?
– Easier to communicate structure, organisation and
relations in content
– Visually appealing
8. 8/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
10. 10/48
Collection level visualisations
• Visualise attributes of the collection
• Typically aim at providing a general overview
of the collection content
• Examples
21. 21/48
Browsing focused
• Exploration starts at a specific point in the
collection from which the user navigates
through the collection
• Usually the same starting point is used every
time
26. 26/48
Design principles (1/2)
• For visual search interfaces
• Should be considered when designing the
interface
• Related studies:
– Chen and Yu, 2000
– Sebrechts et al., 1999
27. 27/48
Design principles (2/2)
1. Added value
2. Simplicity
3. Visual legibility
4. Use of colours
5. Dimension
6. Fixed spatial location
28. 28/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
29. 29/48
Considered types of collections
• Every document in a collection defined
according to a set of dimensions
• Dimensions typically of different types
• Document = set of properties expressing
values of dimensions
• Dimensions always present
• Examples
39. 39/48
Limitations
• In theory not restricted, the limitations might
be:
– the size and resolution of the screen
– the limitations of human perception
40. 40/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
41. 41/48
Conclusion (1/2)
• Motivation:
1. Provide better support for exploratory search
than current textual interfaces
2. Interface that is conceptually applicable in any
document collection regardless of its type
3. Provide an added value by assisting in the
discovery of interesting connections that would
otherwise remain hidden
42. 42/48
Conclusion (2/2)
• Results:
1. Support for comparing and contrasting content.
2. Support for exploration across dimensions.
3. Universal approach to the visualised dimensions.
44. 44/48
References (1/4)
• G. Marchionini. Exploratory search: from finding to understanding.
Communications of the ACM - Supporting exploratory search. 2006.
• D. Rose & D. Levinson. Understanding user goals in web search.
Proceedings of the 13th conference on World Wide Web. 2004.
• Yusef Hassan-Montero and Victor Herrero-Solana. Improving tag-clouds as
visual information retrieval interfaces. In MERIDA, INSCIT2006
CONFERENCE. 2006.
• Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X. Zhou, Weihong
Qian, Lei Shi, Li Tan, and Qiang Zhang. Tiara: a visual exploratory text an-
alytic system. In Proceedings of the 16th ACMSIGKDD international
conference on Knowledge discovery and data mining. 2010.
45. 45/48
References (2/4)
• Ben Shneiderman, David Feldman, Anne Rose, and Xavier Ferré Grau.
Visualizing digital librarysearch results with categorical and hierarchical
axes. In Proceedings of the fifth ACM conference on Digital libraries. 2000.
• Marti A. Hearst. TileBars: Visualization of Term Distribution Information in
Full Text Information Access. In the Proceedings of the ACM SIGCHI
Conference on Human Factors in Computing Systems. 1995.
• David Milne, Ian Witten. A link-based visual search engine for Wikipedia.
Proceeding of the 11th annual international ACM/IEEE joint conference on
Digital libraries. 2011.
• Simon Lehmann, Ulrich Schwanecke, and Rolf Dorner. Interactive
visualization for opportunistic exploration of large document collections.
Information Systems. 2010.
46. 46/48
References (3/4)
• Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos.
Apolo: making sense of large network data by combining rich user
interaction and machine learning. In Proceedings of the 2011 annual
conference on Human factors in computing systems. 2011.
• Michael Granitzer, Wolfgang Kienreich, Vedran Sabol, Keith Andrews, and
Werner Klieber. Evaluating a system for interactive exploration of large,
hierarchically structured document repositories. In Proceedings of the IEEE
Symposium on Information Visualization. 2004.
• Christian Hirsch, John Hosking, and John Grundy. Interactive visualization
tools for exploring the semantic graph of large knowledge spaces.
Interfaces. 2009.
47. 47/48
References (4/4)
• Chaomei Chen and Yue Yu. Empirical studies of information visualization: a
meta-analysis. Int. J. Hum.- Comput. Stud. 2000.
• Marc M. Sebrechts, John V. Cugini, Sharon J. Laskowski, Joanna Vasilakis,
and Michael S. Miller. Visualization of search results: a comparative
evaluation of text, 2d, and 3d interfaces. In Proceedings of the 22nd
annual international ACM SIGIR conference on Research and development
in information retrieval. 1999.