*note: download original powerpoint to view animations*. Presentation at 4th Int. Alexandria Workshop (19./20. October 2017) - Foundations for Temporal Retrieval, Exploration and Analytics in Web Archives.
Chaos&Order: Using visualization as a means to explore large heritage collections
1. Chaos & OrderChaos & Order
University of Oslo Library
Using visualization as a means to
explore large heritage collections
Hugo Huurdeman @timelessfuture
4. Stream 2: Physical
Interaction
• Stream 1 & 3 build on top of
existing work and infrastructure
• Approach Stream 2:
experiment with novel ways of
interaction in physical space
• with library’s book collections
• experiments with a touch
table (Science Library)
• Includes an INF2260 project &
INF Master project Yaron Okun
Physical
interaction
(2)
Visualiza-
tion (1)
Visual
navigation
prototypes
Picture: Marina Tofting
5. Visual Navigation Project
University of Oslo Library
bit.ly/VisualNavigationProject
in collaboration with Department of Informatics
by support of the National Library of Norway
start: Sept. 2016. duration two years
7. One motivation: ‘underuse’ of
Web archives
• Web archives preserve the fast-
changing Web. By now containing
Petabytes of valuable Web data
• This could be a valuable resource,
however, archives have not
frequently been used for research
[DoughertyMeyer14], e.g. due to access issues.
• Presentation focus: using
visualization as a means to explore
large heritage collections
9. • Information seeking as a process of construction
• E.g. [Kuhlthau91, Vakkari01]
Inf. seeking process 2.1+uncertainty-
feelings
thoughts
actions
vague focused
seeking general
information (exploring)
seeking pertinent
information (documenting)
uncertainty optimism confusion clarity confidence (dis)satisfaction
doubt direction
FormulationInitiation Selection Exploration Collection Presentation
11. Re/search as a constructive
process 2.3
• Mapping Kendall’s (2012) Research Process Model
• to Kuhlthau’s ISP Model (1991) [Huurdeman17b]
12. • Today: look at the initial (prefocus) phases
• How does one get curious, inspired, interested?
What support for this phase currently exists?
Research as a constructive
process 2.3
14. [Ahlberg&Shneiderman94]
[Google Wonder Wheel]
[ClusterMap]
[Epicurious]
[Donato10]
[Hearst&Degler13]
[Proulx et al., 2006]
• SUIs may aid users to:
• express needs, formulate queries, provide
understanding & to track progress [Hearst09]
• Complexity of designing effective SUIs
[Shneiderman05]
• Many proposed interactive features:
• search suggestions [Niu14], facets [Tunkelang09], item
trays [Donato10], ..
Search User Interfaces 3.1
15. Few features have made it to the general search engines, however
Some turned up in specific context, e.g. online shopping, analytics
16. Access to heritage collections
3.2
• Some developments have been incorporated in
systems to access cultural heritage collections
• Libraries, Museums, Archives
• Web archives
17. Web Archives 3.3
• Wayback Machine: URL as starting point
• Search Systems: Query as starting point
19. Assumptions of search
3.5
• Searching (even exploratory) assumes that you
have an initial idea what you would like to look for
— however rough
image:Google
20. Web archive Access
Issues 3.7
• Problems* of
• scale (large size)
• dimensions (temporal and hierarchical)
• Hence, the data is too much and too complex for
regular URL browsing & basic searching (e.g. how
to convey all this in 10 blue links?)
21. Towards Visualization? 3.8
• Any kind of visual representation of information
designed to enable exploration, discovery,
communication, etc. (Cairo, 2016)
• Visualization - can be used throughout (re)search
process
• initial exploration, get a grasp (exploration)
• as an artefact of ongoing research (discovery)
• as an end product (science communication)
22. Guiding Questions
3.9
• Can we devise alternatives* to the Query and
URL approach for web archive access?
• To what extent can we provide more visual
approaches for browsing web archives?
[Ahlberg&Shneiderman94]
[Pejtersen89]
30. Changerate
(type of site)
Changes per unesco category (all p/quarter harvests, n=~600, 2009-2015)
Meteorology
Law & government
History
Sports
Agriculture
36. CollectionXplorer
Characteristics
• Using d3js as a basis
• “Playful”, short-form development
• Different visualizations as a ‘lens’ to the archive
• As a starting point to rethink web archive access
• How to induce interest, inspiration & curiosity in
the context of web archives?
44. CollectionXplorer -
some char’istics
• “Playful” - engage potential users, encourage to interact
• Easy to add new types of visualizations
• Various modalities to explore
• Initial testing on touch table (swipe!)
• Next steps: further explore dimensions of the archive
• Develop a “design language”
• Infrastructural demands, user testing. Evaluation.
46. Conclusion
• Looking at initial stages of the complex
(re)search process - open-ended browsing
• Exploring temporal and hierarchical dimensions
• Short-form prototypes - how to visualize web
archive content in “engaging” ways?
• …further infrastructure, dev and testing is needed
47. Closing off: conveying
complexity
• “I want [people] to use the visualizations I provide
as a starting point for their own explorations”
• They should expose “the complexity, the inner
contradictions, the manifold nature of the
underlying phenomenon. (Moritz Stefaner)
In a web archive context, a simple results list
hides a lot of complexities…
48. References
• Ben-David A. & Huurdeman H. (2014). Web Archive Search as Research: Methodological and Theoretical
Implications. Alexandria Journal, Volume 25, No. 1 (2014)
• Brügger, N. (2013). Historical Network Analysis of the Web. Social Science Computer Review, 31(3), 306–321
• Dougherty, M., & Meyer, E. T. (2014). Community, tools, and practices in web archiving: The state-of-the-art in relation to social
science and humanities research needs. Journal of the Association for Information Science and Technology, 65(11), 2195–
2209. http://doi.org/10.1002/asi.23099
• Hearst M. A.. Search User Interfaces. Cambridge University Press, 2009.
• Huurdeman, H. C. (2017). Dynamic Support for the Complex Dynamics of the Information Seeking Process, PhD thesis
(exp.2017)
• Huurdeman, H. C. (2017). Dynamic Compositions: Recombining Search User Interface Features for Supporting Complex Work
Tasks. In SCST@ CHIIR (pp. 21–24).
• Huurdeman, H. C., Wilson, M. L., & Kamps, J. (2016). Active and Passive Utility of Search Interface Features in Different
Information Seeking Task Stages. In Proceedings of the 2016 ACM on Conference on Human Information Interaction and
Retrieval (pp. 3–12). New York, NY, USA: ACM. https://doi.org/10.1145/2854946.2854957
• Huurdeman, Samar, Kamps, De Vries (2016). Towards Multidimensional Web Archive Access. Presented at IIPC conference ‘16
• Hugo C. Huurdeman and Jaap Kamps (2015). Supporting the Process: Adapting Search Systems to Search Stages. In: S.
Kurbanoğlu, S. Špiranec, J. Boustany, E. Grassian, D. Mizrachi, & L. Roy (Eds.), Information Literacy: Moving towards
sustainability, Communication in Computer and Information Science series (Vol. 552, pp. 394-404).
• Huurdeman, H. (2015). Towards Research Engines: Supporting Search Stages in Web archives. In Two-day conference at
Aarhus University, Denmark.
• Huurdeman, H., & Kamps, J. (2014). From Multistage Information-seeking Models to Multistage Search Systems. In
Proceedings of the 5th Information Interaction in Context Symposium (pp. 145–154). New York, NY, USA: ACM.
• C. C. Kuhlthau. Inside the search process: Information seeking from the user’s perspective. JASIS, 42:361–371, 1991.
• B. Shneiderman and C. Pleasant. Designing the user interface: strategies for effective human-computer interaction. Pearson
Education, 2005.
• P. Vakkari. A theory of the task-based information retrieval process: a summary and generalisation of a longitudinal study.
Journal of Documentation, 57:44–60, 2001.
49. Acknowledgements
• Thaer Samar & Jaap Kamps & Arjen & others in WebART
• NWO grant
• Colleagues at University of Oslo (Science Lib)
• NB grant
• René Voorburg & Kees Teszelsky at the KB
50. Chaos & Order
University of Oslo Library
Using visualization as a means to
explore large heritage collections
Hugo Huurdeman @timelessfuture
Editor's Notes
Several underlying reasons exist (incl. data and legal issues). Here, we focus on access.
More and more systems intending to support the process.
Kendall: 1 defining research problem, 2 reviewing litearture, 3 hypothesis formulation, 4 research design, 5 collecting and analyzing data, 6 drawing conclusions & reporting findings
{visualization at different moments in the process}
visual information retrieval. trigger new questions. visualization as a product. information access; enhancing the possibilities.
Donato: “research session detector”
“data is too much and too complex for searching” *** PLUS data issues such as incompleteness ***
(How to induce interest, inspiration & curiosity in the context of web archives?)
put into visual diagram (Steps)
year
month
day
Suitability data, visualization & screen size
some visualizations don’t ‘fit’ the data and screen
“Provide users with a structured way to explore a complex phenomenon on their own terms, in a sensually rich mosaic of media and facts rather than a pre-digested narrative with a surprise at the end.” (as quoted by Cairo, 2016)