Chaos&Order: Using visualization as a means to
 explore large heritage collections

Chaos & OrderChaos & Order
University of Oslo Library
Using visualization as a means to
explore large heritage collections
Hugo Huurdeman @timelessfuture
Visual Navigation Project
University of Oslo Library
bit.ly/VisualNavigationProject
https://www.youtube.com/watch?v=0Ojz0jO8Moc
Stream 2: Physical
Interaction
• Stream 1 & 3 build on top of
existing work and infrastructure
• Approach Stream 2:
experiment with novel ways of
interaction in physical space
• with library’s book collections
• experiments with a touch
table (Science Library)
• Includes an INF2260 project &
INF Master project Yaron Okun
Physical
interaction
(2)
Visualiza-
tion (1)
Visual
navigation
prototypes
Picture: Marina Tofting
Visual Navigation Project
University of Oslo Library
bit.ly/VisualNavigationProject
in collaboration with Department of Informatics
by support of the National Library of Norway
start: Sept. 2016. duration two years
1 Introduction
One motivation: ‘underuse’ of
Web archives
• Web archives preserve the fast-
changing Web. By now containing
Petabytes of valuable Web data
• This could be a valuable resource,
however, archives have not
frequently been used for research
[DoughertyMeyer14], e.g. due to access issues.
• Presentation focus: using
visualization as a means to explore
large heritage collections
2 Theoretical framework
• Information seeking as a process of construction
• E.g. [Kuhlthau91, Vakkari01]
Inf. seeking process 2.1+uncertainty-
feelings
thoughts
actions
vague focused
seeking general
information (exploring)
seeking pertinent
information (documenting)
uncertainty optimism confusion clarity confidence (dis)satisfaction
doubt direction
FormulationInitiation Selection Exploration Collection Presentation
Stage-based search support
• Stage-based support
[Huurdeman&Kamps14/15,HuurdemanWilson&Kamps16, Huurdeman17a/b]
Re/search as a constructive
process 2.3
• Mapping Kendall’s (2012) Research Process Model
• to Kuhlthau’s ISP Model (1991) [Huurdeman17b]
• Today: look at the initial (prefocus) phases
• How does one get curious, inspired, interested?
What support for this phase currently exists?
Research as a constructive
process 2.3
3. Exploratory Interfaces
[Ahlberg&Shneiderman94]
[Google Wonder Wheel]
[ClusterMap]
[Epicurious]
[Donato10]
[Hearst&Degler13]
[Proulx et al., 2006]
• SUIs may aid users to:
• express needs, formulate queries, provide
understanding & to track progress [Hearst09]
• Complexity of designing effective SUIs
[Shneiderman05]
• Many proposed interactive features:
• search suggestions [Niu14], facets [Tunkelang09], item
trays [Donato10], ..
Search User Interfaces 3.1
Few features have made it to the general search engines, however
Some turned up in specific context, e.g. online shopping, analytics
Access to heritage collections
3.2
• Some developments have been incorporated in
systems to access cultural heritage collections
• Libraries, Museums, Archives
• Web archives
Web Archives 3.3
• Wayback Machine: URL as starting point
• Search Systems: Query as starting point
Assumptions of Wayback
Machine 3.4
• Assumption that you know what you are looking
for…
!!!
Assumptions of search
3.5
• Searching (even exploratory) assumes that you
have an initial idea what you would like to look for
— however rough
image:Google
Web archive Access
Issues 3.7
• Problems* of
• scale (large size)
• dimensions (temporal and hierarchical)
• Hence, the data is too much and too complex for
regular URL browsing & basic searching (e.g. how
to convey all this in 10 blue links?)
Towards Visualization? 3.8
• Any kind of visual representation of information
designed to enable exploration, discovery,
communication, etc. (Cairo, 2016)
• Visualization - can be used throughout (re)search
process
• initial exploration, get a grasp (exploration)
• as an artefact of ongoing research (discovery)
• as an end product (science communication)
Guiding Questions
3.9
• Can we devise alternatives* to the Query and
URL approach for web archive access?
• To what extent can we provide more visual
approaches for browsing web archives?
[Ahlberg&Shneiderman94]
[Pejtersen89]
4. Initial explorations
[Part presented as HuurdemanSamarEtAl16 (IIPC)]
Flickr: koninklijkebibliotheek
Statistics (2016):
•10,000+ websites
•35,000+ harvests
•16+ Terabyte
•Categorized using
UNESCO classification
National Library of the Netherlands: Web archive since 2007
Data: extraction and
processing 4.1
extracting all homepages + 1
level deep
matching with seedlist
adding KB metadata
cleaning, processing, data
enrichment (e.g. NER)
generate visualizations~900K XML
files
thanks: Thaer Samar
Web sphere
Page element
Web site
Web page
2010 2015
eyefilm.nl
[Brügger]
[Huurdeman15]
Example: eyefilm.nl (2010-
2015)
redesign redesign
content links images overall
Example: escherinhetpaleis
.nl (2010-2015)
content links images overall
Web sphere
Page element
Web site
Web page
2010 2015
unesco classifications
Changerate
(type of site)
Changes per unesco category (all p/quarter harvests, n=~600, 2009-2015)
Meteorology
Law & government
History
Sports
Agriculture
Web sphere
Page element
Web site
Web page
2010 2015
nu.nl
Exploring content (news)
2014
2015
Jan’13 Feb’13 Mar’13 Apr’13
May’13 Jun’13 Jul’13 Aug’13
Sep’13 Oct’13 Nov’13 Dec’13
Daily (2012)
5. ‘CollectionXplorer’
CollectionXplorer
Characteristics
• Using d3js as a basis
• “Playful”, short-form development
• Different visualizations as a ‘lens’ to the archive
• As a starting point to rethink web archive access
• How to induce interest, inspiration & curiosity in
the context of web archives?
Clusters
color: representations of websites, size: number of crawls
Clusters
color: representations of websites, size: number of crawls
Word Clouds
size: number of sites
Bar Charts
color: unesco category, size: avg change %
Bar Charts
color: unesco category, size: avg change %
Network (Force-directed)
connetions: unesco category, size: number of crawls
Scatterplots
horizontal: category, vertical: user rating (books)
So, lots of opportunities
distinct properties of each type of visualization
CollectionXplorer -
some char’istics
• “Playful” - engage potential users, encourage to interact
• Easy to add new types of visualizations
• Various modalities to explore
• Initial testing on touch table (swipe!)
• Next steps: further explore dimensions of the archive
• Develop a “design language”
• Infrastructural demands, user testing. Evaluation.
7. Conclusion
Conclusion
• Looking at initial stages of the complex
(re)search process - open-ended browsing
• Exploring temporal and hierarchical dimensions
• Short-form prototypes - how to visualize web
archive content in “engaging” ways?
• …further infrastructure, dev and testing is needed
Closing off: conveying
complexity
• “I want [people] to use the visualizations I provide
as a starting point for their own explorations”
• They should expose “the complexity, the inner
contradictions, the manifold nature of the
underlying phenomenon. (Moritz Stefaner)
In a web archive context, a simple results list
hides a lot of complexities…
References
• Ben-David A. & Huurdeman H. (2014). Web Archive Search as Research: Methodological and Theoretical
Implications. Alexandria Journal, Volume 25, No. 1 (2014)
• Brügger, N. (2013). Historical Network Analysis of the Web. Social Science Computer Review, 31(3), 306–321
• Dougherty, M., & Meyer, E. T. (2014). Community, tools, and practices in web archiving: The state-of-the-art in relation to social
science and humanities research needs. Journal of the Association for Information Science and Technology, 65(11), 2195–
2209. http://doi.org/10.1002/asi.23099
• Hearst M. A.. Search User Interfaces. Cambridge University Press, 2009.
• Huurdeman, H. C. (2017). Dynamic Support for the Complex Dynamics of the Information Seeking Process, PhD thesis
(exp.2017)
• Huurdeman, H. C. (2017). Dynamic Compositions: Recombining Search User Interface Features for Supporting Complex Work
Tasks. In SCST@ CHIIR (pp. 21–24).
• Huurdeman, H. C., Wilson, M. L., & Kamps, J. (2016). Active and Passive Utility of Search Interface Features in Different
Information Seeking Task Stages. In Proceedings of the 2016 ACM on Conference on Human Information Interaction and
Retrieval (pp. 3–12). New York, NY, USA: ACM. https://doi.org/10.1145/2854946.2854957
• Huurdeman, Samar, Kamps, De Vries (2016). Towards Multidimensional Web Archive Access. Presented at IIPC conference ‘16
• Hugo C. Huurdeman and Jaap Kamps (2015). Supporting the Process: Adapting Search Systems to Search Stages. In: S.
Kurbanoğlu, S. Špiranec, J. Boustany, E. Grassian, D. Mizrachi, & L. Roy (Eds.), Information Literacy: Moving towards
sustainability, Communication in Computer and Information Science series (Vol. 552, pp. 394-404).
• Huurdeman, H. (2015). Towards Research Engines: Supporting Search Stages in Web archives. In Two-day conference at
Aarhus University, Denmark.
• Huurdeman, H., & Kamps, J. (2014). From Multistage Information-seeking Models to Multistage Search Systems. In
Proceedings of the 5th Information Interaction in Context Symposium (pp. 145–154). New York, NY, USA: ACM.
• C. C. Kuhlthau. Inside the search process: Information seeking from the user’s perspective. JASIS, 42:361–371, 1991.
• B. Shneiderman and C. Pleasant. Designing the user interface: strategies for effective human-computer interaction. Pearson
Education, 2005.
• P. Vakkari. A theory of the task-based information retrieval process: a summary and generalisation of a longitudinal study.
Journal of Documentation, 57:44–60, 2001.
Acknowledgements
• Thaer Samar & Jaap Kamps & Arjen & others in WebART
• NWO grant
• Colleagues at University of Oslo (Science Lib)
• NB grant
• René Voorburg & Kees Teszelsky at the KB
Chaos & Order
University of Oslo Library
Using visualization as a means to
explore large heritage collections
Hugo Huurdeman @timelessfuture
1 of 50

Recommended

Towards Multidimensional Web Archive Access (IIPC 2016) by
Towards Multidimensional Web Archive Access (IIPC 2016)Towards Multidimensional Web Archive Access (IIPC 2016)
Towards Multidimensional Web Archive Access (IIPC 2016)TimelessFuture
992 views48 slides
Webmapping: maps for presentation, exploration & analysis by
Webmapping: maps for presentation, exploration & analysisWebmapping: maps for presentation, exploration & analysis
Webmapping: maps for presentation, exploration & analysisTimelessFuture
141 views75 slides
2014_WWW_BTOR by
2014_WWW_BTOR2014_WWW_BTOR
2014_WWW_BTORDongpo Deng
695 views71 slides
One day workshop Linked Data and Semantic Web by
One day workshop Linked Data and Semantic WebOne day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic WebVictor de Boer
452 views129 slides
Linked Data: principles and examples by
Linked Data: principles and examples Linked Data: principles and examples
Linked Data: principles and examples Victor de Boer
3.6K views81 slides
Planning for big data (lessons from cultural heritage) by
Planning for big data (lessons from cultural heritage)Planning for big data (lessons from cultural heritage)
Planning for big data (lessons from cultural heritage)Mia
5K views40 slides

More Related Content

What's hot

The importance of the Web for the Semantic Web by
The importance of the Web for the Semantic WebThe importance of the Web for the Semantic Web
The importance of the Web for the Semantic WebAlexandre Monnin
2.4K views37 slides
Nilges Making The Metadata Work NISO Virtual Conference Ebooks by
Nilges Making The Metadata Work NISO Virtual Conference EbooksNilges Making The Metadata Work NISO Virtual Conference Ebooks
Nilges Making The Metadata Work NISO Virtual Conference EbooksNational Information Standards Organization (NISO)
943 views31 slides
Introduction to information visualisation for humanities PhDs by
Introduction to information visualisation for humanities PhDsIntroduction to information visualisation for humanities PhDs
Introduction to information visualisation for humanities PhDsMia
5K views100 slides
Another history of the Web from its architecture by
Another history of the Web from its architectureAnother history of the Web from its architecture
Another history of the Web from its architectureAlexandre Monnin
4.6K views64 slides
New tasks, new roles: Libraries in the tension between Digital Humanities, Re... by
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...Stefan Schmunk
846 views23 slides
JCDL 2015 Tutorial Opening Slides by
JCDL 2015 Tutorial Opening SlidesJCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening SlidesRobert H. McDonald
1.6K views23 slides

What's hot(20)

The importance of the Web for the Semantic Web by Alexandre Monnin
The importance of the Web for the Semantic WebThe importance of the Web for the Semantic Web
The importance of the Web for the Semantic Web
Alexandre Monnin2.4K views
Introduction to information visualisation for humanities PhDs by Mia
Introduction to information visualisation for humanities PhDsIntroduction to information visualisation for humanities PhDs
Introduction to information visualisation for humanities PhDs
Mia 5K views
Another history of the Web from its architecture by Alexandre Monnin
Another history of the Web from its architectureAnother history of the Web from its architecture
Another history of the Web from its architecture
Alexandre Monnin4.6K views
New tasks, new roles: Libraries in the tension between Digital Humanities, Re... by Stefan Schmunk
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
Stefan Schmunk846 views
From Structured Data to Linked Open Governmental Data by Dongpo Deng
From Structured Data to Linked Open Governmental DataFrom Structured Data to Linked Open Governmental Data
From Structured Data to Linked Open Governmental Data
Dongpo Deng1K views
TourPack: Packaging and Disseminating Touristic Services with Linked Data and... by Anna Fensel
TourPack: Packaging and Disseminating Touristic Services with Linked Data and...TourPack: Packaging and Disseminating Touristic Services with Linked Data and...
TourPack: Packaging and Disseminating Touristic Services with Linked Data and...
Anna Fensel794 views
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras... by Robert H. McDonald
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
Robert H. McDonald888 views
The CSO Open Data Experience by Dublinked .
The CSO Open Data ExperienceThe CSO Open Data Experience
The CSO Open Data Experience
Dublinked .673 views
Webs of People, Webs of Data by Simon Price
Webs of People, Webs of DataWebs of People, Webs of Data
Webs of People, Webs of Data
Simon Price125 views
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O... by Fabien Gandon
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...
Fabien Gandon1.6K views
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat... by Robert H. McDonald
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
INVENiT II project presentation by bucurcristina
INVENiT II project presentationINVENiT II project presentation
INVENiT II project presentation
bucurcristina599 views
Austrian Experience in Building Data Value Chain by Anna Fensel
Austrian Experience in Building Data Value ChainAustrian Experience in Building Data Value Chain
Austrian Experience in Building Data Value Chain
Anna Fensel1.9K views
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and... by DeVonne Parks, CEM
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
DeVonne Parks, CEM1.3K views
Online Marketing with Schema.org and Multi-channel Communication by Anna Fensel
Online Marketing with Schema.org and Multi-channel CommunicationOnline Marketing with Schema.org and Multi-channel Communication
Online Marketing with Schema.org and Multi-channel Communication
Anna Fensel313 views

Similar to Chaos&Order: Using visualization as a means to
 explore large heritage collections

When Search becomes Research and Research becomes Search by
When Search becomes Research and Research becomes SearchWhen Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchJaap Kamps
836 views75 slides
Interrogating the Politics and Performativity of Web Archiving by
Interrogating the Politics and Performativity of Web ArchivingInterrogating the Politics and Performativity of Web Archiving
Interrogating the Politics and Performativity of Web ArchivingJessica Ogden
2K views23 slides
Towards Research Engines: Supporting Search Stages in Web Archives (2015) by
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)TimelessFuture
1.8K views35 slides
Supporting the Interpretation of Enriched Audiovisual Sources through Tempora... by
Supporting the Interpretation of Enriched Audiovisual Sources through Tempora...Supporting the Interpretation of Enriched Audiovisual Sources through Tempora...
Supporting the Interpretation of Enriched Audiovisual Sources through Tempora...TimelessFuture
330 views28 slides
Observing Web Archives: The Case for an Ethnographic Study of Web Archiving by
Observing Web Archives: The Case for an Ethnographic Study of Web ArchivingObserving Web Archives: The Case for an Ethnographic Study of Web Archiving
Observing Web Archives: The Case for an Ethnographic Study of Web ArchivingJessica Ogden
817 views21 slides

Similar to Chaos&Order: Using visualization as a means to
 explore large heritage collections(20)

When Search becomes Research and Research becomes Search by Jaap Kamps
When Search becomes Research and Research becomes SearchWhen Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes Search
Jaap Kamps836 views
Interrogating the Politics and Performativity of Web Archiving by Jessica Ogden
Interrogating the Politics and Performativity of Web ArchivingInterrogating the Politics and Performativity of Web Archiving
Interrogating the Politics and Performativity of Web Archiving
Jessica Ogden2K views
Towards Research Engines: Supporting Search Stages in Web Archives (2015) by TimelessFuture
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
TimelessFuture1.8K views
Supporting the Interpretation of Enriched Audiovisual Sources through Tempora... by TimelessFuture
Supporting the Interpretation of Enriched Audiovisual Sources through Tempora...Supporting the Interpretation of Enriched Audiovisual Sources through Tempora...
Supporting the Interpretation of Enriched Audiovisual Sources through Tempora...
TimelessFuture330 views
Observing Web Archives: The Case for an Ethnographic Study of Web Archiving by Jessica Ogden
Observing Web Archives: The Case for an Ethnographic Study of Web ArchivingObserving Web Archives: The Case for an Ethnographic Study of Web Archiving
Observing Web Archives: The Case for an Ethnographic Study of Web Archiving
Jessica Ogden817 views
Dh presentation helig 2014 by HELIGLIASA
Dh presentation helig 2014Dh presentation helig 2014
Dh presentation helig 2014
HELIGLIASA698 views
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums by Jon Voss
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
Jon Voss3.4K views
WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013) by TimelessFuture
WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)
WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)
TimelessFuture902 views
Cultural Objects in the Age of Digital Access by Francesco Spagnolo
Cultural Objects in the Age of Digital AccessCultural Objects in the Age of Digital Access
Cultural Objects in the Age of Digital Access
Francesco Spagnolo496 views
Europeana Research Panel DH Benelux 2017 by Europeana
Europeana Research Panel DH Benelux 2017Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017
Europeana203 views
IIIF as an Enabler to Interoperability within a Single Institution by IIIF_io
IIIF as an Enabler to Interoperability within a Single InstitutionIIIF as an Enabler to Interoperability within a Single Institution
IIIF as an Enabler to Interoperability within a Single Institution
IIIF_io359 views
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH by lorna_hughes
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHLorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
lorna_hughes1.3K views

More from TimelessFuture

Experiential Interfaces: 

3D reconstructions as entry points for exploration... by
Experiential Interfaces: 

3D reconstructions as entry points for exploration...Experiential Interfaces: 

3D reconstructions as entry points for exploration...
Experiential Interfaces: 

3D reconstructions as entry points for exploration...TimelessFuture
99 views38 slides
Step inside the Image: 

Interpretative Interfaces for 
3D Historical Content by
Step inside the Image: 

Interpretative Interfaces for 
3D Historical ContentStep inside the Image: 

Interpretative Interfaces for 
3D Historical Content
Step inside the Image: 

Interpretative Interfaces for 
3D Historical ContentTimelessFuture
77 views66 slides
The Multi-Stage Experience: the Simulated Work Task Approach to Studying Info... by
The Multi-Stage Experience: the Simulated Work Task Approach to Studying Info...The Multi-Stage Experience: the Simulated Work Task Approach to Studying Info...
The Multi-Stage Experience: the Simulated Work Task Approach to Studying Info...TimelessFuture
408 views82 slides
Op Ontdekkingsreis door het KB Webarchief - Exploratieve Visualisatie in een ... by
Op Ontdekkingsreis door het KB Webarchief - Exploratieve Visualisatie in een ...Op Ontdekkingsreis door het KB Webarchief - Exploratieve Visualisatie in een ...
Op Ontdekkingsreis door het KB Webarchief - Exploratieve Visualisatie in een ...TimelessFuture
488 views22 slides
Visualization Lecture - Clariah Summer School 2018 by
Visualization Lecture - Clariah Summer School 2018Visualization Lecture - Clariah Summer School 2018
Visualization Lecture - Clariah Summer School 2018TimelessFuture
559 views102 slides
Outcomes Visual Navigation Project by
Outcomes Visual Navigation ProjectOutcomes Visual Navigation Project
Outcomes Visual Navigation ProjectTimelessFuture
548 views96 slides

More from TimelessFuture(19)

Experiential Interfaces: 

3D reconstructions as entry points for exploration... by TimelessFuture
Experiential Interfaces: 

3D reconstructions as entry points for exploration...Experiential Interfaces: 

3D reconstructions as entry points for exploration...
Experiential Interfaces: 

3D reconstructions as entry points for exploration...
TimelessFuture99 views
Step inside the Image: 

Interpretative Interfaces for 
3D Historical Content by TimelessFuture
Step inside the Image: 

Interpretative Interfaces for 
3D Historical ContentStep inside the Image: 

Interpretative Interfaces for 
3D Historical Content
Step inside the Image: 

Interpretative Interfaces for 
3D Historical Content
TimelessFuture77 views
The Multi-Stage Experience: the Simulated Work Task Approach to Studying Info... by TimelessFuture
The Multi-Stage Experience: the Simulated Work Task Approach to Studying Info...The Multi-Stage Experience: the Simulated Work Task Approach to Studying Info...
The Multi-Stage Experience: the Simulated Work Task Approach to Studying Info...
TimelessFuture408 views
Op Ontdekkingsreis door het KB Webarchief - Exploratieve Visualisatie in een ... by TimelessFuture
Op Ontdekkingsreis door het KB Webarchief - Exploratieve Visualisatie in een ...Op Ontdekkingsreis door het KB Webarchief - Exploratieve Visualisatie in een ...
Op Ontdekkingsreis door het KB Webarchief - Exploratieve Visualisatie in een ...
TimelessFuture488 views
Visualization Lecture - Clariah Summer School 2018 by TimelessFuture
Visualization Lecture - Clariah Summer School 2018Visualization Lecture - Clariah Summer School 2018
Visualization Lecture - Clariah Summer School 2018
TimelessFuture559 views
Outcomes Visual Navigation Project by TimelessFuture
Outcomes Visual Navigation ProjectOutcomes Visual Navigation Project
Outcomes Visual Navigation Project
TimelessFuture548 views
KNVI 2017: De collectie in een ander licht - Creatieve inzet van nieuwe techn... by TimelessFuture
KNVI 2017: De collectie in een ander licht - Creatieve inzet van nieuwe techn...KNVI 2017: De collectie in een ander licht - Creatieve inzet van nieuwe techn...
KNVI 2017: De collectie in een ander licht - Creatieve inzet van nieuwe techn...
TimelessFuture1.6K views
Workshop: Inspirational Journeys - Challenges and Solutions for Visual Naviga... by TimelessFuture
Workshop: Inspirational Journeys - Challenges and Solutions for Visual Naviga...Workshop: Inspirational Journeys - Challenges and Solutions for Visual Naviga...
Workshop: Inspirational Journeys - Challenges and Solutions for Visual Naviga...
TimelessFuture417 views
“More than Meets the Eye” - Analyzing the Success of User Queries in Oria by TimelessFuture
“More than Meets the Eye” - Analyzing the Success of User Queries in Oria“More than Meets the Eye” - Analyzing the Success of User Queries in Oria
“More than Meets the Eye” - Analyzing the Success of User Queries in Oria
TimelessFuture1.5K views
Not available, or not found? Lessons from user queries in the Oria catalog at... by TimelessFuture
Not available, or not found? Lessons from user queries in the Oria catalog at...Not available, or not found? Lessons from user queries in the Oria catalog at...
Not available, or not found? Lessons from user queries in the Oria catalog at...
TimelessFuture767 views
Webarchief & Wetenschap (Dutch) by TimelessFuture
Webarchief & Wetenschap (Dutch)Webarchief & Wetenschap (Dutch)
Webarchief & Wetenschap (Dutch)
TimelessFuture444 views
From Exploration to Construction
 - How to Support the Complex Dynamics of In... by TimelessFuture
From Exploration to Construction
 - How to Support the Complex Dynamics of In...From Exploration to Construction
 - How to Support the Complex Dynamics of In...
From Exploration to Construction
 - How to Support the Complex Dynamics of In...
TimelessFuture390 views
Active & Passive Utility of Search Interface Features in different Informatio... by TimelessFuture
Active & Passive Utility of Search Interface Features in different Informatio...Active & Passive Utility of Search Interface Features in different Informatio...
Active & Passive Utility of Search Interface Features in different Informatio...
TimelessFuture784 views
Supporting the Process - Adapting Search Systems To Search Stages (ECIL15) by TimelessFuture
Supporting the Process - Adapting Search Systems To Search Stages (ECIL15)Supporting the Process - Adapting Search Systems To Search Stages (ECIL15)
Supporting the Process - Adapting Search Systems To Search Stages (ECIL15)
TimelessFuture737 views
The Value of Multistage Search Systems for Book Search by TimelessFuture
The Value of Multistage Search Systems for Book SearchThe Value of Multistage Search Systems for Book Search
The Value of Multistage Search Systems for Book Search
TimelessFuture433 views
WebART: hoe maak je webarchieven bruikbaar voor de wetenschap? (Dutch) by TimelessFuture
WebART: hoe maak je webarchieven bruikbaar voor de wetenschap? (Dutch)WebART: hoe maak je webarchieven bruikbaar voor de wetenschap? (Dutch)
WebART: hoe maak je webarchieven bruikbaar voor de wetenschap? (Dutch)
TimelessFuture2K views
Finding Pages on the Unarchived Web (DL 2014) by TimelessFuture
Finding Pages on the Unarchived Web (DL 2014)Finding Pages on the Unarchived Web (DL 2014)
Finding Pages on the Unarchived Web (DL 2014)
TimelessFuture2.9K views
From multistage information seeking models to multistage search systems (IIiX... by TimelessFuture
From multistage information seeking models to multistage search systems (IIiX...From multistage information seeking models to multistage search systems (IIiX...
From multistage information seeking models to multistage search systems (IIiX...
TimelessFuture931 views
WebART - "Data Digging" - eHumanities Group 2013 by TimelessFuture
WebART - "Data Digging" - eHumanities Group 2013WebART - "Data Digging" - eHumanities Group 2013
WebART - "Data Digging" - eHumanities Group 2013
TimelessFuture1.4K views

Recently uploaded

Ch. 7 Political Participation and Elections.pptx by
Ch. 7 Political Participation and Elections.pptxCh. 7 Political Participation and Elections.pptx
Ch. 7 Political Participation and Elections.pptxRommel Regala
105 views11 slides
AI Tools for Business and Startups by
AI Tools for Business and StartupsAI Tools for Business and Startups
AI Tools for Business and StartupsSvetlin Nakov
111 views39 slides
MIXING OF PHARMACEUTICALS.pptx by
MIXING OF PHARMACEUTICALS.pptxMIXING OF PHARMACEUTICALS.pptx
MIXING OF PHARMACEUTICALS.pptxAnupkumar Sharma
82 views35 slides
Women from Hackney’s History: Stoke Newington by Sue Doe by
Women from Hackney’s History: Stoke Newington by Sue DoeWomen from Hackney’s History: Stoke Newington by Sue Doe
Women from Hackney’s History: Stoke Newington by Sue DoeHistory of Stoke Newington
157 views21 slides
Classification of crude drugs.pptx by
Classification of crude drugs.pptxClassification of crude drugs.pptx
Classification of crude drugs.pptxGayatriPatra14
92 views13 slides
The Value and Role of Media and Information Literacy in the Information Age a... by
The Value and Role of Media and Information Literacy in the Information Age a...The Value and Role of Media and Information Literacy in the Information Age a...
The Value and Role of Media and Information Literacy in the Information Age a...Naseej Academy أكاديمية نسيج
54 views42 slides

Recently uploaded(20)

Ch. 7 Political Participation and Elections.pptx by Rommel Regala
Ch. 7 Political Participation and Elections.pptxCh. 7 Political Participation and Elections.pptx
Ch. 7 Political Participation and Elections.pptx
Rommel Regala105 views
AI Tools for Business and Startups by Svetlin Nakov
AI Tools for Business and StartupsAI Tools for Business and Startups
AI Tools for Business and Startups
Svetlin Nakov111 views
Classification of crude drugs.pptx by GayatriPatra14
Classification of crude drugs.pptxClassification of crude drugs.pptx
Classification of crude drugs.pptx
GayatriPatra1492 views
Sociology KS5 by WestHatch
Sociology KS5Sociology KS5
Sociology KS5
WestHatch76 views
7 NOVEL DRUG DELIVERY SYSTEM.pptx by Sachin Nitave
7 NOVEL DRUG DELIVERY SYSTEM.pptx7 NOVEL DRUG DELIVERY SYSTEM.pptx
7 NOVEL DRUG DELIVERY SYSTEM.pptx
Sachin Nitave61 views
AUDIENCE - BANDURA.pptx by iammrhaywood
AUDIENCE - BANDURA.pptxAUDIENCE - BANDURA.pptx
AUDIENCE - BANDURA.pptx
iammrhaywood89 views
The Accursed House by Émile Gaboriau by DivyaSheta
The Accursed House  by Émile GaboriauThe Accursed House  by Émile Gaboriau
The Accursed House by Émile Gaboriau
DivyaSheta212 views
Dance KS5 Breakdown by WestHatch
Dance KS5 BreakdownDance KS5 Breakdown
Dance KS5 Breakdown
WestHatch86 views
EIT-Digital_Spohrer_AI_Intro 20231128 v1.pptx by ISSIP
EIT-Digital_Spohrer_AI_Intro 20231128 v1.pptxEIT-Digital_Spohrer_AI_Intro 20231128 v1.pptx
EIT-Digital_Spohrer_AI_Intro 20231128 v1.pptx
ISSIP379 views
REPRESENTATION - GAUNTLET.pptx by iammrhaywood
REPRESENTATION - GAUNTLET.pptxREPRESENTATION - GAUNTLET.pptx
REPRESENTATION - GAUNTLET.pptx
iammrhaywood107 views

Chaos&Order: Using visualization as a means to
 explore large heritage collections

  • 1. Chaos & OrderChaos & Order University of Oslo Library Using visualization as a means to explore large heritage collections Hugo Huurdeman @timelessfuture
  • 2. Visual Navigation Project University of Oslo Library bit.ly/VisualNavigationProject
  • 4. Stream 2: Physical Interaction • Stream 1 & 3 build on top of existing work and infrastructure • Approach Stream 2: experiment with novel ways of interaction in physical space • with library’s book collections • experiments with a touch table (Science Library) • Includes an INF2260 project & INF Master project Yaron Okun Physical interaction (2) Visualiza- tion (1) Visual navigation prototypes Picture: Marina Tofting
  • 5. Visual Navigation Project University of Oslo Library bit.ly/VisualNavigationProject in collaboration with Department of Informatics by support of the National Library of Norway start: Sept. 2016. duration two years
  • 7. One motivation: ‘underuse’ of Web archives • Web archives preserve the fast- changing Web. By now containing Petabytes of valuable Web data • This could be a valuable resource, however, archives have not frequently been used for research [DoughertyMeyer14], e.g. due to access issues. • Presentation focus: using visualization as a means to explore large heritage collections
  • 9. • Information seeking as a process of construction • E.g. [Kuhlthau91, Vakkari01] Inf. seeking process 2.1+uncertainty- feelings thoughts actions vague focused seeking general information (exploring) seeking pertinent information (documenting) uncertainty optimism confusion clarity confidence (dis)satisfaction doubt direction FormulationInitiation Selection Exploration Collection Presentation
  • 10. Stage-based search support • Stage-based support [Huurdeman&Kamps14/15,HuurdemanWilson&Kamps16, Huurdeman17a/b]
  • 11. Re/search as a constructive process 2.3 • Mapping Kendall’s (2012) Research Process Model • to Kuhlthau’s ISP Model (1991) [Huurdeman17b]
  • 12. • Today: look at the initial (prefocus) phases • How does one get curious, inspired, interested? What support for this phase currently exists? Research as a constructive process 2.3
  • 14. [Ahlberg&Shneiderman94] [Google Wonder Wheel] [ClusterMap] [Epicurious] [Donato10] [Hearst&Degler13] [Proulx et al., 2006] • SUIs may aid users to: • express needs, formulate queries, provide understanding & to track progress [Hearst09] • Complexity of designing effective SUIs [Shneiderman05] • Many proposed interactive features: • search suggestions [Niu14], facets [Tunkelang09], item trays [Donato10], .. Search User Interfaces 3.1
  • 15. Few features have made it to the general search engines, however Some turned up in specific context, e.g. online shopping, analytics
  • 16. Access to heritage collections 3.2 • Some developments have been incorporated in systems to access cultural heritage collections • Libraries, Museums, Archives • Web archives
  • 17. Web Archives 3.3 • Wayback Machine: URL as starting point • Search Systems: Query as starting point
  • 18. Assumptions of Wayback Machine 3.4 • Assumption that you know what you are looking for… !!!
  • 19. Assumptions of search 3.5 • Searching (even exploratory) assumes that you have an initial idea what you would like to look for — however rough image:Google
  • 20. Web archive Access Issues 3.7 • Problems* of • scale (large size) • dimensions (temporal and hierarchical) • Hence, the data is too much and too complex for regular URL browsing & basic searching (e.g. how to convey all this in 10 blue links?)
  • 21. Towards Visualization? 3.8 • Any kind of visual representation of information designed to enable exploration, discovery, communication, etc. (Cairo, 2016) • Visualization - can be used throughout (re)search process • initial exploration, get a grasp (exploration) • as an artefact of ongoing research (discovery) • as an end product (science communication)
  • 22. Guiding Questions 3.9 • Can we devise alternatives* to the Query and URL approach for web archive access? • To what extent can we provide more visual approaches for browsing web archives? [Ahlberg&Shneiderman94] [Pejtersen89]
  • 23. 4. Initial explorations [Part presented as HuurdemanSamarEtAl16 (IIPC)]
  • 24. Flickr: koninklijkebibliotheek Statistics (2016): •10,000+ websites •35,000+ harvests •16+ Terabyte •Categorized using UNESCO classification National Library of the Netherlands: Web archive since 2007
  • 25. Data: extraction and processing 4.1 extracting all homepages + 1 level deep matching with seedlist adding KB metadata cleaning, processing, data enrichment (e.g. NER) generate visualizations~900K XML files thanks: Thaer Samar
  • 26. Web sphere Page element Web site Web page 2010 2015 eyefilm.nl [Brügger] [Huurdeman15]
  • 27. Example: eyefilm.nl (2010- 2015) redesign redesign content links images overall
  • 29. Web sphere Page element Web site Web page 2010 2015 unesco classifications
  • 30. Changerate (type of site) Changes per unesco category (all p/quarter harvests, n=~600, 2009-2015) Meteorology Law & government History Sports Agriculture
  • 31. Web sphere Page element Web site Web page 2010 2015 nu.nl
  • 33. Jan’13 Feb’13 Mar’13 Apr’13 May’13 Jun’13 Jul’13 Aug’13 Sep’13 Oct’13 Nov’13 Dec’13
  • 36. CollectionXplorer Characteristics • Using d3js as a basis • “Playful”, short-form development • Different visualizations as a ‘lens’ to the archive • As a starting point to rethink web archive access • How to induce interest, inspiration & curiosity in the context of web archives?
  • 37. Clusters color: representations of websites, size: number of crawls
  • 38. Clusters color: representations of websites, size: number of crawls
  • 40. Bar Charts color: unesco category, size: avg change %
  • 41. Bar Charts color: unesco category, size: avg change %
  • 42. Network (Force-directed) connetions: unesco category, size: number of crawls
  • 43. Scatterplots horizontal: category, vertical: user rating (books) So, lots of opportunities distinct properties of each type of visualization
  • 44. CollectionXplorer - some char’istics • “Playful” - engage potential users, encourage to interact • Easy to add new types of visualizations • Various modalities to explore • Initial testing on touch table (swipe!) • Next steps: further explore dimensions of the archive • Develop a “design language” • Infrastructural demands, user testing. Evaluation.
  • 46. Conclusion • Looking at initial stages of the complex (re)search process - open-ended browsing • Exploring temporal and hierarchical dimensions • Short-form prototypes - how to visualize web archive content in “engaging” ways? • …further infrastructure, dev and testing is needed
  • 47. Closing off: conveying complexity • “I want [people] to use the visualizations I provide as a starting point for their own explorations” • They should expose “the complexity, the inner contradictions, the manifold nature of the underlying phenomenon. (Moritz Stefaner) In a web archive context, a simple results list hides a lot of complexities…
  • 48. References • Ben-David A. & Huurdeman H. (2014). Web Archive Search as Research: Methodological and Theoretical Implications. Alexandria Journal, Volume 25, No. 1 (2014) • Brügger, N. (2013). Historical Network Analysis of the Web. Social Science Computer Review, 31(3), 306–321 • Dougherty, M., & Meyer, E. T. (2014). Community, tools, and practices in web archiving: The state-of-the-art in relation to social science and humanities research needs. Journal of the Association for Information Science and Technology, 65(11), 2195– 2209. http://doi.org/10.1002/asi.23099 • Hearst M. A.. Search User Interfaces. Cambridge University Press, 2009. • Huurdeman, H. C. (2017). Dynamic Support for the Complex Dynamics of the Information Seeking Process, PhD thesis (exp.2017) • Huurdeman, H. C. (2017). Dynamic Compositions: Recombining Search User Interface Features for Supporting Complex Work Tasks. In SCST@ CHIIR (pp. 21–24). • Huurdeman, H. C., Wilson, M. L., & Kamps, J. (2016). Active and Passive Utility of Search Interface Features in Different Information Seeking Task Stages. In Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval (pp. 3–12). New York, NY, USA: ACM. https://doi.org/10.1145/2854946.2854957 • Huurdeman, Samar, Kamps, De Vries (2016). Towards Multidimensional Web Archive Access. Presented at IIPC conference ‘16 • Hugo C. Huurdeman and Jaap Kamps (2015). Supporting the Process: Adapting Search Systems to Search Stages. In: S. Kurbanoğlu, S. Špiranec, J. Boustany, E. Grassian, D. Mizrachi, & L. Roy (Eds.), Information Literacy: Moving towards sustainability, Communication in Computer and Information Science series (Vol. 552, pp. 394-404). • Huurdeman, H. (2015). Towards Research Engines: Supporting Search Stages in Web archives. In Two-day conference at Aarhus University, Denmark. • Huurdeman, H., & Kamps, J. (2014). From Multistage Information-seeking Models to Multistage Search Systems. In Proceedings of the 5th Information Interaction in Context Symposium (pp. 145–154). New York, NY, USA: ACM. • C. C. Kuhlthau. Inside the search process: Information seeking from the user’s perspective. JASIS, 42:361–371, 1991. • B. Shneiderman and C. Pleasant. Designing the user interface: strategies for effective human-computer interaction. Pearson Education, 2005. • P. Vakkari. A theory of the task-based information retrieval process: a summary and generalisation of a longitudinal study. Journal of Documentation, 57:44–60, 2001.
  • 49. Acknowledgements • Thaer Samar & Jaap Kamps & Arjen & others in WebART • NWO grant • Colleagues at University of Oslo (Science Lib) • NB grant • René Voorburg & Kees Teszelsky at the KB
  • 50. Chaos & Order University of Oslo Library Using visualization as a means to explore large heritage collections Hugo Huurdeman @timelessfuture

Editor's Notes

  1. Several underlying reasons exist (incl. data and legal issues). Here, we focus on access.
  2. More and more systems intending to support the process. Kendall: 1 defining research problem, 2 reviewing litearture, 3 hypothesis formulation, 4 research design, 5 collecting and analyzing data, 6 drawing conclusions & reporting findings
  3. {visualization at different moments in the process} visual information retrieval. trigger new questions. visualization as a product. information access; enhancing the possibilities.
  4. Donato: “research session detector”
  5. “data is too much and too complex for searching” *** PLUS data issues such as incompleteness ***
  6. (How to induce interest, inspiration & curiosity in the context of web archives?)
  7. put into visual diagram (Steps)
  8. year
  9. month
  10. day
  11. Suitability data, visualization & screen size some visualizations don’t ‘fit’ the data and screen
  12. “Provide users with a structured way to explore a complex phenomenon on their own terms, in a sensually rich mosaic of media and facts rather than a pre-digested narrative with a surprise at the end.” (as quoted by Cairo, 2016)