Web Archive Research Skills and Tools Survey (WARST)
1. Web Archive Research
Skills and Tools Survey
(WARST):
preliminary report
WARCnet Meeting, Thursday 4 November 2021
warcnet.eu
2. Sharon Healy: PhD Candidate & GOIPG IRC Scholar in Digital Humanities, Maynooth University,
Project Lead
Nicola Bingham: Lead Curator Web Archives, British Library (editor)
Helena Byrne: Curator, Web Archives, British Library
Olga Holownia: Senior Program Officer, IIPC
Michael Kurzmeier: PhD Candidate & GOIPG IRC Scholar in Digital Humanities, Maynooth University
Chair of WG3
Jason Webber: Web Archiving Engagement Manager, British Library
Research Supervision
Dr Joseph Timoney: Head of Department of Computer Science, Maynooth University
Prof Jane Winters: Professor of Digital Humanities, School of Advanced Study, University of London;
Co-PI of the researcher network WARCnet
Meet the Team
03/11/2021
WARCnet.eu
3. Why WARST?
Web Archives Researcher Skills & Tools Survey
(WARST): a collaborative research study within
the WARCnet Working Group 3:
Digital research methods and tools
Outcome of the WARCnet networking meetings:
decision to focus on the role of skills and knowledge
to conduct web archives research
03/11/2021
WARCnet.eu
4. Purpose of the Project
→ to identify, and document skills and knowledge required
to achieve a range of different research goals within web archives research.
Goals:
• investigate skills that are useful or important for curation, and for conducting research
with web archives
• develop a list of tools for conducting web archives curation and research used by survey
participants
• explore the challenges for the curators and users of web archives
• foster discussion for a better understanding of the barriers to entry for researchers and
curators
• provide insights, for potential solutions for the challenges with web archiving and the use of web
archives through a variety of perspectives
03/11/2021
WARCnet.eu
5. Web Archives Research Skills and Tools Analysis
(WARSA)
Web Archives
- Researcher Skills & Tools Survey
Led by Sharon Healy
Phase 1: WARST-Survey-
P1
July – December 2021
WARCnet meeting
3-6 November 2021
Survey Report -
Preliminary Results
Researcher ‘Persona’ Workshop,
using the data from the survey to
develop personas
Led by Jason Webber
Phase 2: Workshop
(WARSA-P2)
2022
Web Archives - Researcher/
User Interviews
using insights from survey data to
inform an interview study (to
explore further the challenges
brought up in the survey, or further
training requirements)
Led by Jason Webber
Phase 3: Interviews
(WARSA-P3)
2022
Final Report
03/11/2021
WARCnet.eu
6. Methodology
Survey Design
▪ anonymous survey questionnaire
▪ 28 questions (Tick box, Multiple choice, Likert scale and Comment box answers)
Timeline
▪ Tested in mid-March 2021 by academic/non-academic/cultural heritage colleagues to ensure the questions were clearly understood.
▪ Final draft of the research project including information about the project, informed consent, a copy of the survey questions, and a data
management plan submitted to Maynooth University Research Ethics Committee for approval (SRESC-2021-2436150).
Target audience
▪ archivists, librarians, curators, information managers, scholars, researchers, students, historians etc.
▪ Participants were not asked for any personal data such as Name / Contact Email / Date of Birth etc. Participants were only asked for a
Country of residence; Age range; Gender range; and Position. The design of the survey was rendered to be anonymous by default; and
there were no IP addresses collected.
03/11/2021
WARCnet.eu
7. Methodology
Survey Recruitment
▪ recruitment emails to network lists for archivists, librarians, curators, digital humanities, internet studies (e.g.
WARCnet members; IIPC Web Curators and Members list, the Archives Unleashed community, UK Legal Deposit
Libraries Group, Google Groups Digital Curation group, FLAC Dig Lib group.),
▪ social media posts for participation on Facebook and Twitter.
Survey responses
▪ The survey was open from 21 July to 23 September 2021
▪ 50 participants responses (6 surveys were removed from the survey dataset, due to some response inconsistencies)
This is perhaps not an unusual occurrence, as other studies have also come across anomalies whereby there is some
confusion with the term web archive (Costea, 2018; Riley and Crookston, 2015; Healy, 2021)
03/11/2021
WARCnet.eu
8. Web Archives Research Skills and Tools Analysis
(WARSA)
Web Archives
- Researcher Skills & Tools Survey
Led by Sharon Healy
Phase 1: WARST-Survey-
P1
July – December 2021
WARCnet meeting
3-6 November 2021
Survey Report -
Preliminary Results
Researcher ‘Persona’ Workshop,
using the data from the survey to
develop personas
Led by Jason Webber
Phase 2: Workshop
(WARSA-P2)
2022
Web Archives - Researcher/
User Interviews
using insights from survey data to
inform an interview study (to
explore further the challenges
brought up in the survey, or further
training requirements)
Led by Jason Webber
Phase 3: Interviews
(WARSA-P3)
2022
Final Report
03/11/2021
WARCnet.eu
10. WARST Survey – Tools & Software
Quantitative Data
JISC Online Surveys platform tools for collecting the data, and for filtering and aggregating
the quantitative data. Tools within Microsoft Excel are used for generating charts, and GIMP
software is used to export Excel charts and graphs as PNG files.
03/11/2021
WARCnet.eu
11. WARST Survey – Tools & Software
Qualitative Data
03/11/2021
WARCnet.eu
MAXQDA, a software for computer-assisted qualitative data analysis (CAQDAS)
is being used to code and analyse the qualitative data.
12. WARST Survey – Tools & Software
Results for Tools and Resources
03/11/2021
WARCnet.eu
We are using Zotero to develop a WARCnet ToyChest, containing all of the tools
and resources that were mentioned by participants in the WARST survey.
We would like to add, to the ToyChest, the WG3 Tools and Methods Bibliography.
13. Research
Questions
Read and
interpret
the text
Code
segments
of text
Organise
the codes
into
categories
Iterative
feedback-
loop
Analyse
Present
results
We are
HERE
Dataset
N=44
WARST Survey – Qualitative Data Analysis
03/11/2021
WARCnet.eu
Coding Qualitative Data?
“Coding is the process of organizing and
sorting your data. Codes serve as a way to
label, compile and organize your data. They
also allow you to summarize and synthesize
what is happening in your data. In linking
data collection and interpreting the data,
coding becomes the basis for developing the
analysis. It is generally understood, then, that
‘coding is analysis’.”
(Centre for Evaluation and Research, Tobacco
Control Evaluation Centre, 2012)
20. MAXQDA Interface – In-vivo Coding
03/11/2021
WARCnet.eu
In-vivo coding
the term in-vivo also comes from grounded theory and means that words or terms
used by the interviewees are so remarkable that they should be taken as codes. In-
vivo coding adds these terms of the respondents as codes and codes the text
passage at the same time.
21. Example of WARST Survey Coding System
03/11/2021
WARCnet.eu
Document Browser
Code System
22. Example of WARST Survey Coding System
03/11/2021
WARCnet.eu
Document Browser
Code System
23. PROS CONS
Substantial manual for working through the interface.
Good starting point for a novice, for coding interview
transcriptions. Unclear yet how useful it would be for
other media formats, but look forward to trying
Steep learning curve for an individual with
limited technical ability.
MAXQDA automatically backs up your work and will
over-ride the previous save. But you can also save your
project manually with “Save Project As” and apply a
date for each version saved.
Not as popular as ATLAS.ti. or NVivo, thus
may not be a sought after tool to have in the
skill-set for employment in the wider academic
community, or private sector organisations.
You can use MAXQDA as a tool for the analysis of
things like YouTube, websites, forum discussions,
images, and archived websites for sure.
Students with no funding or minimal access to
financial assistance may not be in a position to
afford the license
24 month license available for €72 in Ireland, which is
affordable for students who receive some financial
assistance towards MA or PhD costs.
MAXQDA software (CAQDAS)
03/11/2021
WARCnet.eu
24. Participant responses for position (N=44), coded into 2 main categories
Position (Theme) Position (Theme) Description Responses
(N=44)
Library / Archive /
Web Archive
environment
This refers to a participant who identifies
with working in a Library / Archive / Web
Archive environment.
=30
Scholar /
Academic /
Lecturer / Student
/ IT / Web Design
This refers to a participant who identifies
themselves as a Scholar / Academic /
Lecturer; Post-graduate / PhD student, or
working in an IT or web design environment
=14
WARST Survey: Some Preliminary Findings
03/11/2021
WARCnet.eu
26. 5 participants entered free text for
other option, and the data types are
listed as follows:
• social media content gathered via
APIs
• Software
• CDX index files
• derivative crawl reports
• Cascading Style Sheets
• .json output from APIs
• JavaScript
Representation of the types of data collected by respondents (N=44)
WARST Survey: Some Preliminary Findings
03/11/2021
WARCnet.eu
27. Thank you!
Go raibh míle maith agaibh go léir
(Irish for TY)
WARCnet Meeting, Thursday 4 November 2021
warcnet.eu