SlideShare a Scribd company logo
What’s in a URL? Analysing COVID-19 web archive collections
November 2021, Aarhus
Karin de Wild, Friedel Geeraert
Methodology
• Datathon (2)
• Test hypotheses (WARCnet interviews)
• Test data quality
Online press
Data quality + scope
(Inter)national scope
Social media
“All initiatives except for the IIPC had a national focus.”
Hypothesis (from WARCnet interviews with Web archivists)
(Inter)national scope
(Inter)national scope of Covid-19 special collections
Countries represented in Covid-19 special collections.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
BnF : Bibliothèque nationale de France
BnL : Bibliothèque nationale du Luxembourg
IIPC : International Internet Preservation Consortium
KB : Koninklijke Bibliotheek, The Netherlands
NSL : National Széchényi Library, Hungary
RDL : Royal Danish Library
UKWA : UK Web archive
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
BnF : Bibliothèque nationale de France
BnL : Bibliothèque nationale du Luxembourg
IIPC : International Internet Preservation Consortium
KB : Koninklijke Bibliotheek, The Netherlands
NSL : National Széchényi Library, Hungary
RDL : Royal Danish Library
UKWA : UK Web archive
(Inter)national scope of Covid-19 special collections
Countries represented in Covid-19 special collections.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
BnF : Bibliothèque nationale de France
BnL : Bibliothèque nationale du Luxembourg
IIPC : International Internet Preservation Consortium
KB : Koninklijke Bibliotheek, The Netherlands
NSL : National Széchényi Library, Hungary
RDL : Royal Danish Library
UKWA : UK Web archive
(Inter)national scope of Covid-19 special collections
Countries represented in Covid-19 special collections.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
BnF : Bibliothèque nationale de France
BnL : Bibliothèque nationale du Luxembourg
IIPC : International Internet Preservation Consortium
KB : Koninklijke Bibliotheek, The Netherlands
NSL : National Széchényi Library, Hungary
RDL : Royal Danish Library
UKWA : UK Web archive
(Inter)national scope of Covid-19 special collections
Countries represented in Covid-19 special collections.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
BnF : Bibliothèque nationale de France
BnL : Bibliothèque nationale du Luxembourg
IIPC : International Internet Preservation Consortium
KB : Koninklijke Bibliotheek, The Netherlands
NSL : National Széchényi Library, Hungary
RDL : Royal Danish Library
UKWA : UK Web archive
(Inter)national scope of Covid-19 special collections
Countries represented in Covid-19 special collections.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
BnF : Bibliothèque nationale de France
BnL : Bibliothèque nationale du Luxembourg
IIPC : International Internet Preservation Consortium
KB : Koninklijke Bibliotheek, The Netherlands
NSL : National Széchényi Library, Hungary
RDL : Royal Danish Library
UKWA : UK Web archive
(Inter)national scope of Covid-19 special collections
Countries represented in Covid-19 special collections.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
BnF : Bibliothèque nationale de France
BnL : Bibliothèque nationale du Luxembourg
IIPC : International Internet Preservation Consortium
KB : Koninklijke Bibliotheek, The Netherlands
NSL : National Széchényi Library, Hungary
RDL : Royal Danish Library
UKWA : UK Web archive
(Inter)national scope of Covid-19 special collections
Countries represented in Covid-19 special collections.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
The (inter)national focus of Web archives differs.
BnF : Bibliothèque nationale de France
BnL : Bibliothèque nationale du Luxembourg
IIPC : International Internet Preservation Consortium
KB : Koninklijke Bibliotheek, The Netherlands
NSL : National Széchényi Library, Hungary
RDL : Royal Danish Library
UKWA : UK Web archive
(Inter)national scope of Covid-19 special collections
Countries represented in Covid-19 special collections.
Size Covid-19 special collections
Total unique domain names collected by each Web archive.
National websites in Covid-19 special collections
Percentage of national websites that are represented in Covid-19 special collections in each Web archive.
Note: A national website is a unique domain name with the corresponding country code top-level domain (ccTLD).
For example: Of the 1.482 unique domain names in UKWA, 1.051 domain names end with “.uk” (the country code for the United Kingdom).
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
National websites in Covid-19 special collections (relative)
National websites represented in each Web archive.
Size Covid-19 special collections
Total unique domain names collected by each Web archive.
Most Web archives do have a national focus...
Note: A national website is a unique domain name with the corresponding country code top-level domain (ccTLD).
For example: Of the 1.482 unique domain names in UKWA, 1.051 domain names end with “.uk” (the country code for the United Kingdom).
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
National websites in Covid-19 special collections (relative)
National websites represented in each Web archive.
Size Covid-19 special collections
Total unique domain names collected by each Web archive.
Note: A national website is a unique domain name with the corresponding country code top-level domain (ccTLD).
For example: Of the 1.482 unique domain names in UKWA, 1.051 domain names end with “.uk” (the country code for the United Kingdom).
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
… The IIPC is the only collection without a national focus.
Data quality + scope
Critical reflections on the data quality to further improve the
dataset and refine (the scope of the) research questions.
www.webarchive.org.uk (ccTLD .uk = United Kingdom)
Incomplete data
Countries represented in the IIPC collection (using variable ‘ Country’, derived from ccTLD) .
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
Incomplete data
Countries represented in the IIPC collection (using variable ‘ Country’, derived from ccTLD) .
Not all countries of publication can be derived from the ccTLD.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
Incomplete data
Countries represented in the IIPC collection (using variable ‘ Country’, derived from ccTLD) .
Incomplete data
Size of national collections in our dataset (excl. IIPC collection).
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
6 of the 44 European countries are represented within our dataset
Incomplete data
Size of national collections in our dataset (excl. IIPC collection).
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
BnF (France) collected most unique domain names.
Incomplete data
Size of national collections in our dataset (excl. IIPC collection).
Incomplete data
Size of national collections in the IIPC collection.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
Also IIPC is included in our dataset.
Incomplete data
Size of national collections in the IIPC collection.
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
Do we have information about who contributed to this collection?
International scope of the dataset
Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
Those who do not see themselves reflected in
national heritage are excluded from it.
Stuart Hall
“
“
More data is needed to gain further insights in cultural bias.
“The BnL collections should hold many news articles.”
Hypothesis (from WARCnet interviews with Web archivists)
Online press
96.38%
3.62%
Seed list of BnL: media versus other actor categories
News Media, International News Media, Media Other categories
Seed list of BnL: categories other than media
Twitter Government Non-profit, mutual aid Blog Commercial, Business
Facebook Healthcare Youtube, Vimeo Public Institution Local Government
Festival Religion Political Party University Livetickers
Education Professional Chamber Scientific Research Foundation Advisory commission
Foreign government European Union Podcast Economic Interest Group Ombudsman
Soundcloud Wikipedia
Data quality + scope
Changes over time
Another limitation of the data is that there is no
variable ‘timestamp’, while both the collections and
the websites themselves change over time.
0
1
2
3
4
5
6
7
March April May June July August
Evolution of seeds added to the collection at the BnL
Series1 Series2 Series3
Lack of timestamp
“Social media were excluded from the IIPC collaborative
collection.”
Hypothesis (from WARCnet interviews with Web archivists)
Social media
0
166 172
0
37 44
0
26 26
0
20
40
60
80
100
120
140
160
180
200
Interview Seed list Archive-It
Youtube Twitter Facebook
Social media in IIPC collaborative COVID-19 collection
Different levels to analyse
Crawl scope
1 = Full seed host or directory
2 = Crawl one page only
3 = Seed page plus 1 click of all
links on seed page
Number of seeds on Youtube, Twitter and Facebook.com domains
Data quality + scope
Crawl scope
The seed list is a starting point and does not offer an
exhaustive overview of the collection itself.
Next steps:
• Improve figures and tables
• WARCnet paper
Karin de Wild, Friedel Geeraert, Jane Winters, Nicola Bingham, Niels Brügger, Frédéric Clavert,
Sophie Gebeil, Federico Nanni, Caroline Nyvang, Valérie Schafer, Helle Strandgaard Jensen,
Katherina Schmid
Thank you!

More Related Content

What's hot

Building a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly useBuilding a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly use
ALISS
 
AddressingHistory - Tracing the Past
AddressingHistory - Tracing the PastAddressingHistory - Tracing the Past
AddressingHistory - Tracing the Past
EDINA, University of Edinburgh
 
Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"
Victor de Boer
 
Web@rchive Austria (Archiving Online Media)
Web@rchive Austria (Archiving Online Media)Web@rchive Austria (Archiving Online Media)
Web@rchive Austria (Archiving Online Media)
Web@rchive Austria
 
Consolidating Openness : Developing Rijksmuseum Research Services
Consolidating Openness : Developing Rijksmuseum Research ServicesConsolidating Openness : Developing Rijksmuseum Research Services
Consolidating Openness : Developing Rijksmuseum Research Services
Saskia Scheltjens
 
IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017
Nuno Freire
 
2010 nalis presentation1
2010 nalis presentation12010 nalis presentation1
2010 nalis presentation1
Richard Ovenden
 
20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata
Joachim Neubert
 
Multilingual challenges and ongoing work to tackle them at Europeana
Multilingual challenges and ongoing work to tackle them at EuropeanaMultilingual challenges and ongoing work to tackle them at Europeana
Multilingual challenges and ongoing work to tackle them at Europeana
Antoine Isaac
 
Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017
Europeana
 
Digitised Content: What universities can learn from publishers and what publi...
Digitised Content: What universities can learn from publishers and what publi...Digitised Content: What universities can learn from publishers and what publi...
Digitised Content: What universities can learn from publishers and what publi...
Alastair Dunning
 
Wikidata Conference 2019 GLAM Panel - 20191025
Wikidata Conference 2019 GLAM Panel - 20191025Wikidata Conference 2019 GLAM Panel - 20191025
Wikidata Conference 2019 GLAM Panel - 20191025
Beat Estermann
 
20080903arsenalsofnemesis 04
20080903arsenalsofnemesis 0420080903arsenalsofnemesis 04
20080903arsenalsofnemesis 04
Richard Ovenden
 
Digital Cultural Heritage: Experiences from British Library
Digital Cultural Heritage: Experiences from British LibraryDigital Cultural Heritage: Experiences from British Library
Digital Cultural Heritage: Experiences from British Library
Nora McGregor
 
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
ariadnenetwork
 
The Europeana Data Model Principles, community and innovation
The Europeana Data Model  Principles, community and innovationThe Europeana Data Model  Principles, community and innovation
The Europeana Data Model Principles, community and innovation
Antoine Isaac
 
Religion, social media and the web archive: Peter Webster at International Co...
Religion, social media and the web archive: Peter Webster at International Co...Religion, social media and the web archive: Peter Webster at International Co...
Religion, social media and the web archive: Peter Webster at International Co...
Peter Webster
 
NLW Linked Open Data Sets
NLW Linked Open Data SetsNLW Linked Open Data Sets
NLW Linked Open Data Sets
Glen Robson
 
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Peter Löwe
 

What's hot (19)

Building a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly useBuilding a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly use
 
AddressingHistory - Tracing the Past
AddressingHistory - Tracing the PastAddressingHistory - Tracing the Past
AddressingHistory - Tracing the Past
 
Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"
 
Web@rchive Austria (Archiving Online Media)
Web@rchive Austria (Archiving Online Media)Web@rchive Austria (Archiving Online Media)
Web@rchive Austria (Archiving Online Media)
 
Consolidating Openness : Developing Rijksmuseum Research Services
Consolidating Openness : Developing Rijksmuseum Research ServicesConsolidating Openness : Developing Rijksmuseum Research Services
Consolidating Openness : Developing Rijksmuseum Research Services
 
IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017
 
2010 nalis presentation1
2010 nalis presentation12010 nalis presentation1
2010 nalis presentation1
 
20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata
 
Multilingual challenges and ongoing work to tackle them at Europeana
Multilingual challenges and ongoing work to tackle them at EuropeanaMultilingual challenges and ongoing work to tackle them at Europeana
Multilingual challenges and ongoing work to tackle them at Europeana
 
Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017
 
Digitised Content: What universities can learn from publishers and what publi...
Digitised Content: What universities can learn from publishers and what publi...Digitised Content: What universities can learn from publishers and what publi...
Digitised Content: What universities can learn from publishers and what publi...
 
Wikidata Conference 2019 GLAM Panel - 20191025
Wikidata Conference 2019 GLAM Panel - 20191025Wikidata Conference 2019 GLAM Panel - 20191025
Wikidata Conference 2019 GLAM Panel - 20191025
 
20080903arsenalsofnemesis 04
20080903arsenalsofnemesis 0420080903arsenalsofnemesis 04
20080903arsenalsofnemesis 04
 
Digital Cultural Heritage: Experiences from British Library
Digital Cultural Heritage: Experiences from British LibraryDigital Cultural Heritage: Experiences from British Library
Digital Cultural Heritage: Experiences from British Library
 
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
 
The Europeana Data Model Principles, community and innovation
The Europeana Data Model  Principles, community and innovationThe Europeana Data Model  Principles, community and innovation
The Europeana Data Model Principles, community and innovation
 
Religion, social media and the web archive: Peter Webster at International Co...
Religion, social media and the web archive: Peter Webster at International Co...Religion, social media and the web archive: Peter Webster at International Co...
Religion, social media and the web archive: Peter Webster at International Co...
 
NLW Linked Open Data Sets
NLW Linked Open Data SetsNLW Linked Open Data Sets
NLW Linked Open Data Sets
 
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
 

Similar to What’s in a URL? Analysing COVID-19 web archive collections

20221018_Panel_Covid_WARCnet_closing_conference.pdf
20221018_Panel_Covid_WARCnet_closing_conference.pdf20221018_Panel_Covid_WARCnet_closing_conference.pdf
20221018_Panel_Covid_WARCnet_closing_conference.pdf
WARCnet
 
Introduction to British Library digital resources for social scientists
Introduction to British Library digital resources for social scientistsIntroduction to British Library digital resources for social scientists
Introduction to British Library digital resources for social scientists
johnkayebl
 
Open Cultural Data in Switzerland
Open Cultural Data in SwitzerlandOpen Cultural Data in Switzerland
Open Cultural Data in Switzerland
Beat Estermann
 
Science Barometer Switzerland COVID-19 Edition
Science Barometer Switzerland COVID-19 EditionScience Barometer Switzerland COVID-19 Edition
Science Barometer Switzerland COVID-19 Edition
Mike Schäfer
 
Kate lindsay- Great War Archive
Kate lindsay- Great War ArchiveKate lindsay- Great War Archive
Kate lindsay- Great War Archive
SarahFahmy
 
Treasures in the cloud
Treasures in the cloudTreasures in the cloud
Treasures in the cloud
JennGColl24
 
Television archives in a post-television world (WRIGHT)
Television archives in a post-television world (WRIGHT)Television archives in a post-television world (WRIGHT)
Television archives in a post-television world (WRIGHT)
FIAT/IFTA
 
Europeana 2019 - Connect Communities - Pitch your project
Europeana 2019 - Connect Communities - Pitch your projectEuropeana 2019 - Connect Communities - Pitch your project
Europeana 2019 - Connect Communities - Pitch your project
Europeana
 
Urban Archaeology Session 8: Add-on - Genealogy and Family History
Urban Archaeology Session 8: Add-on - Genealogy and Family HistoryUrban Archaeology Session 8: Add-on - Genealogy and Family History
Urban Archaeology Session 8: Add-on - Genealogy and Family History
Nicole Beale
 
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
Digital Research and Curator Team @ British Library
 
14may08
14may0814may08
14may08
gorin2008
 
The Great War Archive: How audiences engaged with WW1
The Great War Archive: How audiences engaged with WW1The Great War Archive: How audiences engaged with WW1
The Great War Archive: How audiences engaged with WW1
Kate Lindsay
 
Local Memory Project
Local Memory ProjectLocal Memory Project
Local Memory Project
Alexander Nwala
 
Catriona Crowe
Catriona CroweCatriona Crowe
Catriona Crowe
dri_ireland
 
Post-Its and Placemarks
Post-Its and PlacemarksPost-Its and Placemarks
Post-Its and Placemarks
aboutgeo
 

Similar to What’s in a URL? Analysing COVID-19 web archive collections (15)

20221018_Panel_Covid_WARCnet_closing_conference.pdf
20221018_Panel_Covid_WARCnet_closing_conference.pdf20221018_Panel_Covid_WARCnet_closing_conference.pdf
20221018_Panel_Covid_WARCnet_closing_conference.pdf
 
Introduction to British Library digital resources for social scientists
Introduction to British Library digital resources for social scientistsIntroduction to British Library digital resources for social scientists
Introduction to British Library digital resources for social scientists
 
Open Cultural Data in Switzerland
Open Cultural Data in SwitzerlandOpen Cultural Data in Switzerland
Open Cultural Data in Switzerland
 
Science Barometer Switzerland COVID-19 Edition
Science Barometer Switzerland COVID-19 EditionScience Barometer Switzerland COVID-19 Edition
Science Barometer Switzerland COVID-19 Edition
 
Kate lindsay- Great War Archive
Kate lindsay- Great War ArchiveKate lindsay- Great War Archive
Kate lindsay- Great War Archive
 
Treasures in the cloud
Treasures in the cloudTreasures in the cloud
Treasures in the cloud
 
Television archives in a post-television world (WRIGHT)
Television archives in a post-television world (WRIGHT)Television archives in a post-television world (WRIGHT)
Television archives in a post-television world (WRIGHT)
 
Europeana 2019 - Connect Communities - Pitch your project
Europeana 2019 - Connect Communities - Pitch your projectEuropeana 2019 - Connect Communities - Pitch your project
Europeana 2019 - Connect Communities - Pitch your project
 
Urban Archaeology Session 8: Add-on - Genealogy and Family History
Urban Archaeology Session 8: Add-on - Genealogy and Family HistoryUrban Archaeology Session 8: Add-on - Genealogy and Family History
Urban Archaeology Session 8: Add-on - Genealogy and Family History
 
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
 
14may08
14may0814may08
14may08
 
The Great War Archive: How audiences engaged with WW1
The Great War Archive: How audiences engaged with WW1The Great War Archive: How audiences engaged with WW1
The Great War Archive: How audiences engaged with WW1
 
Local Memory Project
Local Memory ProjectLocal Memory Project
Local Memory Project
 
Catriona Crowe
Catriona CroweCatriona Crowe
Catriona Crowe
 
Post-Its and Placemarks
Post-Its and PlacemarksPost-Its and Placemarks
Post-Its and Placemarks
 

More from WARCnet

Gauditz & Kunze, Web archives as research data FINAL.pptx
Gauditz & Kunze, Web archives as research data FINAL.pptxGauditz & Kunze, Web archives as research data FINAL.pptx
Gauditz & Kunze, Web archives as research data FINAL.pptx
WARCnet
 
Gauditz & Kunze, Web archives as research data FINAL.pptx
Gauditz & Kunze, Web archives as research data FINAL.pptxGauditz & Kunze, Web archives as research data FINAL.pptx
Gauditz & Kunze, Web archives as research data FINAL.pptx
WARCnet
 
2022 Visit Royal Danish Library Ditte Laursen.pdf
2022 Visit Royal Danish Library Ditte Laursen.pdf2022 Visit Royal Danish Library Ditte Laursen.pdf
2022 Visit Royal Danish Library Ditte Laursen.pdf
WARCnet
 
20221015 introduction to panel Ditte Laursen.pdf
20221015 introduction to panel  Ditte Laursen.pdf20221015 introduction to panel  Ditte Laursen.pdf
20221015 introduction to panel Ditte Laursen.pdf
WARCnet
 
WARCnet_2022.pptx
WARCnet_2022.pptxWARCnet_2022.pptx
WARCnet_2022.pptx
WARCnet
 
WARCnet conference - Mapping social media archiving initiatives.pptx
WARCnet conference - Mapping social media archiving initiatives.pptxWARCnet conference - Mapping social media archiving initiatives.pptx
WARCnet conference - Mapping social media archiving initiatives.pptx
WARCnet
 
Warcnet 2022_final.pptx
Warcnet 2022_final.pptxWarcnet 2022_final.pptx
Warcnet 2022_final.pptx
WARCnet
 
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdfMaemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
WARCnet
 
Hegarty-WARCNet2022-slides.pdf
Hegarty-WARCNet2022-slides.pdfHegarty-WARCNet2022-slides.pdf
Hegarty-WARCNet2022-slides.pdf
WARCnet
 
Millward - We cannot put this off any longer - upload.pptx
Millward - We cannot put this off any longer - upload.pptxMillward - We cannot put this off any longer - upload.pptx
Millward - We cannot put this off any longer - upload.pptx
WARCnet
 
Balbi_Keynote_AarhusWARCnet.pptx
Balbi_Keynote_AarhusWARCnet.pptxBalbi_Keynote_AarhusWARCnet.pptx
Balbi_Keynote_AarhusWARCnet.pptx
WARCnet
 
Reporting from a Short-Term Network Stay at the BnF and INA
Reporting from a Short-Term Network Stay at the BnF and INAReporting from a Short-Term Network Stay at the BnF and INA
Reporting from a Short-Term Network Stay at the BnF and INA
WARCnet
 
Post WARCnet
Post WARCnetPost WARCnet
Post WARCnet
WARCnet
 
The WARCnet Code Book of web archive data formats
The WARCnet Code Book of web archive data formatsThe WARCnet Code Book of web archive data formats
The WARCnet Code Book of web archive data formats
WARCnet
 
Web scraping using semi-automated browsing
 Web scraping using semi-automated browsing Web scraping using semi-automated browsing
Web scraping using semi-automated browsing
WARCnet
 
Working Group 6 discussion
Working Group 6 discussionWorking Group 6 discussion
Working Group 6 discussion
WARCnet
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experiment
WARCnet
 
Working Group 2 on transnational events
Working Group 2 on transnational eventsWorking Group 2 on transnational events
Working Group 2 on transnational events
WARCnet
 
Web Archive Research Skills and Tools Survey (WARST)
 Web Archive Research Skills and Tools Survey (WARST) Web Archive Research Skills and Tools Survey (WARST)
Web Archive Research Skills and Tools Survey (WARST)
WARCnet
 
Whose Archives? Reflections on ethics and the cultural significance of web ar...
Whose Archives? Reflections on ethics and the cultural significance of web ar...Whose Archives? Reflections on ethics and the cultural significance of web ar...
Whose Archives? Reflections on ethics and the cultural significance of web ar...
WARCnet
 

More from WARCnet (20)

Gauditz & Kunze, Web archives as research data FINAL.pptx
Gauditz & Kunze, Web archives as research data FINAL.pptxGauditz & Kunze, Web archives as research data FINAL.pptx
Gauditz & Kunze, Web archives as research data FINAL.pptx
 
Gauditz & Kunze, Web archives as research data FINAL.pptx
Gauditz & Kunze, Web archives as research data FINAL.pptxGauditz & Kunze, Web archives as research data FINAL.pptx
Gauditz & Kunze, Web archives as research data FINAL.pptx
 
2022 Visit Royal Danish Library Ditte Laursen.pdf
2022 Visit Royal Danish Library Ditte Laursen.pdf2022 Visit Royal Danish Library Ditte Laursen.pdf
2022 Visit Royal Danish Library Ditte Laursen.pdf
 
20221015 introduction to panel Ditte Laursen.pdf
20221015 introduction to panel  Ditte Laursen.pdf20221015 introduction to panel  Ditte Laursen.pdf
20221015 introduction to panel Ditte Laursen.pdf
 
WARCnet_2022.pptx
WARCnet_2022.pptxWARCnet_2022.pptx
WARCnet_2022.pptx
 
WARCnet conference - Mapping social media archiving initiatives.pptx
WARCnet conference - Mapping social media archiving initiatives.pptxWARCnet conference - Mapping social media archiving initiatives.pptx
WARCnet conference - Mapping social media archiving initiatives.pptx
 
Warcnet 2022_final.pptx
Warcnet 2022_final.pptxWarcnet 2022_final.pptx
Warcnet 2022_final.pptx
 
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdfMaemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
 
Hegarty-WARCNet2022-slides.pdf
Hegarty-WARCNet2022-slides.pdfHegarty-WARCNet2022-slides.pdf
Hegarty-WARCNet2022-slides.pdf
 
Millward - We cannot put this off any longer - upload.pptx
Millward - We cannot put this off any longer - upload.pptxMillward - We cannot put this off any longer - upload.pptx
Millward - We cannot put this off any longer - upload.pptx
 
Balbi_Keynote_AarhusWARCnet.pptx
Balbi_Keynote_AarhusWARCnet.pptxBalbi_Keynote_AarhusWARCnet.pptx
Balbi_Keynote_AarhusWARCnet.pptx
 
Reporting from a Short-Term Network Stay at the BnF and INA
Reporting from a Short-Term Network Stay at the BnF and INAReporting from a Short-Term Network Stay at the BnF and INA
Reporting from a Short-Term Network Stay at the BnF and INA
 
Post WARCnet
Post WARCnetPost WARCnet
Post WARCnet
 
The WARCnet Code Book of web archive data formats
The WARCnet Code Book of web archive data formatsThe WARCnet Code Book of web archive data formats
The WARCnet Code Book of web archive data formats
 
Web scraping using semi-automated browsing
 Web scraping using semi-automated browsing Web scraping using semi-automated browsing
Web scraping using semi-automated browsing
 
Working Group 6 discussion
Working Group 6 discussionWorking Group 6 discussion
Working Group 6 discussion
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experiment
 
Working Group 2 on transnational events
Working Group 2 on transnational eventsWorking Group 2 on transnational events
Working Group 2 on transnational events
 
Web Archive Research Skills and Tools Survey (WARST)
 Web Archive Research Skills and Tools Survey (WARST) Web Archive Research Skills and Tools Survey (WARST)
Web Archive Research Skills and Tools Survey (WARST)
 
Whose Archives? Reflections on ethics and the cultural significance of web ar...
Whose Archives? Reflections on ethics and the cultural significance of web ar...Whose Archives? Reflections on ethics and the cultural significance of web ar...
Whose Archives? Reflections on ethics and the cultural significance of web ar...
 

Recently uploaded

Presentation agenda of three-day conference
Presentation agenda of three-day conferencePresentation agenda of three-day conference
Presentation agenda of three-day conference
bernadettalaurentia1
 
一比一原版(vancouver学位证书)加拿大温哥华岛大学毕业证如何办理
一比一原版(vancouver学位证书)加拿大温哥华岛大学毕业证如何办理一比一原版(vancouver学位证书)加拿大温哥华岛大学毕业证如何办理
一比一原版(vancouver学位证书)加拿大温哥华岛大学毕业证如何办理
eagxaf
 
Gamify it until you make it Improving Agile Development and Operations with ...
Gamify it until you make it  Improving Agile Development and Operations with ...Gamify it until you make it  Improving Agile Development and Operations with ...
Gamify it until you make it Improving Agile Development and Operations with ...
Ben Linders
 
Legislation And Regulations For Import, Manufacture,.pptx
Legislation And Regulations For Import, Manufacture,.pptxLegislation And Regulations For Import, Manufacture,.pptx
Legislation And Regulations For Import, Manufacture,.pptx
Charmi13
 
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPEACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
Charmi13
 
一比一原版多伦多都会大学毕业证(TMU毕业证书)学历如何办理
一比一原版多伦多都会大学毕业证(TMU毕业证书)学历如何办理一比一原版多伦多都会大学毕业证(TMU毕业证书)学历如何办理
一比一原版多伦多都会大学毕业证(TMU毕业证书)学历如何办理
vfuvxao
 
一比一原版昆士兰大学毕业证(UQ毕业证书)学历如何办理
一比一原版昆士兰大学毕业证(UQ毕业证书)学历如何办理一比一原版昆士兰大学毕业证(UQ毕业证书)学历如何办理
一比一原版昆士兰大学毕业证(UQ毕业证书)学历如何办理
mamekyn
 
Genesis chapter 3 Isaiah Scudder.pptx
Genesis    chapter 3 Isaiah Scudder.pptxGenesis    chapter 3 Isaiah Scudder.pptx
Genesis chapter 3 Isaiah Scudder.pptx
FamilyWorshipCenterD
 
Cybersecurity Presentation PowerPoint!!!
Cybersecurity Presentation PowerPoint!!!Cybersecurity Presentation PowerPoint!!!
Cybersecurity Presentation PowerPoint!!!
arichardson21686
 
Proposal: The Ark Project and The BEEP Inc
Proposal: The Ark Project and The BEEP IncProposal: The Ark Project and The BEEP Inc
Proposal: The Ark Project and The BEEP Inc
Raheem Muhammad
 
Call Girls In Bangalore 7339748667 available hotel and home full enjoy
Call Girls In Bangalore 7339748667  available hotel and home full enjoyCall Girls In Bangalore 7339748667  available hotel and home full enjoy
Call Girls In Bangalore 7339748667 available hotel and home full enjoy
akbard9823
 
2023 Ukraine Crisis Media Center Financial Report
2023 Ukraine Crisis Media Center Financial Report2023 Ukraine Crisis Media Center Financial Report
2023 Ukraine Crisis Media Center Financial Report
UkraineCrisisMediaCenter
 
一比一原版(unc毕业证书)美国北卡罗来纳大学教堂山分校毕业证如何办理
一比一原版(unc毕业证书)美国北卡罗来纳大学教堂山分校毕业证如何办理一比一原版(unc毕业证书)美国北卡罗来纳大学教堂山分校毕业证如何办理
一比一原版(unc毕业证书)美国北卡罗来纳大学教堂山分校毕业证如何办理
gfysze
 
Bridging the visual gap between cultural heritage and digital scholarship
Bridging the visual gap between cultural heritage and digital scholarshipBridging the visual gap between cultural heritage and digital scholarship
Bridging the visual gap between cultural heritage and digital scholarship
Inesm9
 
Prsentation for VIVA Welike project 1semester.pptx
Prsentation for VIVA Welike project 1semester.pptxPrsentation for VIVA Welike project 1semester.pptx
Prsentation for VIVA Welike project 1semester.pptx
prafulpawar29
 
2023 Ukraine Crisis Media Center Finance Balance
2023 Ukraine Crisis Media Center Finance Balance2023 Ukraine Crisis Media Center Finance Balance
2023 Ukraine Crisis Media Center Finance Balance
UkraineCrisisMediaCenter
 
Kalyan chart satta matka guessing result
Kalyan chart satta matka guessing resultKalyan chart satta matka guessing result
Kalyan chart satta matka guessing result
sanammadhu484
 
Public Art Is (Re)connection: people, heritage and spaces
Public Art Is (Re)connection: people, heritage and spacesPublic Art Is (Re)connection: people, heritage and spaces
Public Art Is (Re)connection: people, heritage and spaces
Marta Pucciarelli
 
2023 Ukraine Crisis Media Center Annual Report
2023 Ukraine Crisis Media Center Annual Report2023 Ukraine Crisis Media Center Annual Report
2023 Ukraine Crisis Media Center Annual Report
UkraineCrisisMediaCenter
 
Data Processing in PHP - PHPers 2024 Poznań
Data Processing in PHP - PHPers 2024 PoznańData Processing in PHP - PHPers 2024 Poznań
Data Processing in PHP - PHPers 2024 Poznań
Norbert Orzechowicz
 

Recently uploaded (20)

Presentation agenda of three-day conference
Presentation agenda of three-day conferencePresentation agenda of three-day conference
Presentation agenda of three-day conference
 
一比一原版(vancouver学位证书)加拿大温哥华岛大学毕业证如何办理
一比一原版(vancouver学位证书)加拿大温哥华岛大学毕业证如何办理一比一原版(vancouver学位证书)加拿大温哥华岛大学毕业证如何办理
一比一原版(vancouver学位证书)加拿大温哥华岛大学毕业证如何办理
 
Gamify it until you make it Improving Agile Development and Operations with ...
Gamify it until you make it  Improving Agile Development and Operations with ...Gamify it until you make it  Improving Agile Development and Operations with ...
Gamify it until you make it Improving Agile Development and Operations with ...
 
Legislation And Regulations For Import, Manufacture,.pptx
Legislation And Regulations For Import, Manufacture,.pptxLegislation And Regulations For Import, Manufacture,.pptx
Legislation And Regulations For Import, Manufacture,.pptx
 
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPEACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
 
一比一原版多伦多都会大学毕业证(TMU毕业证书)学历如何办理
一比一原版多伦多都会大学毕业证(TMU毕业证书)学历如何办理一比一原版多伦多都会大学毕业证(TMU毕业证书)学历如何办理
一比一原版多伦多都会大学毕业证(TMU毕业证书)学历如何办理
 
一比一原版昆士兰大学毕业证(UQ毕业证书)学历如何办理
一比一原版昆士兰大学毕业证(UQ毕业证书)学历如何办理一比一原版昆士兰大学毕业证(UQ毕业证书)学历如何办理
一比一原版昆士兰大学毕业证(UQ毕业证书)学历如何办理
 
Genesis chapter 3 Isaiah Scudder.pptx
Genesis    chapter 3 Isaiah Scudder.pptxGenesis    chapter 3 Isaiah Scudder.pptx
Genesis chapter 3 Isaiah Scudder.pptx
 
Cybersecurity Presentation PowerPoint!!!
Cybersecurity Presentation PowerPoint!!!Cybersecurity Presentation PowerPoint!!!
Cybersecurity Presentation PowerPoint!!!
 
Proposal: The Ark Project and The BEEP Inc
Proposal: The Ark Project and The BEEP IncProposal: The Ark Project and The BEEP Inc
Proposal: The Ark Project and The BEEP Inc
 
Call Girls In Bangalore 7339748667 available hotel and home full enjoy
Call Girls In Bangalore 7339748667  available hotel and home full enjoyCall Girls In Bangalore 7339748667  available hotel and home full enjoy
Call Girls In Bangalore 7339748667 available hotel and home full enjoy
 
2023 Ukraine Crisis Media Center Financial Report
2023 Ukraine Crisis Media Center Financial Report2023 Ukraine Crisis Media Center Financial Report
2023 Ukraine Crisis Media Center Financial Report
 
一比一原版(unc毕业证书)美国北卡罗来纳大学教堂山分校毕业证如何办理
一比一原版(unc毕业证书)美国北卡罗来纳大学教堂山分校毕业证如何办理一比一原版(unc毕业证书)美国北卡罗来纳大学教堂山分校毕业证如何办理
一比一原版(unc毕业证书)美国北卡罗来纳大学教堂山分校毕业证如何办理
 
Bridging the visual gap between cultural heritage and digital scholarship
Bridging the visual gap between cultural heritage and digital scholarshipBridging the visual gap between cultural heritage and digital scholarship
Bridging the visual gap between cultural heritage and digital scholarship
 
Prsentation for VIVA Welike project 1semester.pptx
Prsentation for VIVA Welike project 1semester.pptxPrsentation for VIVA Welike project 1semester.pptx
Prsentation for VIVA Welike project 1semester.pptx
 
2023 Ukraine Crisis Media Center Finance Balance
2023 Ukraine Crisis Media Center Finance Balance2023 Ukraine Crisis Media Center Finance Balance
2023 Ukraine Crisis Media Center Finance Balance
 
Kalyan chart satta matka guessing result
Kalyan chart satta matka guessing resultKalyan chart satta matka guessing result
Kalyan chart satta matka guessing result
 
Public Art Is (Re)connection: people, heritage and spaces
Public Art Is (Re)connection: people, heritage and spacesPublic Art Is (Re)connection: people, heritage and spaces
Public Art Is (Re)connection: people, heritage and spaces
 
2023 Ukraine Crisis Media Center Annual Report
2023 Ukraine Crisis Media Center Annual Report2023 Ukraine Crisis Media Center Annual Report
2023 Ukraine Crisis Media Center Annual Report
 
Data Processing in PHP - PHPers 2024 Poznań
Data Processing in PHP - PHPers 2024 PoznańData Processing in PHP - PHPers 2024 Poznań
Data Processing in PHP - PHPers 2024 Poznań
 

What’s in a URL? Analysing COVID-19 web archive collections

  • 1. What’s in a URL? Analysing COVID-19 web archive collections November 2021, Aarhus Karin de Wild, Friedel Geeraert
  • 2. Methodology • Datathon (2) • Test hypotheses (WARCnet interviews) • Test data quality
  • 3. Online press Data quality + scope (Inter)national scope Social media
  • 4. “All initiatives except for the IIPC had a national focus.” Hypothesis (from WARCnet interviews with Web archivists) (Inter)national scope
  • 5. (Inter)national scope of Covid-19 special collections Countries represented in Covid-19 special collections. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. BnF : Bibliothèque nationale de France BnL : Bibliothèque nationale du Luxembourg IIPC : International Internet Preservation Consortium KB : Koninklijke Bibliotheek, The Netherlands NSL : National Széchényi Library, Hungary RDL : Royal Danish Library UKWA : UK Web archive
  • 6. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. BnF : Bibliothèque nationale de France BnL : Bibliothèque nationale du Luxembourg IIPC : International Internet Preservation Consortium KB : Koninklijke Bibliotheek, The Netherlands NSL : National Széchényi Library, Hungary RDL : Royal Danish Library UKWA : UK Web archive (Inter)national scope of Covid-19 special collections Countries represented in Covid-19 special collections.
  • 7. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. BnF : Bibliothèque nationale de France BnL : Bibliothèque nationale du Luxembourg IIPC : International Internet Preservation Consortium KB : Koninklijke Bibliotheek, The Netherlands NSL : National Széchényi Library, Hungary RDL : Royal Danish Library UKWA : UK Web archive (Inter)national scope of Covid-19 special collections Countries represented in Covid-19 special collections.
  • 8. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. BnF : Bibliothèque nationale de France BnL : Bibliothèque nationale du Luxembourg IIPC : International Internet Preservation Consortium KB : Koninklijke Bibliotheek, The Netherlands NSL : National Széchényi Library, Hungary RDL : Royal Danish Library UKWA : UK Web archive (Inter)national scope of Covid-19 special collections Countries represented in Covid-19 special collections.
  • 9. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. BnF : Bibliothèque nationale de France BnL : Bibliothèque nationale du Luxembourg IIPC : International Internet Preservation Consortium KB : Koninklijke Bibliotheek, The Netherlands NSL : National Széchényi Library, Hungary RDL : Royal Danish Library UKWA : UK Web archive (Inter)national scope of Covid-19 special collections Countries represented in Covid-19 special collections.
  • 10. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. BnF : Bibliothèque nationale de France BnL : Bibliothèque nationale du Luxembourg IIPC : International Internet Preservation Consortium KB : Koninklijke Bibliotheek, The Netherlands NSL : National Széchényi Library, Hungary RDL : Royal Danish Library UKWA : UK Web archive (Inter)national scope of Covid-19 special collections Countries represented in Covid-19 special collections.
  • 11. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. BnF : Bibliothèque nationale de France BnL : Bibliothèque nationale du Luxembourg IIPC : International Internet Preservation Consortium KB : Koninklijke Bibliotheek, The Netherlands NSL : National Széchényi Library, Hungary RDL : Royal Danish Library UKWA : UK Web archive (Inter)national scope of Covid-19 special collections Countries represented in Covid-19 special collections.
  • 12. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. The (inter)national focus of Web archives differs. BnF : Bibliothèque nationale de France BnL : Bibliothèque nationale du Luxembourg IIPC : International Internet Preservation Consortium KB : Koninklijke Bibliotheek, The Netherlands NSL : National Széchényi Library, Hungary RDL : Royal Danish Library UKWA : UK Web archive (Inter)national scope of Covid-19 special collections Countries represented in Covid-19 special collections.
  • 13. Size Covid-19 special collections Total unique domain names collected by each Web archive. National websites in Covid-19 special collections Percentage of national websites that are represented in Covid-19 special collections in each Web archive. Note: A national website is a unique domain name with the corresponding country code top-level domain (ccTLD). For example: Of the 1.482 unique domain names in UKWA, 1.051 domain names end with “.uk” (the country code for the United Kingdom). Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
  • 14. National websites in Covid-19 special collections (relative) National websites represented in each Web archive. Size Covid-19 special collections Total unique domain names collected by each Web archive. Most Web archives do have a national focus... Note: A national website is a unique domain name with the corresponding country code top-level domain (ccTLD). For example: Of the 1.482 unique domain names in UKWA, 1.051 domain names end with “.uk” (the country code for the United Kingdom). Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
  • 15. National websites in Covid-19 special collections (relative) National websites represented in each Web archive. Size Covid-19 special collections Total unique domain names collected by each Web archive. Note: A national website is a unique domain name with the corresponding country code top-level domain (ccTLD). For example: Of the 1.482 unique domain names in UKWA, 1.051 domain names end with “.uk” (the country code for the United Kingdom). Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. … The IIPC is the only collection without a national focus.
  • 16. Data quality + scope Critical reflections on the data quality to further improve the dataset and refine (the scope of the) research questions.
  • 17. www.webarchive.org.uk (ccTLD .uk = United Kingdom)
  • 18. Incomplete data Countries represented in the IIPC collection (using variable ‘ Country’, derived from ccTLD) . Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
  • 19. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. Incomplete data Countries represented in the IIPC collection (using variable ‘ Country’, derived from ccTLD) .
  • 20. Not all countries of publication can be derived from the ccTLD. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. Incomplete data Countries represented in the IIPC collection (using variable ‘ Country’, derived from ccTLD) .
  • 21. Incomplete data Size of national collections in our dataset (excl. IIPC collection). Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021.
  • 22. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. 6 of the 44 European countries are represented within our dataset Incomplete data Size of national collections in our dataset (excl. IIPC collection).
  • 23. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. BnF (France) collected most unique domain names. Incomplete data Size of national collections in our dataset (excl. IIPC collection).
  • 24. Incomplete data Size of national collections in the IIPC collection. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. Also IIPC is included in our dataset.
  • 25. Incomplete data Size of national collections in the IIPC collection. Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. Do we have information about who contributed to this collection?
  • 26. International scope of the dataset Source: Susan Aasman, Nicola Bingham, Niels Brügger, Karin de Wild, Sophie Gebeil and Valérie Schafer, "Dataset COVID-19 Special Collections ", 2021. Those who do not see themselves reflected in national heritage are excluded from it. Stuart Hall “ “ More data is needed to gain further insights in cultural bias.
  • 27. “The BnL collections should hold many news articles.” Hypothesis (from WARCnet interviews with Web archivists) Online press
  • 28. 96.38% 3.62% Seed list of BnL: media versus other actor categories News Media, International News Media, Media Other categories
  • 29. Seed list of BnL: categories other than media Twitter Government Non-profit, mutual aid Blog Commercial, Business Facebook Healthcare Youtube, Vimeo Public Institution Local Government Festival Religion Political Party University Livetickers Education Professional Chamber Scientific Research Foundation Advisory commission Foreign government European Union Podcast Economic Interest Group Ombudsman Soundcloud Wikipedia
  • 30. Data quality + scope Changes over time Another limitation of the data is that there is no variable ‘timestamp’, while both the collections and the websites themselves change over time.
  • 31. 0 1 2 3 4 5 6 7 March April May June July August Evolution of seeds added to the collection at the BnL Series1 Series2 Series3 Lack of timestamp
  • 32. “Social media were excluded from the IIPC collaborative collection.” Hypothesis (from WARCnet interviews with Web archivists) Social media
  • 33. 0 166 172 0 37 44 0 26 26 0 20 40 60 80 100 120 140 160 180 200 Interview Seed list Archive-It Youtube Twitter Facebook Social media in IIPC collaborative COVID-19 collection Different levels to analyse Crawl scope 1 = Full seed host or directory 2 = Crawl one page only 3 = Seed page plus 1 click of all links on seed page Number of seeds on Youtube, Twitter and Facebook.com domains
  • 34. Data quality + scope Crawl scope The seed list is a starting point and does not offer an exhaustive overview of the collection itself.
  • 35. Next steps: • Improve figures and tables • WARCnet paper
  • 36. Karin de Wild, Friedel Geeraert, Jane Winters, Nicola Bingham, Niels Brügger, Frédéric Clavert, Sophie Gebeil, Federico Nanni, Caroline Nyvang, Valérie Schafer, Helle Strandgaard Jensen, Katherina Schmid Thank you!