1. Digging into national web
domains across web archives
— how far have we got?
Yves Maurer, Niels Brügger, Anne Helmond
2. How much of a nation’s ccTLD is found in publicly available
web archives or web collections?
Niels: looking at hosts (domain names) and ccTLDs
Anne: looking at HTTP & HTTPs, .js (JavaScript), and multimedia
3. Number of hosts, year by year
One ccTLD, one at a time, the case of .frl)
8. How many of the same hosts can one expect to find throughout the years under study in
the same web collection? — illustrated with .frl in the CommonCrawl collection
9. Diagnosing a national web using MIME types
● HTTP & HTTPs (hypertext transfer protocol secure) → indicative of the
increasing security of a national web
○ Authentication of the accessed website (is the owner who s/he says s/he is?)
○ Protection of the privacy and integrity of the exchanged data (secure data transfer)
● .js (JavaScript) → indicative of the increasing ‘interactivity’ and first-party
tracking of a national web
○ The included (first-party) scripts in a website
● Multimedia → indicative of the ‘multimodality’ of a website
○ Images, audio, video, PDF