Presentada en la Jornada Internacional sobre Archivos Web y Depósito Legal Electrónico, en la Biblioteca Nacional de España (BNE), el día 9 de julio de 2013.
Presentada en la Jornada Internacional sobre Archivos Web y Depósito Legal Electrónico, en la Biblioteca Nacional de España (BNE), el día 9 de julio de 2013.
Clare Lanigan - Presentation to IES Studentsdri_ireland
Presentation given by Clare Lanigan, DRI Education and Outreach Manager, to students of the School of Information and Library Science, University of North Carolina, at the Institute for the International Education of Students (IES) Abroad centre in Rathmines, Dublin, on 1 June 2017.
Presentation "Digitisation at KU Leuven University Libraries: Towards consolidation" by Nele Gabriëls, KU Leuven, at IMPACT Members' Meeting 2017. http://bib.kuleuven.be/ub
Presentada en la Asamblea General del IIPC desde el 30 de abril hasta el 4 de mayo de 2012. en Washington, donde participó la Biblioteca Nacional de España (BNE).
Presentada en la Jornada Internacional sobre Archivos Web y Depósito Legal Electrónico, en la Biblioteca Nacional de España (BNE), el día 9 de julio de 2013.
Clare Lanigan - Presentation to IES Studentsdri_ireland
Presentation given by Clare Lanigan, DRI Education and Outreach Manager, to students of the School of Information and Library Science, University of North Carolina, at the Institute for the International Education of Students (IES) Abroad centre in Rathmines, Dublin, on 1 June 2017.
Presentation "Digitisation at KU Leuven University Libraries: Towards consolidation" by Nele Gabriëls, KU Leuven, at IMPACT Members' Meeting 2017. http://bib.kuleuven.be/ub
Presentada en la Asamblea General del IIPC desde el 30 de abril hasta el 4 de mayo de 2012. en Washington, donde participó la Biblioteca Nacional de España (BNE).
LoCloud: Local Content in a Europeana Cloudlocloud
IMCW 2013 Conference
Presentation on LoCloud by B. Yılmaz, Ö. Külcü, Y. Ünal & T. Çakmak, Hacettepe University, Turkey
4-6 September 2013
Limerick, Ireland.
The unique value of cultural heritage has long been recognized together with the need for accurate and detailed information in order to preserve and manage cultural heritage material. Any organization whose mission includes promoting access to information is aware of the value of digital collections. For the last few years, digital technology has become very familiar in cultural organizations, providing enhanced access to the content. This paper gives information about Ktisis (http://ktisis.cut.ac.cy), the institutional repository of the Cyprus University of Technology (CUT). Ktisis was developed by the Library and Information Services of CUT. The paper reflects on the technical issues that the Library had to face in the preparation of this project and the strategy that had to be defined in order to tackle them. Such issues, among others, include the file and metadata format, the design and implementation software, etc.
A presentation about web archiving projects end-user perspective review, as well about web archiving in Serbia, presented at VIII National conference of National center for digitization, Belgrade, Serbia, April 16, 2009.
Digital Curation and Preservation: Defining the Research Agenda for the Next Decade [2005-2015]: Warwick3 -How did we do?
The Warwick3 Workshop: Digital Preservation and Curation Summing up + Next Steps available now on Slideshare is the eighth of 12 presentations I’ve selected to mark 20 years in Digital Preservation. The remainder will be published at monthly intervals over 2015.
I’ve chosen it as it briefly allows us to look back at aspirations and achievements in Digital Preservation over a 20 year period from the very first (and seminal) Warwick 1 workshop held in 1995 to today. The first Warwick workshop considered the Long Term Preservation of Electronic Materials and a UK response to the final report of the RLG/CPA Task Force on Digital Archiving. Two further Warwick workshops followed in 1999 and 2005 to review progress and set a forward agenda.
The two-day workshop that took place over 7 - 8 November 2005 at the University of Warwick aimed for the first time to address digital preservation issues for both scientific data and cultural heritage and to map out a future research agenda for them. Sponsored by JISC, the Digital Curation Centre (DCC), the British Library and the Council for the Central Laboratory of the Research Councils (CCLRC), the invitation-only event drew a wide range of national and international experts to explore the current state of play with a view to shaping future strategy. The slides are from my summing up and conclusions at the workshop close.
Part of my conclusions (slides 12-13), outlined the recommendations of the previous Warwick workshop held in 1999 and reviewed the progress that had been made in implementing them over the subsequent five years with a very subjective level of achievement √ (some) to √ √ √ (good).
Topics
● MediaMosa and the Archipel Project
● MediaMosa for Archives / eDepot
● MediaMosa and WCAG2 Compliancy
● MediaMosa and SURFconext
● MediaMosa and Clouds
This slideshare, Maintaining a Vision: how mandates and strategies are changing with digital content, is one of the 12 that I like most and is a keynote given to the 2013 Screening the Future conference in London.
It is the penultimate of 12 presentations I have selected to mark 20 years in Digital Preservation. The final one to come will be published in December 2015.
My brief for this conference keynote was to focus on how institutional responses to collection and preservation mandates are realized and stretched by the digital...do existing institutions just 'go digital' but otherwise claim 'business as usual' [or not]?
The Talk had an AV focus given the nature of the conference but I think the messages will be of broad interest. It was in three parts:
The Changes: covering how digital content (including AV content) has changed the nature of typical collections across sectors; how it has shifted the scale of available content; and how content has fragmented and the number of content creators proliferated.
The Responses: covering how we have seen in response the growth of cross-sectoral preservation exchange (different sectoral membership of the DPC; Technology Watch Reports; the national coalitions worldwide such as nestor, NCDD, NDSA, etc); the development of shared services and outsourcing (e.g. digital preservation services in the cloud); and in some cases a range of cross-sector mergers (particularly of national archives and national libraries).
Conclusions:
What is changing? We are seeing multi-media permeating sectoral boundaries; greater shared interests and convergence of interests across different sectors; and a massive shift in the scale and management of digital media.
The responses? We are seeing new alliances and partnerships; digital preservation exchange across sectors; some mergers and partnerships across established boundaries; and more shared services and outsourcing.
Finally, if you want to know the answer to the question "When was the beginning of the Digital Age" posed in previous posts, the answer is here in slide 8
Presentada en X Encuentro Internacional de Catalogadores – ABINIA, que tiene por título “Normalización técnica, tecnológica y legislativa de la catalogación”, y que se celebró en La Paz, Bolivia, del 9 al 11 de septiembre de 2015.
LoCloud: Local Content in a Europeana Cloudlocloud
IMCW 2013 Conference
Presentation on LoCloud by B. Yılmaz, Ö. Külcü, Y. Ünal & T. Çakmak, Hacettepe University, Turkey
4-6 September 2013
Limerick, Ireland.
The unique value of cultural heritage has long been recognized together with the need for accurate and detailed information in order to preserve and manage cultural heritage material. Any organization whose mission includes promoting access to information is aware of the value of digital collections. For the last few years, digital technology has become very familiar in cultural organizations, providing enhanced access to the content. This paper gives information about Ktisis (http://ktisis.cut.ac.cy), the institutional repository of the Cyprus University of Technology (CUT). Ktisis was developed by the Library and Information Services of CUT. The paper reflects on the technical issues that the Library had to face in the preparation of this project and the strategy that had to be defined in order to tackle them. Such issues, among others, include the file and metadata format, the design and implementation software, etc.
A presentation about web archiving projects end-user perspective review, as well about web archiving in Serbia, presented at VIII National conference of National center for digitization, Belgrade, Serbia, April 16, 2009.
Digital Curation and Preservation: Defining the Research Agenda for the Next Decade [2005-2015]: Warwick3 -How did we do?
The Warwick3 Workshop: Digital Preservation and Curation Summing up + Next Steps available now on Slideshare is the eighth of 12 presentations I’ve selected to mark 20 years in Digital Preservation. The remainder will be published at monthly intervals over 2015.
I’ve chosen it as it briefly allows us to look back at aspirations and achievements in Digital Preservation over a 20 year period from the very first (and seminal) Warwick 1 workshop held in 1995 to today. The first Warwick workshop considered the Long Term Preservation of Electronic Materials and a UK response to the final report of the RLG/CPA Task Force on Digital Archiving. Two further Warwick workshops followed in 1999 and 2005 to review progress and set a forward agenda.
The two-day workshop that took place over 7 - 8 November 2005 at the University of Warwick aimed for the first time to address digital preservation issues for both scientific data and cultural heritage and to map out a future research agenda for them. Sponsored by JISC, the Digital Curation Centre (DCC), the British Library and the Council for the Central Laboratory of the Research Councils (CCLRC), the invitation-only event drew a wide range of national and international experts to explore the current state of play with a view to shaping future strategy. The slides are from my summing up and conclusions at the workshop close.
Part of my conclusions (slides 12-13), outlined the recommendations of the previous Warwick workshop held in 1999 and reviewed the progress that had been made in implementing them over the subsequent five years with a very subjective level of achievement √ (some) to √ √ √ (good).
Topics
● MediaMosa and the Archipel Project
● MediaMosa for Archives / eDepot
● MediaMosa and WCAG2 Compliancy
● MediaMosa and SURFconext
● MediaMosa and Clouds
This slideshare, Maintaining a Vision: how mandates and strategies are changing with digital content, is one of the 12 that I like most and is a keynote given to the 2013 Screening the Future conference in London.
It is the penultimate of 12 presentations I have selected to mark 20 years in Digital Preservation. The final one to come will be published in December 2015.
My brief for this conference keynote was to focus on how institutional responses to collection and preservation mandates are realized and stretched by the digital...do existing institutions just 'go digital' but otherwise claim 'business as usual' [or not]?
The Talk had an AV focus given the nature of the conference but I think the messages will be of broad interest. It was in three parts:
The Changes: covering how digital content (including AV content) has changed the nature of typical collections across sectors; how it has shifted the scale of available content; and how content has fragmented and the number of content creators proliferated.
The Responses: covering how we have seen in response the growth of cross-sectoral preservation exchange (different sectoral membership of the DPC; Technology Watch Reports; the national coalitions worldwide such as nestor, NCDD, NDSA, etc); the development of shared services and outsourcing (e.g. digital preservation services in the cloud); and in some cases a range of cross-sector mergers (particularly of national archives and national libraries).
Conclusions:
What is changing? We are seeing multi-media permeating sectoral boundaries; greater shared interests and convergence of interests across different sectors; and a massive shift in the scale and management of digital media.
The responses? We are seeing new alliances and partnerships; digital preservation exchange across sectors; some mergers and partnerships across established boundaries; and more shared services and outsourcing.
Finally, if you want to know the answer to the question "When was the beginning of the Digital Age" posed in previous posts, the answer is here in slide 8
Presentada en X Encuentro Internacional de Catalogadores – ABINIA, que tiene por título “Normalización técnica, tecnológica y legislativa de la catalogación”, y que se celebró en La Paz, Bolivia, del 9 al 11 de septiembre de 2015.
Presentada en la Jornada Internacional sobre Archivos Web y Depósito Legal Electrónico, en la Biblioteca Nacional de España (BNE), el día 9 de julio de 2013.
Presentada en la jornada "Normas técnicas nacionales e internacionales para bibliotecas", organizada por la Biblioteca Nacional de España y AENOR, que tuvo lugar el 12 de febrero de 2015 en la Biblioteca Nacional de España
IWMW 2006: Archiving the Web What can Institutions learn from National and In...IWMW
Slides used in workshop session B5 on "Archiving the Web What can Institutions learn from National and International Web Archiving Initiatives" at the IWMW 2006 event held at the University of Bath on 14 - 16 June 2006.
See http://www.ukoln.ac.uk/web-focus/events/workshops/webmaster-2006/sessions/pennock/
Web Preservation, or Managing your Organisation’s Online Presence After the O...lisbk
Slides for talk on "Web Preservation, or Managing your Organisation’s Online Presence After the Organisation Ceases to Exist" given by Brian Kelly, UK Web Focus at the IRMS 2016 conference in Brighton on 17 May 2016.
See http://ukwebfocus.com/events/irms-2016-web-preservation
Prospects and pitfalls in using web archives for researchPeter Webster
A lecture given at the Moore Institute at the National University of Ireland Galway. It lays out the case for archiving the web as a source for future scholarly enquiry; examines the state of play of web archiving in Ireland; outlines the broad use cases for the archived web; and presents results from research into creationism on the web in the UK and in Ireland.
Archiving Web-Based #musetech for Institutional MemorySamantha Norling
Museum websites, blog and social media posts, gallery interactives, dashboards and microsites—these and other web-based content created by museum technologists contain a wealth of information about our institutions. Documenting everything from collections and exhibitions to public programs and staff activities, content created and shared on the web forms a vital part of a museum's institutional memory shared by its staff, audiences, and the communities of which it is a part.
While we'd like to think that web-based content and applications will live forever, the reality is that they often have a predetermined (or worse, unexpectedly shortened) active life on the web. Whether tied to a temporary exhibition or event, superseded by more current content, replaced by newer technologies, or fallen to technical obsolescence, retired web-based content can and should be archived for continued access to information in context.
This session will provide an overview of the web archiving landscape (best practices, available tools and resources, relevant initiatives). Web archiving activities of the Newfields Lab--in collaboration with Newfields Archives--will serve as case study. To date, the Newfields web archives include imamuseum.org, various blogs, the IMA Dashboard, and exhibition-related interactives and microsites--content which now serves a variety of uses as archives.
NORFest 2023 Lightning Talks Session Three dri_ireland
Lightning Talk Session 3: Enabling FAIR Research Data and Other Outputs
The Irish ORCID Consortium
presented by Catherine Ferris, IReL;
Exploring Large-Scale Open Data: The Curatr Platform
presented by Derek Greene, University College Dublin;
A Workflow for Research Data Management (RDM): Aligning the Management of Research Data
presented by Gail Birkbeck, University College Dublin;
Making Cultural Heritage Data FAIR: Developing Recommendations for the WorldFAIR Project at the Digital Repository of Ireland
presented by Joan Murphy, Digital Repository of Ireland.
Investigating the PROMISE of a Belgian web archive Sally Chambers
Presentation held (remotely) at: The "Web Archiving: Best Practices for Digital Cultural Heritage" international conference is organized by The National Library of Israel and the Open Media and Information Lab (OMILab) at the Open University of Israel. (http://webarchiving2018.nli.org.il)
The Belgian web is not currently systematically archived. As a result, there is a considerable risk that a significant portion of Belgian contemporary history will be lost forever. To prevent this, the Belgian Science Policy Office (BELSPO) funded the PROMISE (Preserving Online Multiple Information: towards a Belgian Strategy) project The aim of PROMISE is to: (i) identify current best practices in web-archiving (ii) pilot web-archiving in Belgium, including access (and use) for scientific research, and (iii) make recommendations for a sustainable web-archiving service for Belgium. This paper will present the current status of the PROMISE project, including the latest results.
Presentación realizada por Ricardo Santos, miembro del VIAF GDPR Working Group, en la reunión anual de VIAF. La presentación muestra los resultados de una encuesta sobre privacidad de datos de autores en ficheros de autoridad.
Los días 6 y 7 de junio de 2019 la Biblioteca Nacional de España albergó un taller práctico sobre RDA destinado a responsables de proceso técnico de instituciones integradas en el Consejo de Cooperación Bibliotecaria
Los días 6 y 7 de junio de 2019 la Biblioteca Nacional de España albergó un taller práctico sobre RDA destinado a responsables de proceso técnico de instituciones integradas en el Consejo de Cooperación Bibliotecaria
Los días 6 y 7 de junio de 2019 la Biblioteca Nacional de España albergó un taller práctico sobre RDA destinado a responsables de proceso técnico de instituciones integradas en el Consejo de Cooperación Bibliotecaria
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
2. www.bl.uk 2
2001-2002
Explore
Launch
Domain.UK
project
No public
access
Collaborate
2003-2008
Establish Web
Archiving Programme
Lead UK Web
Archiving Consortium
Launch UK Web
Archive
Build capacity BAU
2008-2011
People, systems and
processes
Curatorial expertise
Technical know-how
2011
Web Archiving as
operational unit
Implement non-print
Legal Deposit since
April 2013
Web Archiving Timeline
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
3. www.bl.uk 3
Before (6 April 2013)
• Selective archiving of websites that
– reflect the diversity of lives, interests and activities throughout the UK
– contain research value or are of research interest
– feature political, cultural, social and economic events of national interest
– demonstrate innovative use of the web
– Also prioritise websites at risk and web-only content
• Permission based
– Permission to archive, to provide online access and to preserve. Also ask
or 3rd party rights clearance
– 30% success rate, 5% explicit refusal (mostly due to 3rd party rights)
• Online access through UK Web Archive
4. www.bl.uk 4
Toolset
• Selection and Permission Tool
– selection and permission management
– Integrated with the Web Curator Tool
• Web Curator Tool
– Job scheduling
– Metadata
– Access control
– Harvesting (uses Heritirx)
– QA
• Indexing and SIP generation – scripts and SOLR (for full-text index)
• Wayback – rendering tool for WARCs
• UK Web Archive – web-based end user interface
5. www.bl.uk 5
Access
•Currently 3 ways to access the web archive
– Online through the UK Web Archive
– Catalogue records (of special collections)
– Keywords search through primo (corporate resource
discovery system)
•Conduct researcher survey / research
projects to understand requirements
8. www.bl.uk 8
UK Web Archive
• 14,118 websites, 60,482
instances, 17.6TB WARCs
• Over 182,761 unique visits 1st
April ‘12 – 31st March ‘13
• Key websites include videos
• Full-text, N-gram, title and
URL search
• Browse by subject / special
collection, visual browsing
• Analytical access
http://www.webarchive.org.uk
9. www.bl.uk 9
Analytical access
• Shift of focus from the level of single webpages or websites to the entire
web archive collection.
• Use web archives as datasets, access to metadata and knowledge
about websites
• Support survey, annotation, contextualisation and visualisation
• Allows discovery of patterns, trends and relationships in inter-linked
web pages
• Helps addresses a number of challenging issues
– Scalability
– Accessibility of individual websites
– Components missed by crawlers
10. www.bl.uk 10
After (6 April 2013)
• Government introduced Non-print Legal Deposit Regulations 2013
• Apply to material published digitally and online, including articles
books, and websites.
• 6 UK Legal Deposit Libraries
• Deposited content accessible “on library premises controlled by the
deposit library”
– after 7 days of collection or deposit
– Single concurrent access
– Catalogue records allowed to be searchable online
– Digital copying not permitted
11. www.bl.uk 11
Legal Deposit of UK websites
• In scope
– Sites that use a .uk or other UK geographic top-level domain
– where part of the publishing process takes place in the UK;
• Will not archive
– sites concerning film and recorded sound where the audio-visual
content predominates
– private intranets and emails
• Over 10 million .uk registered domains
– 4th TLD after .com, .de and .net
– UK organisations also use non .uk domain names (eg .com or .org)
– scale unknown
12. www.bl.uk 12
Domain Crawl
News
S
p
e
c
i
a
l
c
o
l
l
e
c
t
i
o
n
S
p
e
c
i
a
l
c
o
l
l
e
c
t
i
o
n
Domain crawl:
• Broad
sweep of
UK domain
• Once or
twice a year
Events & key
sites and news:
• Events of
UK interest
• High value,
high impact
sites
• National &
regional
news
Special
Collection:
• Focused,
thematic
collections
• Support
priority
subjects
Key sitesEvents
S
p
e
c
i
a
l
c
o
l
l
e
c
t
i
o
n
S
p
e
c
i
a
l
c
o
l
l
e
c
t
i
o
n
Collecting strategy
13. www.bl.uk 13
Access strategy
• Deposited content cannot be accessed outside the reading
rooms.
• Online access can be provided to metadata and selected content
to showcase the Legal Deposit web archive of the UK
– Bibliographic metadata
– Analysis and visualisation of aggregated content
– Statistical and contextual data
– Copy of deposited content with direct permission
• For sites from outside the UK, permission both to harvest and for
public access will be required
14. www.bl.uk 14
Before and after: what has changed
• Everything!
BEFORE AFTER
Scale 14,000 4 – 5 million
Purpose Advocacy, demonstrating
benefits
Legal Deposit
Workflow (and
tools)
Selection prior to harvesting Selection / curation can happen post
harvesting
Permission to
archive
Required Can collect in-scope material without
permission
Access Online Reading rooms only (unless with direct
permission for online access)
Nature of QA Quality control leading to
deselection
Flagging up quality issues
Ownership British Library Legal Deposit Libraries
15. www.bl.uk 15
Progress
• Experimental domain crawl in August-December 2012, no access
– Started with 4.8 million seeds
– Collected 27TB data +1TB of crawl logs
• 1st Legal Deposit domain crawl started in April
– Started with 3.8 million seeds
– Ran between 8th April - 21st June and collected over 31TB data
• Focused collection on National Health Service Reform
– Showcase end-to-end processes including ingest and access in
reading room in early July
• Selecting key sites, news site and events