• Like
  • Save
Netarchive Suite at the BNE. Juan Carlos García Arratia y Mar Pérez Morillo
Upcoming SlideShare
Loading in...5
×
 

Netarchive Suite at the BNE. Juan Carlos García Arratia y Mar Pérez Morillo

on

  • 352 views

Descripción del sistema de archivado web en la Biblioteca Nacional de España

Descripción del sistema de archivado web en la Biblioteca Nacional de España

Statistics

Views

Total Views
352
Views on SlideShare
352
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Netarchive Suite at the BNE. Juan Carlos García Arratia y Mar Pérez Morillo Netarchive Suite at the BNE. Juan Carlos García Arratia y Mar Pérez Morillo Presentation Transcript

    • Título de la presentación NetarchiveSuite at the BNE Juan Carlos García Arratia – Chief of IT Development Service, NLS Mar Pérez Morillo – Chief of Web Service, NLS IIPC GA Paris 2014, 22nd May 1
    • Título de la presentación Table of contents 1. Starting point 2. Non-print Legal Deposit regulation 3. Agreement with Red.es 4. Why NAS? 5. Installation of test environment 6. First test crawls 7. Specifications and needs 8. What we expect 2
    • Título de la presentación 1. Starting point With Internet Archive (2009-2013): o 8 domain crawls o 2 selective crawls: • General Elections 2011 • Humanities o Total: ± 100 TB Collection delivery (Red.es) 3
    • Título de la presentación Allegations answered • Ministery of Education and Culture • Stakeholders (publishers and content providers) • Public information Last step of the process Enactment expected by the end of the year 4 2. Non-print Legal Deposit Regulation
    • Título de la presentación Strong investment: Network Storage Servers 5 3. Red.es
    • Título de la presentación 4. Why Netarchive Suite?  Other tools considered  Whole lifecycle covered  First workshop on NAS at Vienna  BnF, as a model: o Legal deposit law o Similar starting point with IA o National domain o Size of the French web  Ability of sharing tools and experiences  Community of users and developers (Denmark, Austria)  Modularity 6
    • Título de la presentación 5. Installation of test environment Summer 2013: pilot installation  6 servers 1st crawl: internal administrative network Lack of documentation Problems understanding configuration profiles Need of strong network security 7
    • Título de la presentación 6. First test crawls  Mining Historical Archive (Archivo Histórico Minero) Death of Adolfo Suárez  Death of Gabriel García Márquez 8 URL únicas Tamaño GB ADOLFO SUAREZ 322.722 37,66 ADOLFO ESPECIALES 344.418 9,65 ADOLFO INTERNACIONAL 19.509 0,25 ADOLFO QA 32.326 1,86 TOTAL 718.975 49,42 URL únicas Tamaño GB García Márquez 455.581 7,38
    • Título de la presentación 6.1 Archivo Historico Minero www.archivohistoricominero.org 9
    • Título de la presentación 10 6.2. Death of Adolfo Suárez
    • Título de la presentación 6.3. Ongoing crawls  Regional Governments proposals  European Elections 11
    • Título de la presentación http://bns08.bne.local/HarvestDefinition/Definitions-selective-harvests.jsp 12 NAS at the BNE
    • Título de la presentación 7. Specifications and needs 13
    • Título de la presentación 7.1. National cooperation  Designing a workflow in a collaborative environment  Legal deposit purposes  Using administrative Network  Need of common interface for proposals 14
    • Título de la presentación 7.2. Web content curators at the Library  Internal and external curators  Interface to share  Easier to use for librarians New reason to choose NAS… 15
    • Título de la presentación 16 BCWeb
    • Título de la presentación 8. What we expect 17 Better understanding of templates and configuration Dashboard for Quality Assurance: o to show big figures to monitor crawl status Modularity o Fine tuning