Archivematica
Community update
Dan Gillean
SAA 2016
Atlanta, GA
MeetArchivematica
(hello world!)
WhatisArchivematica?
Archivematica is a web-
and standards-based,
open-source application
which allows your
institution to preserve
long-term access to
trustworthy, authentic
and reliable digital
content.
Standards based
Open source
Customizable
Integrated w 3rd
party systems
Active community
Broadandactiveusercommunity
Community-drivendevelopment
…and many others
20142008
2007: UNESCO REPORT 0.1-ALPHA
DASHBOARD
INTRODUCED
Archivematica’s development
0.7
1.0
RELEASED!0.9
0.8
Bradley, K., Lei, J., Blackall, C.
Towards An Open Source
Archival Repository and
Preservation System (2007)
Planning and development begin.
Initial Funding via UNESCO MotW
Subcommittee, IMF Archives, City of
Vancouver Archives
0.6-ALPHA
February 2010
May 2010
February 2011 February 2012
PREMIS
in
METS
0.10
April 2013
August 2012
STORAGE
SERVICE 0.2
January 2014
CURRENTRELEASE
Archivematica 1.5 June 9, 2016
• AIP reingest:
• AIP versioning – add metadata or generate DIP from AIP after
processing is complete
• ArchivesSpace integration:
• Send DIP object metadata to ArchivesSpace
• AtoM integration enhancement:
• Hierarchical DIP upload
• DIP storage revision:
• Allow users to store DIP after it (or its metadata) has been sent
to the Access system
NEXTRELEASE
Archivematica 1.5.1 August 2016
• CentOS packages!
• Adding rpm packages for Archivematica and its dependencies
• Will be continuing this support through future releases
• Bug fixes
• Problem fetching AtoM levels of description for hierarchical DIP
upload with AtoM 2.3
• AIP re-ingest causing incorrect file count in Archival Storage
• Files with diacritics failing at Assign file UUIDS and checksums
WHAT’S
NEXT?
Okay, so…
UPCOMINGRELEASE
Archivematica 1.6
• New Appraisal tab:
• For analysis of transfer contents and SIP arrangement; advanced ArchivesSpace integration
• File visualizations, content tagging, and Bulk Extractor report analysis
• Transfer Backlog enhancements:
• Search transfers, download transfers or select files from archival storage tab
• Perform transfer delete requests from the archival storage tab
• Full AIP re-ingest
• Send AIPs back to re-run all major preservation micro-services
• Update AIP and version the METS file; define a different processing configuration for re-ingest
• Support for multiple checksum algorithms
• Allow users to select different checksum algorithms per pipeline
late 2016?
Archivematica 1.6 – Appraisal tab
FUTURERELEASES
Post 1.6 / SS 0.9
• DIP upload to AtoM enhancements
• Metadata-only DIP upload
• REST API integration; show AtoM levels of description in Appraisal tab
• WARC file ingest
• Analyze WARC header info and prepare metadata mapping to AIP METS file
• Automated DIP generation workflow
• Add support for re-ingesting uncompressed AIPS to AIP re-ingest
• Add pre-configuration options for DIP upload actions
• Enhance SS callbacks to notify 3rd party apps when a DIP is ready to be used
FUTURERELEASES
Post 1.6 / SS 0.9
• METS Parsing tools and REST API development
• Add JSON-LD responses to public-facing REST API
• Add python-based METSreader library that would live behind the REST service
• Add endpoints for no. of files in storage, formats, etc to API
• Enhanced PRONOM integration
• Provide report of non-identified files in a SIP or AIP, with access to the file identification
tool output
• Re-run file identification on files that fail in the first analysis
FUTURERELEASES
Post 1.6 / SS 0.9
• MediaConch integration for better AV preservation
• Validate AV files with MediaConch prior to ingest
• Validate normalization copies produced for AIP
• Better documentation
• Better end-user documentation for the Fixity app
• Documentation for the REST API
• Automation tools documentation
• JISC development for better support of research data
• ????
ArchivematiCamp!
• August 24-26, 2016, Ann Arbor MI
• Hosted by University of Michigan School of
Information
• Registration closed!
• No actual camping involved (sorry)
• 2017: UK/Europe (TBD)
• 2017/18: Canada (TBD)
Want an AMCamp near you? If you
can offer event hosting, let’s talk!
Fran William Micklewaite / Library and Archives Canada PA-163493
https://en.wikipedia.org/wiki/Camping#/media/File:Unidentified_group_of_men_camping.jpg
ArchivesCanada Digital Preservation Service
• Canadian Hosted Archivematica Service
• Launched in Partnership with the Canadian Council on Archives
• Cloud hosting via Microsoft Azure
• Supported via a Documentary Heritage Communities Program grant via Library and Archives
Canada
• Will bring translations to Archivematica Interface!
• Translations to be managed via Transifex
• Will be relying on community volunteer translators
• Should be available in a public release ca. 1st quarter of 2017?
Webinar updates
• Automation tools webinar recording
• Problems with the recording, sorry!
• Working to make a screencast available
• For Discussion:
• Are webinars the most useful way to
disseminate new feature information?
• Would you prefer edited screencasts made
internally? What other formats might be useful?
• What topics would you like to see covered in
upcoming events?
RESOURCESAM homepage: https://www.archivematica.org
AM demo: http://sandbox.archivematica.org
Wiki: https://wiki.archivematica.org
Documentation: https://www.archivematica.org/docs/
RESOURCESRoadmap: https://wiki.archivematica.org/Development_roadmap:_Archivematica
Issue tracker: https://projects.artefactual.com/projects/Archivematica
Code repo: https://github.com/artefactual/Archivematica
Forum: https://groups.google.com/forum/#!forum/archivematica
QUESTIONS?
info@artefactual.com
Thanks!

Archivematica Community Update - SAA 2016

  • 1.
  • 2.
  • 3.
    WhatisArchivematica? Archivematica is aweb- and standards-based, open-source application which allows your institution to preserve long-term access to trustworthy, authentic and reliable digital content. Standards based Open source Customizable Integrated w 3rd party systems Active community
  • 4.
  • 5.
  • 6.
    20142008 2007: UNESCO REPORT0.1-ALPHA DASHBOARD INTRODUCED Archivematica’s development 0.7 1.0 RELEASED!0.9 0.8 Bradley, K., Lei, J., Blackall, C. Towards An Open Source Archival Repository and Preservation System (2007) Planning and development begin. Initial Funding via UNESCO MotW Subcommittee, IMF Archives, City of Vancouver Archives 0.6-ALPHA February 2010 May 2010 February 2011 February 2012 PREMIS in METS 0.10 April 2013 August 2012 STORAGE SERVICE 0.2 January 2014
  • 7.
    CURRENTRELEASE Archivematica 1.5 June9, 2016 • AIP reingest: • AIP versioning – add metadata or generate DIP from AIP after processing is complete • ArchivesSpace integration: • Send DIP object metadata to ArchivesSpace • AtoM integration enhancement: • Hierarchical DIP upload • DIP storage revision: • Allow users to store DIP after it (or its metadata) has been sent to the Access system
  • 8.
    NEXTRELEASE Archivematica 1.5.1 August2016 • CentOS packages! • Adding rpm packages for Archivematica and its dependencies • Will be continuing this support through future releases • Bug fixes • Problem fetching AtoM levels of description for hierarchical DIP upload with AtoM 2.3 • AIP re-ingest causing incorrect file count in Archival Storage • Files with diacritics failing at Assign file UUIDS and checksums
  • 9.
  • 10.
    UPCOMINGRELEASE Archivematica 1.6 • NewAppraisal tab: • For analysis of transfer contents and SIP arrangement; advanced ArchivesSpace integration • File visualizations, content tagging, and Bulk Extractor report analysis • Transfer Backlog enhancements: • Search transfers, download transfers or select files from archival storage tab • Perform transfer delete requests from the archival storage tab • Full AIP re-ingest • Send AIPs back to re-run all major preservation micro-services • Update AIP and version the METS file; define a different processing configuration for re-ingest • Support for multiple checksum algorithms • Allow users to select different checksum algorithms per pipeline late 2016?
  • 11.
    Archivematica 1.6 –Appraisal tab
  • 12.
    FUTURERELEASES Post 1.6 /SS 0.9 • DIP upload to AtoM enhancements • Metadata-only DIP upload • REST API integration; show AtoM levels of description in Appraisal tab • WARC file ingest • Analyze WARC header info and prepare metadata mapping to AIP METS file • Automated DIP generation workflow • Add support for re-ingesting uncompressed AIPS to AIP re-ingest • Add pre-configuration options for DIP upload actions • Enhance SS callbacks to notify 3rd party apps when a DIP is ready to be used
  • 13.
    FUTURERELEASES Post 1.6 /SS 0.9 • METS Parsing tools and REST API development • Add JSON-LD responses to public-facing REST API • Add python-based METSreader library that would live behind the REST service • Add endpoints for no. of files in storage, formats, etc to API • Enhanced PRONOM integration • Provide report of non-identified files in a SIP or AIP, with access to the file identification tool output • Re-run file identification on files that fail in the first analysis
  • 14.
    FUTURERELEASES Post 1.6 /SS 0.9 • MediaConch integration for better AV preservation • Validate AV files with MediaConch prior to ingest • Validate normalization copies produced for AIP • Better documentation • Better end-user documentation for the Fixity app • Documentation for the REST API • Automation tools documentation • JISC development for better support of research data • ????
  • 15.
    ArchivematiCamp! • August 24-26,2016, Ann Arbor MI • Hosted by University of Michigan School of Information • Registration closed! • No actual camping involved (sorry) • 2017: UK/Europe (TBD) • 2017/18: Canada (TBD) Want an AMCamp near you? If you can offer event hosting, let’s talk! Fran William Micklewaite / Library and Archives Canada PA-163493 https://en.wikipedia.org/wiki/Camping#/media/File:Unidentified_group_of_men_camping.jpg
  • 16.
    ArchivesCanada Digital PreservationService • Canadian Hosted Archivematica Service • Launched in Partnership with the Canadian Council on Archives • Cloud hosting via Microsoft Azure • Supported via a Documentary Heritage Communities Program grant via Library and Archives Canada • Will bring translations to Archivematica Interface! • Translations to be managed via Transifex • Will be relying on community volunteer translators • Should be available in a public release ca. 1st quarter of 2017?
  • 17.
    Webinar updates • Automationtools webinar recording • Problems with the recording, sorry! • Working to make a screencast available • For Discussion: • Are webinars the most useful way to disseminate new feature information? • Would you prefer edited screencasts made internally? What other formats might be useful? • What topics would you like to see covered in upcoming events?
  • 18.
    RESOURCESAM homepage: https://www.archivematica.org AMdemo: http://sandbox.archivematica.org Wiki: https://wiki.archivematica.org Documentation: https://www.archivematica.org/docs/
  • 19.
    RESOURCESRoadmap: https://wiki.archivematica.org/Development_roadmap:_Archivematica Issue tracker:https://projects.artefactual.com/projects/Archivematica Code repo: https://github.com/artefactual/Archivematica Forum: https://groups.google.com/forum/#!forum/archivematica
  • 20.

Editor's Notes

  • #4 Standards based: OAIS, PREMIS, METS, BagIt, Dublin Core Open source: A-GPLv3 license, free to study, use, modify, etc Customizable: Add/change/remove FPR rules as needed Integrated: dSpace, CONTENTdm, Islandora, LOCKSS, AtoM, DuraCloud, OpenStack, Archivist’s Toolkit, Arkivum, ArchivesSpace… etc Active community:
  • #8 AIP Re-ingest:
  • #11 Backlog: New backlog tab! Full AIP re-ingest - really about long-term preservation planning; ability to reprocess for new preservation formats, new FPR rules, etc. New checksums: MD5, SHA1, SHA256, SHA512
  • #13 DIP automation: primary use case is for users who are storing their DIPs in another system or repository, not uploading them to an Access System. Most access systems (ASpace, AtoM, Archivist’s Toolkit) still need  a target provided by the user (e.g. slug, etc). Will be handled via automation tools rather than in AM itself. DIP SS callbacks: Essentially for 3rd party integration - endpoints to be able to query SS if a DIP already exists for an AIP, and if not, to trigger re-ingest for access and notify user via API when new DIP is ready
  • #14 REST API: more integration to allow querying of SS, or AIP specifics via METSreader. Would allow users to build their own Binder-style application, for aggregate data and mgmt.
  • #15 PREFORMA Project / MediaConch integration: Integrating MediaConch into AM as a validation tool for a/v materials Will be able to check, for example that an .mkv file passes all checks as a valid .mkv file in both the source transfer, and the normalized files Working on adding preservation policy rules for verifying normalized copies; comparing them to originals, etc - e.g. frame size, bitrate, color space, etc Definitely useful for preservation normalization, but probs more useful for access normalization, to be able to customize DIP objects for the access environment (e.g specifying size, tailoring for devices, etc) Still in the design stages (search AM wiki for MediaConch to see analysis done so far) Will definitely make AM stronger for A/V materials JISC: Beyond 1.6 there is no roadmap planning as of now - Waiting to see what will happen with JISC We’ve been identified as an approved vendor for this process - alongside Preservica All higher education institutes indicate digital preservation as high priority Other applications mentioned in different areas include Islandora, Hydra, Previous and ongoing work into Research data management: Development of core preservation functionality for preservation of datasets: University of Alberta Libraries and University of British Columbia Library (2011-2014) Integration with Dataverse: Ontario Council of University Libraries (2015-2016) Jisc Research Data Spring analysis and software enhancements: Universities of York and Hull (2014-2016) Development of national research data preservation platform: Compute Canada and Canadian Association of Research Libraries (2015-)
  • #16 August 24th - 26th, in Ann Arbor Michigan Tentatively thinking of pursuing an ArchivematiCamp in Vancouver next year Strong interest in holding one in the UK around April 2017 If there are people who want to help us plan a future camp near them, we are open to discussing this - being able to offer space/resources would be necessary for us to be able to do this (space is the biggest cost) Definitely want to see more ArchivematiCamps! Current AMCamp has a stream for curators (e.g. archivists, librarians, etc) and technologists Not many folks signed up in the tech stream - will have to feel it out Day 3 will be split into 2 goups Hydra/Fedora/Islandora integration discussion / hackathon AM Unconference - lightning talks, presentations, etc