SlideShare a Scribd company logo
Born-Digital Archives inCollecting Repositories:Turning Challenges into Byte-Size OpportunitiesGretchen Gueguen, Mark A. Matienzo, Simon Wilson, and Peter ChanSession 502, 27 August 2011Society of American Archivists Annual Meeting
AIMS Project"Born-Digital Collections: An Inter-Institutional Model for Stewardship“Two year project to create a framework for stewardship of born-digital archival records in collecting repositoriesFunded by the Andrew W. Mellon Foundation
Partners
Grant GoalsProcessing of Hybrid CollectionsSoftware Development Community DevelopmentUnconference (May 2011, Charlottesville, VA)UK Symposium (June 2011, London, England)Workshop (August 2011, Chicago, IL)White Paper and Project Report
Framework DevelopmentA framework for collecting and delivering the born-digital materials that are quickly beginning to constitute the collections of contemporary scholarly, literary, and political figures and organizations.
AIMS FrameworkDiscovery and AccessAccessioning
Collection DevelopmentGretchen GueguenUniversity of Virginia
What is Collection Development?Actions and policies of institutions to bring in material for end users (both current and future); includes prioritizing, developing relationships with creators, assessments, negotiating agreements and preparing for accessioning.Within the AIMS frameworkViable, practical method to capture/process born-digital material from hybrid collections requires sound work at the beginning (i.e. policies, practices, agreements with donors, etc.) to set up later work
Elements of Collection DevelopmentPrerequisitesEstablish relationship with donorAnalyze FeasibilityNegotiate AgreementsPrepare for Accessioning
Prerequisites…Neil Beagrie, "Plenty of Room at the Bottom?  Personal Digital Libraries and Collections,"  D-Lib Magazine (June 2005)Blagofaire. http://xkcd.com/239/
Donor Relationship…
Enhanced Curation
Analyzing Feasibility…
Negotiate Agreements…All rights reserved by Chevrolet UK
Prepare for Accessioning...Scope and extent determined?Coordination with acquisition of analog material?Method and time determined?Pre-acquisition appraisal performed?Enhanced curationcarried out?Test capture if needed?Development of new methodologies undertaken as needed/possible?
AccessioningMark A. Matienzo, Yale University
What is Accessioning?Archival institution takes physical and legal custody of a group of records from a donor and documents the transfer in a register or other representation of the institution’s holdingsWithin AIMS FrameworkProcesses which establish physical, administrative and intellectual control over transferred records; assessment and documentation of future needs; documentation of actions taken; beginning of safe storage and maintenance
Elements of AccessioningPrerequisitesTransfer records and gain administrative controlPhysical control and stabilizationIntellectual control and documentation to support further processesMaintain accessioned records
Case Study:Re-Accessioning at YaleCollaborative capacity building across two repositoriesManuscripts and ArchivesBeinecke Rare Book and Manuscript LibraryAddressing previously received accessions of containing electronic records on mediaStill in testing phase, but working towards implementing in production
Types of Records and MediaWide variety of records creatorsLiterary authorsUniversity facultyUniversity officesArchitectural firmsCommon types of mediaFloppy disks: 5.25” and 3.5”Optical media: CDROM, CD-R, DVD-R, etc.Zip disksUSB flash drives
Goals of Re-AccessioningIdentify, document, and register mediaMitigate risk of media deterioration and obsolescenceExtract basic metadata from filesystems on media and files contained on filesystems
Re-Accessioning Workflow
Disk ImagingUsing “forensic” (bit-level) imaging processEnsure data on media is not manipulated using write-protectionUses software to acquire imagesIncludes hash-based verification process
Saa Session 502 Born Digital Archives in Collecting Repositories
Media LogUsing SharePoint listContains unique identifier of mediaRecords physical/logical characteristics of mediaDocuments success, failure, or status of various processes and additional notes
Media Log
Media Log
Metadata ExtractionCan be repurposed for descriptive, administrative, and technical metadataUses command-line tools (Sleuthkit, fiwalk)Outputs XML document
Packaging and TransferUsing BagIt packages/Bagger applicationPackages contain disk images, extracted metadata, imaging logs, and high-level accession informationTransfer to storage is verified by comparison against manifest
Saa Session 502 Born Digital Archives in Collecting Repositories
Arrangement & DescriptionSimon WilsonHull University Archives
Purpose of Arrangement & Description The general objectives for Arrangement & Description are:  - to preserve context  - to establish intellectual control of the material  - to provide a means of discovery SAA definition, emphasis on minimizing the amount of handlingWithin the AIMS frameworkProcesses which establish intellectual control of the material including implementation of policies and agreements with donors etc. to enable subsequent discovery and access
Elements of Arrangement and Description Prerequisites Plan for processing        - gather supporting information; files captured from media         (accessioning); convert files (for viewing); appraisal strategy; 	assess arrangement options; consider preservation issues Processing       - implement arrangement strategy; add descriptive metadata and wider context (eg Collection Level Description); copyright & other legal considerations  4. 	Prepare for Discovery & Access- remove restricted access to b-d material during processing
Case Study - Stephen GallagherBackground:2005: 42 boxes paper archives 2010: born-digital material: 14,320 files (13.6GB) transferred to us via external hard drive and a box of Amstrad disksCreate integrated catalogue to accommodate paper, born-digital and future accruals
Case Study - Stephen GallagherApproach: - current work higher priority in filing system- considered each work a distinct ‘project’ - structure reflect his way of working & the   archival principles of control that creator,   archivist & user can all understandSeries level was most logical solution- all related files placed in the series - reasonable return for our effort
Case Study - Stephen Gallagher300 files created using FinalDraft  screenwriter software view file (as created) to identify appropriate format for long termpreservationOther issues:  copyright/third-party content 
 commercial implications: access via repository = publication?  - re-purposing of work from one (unsuccessful) project to another 
Challenges facedEach collection is unique, approach will vary:  integrate born-digital material with existing material/arrangement? 
 one-off collection (eg project) or likely to be subsequent accruals?
 collection type; differs for personal papers & organisational records 
 same personnel work on paper and born-digital components?  
 can we appraise without knowing the contents?  similar to paper material that is in a different language?
Challenges facedVolume of material :  - depositor perception that 'storage is cheap‘ - does this mean        we shouldn’t appraise the material we receive?  - wide range of file types encountered    - not practical to describe each and every file  - risk management - if you don’t check every    file for sensitive information      - we need to automate as much of the processing as possible
HypatiaDigital archivists' identified a gap in current tools – used experiences to define the requirements for a new tool Key features identified: need an intuitive (for archivists) graphical interface
drag'n'drop to create the intellectual arrangement
 ability to return to original order of the material
 view some file types, add descriptive metadata etc
 high level of granularity when applying rights & permissionsTechnical (acquired at accessioning) and descriptive metadata - Discovery & Access process
Discovery and AccessPeter ChanStanford University
What is Discovery & AccessDiscovery and Accessrefers to the systems and workflows that make processed or unprocessedmaterial and the metadata that support it available to users. 
Goals of D&ATo make material available to user communities by ensuring that they can:
find out about material
understand whether it is available for consultation and if so, how
access material.
To apply appropriate access restrictions in order to protect private and sensitive information as well as intellectual property.
To provide access to material in a format and/or environment that presents the original’s significant properties.Case Study - Stephen Jay Gould PapersAnalog component: 550 linear feet of papers (789 boxes, 119 cartons, 30 flat boxes, and 14 map folders.File size and number: 59.7 MB and 2,567 files.Media formats: 98 3 ½” floppy diskettes; 61 5.25” floppy diskettes; 4 sets of punch cards*; 3 computer tapes File Types: Computer Programs; Data sets; Documents; SpreadsheetsFile Formats: ASCII Text; WordPerfect 4.2, 5.0, 5.1, 6.0, 6.1; Microsoft Word 2.0, 6.0, 97, 2000; Microsoft RTF; Microsoft Excel 4.0; Lotus 1-2-3 2.0, etc.* During processing of the “analog” papers in 2011, another 21 sets of punch cards and more floppy diskettes were found.
D&A – EAD
D&A – Facet Browsing

More Related Content

What's hot

FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM
 
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
Incremental Project
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering Students
Aaron Collie
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
Jez Cope
 
Digital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and Requirements
DigitalPreservationEurope
 
Martin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP OnlineMartin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP Online
Future Perfect 2012
 
From policy to practice with DMP Online
From policy to practice with DMP OnlineFrom policy to practice with DMP Online
From policy to practice with DMP Online
Sarah Jones
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Tony Ross-Hellauer
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011
DLFCLIR
 
MANTRA Research Data Lifecycle
MANTRA Research Data LifecycleMANTRA Research Data Lifecycle
MANTRA Research Data Lifecycle
EDINA, University of Edinburgh
 
Data accessibilityandchallenges
Data accessibilityandchallengesData accessibilityandchallenges
Data accessibilityandchallenges
jyotikhadake
 
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
cscpconf
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
FAIRDOM
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Benoit Pauwels
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
Michael Day
 
Data management for proposal writing
Data management for proposal writingData management for proposal writing
Data management for proposal writing
Olatunbosun Obileye
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and Humanities
Rebekah Cummings
 
ROER4D Open Data Initiative
ROER4D Open Data InitiativeROER4D Open Data Initiative
ROER4D Open Data Initiative
Michelle Willmers
 

What's hot (18)

FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
 
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering Students
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
 
Digital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and Requirements
 
Martin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP OnlineMartin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP Online
 
From policy to practice with DMP Online
From policy to practice with DMP OnlineFrom policy to practice with DMP Online
From policy to practice with DMP Online
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011
 
MANTRA Research Data Lifecycle
MANTRA Research Data LifecycleMANTRA Research Data Lifecycle
MANTRA Research Data Lifecycle
 
Data accessibilityandchallenges
Data accessibilityandchallengesData accessibilityandchallenges
Data accessibilityandchallenges
 
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Data management for proposal writing
Data management for proposal writingData management for proposal writing
Data management for proposal writing
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and Humanities
 
ROER4D Open Data Initiative
ROER4D Open Data InitiativeROER4D Open Data Initiative
ROER4D Open Data Initiative
 

Viewers also liked

Processing workshop 2010_04_23_final
Processing workshop 2010_04_23_finalProcessing workshop 2010_04_23_final
Processing workshop 2010_04_23_final
archiwicz
 
Cataloguing Photographs at The British Postal Museum & Archive
Cataloguing Photographs at The British Postal Museum & ArchiveCataloguing Photographs at The British Postal Museum & Archive
Cataloguing Photographs at The British Postal Museum & Archive
Martind1199
 
Archival Standards – ISAD(G) & ISAAR(CPF)
Archival Standards – ISAD(G) & ISAAR(CPF)Archival Standards – ISAD(G) & ISAAR(CPF)
Archival Standards – ISAD(G) & ISAAR(CPF)
Henny van Schie
 
Archivematica and Local Authority Archive Services
Archivematica and Local Authority Archive ServicesArchivematica and Local Authority Archive Services
Archivematica and Local Authority Archive Services
Paweł Jaskulski
 
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
mikeum
 
Arranging and Describing Archives
Arranging and Describing ArchivesArranging and Describing Archives
Arranging and Describing Archives
Kevin Conrad Tansiongco
 
Rebecca Grant - Archival Description and Archival Arrangement
Rebecca Grant - Archival Description and Archival ArrangementRebecca Grant - Archival Description and Archival Arrangement
Rebecca Grant - Archival Description and Archival Arrangement
dri_ireland
 
Sm iic seminar workshop by arlante
Sm iic seminar workshop by arlanteSm iic seminar workshop by arlante
Sm iic seminar workshop by arlante
Ma. Lourdes Flores
 
Chapter 24 the persuasive speech
Chapter 24   the persuasive speechChapter 24   the persuasive speech
Chapter 24 the persuasive speech
ProfessorEvans
 
The Needs of Archives: 16 (simple) rules for a better archival management
The Needs of Archives: 16 (simple) rules for a better archival managementThe Needs of Archives: 16 (simple) rules for a better archival management
The Needs of Archives: 16 (simple) rules for a better archival management
Tom Cobbaert
 
Archival cataloging using ISAD-G
Archival cataloging using ISAD-GArchival cataloging using ISAD-G
Archival cataloging using ISAD-G
Fe Angela Verzosa
 
Archival Arrangement, Description & Access
Archival Arrangement, Description & AccessArchival Arrangement, Description & Access
Archival Arrangement, Description & Access
lindyhopper38
 
Archival Processing And Description
Archival Processing And DescriptionArchival Processing And Description
Archival Processing And Description
Michelle Belden
 
Archiving
ArchivingArchiving
Organization of Archival Materials
Organization of Archival MaterialsOrganization of Archival Materials
Organization of Archival Materials
Fe Angela Verzosa
 
Chapter 12 types of organizational arrangements
Chapter 12 types of organizational arrangementsChapter 12 types of organizational arrangements
Chapter 12 types of organizational arrangements
ProfessorEvans
 
Introduction to arrangement and description (feb 4&5, 2012)
Introduction to arrangement and description (feb 4&5, 2012)Introduction to arrangement and description (feb 4&5, 2012)
Introduction to arrangement and description (feb 4&5, 2012)
Amanda Hill
 
Archival Management: Principles and Techniques
Archival Management: Principles and TechniquesArchival Management: Principles and Techniques
Archival Management: Principles and Techniques
Fe Angela Verzosa
 
Principles Of Marketing 1
Principles Of  Marketing 1Principles Of  Marketing 1
Principles Of Marketing 1
ali.jibran
 
Overview of Archival Processing
Overview of Archival ProcessingOverview of Archival Processing
Overview of Archival Processing
jennifer whitlock
 

Viewers also liked (20)

Processing workshop 2010_04_23_final
Processing workshop 2010_04_23_finalProcessing workshop 2010_04_23_final
Processing workshop 2010_04_23_final
 
Cataloguing Photographs at The British Postal Museum & Archive
Cataloguing Photographs at The British Postal Museum & ArchiveCataloguing Photographs at The British Postal Museum & Archive
Cataloguing Photographs at The British Postal Museum & Archive
 
Archival Standards – ISAD(G) & ISAAR(CPF)
Archival Standards – ISAD(G) & ISAAR(CPF)Archival Standards – ISAD(G) & ISAAR(CPF)
Archival Standards – ISAD(G) & ISAAR(CPF)
 
Archivematica and Local Authority Archive Services
Archivematica and Local Authority Archive ServicesArchivematica and Local Authority Archive Services
Archivematica and Local Authority Archive Services
 
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
 
Arranging and Describing Archives
Arranging and Describing ArchivesArranging and Describing Archives
Arranging and Describing Archives
 
Rebecca Grant - Archival Description and Archival Arrangement
Rebecca Grant - Archival Description and Archival ArrangementRebecca Grant - Archival Description and Archival Arrangement
Rebecca Grant - Archival Description and Archival Arrangement
 
Sm iic seminar workshop by arlante
Sm iic seminar workshop by arlanteSm iic seminar workshop by arlante
Sm iic seminar workshop by arlante
 
Chapter 24 the persuasive speech
Chapter 24   the persuasive speechChapter 24   the persuasive speech
Chapter 24 the persuasive speech
 
The Needs of Archives: 16 (simple) rules for a better archival management
The Needs of Archives: 16 (simple) rules for a better archival managementThe Needs of Archives: 16 (simple) rules for a better archival management
The Needs of Archives: 16 (simple) rules for a better archival management
 
Archival cataloging using ISAD-G
Archival cataloging using ISAD-GArchival cataloging using ISAD-G
Archival cataloging using ISAD-G
 
Archival Arrangement, Description & Access
Archival Arrangement, Description & AccessArchival Arrangement, Description & Access
Archival Arrangement, Description & Access
 
Archival Processing And Description
Archival Processing And DescriptionArchival Processing And Description
Archival Processing And Description
 
Archiving
ArchivingArchiving
Archiving
 
Organization of Archival Materials
Organization of Archival MaterialsOrganization of Archival Materials
Organization of Archival Materials
 
Chapter 12 types of organizational arrangements
Chapter 12 types of organizational arrangementsChapter 12 types of organizational arrangements
Chapter 12 types of organizational arrangements
 
Introduction to arrangement and description (feb 4&5, 2012)
Introduction to arrangement and description (feb 4&5, 2012)Introduction to arrangement and description (feb 4&5, 2012)
Introduction to arrangement and description (feb 4&5, 2012)
 
Archival Management: Principles and Techniques
Archival Management: Principles and TechniquesArchival Management: Principles and Techniques
Archival Management: Principles and Techniques
 
Principles Of Marketing 1
Principles Of  Marketing 1Principles Of  Marketing 1
Principles Of Marketing 1
 
Overview of Archival Processing
Overview of Archival ProcessingOverview of Archival Processing
Overview of Archival Processing
 

Similar to Saa Session 502 Born Digital Archives in Collecting Repositories

Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
GarethKnight
 
Data management
Data management Data management
Data management
Graça Gabriel
 
Best Practices for Managing Born Digital Content
Best Practices for Managing Born Digital ContentBest Practices for Managing Born Digital Content
Best Practices for Managing Born Digital Content
Recollection Wisconsin
 
20130222 kaptur training_goldsmiths
20130222 kaptur training_goldsmiths20130222 kaptur training_goldsmiths
20130222 kaptur training_goldsmiths
JISC funded KAPTUR project
 
Keep Calm and Curate
Keep Calm and CurateKeep Calm and Curate
Keep Calm and Curate
GarethKnight
 
Data management plans
Data management plansData management plans
Data management plans
Brad Houston
 
Di d dlf_handout
Di d dlf_handoutDi d dlf_handout
Di d dlf_handout
cwilliford
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
Brad Houston
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
Brad Houston
 
Data management
Data management Data management
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
DigitalPreservationEurope
 
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
dri_ireland
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
faflrt
 
Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...
GarethKnight
 
Pekin eca2010-v2
Pekin eca2010-v2Pekin eca2010-v2
Pekin eca2010-v2
Anna Ashton
 
Data management (newest version)
Data management (newest version)Data management (newest version)
Data management (newest version)
Graça Gabriel
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
MANENDRASINGH30
 
RDM for trainee physicians
RDM for trainee physiciansRDM for trainee physicians
RDM for trainee physicians
Historic Environment Scotland
 
Completepresentation
CompletepresentationCompletepresentation
Completepresentation
Andrew Wesolek
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
Rebekah Cummings
 

Similar to Saa Session 502 Born Digital Archives in Collecting Repositories (20)

Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
Data management
Data management Data management
Data management
 
Best Practices for Managing Born Digital Content
Best Practices for Managing Born Digital ContentBest Practices for Managing Born Digital Content
Best Practices for Managing Born Digital Content
 
20130222 kaptur training_goldsmiths
20130222 kaptur training_goldsmiths20130222 kaptur training_goldsmiths
20130222 kaptur training_goldsmiths
 
Keep Calm and Curate
Keep Calm and CurateKeep Calm and Curate
Keep Calm and Curate
 
Data management plans
Data management plansData management plans
Data management plans
 
Di d dlf_handout
Di d dlf_handoutDi d dlf_handout
Di d dlf_handout
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Data management
Data management Data management
Data management
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
 
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
 
Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...
 
Pekin eca2010-v2
Pekin eca2010-v2Pekin eca2010-v2
Pekin eca2010-v2
 
Data management (newest version)
Data management (newest version)Data management (newest version)
Data management (newest version)
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
 
RDM for trainee physicians
RDM for trainee physiciansRDM for trainee physicians
RDM for trainee physicians
 
Completepresentation
CompletepresentationCompletepresentation
Completepresentation
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
 

More from AIMS_Archives

AIMS Workshop Case Study 2: Re-accessioning at Yale
AIMS Workshop Case Study 2: Re-accessioning at YaleAIMS Workshop Case Study 2: Re-accessioning at Yale
AIMS Workshop Case Study 2: Re-accessioning at Yale
AIMS_Archives
 
AIMS workshop pt. 4: Discovery and Access
AIMS workshop pt. 4: Discovery and AccessAIMS workshop pt. 4: Discovery and Access
AIMS workshop pt. 4: Discovery and Access
AIMS_Archives
 
AIMS Workshop pt. 2: Accessioning
AIMS Workshop pt. 2: AccessioningAIMS Workshop pt. 2: Accessioning
AIMS Workshop pt. 2: Accessioning
AIMS_Archives
 
AIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
AIMS workshop Case Study 4: Discovery and Access to Hybrid CollectionsAIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
AIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
AIMS_Archives
 
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom CenterAIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
AIMS_Archives
 
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
AIMS_Archives
 
AIMS Workshop Case Study 2: Accessioning Evolution
AIMS Workshop Case Study 2: Accessioning EvolutionAIMS Workshop Case Study 2: Accessioning Evolution
AIMS Workshop Case Study 2: Accessioning Evolution
AIMS_Archives
 
AIMS Workshop pt .1: Collection Development
AIMS Workshop pt .1: Collection DevelopmentAIMS Workshop pt .1: Collection Development
AIMS Workshop pt .1: Collection Development
AIMS_Archives
 
AIMS workshop: Introduction
AIMS workshop: IntroductionAIMS workshop: Introduction
AIMS workshop: Introduction
AIMS_Archives
 

More from AIMS_Archives (9)

AIMS Workshop Case Study 2: Re-accessioning at Yale
AIMS Workshop Case Study 2: Re-accessioning at YaleAIMS Workshop Case Study 2: Re-accessioning at Yale
AIMS Workshop Case Study 2: Re-accessioning at Yale
 
AIMS workshop pt. 4: Discovery and Access
AIMS workshop pt. 4: Discovery and AccessAIMS workshop pt. 4: Discovery and Access
AIMS workshop pt. 4: Discovery and Access
 
AIMS Workshop pt. 2: Accessioning
AIMS Workshop pt. 2: AccessioningAIMS Workshop pt. 2: Accessioning
AIMS Workshop pt. 2: Accessioning
 
AIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
AIMS workshop Case Study 4: Discovery and Access to Hybrid CollectionsAIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
AIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
 
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom CenterAIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
 
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
 
AIMS Workshop Case Study 2: Accessioning Evolution
AIMS Workshop Case Study 2: Accessioning EvolutionAIMS Workshop Case Study 2: Accessioning Evolution
AIMS Workshop Case Study 2: Accessioning Evolution
 
AIMS Workshop pt .1: Collection Development
AIMS Workshop pt .1: Collection DevelopmentAIMS Workshop pt .1: Collection Development
AIMS Workshop pt .1: Collection Development
 
AIMS workshop: Introduction
AIMS workshop: IntroductionAIMS workshop: Introduction
AIMS workshop: Introduction
 

Recently uploaded

Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
AmandaCheung15
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
Baishakhi Ray
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
BrainSell Technologies
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
SynapseIndia
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Muhammad Ali
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
BrainSell Technologies
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
Ivanti
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
SynapseIndia
 
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
Priyanka Aash
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
Safe Software
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Kunal Gupta
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
SAI KAILASH R
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
bellared2
 
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and CitiesThe Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
Arpan Buwa
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
313mohammedarshad
 

Recently uploaded (20)

Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
 
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and CitiesThe Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
 

Saa Session 502 Born Digital Archives in Collecting Repositories

  • 1. Born-Digital Archives inCollecting Repositories:Turning Challenges into Byte-Size OpportunitiesGretchen Gueguen, Mark A. Matienzo, Simon Wilson, and Peter ChanSession 502, 27 August 2011Society of American Archivists Annual Meeting
  • 2. AIMS Project"Born-Digital Collections: An Inter-Institutional Model for Stewardship“Two year project to create a framework for stewardship of born-digital archival records in collecting repositoriesFunded by the Andrew W. Mellon Foundation
  • 4. Grant GoalsProcessing of Hybrid CollectionsSoftware Development Community DevelopmentUnconference (May 2011, Charlottesville, VA)UK Symposium (June 2011, London, England)Workshop (August 2011, Chicago, IL)White Paper and Project Report
  • 5. Framework DevelopmentA framework for collecting and delivering the born-digital materials that are quickly beginning to constitute the collections of contemporary scholarly, literary, and political figures and organizations.
  • 6. AIMS FrameworkDiscovery and AccessAccessioning
  • 8. What is Collection Development?Actions and policies of institutions to bring in material for end users (both current and future); includes prioritizing, developing relationships with creators, assessments, negotiating agreements and preparing for accessioning.Within the AIMS frameworkViable, practical method to capture/process born-digital material from hybrid collections requires sound work at the beginning (i.e. policies, practices, agreements with donors, etc.) to set up later work
  • 9. Elements of Collection DevelopmentPrerequisitesEstablish relationship with donorAnalyze FeasibilityNegotiate AgreementsPrepare for Accessioning
  • 10. Prerequisites…Neil Beagrie, "Plenty of Room at the Bottom? Personal Digital Libraries and Collections," D-Lib Magazine (June 2005)Blagofaire. http://xkcd.com/239/
  • 14. Negotiate Agreements…All rights reserved by Chevrolet UK
  • 15. Prepare for Accessioning...Scope and extent determined?Coordination with acquisition of analog material?Method and time determined?Pre-acquisition appraisal performed?Enhanced curationcarried out?Test capture if needed?Development of new methodologies undertaken as needed/possible?
  • 17. What is Accessioning?Archival institution takes physical and legal custody of a group of records from a donor and documents the transfer in a register or other representation of the institution’s holdingsWithin AIMS FrameworkProcesses which establish physical, administrative and intellectual control over transferred records; assessment and documentation of future needs; documentation of actions taken; beginning of safe storage and maintenance
  • 18. Elements of AccessioningPrerequisitesTransfer records and gain administrative controlPhysical control and stabilizationIntellectual control and documentation to support further processesMaintain accessioned records
  • 19. Case Study:Re-Accessioning at YaleCollaborative capacity building across two repositoriesManuscripts and ArchivesBeinecke Rare Book and Manuscript LibraryAddressing previously received accessions of containing electronic records on mediaStill in testing phase, but working towards implementing in production
  • 20. Types of Records and MediaWide variety of records creatorsLiterary authorsUniversity facultyUniversity officesArchitectural firmsCommon types of mediaFloppy disks: 5.25” and 3.5”Optical media: CDROM, CD-R, DVD-R, etc.Zip disksUSB flash drives
  • 21. Goals of Re-AccessioningIdentify, document, and register mediaMitigate risk of media deterioration and obsolescenceExtract basic metadata from filesystems on media and files contained on filesystems
  • 23. Disk ImagingUsing “forensic” (bit-level) imaging processEnsure data on media is not manipulated using write-protectionUses software to acquire imagesIncludes hash-based verification process
  • 25. Media LogUsing SharePoint listContains unique identifier of mediaRecords physical/logical characteristics of mediaDocuments success, failure, or status of various processes and additional notes
  • 28. Metadata ExtractionCan be repurposed for descriptive, administrative, and technical metadataUses command-line tools (Sleuthkit, fiwalk)Outputs XML document
  • 29. Packaging and TransferUsing BagIt packages/Bagger applicationPackages contain disk images, extracted metadata, imaging logs, and high-level accession informationTransfer to storage is verified by comparison against manifest
  • 31. Arrangement & DescriptionSimon WilsonHull University Archives
  • 32. Purpose of Arrangement & Description The general objectives for Arrangement & Description are: - to preserve context - to establish intellectual control of the material - to provide a means of discovery SAA definition, emphasis on minimizing the amount of handlingWithin the AIMS frameworkProcesses which establish intellectual control of the material including implementation of policies and agreements with donors etc. to enable subsequent discovery and access
  • 33. Elements of Arrangement and Description Prerequisites Plan for processing      - gather supporting information; files captured from media (accessioning); convert files (for viewing); appraisal strategy; assess arrangement options; consider preservation issues Processing    - implement arrangement strategy; add descriptive metadata and wider context (eg Collection Level Description); copyright & other legal considerations  4. Prepare for Discovery & Access- remove restricted access to b-d material during processing
  • 34. Case Study - Stephen GallagherBackground:2005: 42 boxes paper archives 2010: born-digital material: 14,320 files (13.6GB) transferred to us via external hard drive and a box of Amstrad disksCreate integrated catalogue to accommodate paper, born-digital and future accruals
  • 35. Case Study - Stephen GallagherApproach: - current work higher priority in filing system- considered each work a distinct ‘project’ - structure reflect his way of working & the archival principles of control that creator, archivist & user can all understandSeries level was most logical solution- all related files placed in the series - reasonable return for our effort
  • 36. Case Study - Stephen Gallagher300 files created using FinalDraft screenwriter software view file (as created) to identify appropriate format for long termpreservationOther issues:  copyright/third-party content 
  • 37. commercial implications: access via repository = publication? - re-purposing of work from one (unsuccessful) project to another 
  • 38. Challenges facedEach collection is unique, approach will vary:  integrate born-digital material with existing material/arrangement? 
  • 39. one-off collection (eg project) or likely to be subsequent accruals?
  • 40. collection type; differs for personal papers & organisational records 
  • 41. same personnel work on paper and born-digital components?  
  • 42. can we appraise without knowing the contents? similar to paper material that is in a different language?
  • 43. Challenges facedVolume of material :  - depositor perception that 'storage is cheap‘ - does this mean we shouldn’t appraise the material we receive?  - wide range of file types encountered    - not practical to describe each and every file - risk management - if you don’t check every file for sensitive information      - we need to automate as much of the processing as possible
  • 44. HypatiaDigital archivists' identified a gap in current tools – used experiences to define the requirements for a new tool Key features identified: need an intuitive (for archivists) graphical interface
  • 45. drag'n'drop to create the intellectual arrangement
  • 46. ability to return to original order of the material
  • 47. view some file types, add descriptive metadata etc
  • 48. high level of granularity when applying rights & permissionsTechnical (acquired at accessioning) and descriptive metadata - Discovery & Access process
  • 49. Discovery and AccessPeter ChanStanford University
  • 50. What is Discovery & AccessDiscovery and Accessrefers to the systems and workflows that make processed or unprocessedmaterial and the metadata that support it available to users. 
  • 51. Goals of D&ATo make material available to user communities by ensuring that they can:
  • 52. find out about material
  • 53. understand whether it is available for consultation and if so, how
  • 55. To apply appropriate access restrictions in order to protect private and sensitive information as well as intellectual property.
  • 56. To provide access to material in a format and/or environment that presents the original’s significant properties.Case Study - Stephen Jay Gould PapersAnalog component: 550 linear feet of papers (789 boxes, 119 cartons, 30 flat boxes, and 14 map folders.File size and number: 59.7 MB and 2,567 files.Media formats: 98 3 ½” floppy diskettes; 61 5.25” floppy diskettes; 4 sets of punch cards*; 3 computer tapes File Types: Computer Programs; Data sets; Documents; SpreadsheetsFile Formats: ASCII Text; WordPerfect 4.2, 5.0, 5.1, 6.0, 6.1; Microsoft Word 2.0, 6.0, 97, 2000; Microsoft RTF; Microsoft Excel 4.0; Lotus 1-2-3 2.0, etc.* During processing of the “analog” papers in 2011, another 21 sets of punch cards and more floppy diskettes were found.
  • 58. D&A – Facet Browsing
  • 59. D&A – Full text search
  • 60. D&A – See Contents on Web
  • 61. D&A – Tag & Annotation by Invited Persons / PublicAnnotation:
  • 62. Impacts fromCollection DevelopmentFile formats: no restrictionComputer medium: no restriction (punch card, open reel tape, 5.25 inch floppy, 3.5 inch floppy), File type: no restriction (computer program, data set, document, spreadsheet), Agreement: permission to post contents online.
  • 63. Impacts fromAccessioningBuilt 5.25 inch floppy capture stationAsk Computer History Museum to read punch cardsOpen reel tapes – still outstanding
  • 64. Impacts fromProcessingAccessData FTK was used to search files with restricted information, annotate files with appropriate descriptive metadata (book title, articles, etc.), and rights metadata (access restriction), generate technical metadata for the delivery platform to act upon.Transit Solution was used to transform files to html format for display in web.A XSLT program was written to transform the XSL-FO output from FTK to XML content document. A Ruby program was written to ingest the XML content document, original files, and the display derivatives to Fedora.
  • 65. FTK – Bookmark and Label
  • 66. FTK – Full Text, Pattern Search & Fuzzy Hash
  • 68. Network Diagram for 50,000 Creeley Emails
  • 71. Want to know more?http://born-digital-archives.blogspot.comhttp://born-digital-archives.blogspot.comGretchen Gueguen Mark Matienzogmg2n@virginia.edumark.matienzo@yale.eduSimon Wilson Peter Chans.wilson@hull.ac.ukpchan3@stanford.edu

Editor's Notes

  1. Hello and welcome to session 502: Born-Digital Archives in Collecting Repository: Turning Challenges into Byte-Size OpportunitiesMy name is Gretchen Gueguen and I’m Digital Archivist at the University of Virginia. This morning, along with my colleagues Mark Matienzo from Yale, Simon Wilson from the University of Hull, and Peter Chan from Stanford, I’m going to talk with you about the AIMS project.
  2. AIMS is the short title for a Mellon-funded grant project entitled Born-Digital Collections: An Inter-Institutional Model for Stewardship. This two-year project set out to create a framework for stewardship of born-digital archival records in the collecting repositories.
  3. As I’ve mentioned, the grant partners include UVA, Stanford, Hull and Yale and Virginia serves as the PI
  4. The grant set out to achieve it’s goal through 4 different areas of activity. The first was the processing of several hybrid collections which you are going to hear about later this morning. The Digital archivists at each institution, the four of us here this morning, were funded by the grant to carry out this processing.To facilitate this stewardship, the partners also sought to develop some software solutions. You won’t hear as much about these this morning, but they include Rubymatica, a ruby-based reworking of Archivematica for the creation of Submission Information Packages, and Functional Requirement for a software tool to facilitate arrangement, description and access to born-digital archival materials. These requirements led to work on developing Hypatia, which is what is known as a “Hydra Head” or a module for the Fedora/Solr/Blacklight Hydra stack, for access to born-digital materials.The partners also hosted several events to garner feedback and to encourage communication among the archival community, including a workshop that took place here in Chicago earlier this week.The final project deliverables will include a White Paper synthesizes the research done during the project and a project report to the Mellon Foundation.
  5. A large part of the White Paper focuses on what we are currently referring to as the AIMS framwork: “A framework for collecting and delivering the born-digital materials that are quickly beginning to constitute the collections of contemporary scholarly, literary, and political figures and organizations.”This is really a high-level look at the tools, strategies, methodologies, and practices needed to effectively manage b-d content
  6. The framework is characterized by four main functions of stewardship:Collection DevelopmentAccessioningArrangement and DescriptionDiscovery and AccessYou’ll notice that we do not include “preservation” as an explicit function here. That is an intentional omission because we believe that preservation is implicit in all of these functions. In addition aspects such developing a preservation repository or undertaking preservation activities are outside of this scope because they are larger institutional initiatives. They are mentioned as prerequisites to being able to do work in many steps, but since there are many guidelines out there we didn’t feel the need to reiterate them here.We are going to focus the rest of our presentation this morning on these four areas and share with you some of the work we have done.If you are interested in more on the background of the project, I will encourage you to check out our project blog, called born Digital Archives and I’ll put a URL up for the blog at the end of the presentation
  7. We are starting our model with activities related to Collection Development. These are the activities undertaken in order to bring material in to the institution.  These include activities we may be very familiar with like prioritizing, developing relationships with creators, doing assessments and negotiating agreements.Within the concept of the AIMS model, which is primarily a hybrid collection environment, this work will be necessary to develop sound capturing and processing activities later.
  8. We’ve defined collection development as having five distinct stages which I’m going to go over with you this morning:PrerequisitesEstablish relationship with donorAnalyze FeasibilityNegotiate AgreementsPrepare for Accessioning
  9. The first step is going through some prerequisites like having an appraisal processes: how will you assess or evaluate materials? How will you be able to determine value? Also you need to evaluate your storage capacity: Do you have enough space to keep this material in both the short- and long-term? What about future transfers? Do you have a sound data preservation strategy or methodology?One of the most important prerequisites is establishing Collection policies.Defining what it is that we want to collect takes on a couple of different questions.The first might be what types of material are we interested in, in the traditional collecting sense: prominent people, organizational records, etc.Next, we need to consider what part of those figures lives we are collecting. We use our digital devices for private activities, as well as more public ones…which are we interested in collecting?The next logical step then is to think about where this information might be on digital devices: stored files probably yes, but do we also need software, operating systems, hardware, internet activity or cloud material?All of these factors, and more come together in a collection development policy, and it can be very difficult to write, especially when you are just starting and don’t know
  10. Assuming that you have the needed prerequisites in place or have the capacity to work on them, you can move on to the actual work of collection development:The first step is establishing a relationship with the donor. In many ways this is parallel to existing analog work, but when dealing with born-digital materials you should start thinking early about how digital archive staff need to be involved? This is potentially going to be very different from access to physical materials and now is the time to discuss options. Now is also the time to discuss the creation of the data with the donor and capture any documentation that will help with later processing and access. But, how comfortable is your donor with digital concepts and access to digital materials? As an example of the difficulty that this can cause, I’d like to show the example of some work that the AIMS project did in this regard. This is a digital donor survey that the AIMS project created based on one created for the PARADIGM workbook. The original intention was that a donor could fill this out before accessioning.This is the first page…and this is the second…and the third…and the fourth….and this is part two!We quickly realized that this would be overwhelming to potential donors, especially ones who hadn’t really thought much about things like their online persona or email preservation. We changed tactics and now recommend that this survey be used as prompt sheet for the archivist in an interview.
  11. Such an interview may be part of a program of enhanced curation, something Jeremy Leighton John at the British Library describes as not only collect[ing] the original archive but add[ing] value to it.“enhanced curation” techniques include things like documenting the creator’s workspace with high-resolution digital photography, creating a digital film of an oral history interviewwith donors about their computers and their computing habits, perhaps capturing video of screencasts of the donor describing the organization on their computer. This type of information can be invaluable as materials are accessioned and processed as the level of abstraction or unfamiliarity with a new system can make it difficult to gain intellectual control.
  12. Okay, so you are ready to move on to considering whether or not you even *can* acquire this material or more likely whether it is worth the costs. What is the cost analysis and risk analysis? Try a test capture…how does it work? Do you have the needed infrastructure and policies or can you create them? Can you even view files in order to appraise them? Do you need these guys to accomplish this? Or maybe these guys?It’s very easy to say “analyze costs” or “evaluate your home institution infrastructure” but if you’ve never encountered a particular software or hardware it’s difficult to be prepared for them. This is where having technologists or digital archivists involved early in the process can help. If possible during a test capture they can do a triage to determine if there are serious preservation concerns, if any forensic processing might be needed to recover damaged or deleted files. Etc.
  13. Moving on then, the next step is negotiating agreements. One of the big problems here is that there is a lack of models for agreements and appraisals. Many elements of standard agreements remain applicable in the hybrid or born-digital archive, but have different implications. It’s not the same to provide unrestricted access to paper documents in a reading room and unrestricted access to digital materials online. Furthermore, you have a much larger potential for capturing and inadvertently exposing sensitive electronic information like financial and health information, passwords and other personal data.The legal agreement with the donor needs to specify:An Agreement about copyright – either transferred to repository/institution or remain with creator/heirsUnderstanding that collecting repository will be “sole” repository of b-d material Understanding of capabilities/limits for capturing b-d material (currently)Understanding of preservation strategies and capabilitiesUnderstanding of delivery capabilities and limits (current)Understanding of what/how files will be restricted or deleted & how this will be confirmed Understanding of capabilities/limits of appraisal, viewing, description/processing of b-d materialUnderstanding of the creative process and relationship with b-d materials, computers, hand-held devices, cloud computing, etc.
  14. The final step in collection development is to prepare for processing. This may seem a little odd in a traditional sense, but what we are alluding to here is making sure that all of your technical steps for transfer, which may not be in the agreement, are planned ahead of time. Specifically, Scope and extent determinedMethod and time determinedPre-acquisition appraisal performedTest capture if neededDevelopment of new methodologies undertaken as needed/possibleEnhanced curation carried outCoordination with acquisition of analog materialThis is really the “action” step where many of the activities you have been planning prior are carried out. Overall, the steps in Collection Development help to set up later activities. By the end of the collection development step, the institution should be ready to take legal and physical custody of material. Doing this in a forward-thinking, planfull manner will help later processes go much smoother. You’ve made it to the finish line of collection development, but now we need to move on to Accessioning.
  15. Accessioning is generally understood as the set of processes wherein a repository takes physical and legal custody of records from a donor and formally documents, or "registers." the transfer. The processes have clear links to both collection development and arrangement and description, and in some cases, institutions may view them as part of those processes. However, we have situated accessioning as a primary function within the AIMS framework.Within our framework, accessioning serves a vital role to allow a collecting repository to establish physical, administrative, and intellectual control over records that have been transferred. The accessioning processes allow archivists to gather a wide variety of information that will inform and prioritize other processes, such as arrangement and description, further appraisal, and requirements for access. Accessioning also provides an environment in which archivists can document their actions and ultimately transfer the accessioned records into an environment for their storage and maintenance.The goals of accessioning therefore reflect the need to establish control over and ensure the authenticity and reliability of transferred records. Archivists must therefore be diligent during accessioning and understand that they understand the potential impact of the actions they take during these processes. If a collecting repository is unable to establish an adequate level of control over transferred electronic records, then it is likely that it has not successfully accessioned them. Accordingly, archivists with "legacy" accessions of electronic records, such as those containing computer media, may want to consider "reaccessioning" those transfers to establish a suitable level of control.
  16. The prerequisites, like the other areas of the AIMS model, broadly fall into several categories; in this case, they are policies, procedures, and infrastructure. There are many policies required to support accessioning properly. These may range from departmental preferences to requirements set at the institutional level. Procedures may account for a number of different options, such as minimal processing, accessioning of born-digital materials with paper records, deferment of digital accessioning, accessioning as resources allow, and retrospective accessioning of previously received electronic records. Infrastructure to support accessioning includes a wide variety of software and hardware, and expertise. This infrastructure will take resources to build, and archivists are urged to consider collaborative partnerships to allow for the better sharing of knowledge.  The transfer and administrative control processes in the AIMS framework are very similar to those for other formats of records. Archivists working with electronic records should be familiar with the various types for transfers and their implications. Types of transfers can include receipt of retired media formerly in use by a creator, records copied to media only used for transfer (such as external hard drives, CDs or DVDs), or a direct transfer using disk imaging software or by copying files across a network.Once the under administrative control, archivists should focus their efforts to gain physical control over records and media. Much of this work concerns identifying and potentially addressing threats preservation issues in the records, such as viruses, unknown file formats, and the physical condition of media if appropriate.Archivists next need to establish intellectual control and gather documentation that will enable further work necessary to process, maintain, or use the records. For some transfers, a listing of directories or files may be repurposed for archival description if the existing arrangement appears to be of value.Finally, the archivist should prepare the records to be maintained over time. This may include actions such as normalizing to preservation formats. Ultimately, the records should also be transferred to a secure storage location that can be monitored by the collecting repository.
  17. At Yale University, we have worked on a reaccessioning project that has allowed us to develop our thinking of how this accessioning of electronic records could best be realized for us going forward. Two repositories, Manuscripts and Archives and the Beinecke Rare Book and Manuscript Library, have worked in collaboration to implement software, hardware, and procedures that can be shared to support accessioning. In our reaccessioning project, we are working to establish better control over previously transferred accessions that contain electronic records on media such as floppy disks and CD-ROMs. These pieces of media were often received as part of a hybrid accession that also contained paper records, but in some cases we have received accessions of boxes containing only media.
  18. The goals of our reaccessioning project are fairly straightforward and relate to the three types of control discussed previously. First, we seek to establish administrative control of the media by identifying what it is and documenting its physical and logical characteristics and by assigning a unique identifier to each piece. Secondly, we are working towards gaining physical control of the media, which will allow us to mitigate the risks of media deterioration and obsolescence. Finally, we are trying to establish a basic level of intellectual control by extracting metadata about the filesystems and files contained on the media, such as file names, directory structures, and creation, access, and modification dates.
  19. Our reaccessioning workflow roughly looks like the following. We begin by retrieving the media and bringing it to the electronic records workstation, documenting its change in location within the Archivists’ Toolkit. We then assign unique identifiers to each of the media. We establish the best means by which to write-protect the media for imaging and record its identifying characteristics in a media log. We then put the media in the appropriate drive and create a forensic bit-level disk image, which includes all the files, the filesystem metadata, unused space – in other words, the entirety of the data on the media. We verify the image against the raw contents of the media and extract metadata from the disk image. Finally, we package the images and metadata and transfer the package into storage and complete the rest of the documentation.
  20. To acquire the data off media, we are using a forensic imaging process that extracts the entirety of the data off the media at the lowest level possible. To ensure that we do not intentionally or accidentally manipulate any of the data on the original media, we write-protect the media or reader. For floppy disks, we can use physical write protect tabs. For USB flash media, hard drives, and the like, we connect the drive or reader to a write-blocker, which is a piece of hardware connected to the computer that blocks low-level write signals from a computer. We use a variety of software to acquire the images, such as FTK Imager. The imaging software extracts the data from the media and calculates a cryptographic hash of the data on the media and the data within the image file. If the checksums match, the imaging is viewed as successful. [ADD FTK Imager SCREENSHOT? WRITEBLOCKER PHOTO?]
  21. This is a screenshot of FTK Imager, which we use to image media and to inspect disk images. You can see that the file listing includes regular files, slack or unused space on the disk, and deleted files, as denoted by the red X on the file icons.
  22. Our media log is a SharePoint list that contains identifying characteristics and physical and logical information about the media, such as the type of media, when it was imaged, the text of a label or writing on the media, and the type of filesystem or filesystems it contains. We assign each piece of media a unique identifier, which is a combination of theaccession number and incremental number. The media log also contains the workflow status of the accessioning process for each piece of media and whether processes succeeded or failed.
  23. The first screenshot is an overview for several pieces of media. You can see the unique media identifiers, the media format, and the workflow status.
  24. This expanded view shows all the fields, including further documentation about the disk image, the filesystem contained, and additional notes.
  25. If imaging is successful, we then extract metadata from the filesystem and files within the image. This is a software-based process that provides metadata such as file names, directory structures, creation and modification times, and approximate categorization of the types of files. This metadata can be repurposed in a variety of ways and provides a basic level of intellectual control that is comparable to a box list or other type of inventory for paper records. We are using open source software such as Sleuthkit and fiwalk to perform this extraction, but occasionally we need to rely on other tools for older or less common types of file systems.
  26. Finally, we create a transfer package using the BagIt specification as developed by the Library of Congress and the California Digital Library. To create the packages, we are using the Library of Congress-developed Bagger application. These packages contain the disk images, extracted metadata, and logs generated by the disk imaging software during the acquisition process. The BagIt packages also contain high-level information about the accession. For the time being, we are making a rough connection of one bag per accession, but we realize we may need to modify depending on the size of the accessions.
  27. This an overview of a sample bag, showing the structure and high-level metadata. Once packaged, we transfer the package to storage and verify the success of the transfer using procedures for the BagIt specification which compare the contents of the package against its manifest. If successful, we complete the rest of the documentation and record the success in the media log. We also record the storage location of the transferred package within the Archivists’ Toolkit and add the date of completion.
  28. SAA definition for description puts emphasis on minimizing the amount of handling needs to be updated to consider preservation actions due to file format obsolescence etc
  29. - reasonable return for our effort for us to describe the ‘project’ and indicative content that we held
  30. What is sensitive will vary from collection to collection information (social security; personal e-mail address/mobile no etc) - Could also be discussion behind a decision (Larkin 25 funding)
  31. As a result of experiences to tackle arrangement and description, the AIMS digital archivists' defined the requirements for a new tool - designed to work with technical and professional standards- use drag'n'drop to create intellectual arrangement, changes a relationship between digital assets (asset doesn’t move) using Fedora "sets“  - rights & permissions to single file, a discrete series or the entire collection
  32. “the systems and workflows that make processed or unprocessedmaterial and the metadata that support it available to users.”Discovery and access is also not possible without completion of many of the prior steps described in this model. The outcomes of those steps have a significant impact on what is either appropriate or achievable in terms of discovery and access. Given the impact of these prior steps on discovery and access it is crucial to consider the desired outcomes for discovery and access as early as possible — ideally during the Collection Development phase — and to continue to update and revise these plans are work on the collection progresses.
  33. Overall though, we have three major goals in discovery and access.The first is to make material available to user communities. This includes ensuring that the users can find the material, understand if it’s available, and get access to it if possibleHowever, that access must follow guidelines for access restrictions related to privacy, and intellectual property.An overarching goal of all three is to ensure that the significant properties of the material are inherent in whichever form the delivery takes.
  34. We plan to delivery Stephen Jay Gould papers in the Hypatia platform. Hypatia is a fedora platform In Hypatia, we have one EAD for the hybrid collectionSeries 6 – for born digital material. We provide a link for people to go to an interface where they can browse and perform full text search on the born digital material of the papers.
  35. Facets:SubjectsTeaching materialsBooksArticlesNSF reports
  36. Convert the files in obsolete file format such as WordPerfect to html. If not, people have to download the files and find a viewer to view the file or create an emulated environment to view the file.
  37. Discovery and access is also not possible without completion of many of the prior steps described in this model. Some institution accept certain file formats only.
  38. Researchers may also need to bookmark or label the files they found.
  39. In additional to Hypatia mentioned above. Stanford also try to use FTK the software we use for processing) to delivery born digital materials.One of the features of the FTK, which I believe will be interested by researchers, is the ability to generate Fuzzy hash.Files with the same hash are the same in contents. What about similar files?Fuzzy hash provide you the information how close files are Full text searchHow many characters mis-speltFuzzy hashing is a tool which provides the ability to compare two different files and determine a fundamental level of similarity. This similarity is expressed as score from 0-100. The higher the score reported the more similar the two pieces of data. A score of 100 would indicate that the files are close to identical. Alternatively a score of 0 would indicate no meaningful common sequence of data between the two files.
  40. I mentioned before that the goal of D&A is to ensure that the significant properties of the material are inherent in whichever form the delivery takes.For design file I believe VM is the appropriate platform.I have built a virtual machine containing some design files with the associated fonts.People want to know the exact fonts, font spacing, etc. used. They don’t have the fonts – so even they download the file, they cannot recreate the appearance of the file,Virtual machine created using Parallels Desktop.
  41. How to delivery 50,000 emails? I worked with colleague at Stanford to produce network graph of 50,000 emails. Name of the network software: Gephi is an open-source software for visualizing and analyzing large networks graphs.
  42. I am very lucky to meet Computer Science candidate at Stanford. SudheendraHangalEmail visualization tool for sentiment analysis.Psychology literature to define what words constitute happiness, love, etc. Topic analysis using software
  43. Annotation, see individual email