SlideShare a Scribd company logo
1 of 25
Approaching Archival Authenticity when ā€œRecordsā€ become
ā€œDataā€
Rebecca Grant, Digital Archivist, Digital Repository of Ireland
Dolores Grant, DRI-IRL Digital Archivist, Digital Repository of Ireland
Dr. Sharon Webb, Knowledge Transfer Manager, Digital Arts & Humanities PhD
Programme
Dr. Sandra Collins, Director, Digital Repository of Ireland
The Digital Repository of Ireland
DRI is a trusted digital repository for the Humanities and
Social Sciences data ā€“ launched June 2015
Linking and preserving the rich collections held by Irish
institutions (archives, museums, libraries, galleries,
universities, research projects etc)
Focal point for the development of national guidelines and
policy for digital preservation and access.
repository.dri.ie
Irish Record Linkage project 1864-1913
Irish Record Linkage is an Irish Research Council funded project running
from 2014 ā€“ September 2015
Collaboration between the University of Limerick (medical historians),
the Digital Repository of Ireland at the Royal Irish Academy (archivists!),
and Insight@NUI Galway (knowledge engineers, Linked Data experts)
Constructing a Knowledge Platform ā€“ Linked Data based on Vital
Registration Data (digitised registers of Births, Marriages and Deaths) in
order to answer research questions around infant and maternal
mortality
Irish Record Linkage and Linked Data Queries
ā€¢ How many women died within 42 days following childbirth due to
complications related to labour and how does that figure correspond
with the official reports?
ā€¢ Which women died of causes that can be attributed to maternal death,
but for which no corresponding birth certificate exists?
ā€¢ How did various socio-economic conditions affect maternal and infant
mortality rates?
The General Register Office (GRO) ā€“ civil registry responsible for recording
information on births, deaths and marriages.
Records of 6,009,781 births (from 1864 to 1912), 4,314,963 deaths (from 1864 and
1912) and 1,443,110 marriages (from 1845 to 1912) transferred to the project
team with strict terms and conditions.
Events were captured on register pages (up to 10 for births and deaths, and up to 4
for marriages) divided by district and sent to the GRO where volumes were then
created and an index compiled.
Database dump of the GRO's database with digitised versions of the
register pages and indexes (TIFFs)
General Register Office records
Data (eg. database records and TIFFs) are only stored for the duration of the
project, and must be destroyed following its completion
Data can only be accessed by the IRL project team after an access agreement has
been signed
Records cannot be duplicated, downloaded, brought off-site
Personal, identifying information cannot be published
Copyright and related rights remain vested in the General Register Office.
Terms of transfer
Birth records with redactions
The IRL project are not data
owners..
The security and authenticity of the
dataset were critical to the success
of the project.
The Linked Data Concept
A method of publishing structured data on the Web,
allowing it to be connected and enriched, and facilitating
linking between related resources.
Linked Data standards such as RDF allows semantic
definitions to be applied to information, using statements
called ā€˜triplesā€™ in the form subject, predicate, object.
A key principle of Linked Data is that HTTP URIs are used to
name the semantic elements of the dataset
The Linked Data Concept
The example above describes the subject (James Joyce) and his
relationship (predicate) to an object (Dublin). By semantically
separating the elements of the information (that James Joyce was
born in Dublin) datasets stored in this way can be easily queried.
Competency questions for ontology construction
ID Competency Question
C01 Women died within 41 days after giving birth
(the date of birth counted as day 1 and day 41 is included)
C02 Women died within 41 days after giving birth AND in their death certificate
ā€˜complication 1ā€™ is mentioned.
C03 Women died within 41 days after giving birth AND in their death certificate
ā€˜complication 2ā€™ is mentioned.
C04 Women having official maternal death reports including ā€œXXXXā€™
C05 Women having official maternal death reports including ā€œcause 1ā€
C06 Women having official maternal death reports including ā€œcause 2 and cause 3
togetherā€
C07 For each record in C04 find the ones with corresponding birth record
(the date of death counted as day 1 and day 41 is included)
A General Register Office Birth Record, 1870
Linked Data (logainm.ie)
Register TIFF Index TIFF System Pre 1900 System Post 1900
Superintendent Registrarā€™s
District
Registrarā€™s District Registration district District District
Union
County County County
Province Province
Number in register Entry number
Date & place of birth Year of event Date of birth, year of event
Name (if any) Name Forename, Surname Forename, Surname
Sex Sex
Name, surname &
dwelling place of father
Name & surname &
maiden surname of
mother
Motherā€™s maiden name
Rank or profession of
father
Signature, qualification,
and residence of
informant
When Registered Returns year Returns year
Returns quarter Returns quarter
Signature of Registrar
Name & surname &
maiden surname of
mother
Rank or profession of
father
Signature, qualification,
and residence of
informant
Signature of Registrar
DRI Presentation
Archival authenticity
The quality of being genuine, not a counterfeit, and free from tampering,
and is typically inferred from internal and external evidence, including its
physical characteristics, structure, content, and context.
The presence of a signature serves as a fundamental test for authenticity;
the signature identifies the creator and establishes the relationship between
the creator and the record.
The style and language of the document must be consistent with
other, related documents that are accepted as authentic.
Society of American Archivists
http://www2.archivists.org/glossary/terms/a/authenticity
DRI Presentation
Archival authenticity
Only records that are complete can ensure accountability and protect
personal rights[ā€¦]Individual records must be complete; they must contain
all the information they had when they were created. They must also
maintain their original structure and context. (Hirtle)
An authentic record is one that is what it purports to be and has not
been tampered with or otherwise corrupted. (InterPARES 2)
For a record to be considered trustworthy [ā€¦] it must accurately
reflect the event it records and be uncontaminated by the distorting
influence of time, bias, interpretation, or unwarranted opinion on
the part of the record-maker (McNeil)
DRI Presentation
Approaching authenticity for the IRL project
The dataset cannot provide evidence of structure, context, standardised
style, signatures ā€“ therefore the data ā€œrecordā€ must always be linked to
the TIFF
The ā€œrecordsā€ transcribed must be complete ā€“ all data must be
transcribed, even if it is not currently used to answer our research
questions
The ā€œrecordsā€ should not be biased by interpretation ā€“ each piece of
data should be transcribed faithfully.
Initial data preparation
Final dataset comprises death records from 2 districts in Dublin (South
City no. 1 and South City no. 3)
Separate database constructed to enable the encoding of the IRL records
Tables represent both the register pages and the records (ā€œrecordā€ =
historical event)
The register page and record are linked to the index page
Fields created reflect original record information and structure enables
transformation to RDF
DRI Presentation
ā€¢ Whole, authentic record maintained to represent the original
record and preserve context of creation
ā€¢ Every database record linked to the TIFF image ā€“ TIFFs stored in
semi-meaningful arrangement
ā€¢ Consistent cataloguing practices (dates, square brackets, [sic],
notes field to capture anomalies)
ā€¢ Paleography
ā€¢ Controlled vocabulary of death terms and professions
ā€¢ Archiving databases: preserving content, structure and processes
(RODA toolkit (Repository of Authentic Digital Objects), SIARD
(Software Independent Archiving of Relational Databases))
Data challenges
GRO Triplestore
Triplestore 2 Data Analysis
Transformation from one model to
another
ā€¢ SPIN ā€“ SPARQL Inference
ā€¢ SWRL / RuleML
ā€¢ SPARQL Construct
ā€¢ ā€¦
SEPARATIONOFCONCERNS
GRO Records annotation vs. Data Analysis
DRI Presentation
Separation of concerns ā€“ transcription vs intepretation
Variance in how subject names and places were recorded (initials,
short hands, name of a building versus street name) -
might imply something, which we are currently unaware of.
Transcription of the register pages transcribes exactly what was written
down.
Some interpretation necessary in order to use data however ā€“ eg. street
names changing over time, new insights into medical conditions, adoption
of new social theory (eg. class distinctions)
Captured data in two separate ontologies ā€“ one for transcription, one for
intepretation. For example a death recorded in days in the first database
can be interpreted/queried as hours in the second.
DRI Presentation
Register page as EAD (database crosswalk)
DRI Presentation
DRI Presentation
Thinking about archival authenticity
Archivist encoded entire register pages rather than lines of data regarding an
individual (eg. a single life event such as a death)
Database records refer back to digitised TIFFs created by General Register Office
Interpretation of the dataset occurs separately ā€“ all records are transcribed exactly
including typos, blank fields, details crossed out, Xs etc.
TIFFs can be preserved with EAD metadata, and associated databases preserved
separately and linked
Querying of the data occurs only on an obfuscated dataset with personal names
excluded; linked data can contain outbound links but is protected by a firewall
Authenticity of the dataset
Bibliography
Hirtle, Peter. ā€œArchival Authenticity in a Digital Ageā€. Authenticity in a digital
environment, 2000.
Lee, Brent. Authenticity, Accuracy and Reliability: Reconciling Arts-related and
Archival Literature, 2005.
McNeil, Heather. ā€œTrusting Records in a Postmodern Worldā€. Archivaria 51,
2001.
Pearce-Moses, Richard. A Glossary of Archival and Records Terminology, 2005.
SIARD Suite:
http://www.bar.admin.ch/dienstleistungen/00823/01911/index.html?lang=en
@beck_grant
@dri_ireland
r.grant@ria.ie
http://repository.dri.ie
The content of this presentation is licensed as CC-BY. Please attribute to Rebecca Grant, Digital
Archivist, Digital Repository of Ireland, 2015.
https://irishrecordlinkage.wordpress.com/

More Related Content

Similar to Rebecca Grant - Approaching Archival Authenticity: when 'Records' become 'Data.

Mid-Sweden University/SNIA Conference 13 October 2008
Mid-Sweden University/SNIA Conference 13 October 2008Mid-Sweden University/SNIA Conference 13 October 2008
Mid-Sweden University/SNIA Conference 13 October 2008Mark Conrad
Ā 
Some thoughts about the gaps across languages and domains through the experi...
Some thoughts about the gaps across languages and domains through the experi...Some thoughts about the gaps across languages and domains through the experi...
Some thoughts about the gaps across languages and domains through the experi...National Institute of Informatics (NII)
Ā 
Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913
Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913
Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913IRL_Project
Ā 
Cultural Heritage Insitutions and Big Data Collections
Cultural Heritage Insitutions and Big Data CollectionsCultural Heritage Insitutions and Big Data Collections
Cultural Heritage Insitutions and Big Data Collectionslljohnston
Ā 
Providing geospatial information as Linked Open Data
Providing geospatial information as Linked Open DataProviding geospatial information as Linked Open Data
Providing geospatial information as Linked Open DataPat Kenny
Ā 
Reusing legacy data: Irish historic Vital Registration data, 1864-1913
Reusing legacy data: Irish historic Vital Registration data, 1864-1913Reusing legacy data: Irish historic Vital Registration data, 1864-1913
Reusing legacy data: Irish historic Vital Registration data, 1864-1913dri_ireland
Ā 
Linked Data, Irish Maternity and Maternal Mortality 1864-1913
Linked Data, Irish Maternity and Maternal Mortality 1864-1913Linked Data, Irish Maternity and Maternal Mortality 1864-1913
Linked Data, Irish Maternity and Maternal Mortality 1864-1913IRL_Project
Ā 
Rogers digitalmethodsaftersocialmedia nov2013_optimized_
Rogers digitalmethodsaftersocialmedia nov2013_optimized_Rogers digitalmethodsaftersocialmedia nov2013_optimized_
Rogers digitalmethodsaftersocialmedia nov2013_optimized_Digital Methods Initiative
Ā 
Creating and Consuming Metadata from Transcribed Historical Vital Records for...
Creating and Consuming Metadata from Transcribed Historical Vital Records for...Creating and Consuming Metadata from Transcribed Historical Vital Records for...
Creating and Consuming Metadata from Transcribed Historical Vital Records for...Christophe Debruyne
Ā 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities Getaneh Alemu
Ā 
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...IRL_Project
Ā 
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...dri_ireland
Ā 
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...Christophe Debruyne
Ā 
Sailing on the ocean of 1s and 0s
Sailing on the ocean of 1s and 0sSailing on the ocean of 1s and 0s
Sailing on the ocean of 1s and 0sWoodruff Solutions LLC
Ā 
Toward universal information access on the digital object cloud
Toward universal information access on the digital object cloudToward universal information access on the digital object cloud
Toward universal information access on the digital object cloudNational Institute of Informatics
Ā 
Managing Social Science Data from the Arctic with ELOKA, ACADIS, NSIDC, and (...
Managing Social Science Data from the Arctic with ELOKA, ACADIS, NSIDC, and (...Managing Social Science Data from the Arctic with ELOKA, ACADIS, NSIDC, and (...
Managing Social Science Data from the Arctic with ELOKA, ACADIS, NSIDC, and (...nabo_ghea
Ā 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media suresh sood
Ā 
Towards Linked Vital Registration Data for Reconstituting Families and Creati...
Towards Linked Vital Registration Data for Reconstituting Families and Creati...Towards Linked Vital Registration Data for Reconstituting Families and Creati...
Towards Linked Vital Registration Data for Reconstituting Families and Creati...dri_ireland
Ā 
Towards linked vital registration data for reconstituting families and creati...
Towards linked vital registration data for reconstituting families and creati...Towards linked vital registration data for reconstituting families and creati...
Towards linked vital registration data for reconstituting families and creati...IRL_Project
Ā 
Dp Geosc Info Presentation Final Version 2
Dp Geosc Info Presentation Final Version 2Dp Geosc Info Presentation Final Version 2
Dp Geosc Info Presentation Final Version 2Smita Chandra
Ā 

Similar to Rebecca Grant - Approaching Archival Authenticity: when 'Records' become 'Data. (20)

Mid-Sweden University/SNIA Conference 13 October 2008
Mid-Sweden University/SNIA Conference 13 October 2008Mid-Sweden University/SNIA Conference 13 October 2008
Mid-Sweden University/SNIA Conference 13 October 2008
Ā 
Some thoughts about the gaps across languages and domains through the experi...
Some thoughts about the gaps across languages and domains through the experi...Some thoughts about the gaps across languages and domains through the experi...
Some thoughts about the gaps across languages and domains through the experi...
Ā 
Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913
Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913
Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913
Ā 
Cultural Heritage Insitutions and Big Data Collections
Cultural Heritage Insitutions and Big Data CollectionsCultural Heritage Insitutions and Big Data Collections
Cultural Heritage Insitutions and Big Data Collections
Ā 
Providing geospatial information as Linked Open Data
Providing geospatial information as Linked Open DataProviding geospatial information as Linked Open Data
Providing geospatial information as Linked Open Data
Ā 
Reusing legacy data: Irish historic Vital Registration data, 1864-1913
Reusing legacy data: Irish historic Vital Registration data, 1864-1913Reusing legacy data: Irish historic Vital Registration data, 1864-1913
Reusing legacy data: Irish historic Vital Registration data, 1864-1913
Ā 
Linked Data, Irish Maternity and Maternal Mortality 1864-1913
Linked Data, Irish Maternity and Maternal Mortality 1864-1913Linked Data, Irish Maternity and Maternal Mortality 1864-1913
Linked Data, Irish Maternity and Maternal Mortality 1864-1913
Ā 
Rogers digitalmethodsaftersocialmedia nov2013_optimized_
Rogers digitalmethodsaftersocialmedia nov2013_optimized_Rogers digitalmethodsaftersocialmedia nov2013_optimized_
Rogers digitalmethodsaftersocialmedia nov2013_optimized_
Ā 
Creating and Consuming Metadata from Transcribed Historical Vital Records for...
Creating and Consuming Metadata from Transcribed Historical Vital Records for...Creating and Consuming Metadata from Transcribed Historical Vital Records for...
Creating and Consuming Metadata from Transcribed Historical Vital Records for...
Ā 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities
Ā 
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Ā 
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Ā 
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Ā 
Sailing on the ocean of 1s and 0s
Sailing on the ocean of 1s and 0sSailing on the ocean of 1s and 0s
Sailing on the ocean of 1s and 0s
Ā 
Toward universal information access on the digital object cloud
Toward universal information access on the digital object cloudToward universal information access on the digital object cloud
Toward universal information access on the digital object cloud
Ā 
Managing Social Science Data from the Arctic with ELOKA, ACADIS, NSIDC, and (...
Managing Social Science Data from the Arctic with ELOKA, ACADIS, NSIDC, and (...Managing Social Science Data from the Arctic with ELOKA, ACADIS, NSIDC, and (...
Managing Social Science Data from the Arctic with ELOKA, ACADIS, NSIDC, and (...
Ā 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media
Ā 
Towards Linked Vital Registration Data for Reconstituting Families and Creati...
Towards Linked Vital Registration Data for Reconstituting Families and Creati...Towards Linked Vital Registration Data for Reconstituting Families and Creati...
Towards Linked Vital Registration Data for Reconstituting Families and Creati...
Ā 
Towards linked vital registration data for reconstituting families and creati...
Towards linked vital registration data for reconstituting families and creati...Towards linked vital registration data for reconstituting families and creati...
Towards linked vital registration data for reconstituting families and creati...
Ā 
Dp Geosc Info Presentation Final Version 2
Dp Geosc Info Presentation Final Version 2Dp Geosc Info Presentation Final Version 2
Dp Geosc Info Presentation Final Version 2
Ā 

More from dri_ireland

NORFest 2023 Lightning Talks Session Two
NORFest 2023 Lightning Talks Session TwoNORFest 2023 Lightning Talks Session Two
NORFest 2023 Lightning Talks Session Twodri_ireland
Ā 
NORFest 2023: Early Career Researcher Panel on Research Assessment
NORFest 2023: Early Career Researcher Panel on Research AssessmentNORFest 2023: Early Career Researcher Panel on Research Assessment
NORFest 2023: Early Career Researcher Panel on Research Assessmentdri_ireland
Ā 
NORFest 2023: National Open Research Fund 2023, Projects Launch
NORFest 2023: National Open Research Fund 2023, Projects LaunchNORFest 2023: National Open Research Fund 2023, Projects Launch
NORFest 2023: National Open Research Fund 2023, Projects Launchdri_ireland
Ā 
NORFest 2023 Lightning Talks Session Three
NORFest 2023 Lightning Talks Session Three NORFest 2023 Lightning Talks Session Three
NORFest 2023 Lightning Talks Session Three dri_ireland
Ā 
NORFest 2023 Lightning Talks Session One
NORFest 2023 Lightning Talks Session OneNORFest 2023 Lightning Talks Session One
NORFest 2023 Lightning Talks Session Onedri_ireland
Ā 
NORFest2023 Keynote address: Chelle Gentemann (NASA)
NORFest2023 Keynote address: Chelle Gentemann (NASA)NORFest2023 Keynote address: Chelle Gentemann (NASA)
NORFest2023 Keynote address: Chelle Gentemann (NASA)dri_ireland
Ā 
The Archiving Reproductive Health project as a FAIR data resource for humanit...
The Archiving Reproductive Health project as a FAIR data resource for humanit...The Archiving Reproductive Health project as a FAIR data resource for humanit...
The Archiving Reproductive Health project as a FAIR data resource for humanit...dri_ireland
Ā 
Developing a self-care protocol for working with potentially traumatic data: ...
Developing a self-care protocol for working with potentially traumatic data: ...Developing a self-care protocol for working with potentially traumatic data: ...
Developing a self-care protocol for working with potentially traumatic data: ...dri_ireland
Ā 
An Introduction to the Digital Repository of Ireland
An Introduction to the Digital Repository of Ireland An Introduction to the Digital Repository of Ireland
An Introduction to the Digital Repository of Ireland dri_ireland
Ā 
DRI Copyright and Licencing_UCC_Mar23.pptx
DRI Copyright and Licencing_UCC_Mar23.pptxDRI Copyright and Licencing_UCC_Mar23.pptx
DRI Copyright and Licencing_UCC_Mar23.pptxdri_ireland
Ā 
The Digital Repository of Ireland Digital Preservation and Research Sustainab...
The Digital Repository of Ireland Digital Preservation and Research Sustainab...The Digital Repository of Ireland Digital Preservation and Research Sustainab...
The Digital Repository of Ireland Digital Preservation and Research Sustainab...dri_ireland
Ā 
DRI's role in WorldFAIR: Cultural Heritage / Image Sharing
DRI's role in WorldFAIR: Cultural Heritage / Image SharingDRI's role in WorldFAIR: Cultural Heritage / Image Sharing
DRI's role in WorldFAIR: Cultural Heritage / Image Sharingdri_ireland
Ā 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data managementdri_ireland
Ā 
Archiving Ports, Ports as Archives
Archiving Ports, Ports as ArchivesArchiving Ports, Ports as Archives
Archiving Ports, Ports as Archivesdri_ireland
Ā 
Preservation, Access, Discovery
Preservation, Access, DiscoveryPreservation, Access, Discovery
Preservation, Access, Discoverydri_ireland
Ā 
Dublin in the Fingal Archives
Dublin in the Fingal ArchivesDublin in the Fingal Archives
Dublin in the Fingal Archivesdri_ireland
Ā 
Dublin Ghost Signs
Dublin Ghost SignsDublin Ghost Signs
Dublin Ghost Signsdri_ireland
Ā 
Mapping Memories: Participatory Media, Place-Based Stories, Refugee Youth
Mapping Memories: Participatory Media, Place-Based Stories, Refugee YouthMapping Memories: Participatory Media, Place-Based Stories, Refugee Youth
Mapping Memories: Participatory Media, Place-Based Stories, Refugee Youthdri_ireland
Ā 
Supporting Activists to Preserve Video Documentation
Supporting Activists to Preserve Video Documentation Supporting Activists to Preserve Video Documentation
Supporting Activists to Preserve Video Documentation dri_ireland
Ā 
Making the Future
Making the FutureMaking the Future
Making the Futuredri_ireland
Ā 

More from dri_ireland (20)

NORFest 2023 Lightning Talks Session Two
NORFest 2023 Lightning Talks Session TwoNORFest 2023 Lightning Talks Session Two
NORFest 2023 Lightning Talks Session Two
Ā 
NORFest 2023: Early Career Researcher Panel on Research Assessment
NORFest 2023: Early Career Researcher Panel on Research AssessmentNORFest 2023: Early Career Researcher Panel on Research Assessment
NORFest 2023: Early Career Researcher Panel on Research Assessment
Ā 
NORFest 2023: National Open Research Fund 2023, Projects Launch
NORFest 2023: National Open Research Fund 2023, Projects LaunchNORFest 2023: National Open Research Fund 2023, Projects Launch
NORFest 2023: National Open Research Fund 2023, Projects Launch
Ā 
NORFest 2023 Lightning Talks Session Three
NORFest 2023 Lightning Talks Session Three NORFest 2023 Lightning Talks Session Three
NORFest 2023 Lightning Talks Session Three
Ā 
NORFest 2023 Lightning Talks Session One
NORFest 2023 Lightning Talks Session OneNORFest 2023 Lightning Talks Session One
NORFest 2023 Lightning Talks Session One
Ā 
NORFest2023 Keynote address: Chelle Gentemann (NASA)
NORFest2023 Keynote address: Chelle Gentemann (NASA)NORFest2023 Keynote address: Chelle Gentemann (NASA)
NORFest2023 Keynote address: Chelle Gentemann (NASA)
Ā 
The Archiving Reproductive Health project as a FAIR data resource for humanit...
The Archiving Reproductive Health project as a FAIR data resource for humanit...The Archiving Reproductive Health project as a FAIR data resource for humanit...
The Archiving Reproductive Health project as a FAIR data resource for humanit...
Ā 
Developing a self-care protocol for working with potentially traumatic data: ...
Developing a self-care protocol for working with potentially traumatic data: ...Developing a self-care protocol for working with potentially traumatic data: ...
Developing a self-care protocol for working with potentially traumatic data: ...
Ā 
An Introduction to the Digital Repository of Ireland
An Introduction to the Digital Repository of Ireland An Introduction to the Digital Repository of Ireland
An Introduction to the Digital Repository of Ireland
Ā 
DRI Copyright and Licencing_UCC_Mar23.pptx
DRI Copyright and Licencing_UCC_Mar23.pptxDRI Copyright and Licencing_UCC_Mar23.pptx
DRI Copyright and Licencing_UCC_Mar23.pptx
Ā 
The Digital Repository of Ireland Digital Preservation and Research Sustainab...
The Digital Repository of Ireland Digital Preservation and Research Sustainab...The Digital Repository of Ireland Digital Preservation and Research Sustainab...
The Digital Repository of Ireland Digital Preservation and Research Sustainab...
Ā 
DRI's role in WorldFAIR: Cultural Heritage / Image Sharing
DRI's role in WorldFAIR: Cultural Heritage / Image SharingDRI's role in WorldFAIR: Cultural Heritage / Image Sharing
DRI's role in WorldFAIR: Cultural Heritage / Image Sharing
Ā 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data management
Ā 
Archiving Ports, Ports as Archives
Archiving Ports, Ports as ArchivesArchiving Ports, Ports as Archives
Archiving Ports, Ports as Archives
Ā 
Preservation, Access, Discovery
Preservation, Access, DiscoveryPreservation, Access, Discovery
Preservation, Access, Discovery
Ā 
Dublin in the Fingal Archives
Dublin in the Fingal ArchivesDublin in the Fingal Archives
Dublin in the Fingal Archives
Ā 
Dublin Ghost Signs
Dublin Ghost SignsDublin Ghost Signs
Dublin Ghost Signs
Ā 
Mapping Memories: Participatory Media, Place-Based Stories, Refugee Youth
Mapping Memories: Participatory Media, Place-Based Stories, Refugee YouthMapping Memories: Participatory Media, Place-Based Stories, Refugee Youth
Mapping Memories: Participatory Media, Place-Based Stories, Refugee Youth
Ā 
Supporting Activists to Preserve Video Documentation
Supporting Activists to Preserve Video Documentation Supporting Activists to Preserve Video Documentation
Supporting Activists to Preserve Video Documentation
Ā 
Making the Future
Making the FutureMaking the Future
Making the Future
Ā 

Recently uploaded

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
Ā 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
Ā 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
Ā 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
Ā 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
Ā 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
Ā 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
Ā 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
Ā 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
Ā 
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø9953056974 Low Rate Call Girls In Saket, Delhi NCR
Ā 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
Ā 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
Ā 
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdfssuser54595a
Ā 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
Ā 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
Ā 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
Ā 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
Ā 

Recently uploaded (20)

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
Ā 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
Ā 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
Ā 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
Ā 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
Ā 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
Ā 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
Ā 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Ā 
Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”
Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”
Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”
Ā 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
Ā 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
Ā 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
Ā 
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
Ā 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
Ā 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
Ā 
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf
Ā 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
Ā 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
Ā 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
Ā 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
Ā 

Rebecca Grant - Approaching Archival Authenticity: when 'Records' become 'Data.

  • 1. Approaching Archival Authenticity when ā€œRecordsā€ become ā€œDataā€ Rebecca Grant, Digital Archivist, Digital Repository of Ireland Dolores Grant, DRI-IRL Digital Archivist, Digital Repository of Ireland Dr. Sharon Webb, Knowledge Transfer Manager, Digital Arts & Humanities PhD Programme Dr. Sandra Collins, Director, Digital Repository of Ireland
  • 2. The Digital Repository of Ireland DRI is a trusted digital repository for the Humanities and Social Sciences data ā€“ launched June 2015 Linking and preserving the rich collections held by Irish institutions (archives, museums, libraries, galleries, universities, research projects etc) Focal point for the development of national guidelines and policy for digital preservation and access. repository.dri.ie
  • 3. Irish Record Linkage project 1864-1913 Irish Record Linkage is an Irish Research Council funded project running from 2014 ā€“ September 2015 Collaboration between the University of Limerick (medical historians), the Digital Repository of Ireland at the Royal Irish Academy (archivists!), and Insight@NUI Galway (knowledge engineers, Linked Data experts) Constructing a Knowledge Platform ā€“ Linked Data based on Vital Registration Data (digitised registers of Births, Marriages and Deaths) in order to answer research questions around infant and maternal mortality
  • 4. Irish Record Linkage and Linked Data Queries ā€¢ How many women died within 42 days following childbirth due to complications related to labour and how does that figure correspond with the official reports? ā€¢ Which women died of causes that can be attributed to maternal death, but for which no corresponding birth certificate exists? ā€¢ How did various socio-economic conditions affect maternal and infant mortality rates?
  • 5. The General Register Office (GRO) ā€“ civil registry responsible for recording information on births, deaths and marriages. Records of 6,009,781 births (from 1864 to 1912), 4,314,963 deaths (from 1864 and 1912) and 1,443,110 marriages (from 1845 to 1912) transferred to the project team with strict terms and conditions. Events were captured on register pages (up to 10 for births and deaths, and up to 4 for marriages) divided by district and sent to the GRO where volumes were then created and an index compiled. Database dump of the GRO's database with digitised versions of the register pages and indexes (TIFFs) General Register Office records
  • 6. Data (eg. database records and TIFFs) are only stored for the duration of the project, and must be destroyed following its completion Data can only be accessed by the IRL project team after an access agreement has been signed Records cannot be duplicated, downloaded, brought off-site Personal, identifying information cannot be published Copyright and related rights remain vested in the General Register Office. Terms of transfer
  • 7. Birth records with redactions The IRL project are not data owners.. The security and authenticity of the dataset were critical to the success of the project.
  • 8. The Linked Data Concept A method of publishing structured data on the Web, allowing it to be connected and enriched, and facilitating linking between related resources. Linked Data standards such as RDF allows semantic definitions to be applied to information, using statements called ā€˜triplesā€™ in the form subject, predicate, object. A key principle of Linked Data is that HTTP URIs are used to name the semantic elements of the dataset
  • 9. The Linked Data Concept The example above describes the subject (James Joyce) and his relationship (predicate) to an object (Dublin). By semantically separating the elements of the information (that James Joyce was born in Dublin) datasets stored in this way can be easily queried.
  • 10. Competency questions for ontology construction ID Competency Question C01 Women died within 41 days after giving birth (the date of birth counted as day 1 and day 41 is included) C02 Women died within 41 days after giving birth AND in their death certificate ā€˜complication 1ā€™ is mentioned. C03 Women died within 41 days after giving birth AND in their death certificate ā€˜complication 2ā€™ is mentioned. C04 Women having official maternal death reports including ā€œXXXXā€™ C05 Women having official maternal death reports including ā€œcause 1ā€ C06 Women having official maternal death reports including ā€œcause 2 and cause 3 togetherā€ C07 For each record in C04 find the ones with corresponding birth record (the date of death counted as day 1 and day 41 is included)
  • 11. A General Register Office Birth Record, 1870
  • 13. Register TIFF Index TIFF System Pre 1900 System Post 1900 Superintendent Registrarā€™s District Registrarā€™s District Registration district District District Union County County County Province Province Number in register Entry number Date & place of birth Year of event Date of birth, year of event Name (if any) Name Forename, Surname Forename, Surname Sex Sex Name, surname & dwelling place of father Name & surname & maiden surname of mother Motherā€™s maiden name Rank or profession of father Signature, qualification, and residence of informant When Registered Returns year Returns year Returns quarter Returns quarter Signature of Registrar Name & surname & maiden surname of mother Rank or profession of father Signature, qualification, and residence of informant Signature of Registrar
  • 14. DRI Presentation Archival authenticity The quality of being genuine, not a counterfeit, and free from tampering, and is typically inferred from internal and external evidence, including its physical characteristics, structure, content, and context. The presence of a signature serves as a fundamental test for authenticity; the signature identifies the creator and establishes the relationship between the creator and the record. The style and language of the document must be consistent with other, related documents that are accepted as authentic. Society of American Archivists http://www2.archivists.org/glossary/terms/a/authenticity
  • 15. DRI Presentation Archival authenticity Only records that are complete can ensure accountability and protect personal rights[ā€¦]Individual records must be complete; they must contain all the information they had when they were created. They must also maintain their original structure and context. (Hirtle) An authentic record is one that is what it purports to be and has not been tampered with or otherwise corrupted. (InterPARES 2) For a record to be considered trustworthy [ā€¦] it must accurately reflect the event it records and be uncontaminated by the distorting influence of time, bias, interpretation, or unwarranted opinion on the part of the record-maker (McNeil)
  • 16. DRI Presentation Approaching authenticity for the IRL project The dataset cannot provide evidence of structure, context, standardised style, signatures ā€“ therefore the data ā€œrecordā€ must always be linked to the TIFF The ā€œrecordsā€ transcribed must be complete ā€“ all data must be transcribed, even if it is not currently used to answer our research questions The ā€œrecordsā€ should not be biased by interpretation ā€“ each piece of data should be transcribed faithfully.
  • 17. Initial data preparation Final dataset comprises death records from 2 districts in Dublin (South City no. 1 and South City no. 3) Separate database constructed to enable the encoding of the IRL records Tables represent both the register pages and the records (ā€œrecordā€ = historical event) The register page and record are linked to the index page Fields created reflect original record information and structure enables transformation to RDF
  • 18. DRI Presentation ā€¢ Whole, authentic record maintained to represent the original record and preserve context of creation ā€¢ Every database record linked to the TIFF image ā€“ TIFFs stored in semi-meaningful arrangement ā€¢ Consistent cataloguing practices (dates, square brackets, [sic], notes field to capture anomalies) ā€¢ Paleography ā€¢ Controlled vocabulary of death terms and professions ā€¢ Archiving databases: preserving content, structure and processes (RODA toolkit (Repository of Authentic Digital Objects), SIARD (Software Independent Archiving of Relational Databases)) Data challenges
  • 19. GRO Triplestore Triplestore 2 Data Analysis Transformation from one model to another ā€¢ SPIN ā€“ SPARQL Inference ā€¢ SWRL / RuleML ā€¢ SPARQL Construct ā€¢ ā€¦ SEPARATIONOFCONCERNS GRO Records annotation vs. Data Analysis
  • 20. DRI Presentation Separation of concerns ā€“ transcription vs intepretation Variance in how subject names and places were recorded (initials, short hands, name of a building versus street name) - might imply something, which we are currently unaware of. Transcription of the register pages transcribes exactly what was written down. Some interpretation necessary in order to use data however ā€“ eg. street names changing over time, new insights into medical conditions, adoption of new social theory (eg. class distinctions) Captured data in two separate ontologies ā€“ one for transcription, one for intepretation. For example a death recorded in days in the first database can be interpreted/queried as hours in the second.
  • 21. DRI Presentation Register page as EAD (database crosswalk)
  • 23. DRI Presentation Thinking about archival authenticity Archivist encoded entire register pages rather than lines of data regarding an individual (eg. a single life event such as a death) Database records refer back to digitised TIFFs created by General Register Office Interpretation of the dataset occurs separately ā€“ all records are transcribed exactly including typos, blank fields, details crossed out, Xs etc. TIFFs can be preserved with EAD metadata, and associated databases preserved separately and linked Querying of the data occurs only on an obfuscated dataset with personal names excluded; linked data can contain outbound links but is protected by a firewall Authenticity of the dataset
  • 24. Bibliography Hirtle, Peter. ā€œArchival Authenticity in a Digital Ageā€. Authenticity in a digital environment, 2000. Lee, Brent. Authenticity, Accuracy and Reliability: Reconciling Arts-related and Archival Literature, 2005. McNeil, Heather. ā€œTrusting Records in a Postmodern Worldā€. Archivaria 51, 2001. Pearce-Moses, Richard. A Glossary of Archival and Records Terminology, 2005. SIARD Suite: http://www.bar.admin.ch/dienstleistungen/00823/01911/index.html?lang=en
  • 25. @beck_grant @dri_ireland r.grant@ria.ie http://repository.dri.ie The content of this presentation is licensed as CC-BY. Please attribute to Rebecca Grant, Digital Archivist, Digital Repository of Ireland, 2015. https://irishrecordlinkage.wordpress.com/