SlideShare a Scribd company logo
The “obsession” with
checksums
Helen Hockx-Yu
Office of Information Technologies
University of Notre Dame
Fixity and Checksums
● Fixity refers to the property of a digital file / object being fixed,
or unchanged
● Checksums are a means for determining fixity, or the sameness
between two copies of a digital object or the same object at
different times, and/or before and after certain events
● Checksums feature prominently in digital preservation
● Checksum-based fixity checking workflows are commonly used by
libraries and archives
IT use cases
● 5 years at ND, 3 roles
○ Strategy development for digital asset management (including digital archiving
and preservation, in collaboration with the ND Libraries and Archives)
○ Implementation of enterprise Digital Asset Management (DAM) service
○ Now an Enterprise Architect
● Verify software packages before distributing to many endpoints
● Data migration - correctness
● Mostly rely on built-in fault-tolerance features of storage systems
or data management systems for data storage and transfer
integrity
○ Isilon, Globus
Digital Preservation risks
● Bit flip is not a top reason for
data loss
● “The real culprits are a
combination of human error,
viruses, bugs in application
software, and malicious
employees or intruders. Almost
everyone has accidentally
erased or overwritten a file. “
● 13 threats in Requirements for
Digital Preservation Systems: A
Bottom-Up Approach
A matter of where to spend energy
● Format migration - another prominently featured digital preservation task, on which our thoughts have evolved
○ relaxed approach to format obsolescence - preserving the bits and dealing with format obsolescence if and
when it happens
● We still need to preserve the bits - but does this mean we need to run checksums on everything by ourselves?
○ Can we make use of hash of data stored by Cloud services - trust and egress
○ Pass hash values with upload
● More importantly, what other things have overlooked that can lead to data loss
○ Not collecting content
○ Dependency on storage technology (storage intermediary) that
■ Challenges the notion of “redundancy” or “lots of copies”
■ Introducing single point of failure
■ Holds the unique knowledge of file to object mapping
■ Hidden and invisible files could be added to the storage location inadvertently
● Scalability and automation
● Work together!

More Related Content

What's hot

Solving Document Security
Solving Document SecuritySolving Document Security
Solving Document Security
Zia Consulting
 
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and LloydIn Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
Lighton Phiri
 
Filling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangeFilling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on Change
PERICLES_FP7
 
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
Storage Switzerland
 
Cassie findlay
Cassie findlayCassie findlay
Keith prabhu global high on cloud summit
Keith prabhu  global high on cloud summitKeith prabhu  global high on cloud summit
Keith prabhu global high on cloud summit
administrator_confidis
 
Blockchain in Healthcare
Blockchain in HealthcareBlockchain in Healthcare
Blockchain in Healthcare
HELATHCURSOR CONSULTING GROUP
 
Knowledge Management And Small Business
Knowledge Management And Small BusinessKnowledge Management And Small Business
Knowledge Management And Small Business
James Purser
 
Cloud computing - When is Deletion Deletion?
Cloud computing - When is Deletion Deletion?Cloud computing - When is Deletion Deletion?
Cloud computing - When is Deletion Deletion?
Lancaster University Library
 
Blockchain Technology for Public and Commercial Libraries
Blockchain Technology for Public and Commercial LibrariesBlockchain Technology for Public and Commercial Libraries
Blockchain Technology for Public and Commercial Libraries
David Nzoputa Ofili
 
Global bigdata conf_01282013
Global bigdata conf_01282013Global bigdata conf_01282013
Global bigdata conf_01282013
HPCC Systems
 
CloudDOCX_SellSheet_GSmith_V2
CloudDOCX_SellSheet_GSmith_V2CloudDOCX_SellSheet_GSmith_V2
CloudDOCX_SellSheet_GSmith_V2
George L. Smith
 
The rise of big data on cloud computing
The rise of big data on cloud computing The rise of big data on cloud computing
The rise of big data on cloud computing
Muhammad Maaz Irfan
 
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKANOpen Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
Opening-up.eu
 
Todd Carpenter VM#3 Privacy Publisher Systems Introduction
Todd Carpenter VM#3 Privacy Publisher Systems IntroductionTodd Carpenter VM#3 Privacy Publisher Systems Introduction
Todd Carpenter VM#3 Privacy Publisher Systems Introduction
National Information Standards Organization (NISO)
 
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
Dez Blanchfield
 
Bluzelle - A Decentralized World Database
Bluzelle - A Decentralized World DatabaseBluzelle - A Decentralized World Database
Bluzelle - A Decentralized World Database
Bluzelle
 
Xact aceds 5-7-14 webcast
Xact aceds 5-7-14 webcastXact aceds 5-7-14 webcast
Xact aceds 5-7-14 webcast
Robbie Hilson
 
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
Florian Blum
 

What's hot (19)

Solving Document Security
Solving Document SecuritySolving Document Security
Solving Document Security
 
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and LloydIn Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
 
Filling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangeFilling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on Change
 
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
 
Cassie findlay
Cassie findlayCassie findlay
Cassie findlay
 
Keith prabhu global high on cloud summit
Keith prabhu  global high on cloud summitKeith prabhu  global high on cloud summit
Keith prabhu global high on cloud summit
 
Blockchain in Healthcare
Blockchain in HealthcareBlockchain in Healthcare
Blockchain in Healthcare
 
Knowledge Management And Small Business
Knowledge Management And Small BusinessKnowledge Management And Small Business
Knowledge Management And Small Business
 
Cloud computing - When is Deletion Deletion?
Cloud computing - When is Deletion Deletion?Cloud computing - When is Deletion Deletion?
Cloud computing - When is Deletion Deletion?
 
Blockchain Technology for Public and Commercial Libraries
Blockchain Technology for Public and Commercial LibrariesBlockchain Technology for Public and Commercial Libraries
Blockchain Technology for Public and Commercial Libraries
 
Global bigdata conf_01282013
Global bigdata conf_01282013Global bigdata conf_01282013
Global bigdata conf_01282013
 
CloudDOCX_SellSheet_GSmith_V2
CloudDOCX_SellSheet_GSmith_V2CloudDOCX_SellSheet_GSmith_V2
CloudDOCX_SellSheet_GSmith_V2
 
The rise of big data on cloud computing
The rise of big data on cloud computing The rise of big data on cloud computing
The rise of big data on cloud computing
 
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKANOpen Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
 
Todd Carpenter VM#3 Privacy Publisher Systems Introduction
Todd Carpenter VM#3 Privacy Publisher Systems IntroductionTodd Carpenter VM#3 Privacy Publisher Systems Introduction
Todd Carpenter VM#3 Privacy Publisher Systems Introduction
 
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
 
Bluzelle - A Decentralized World Database
Bluzelle - A Decentralized World DatabaseBluzelle - A Decentralized World Database
Bluzelle - A Decentralized World Database
 
Xact aceds 5-7-14 webcast
Xact aceds 5-7-14 webcastXact aceds 5-7-14 webcast
Xact aceds 5-7-14 webcast
 
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
 

Similar to The “obsession” with checksums by Helen Hockx-Yu

Destroying Data Silos
Destroying Data SilosDestroying Data Silos
Destroying Data Silos
DataWorks Summit
 
Destroying Data Silos
Destroying Data SilosDestroying Data Silos
Destroying Data Silos
Hellmar Becker
 
Alex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
Alex Hanway - Securing the Breach: Using a Holistic Data Protection FrameworkAlex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
Alex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
centralohioissa
 
4.1 Introduction to cloud storage.pptx
4.1 Introduction to cloud storage.pptx4.1 Introduction to cloud storage.pptx
4.1 Introduction to cloud storage.pptx
yasothamohankumar
 
ChodonKumar.pptx
ChodonKumar.pptxChodonKumar.pptx
ChodonKumar.pptx
AnSHiKa187943
 
26-170918023441.pptx
26-170918023441.pptx26-170918023441.pptx
26-170918023441.pptx
aravind Guru
 
26-170918023441 (1).pptx
26-170918023441 (1).pptx26-170918023441 (1).pptx
26-170918023441 (1).pptx
AnSHiKa187943
 
somee.pptx
somee.pptxsomee.pptx
somee.pptx
SritamDash6
 
Webinar: 10 Reasons Why Backup Breaks and How to Fix It
Webinar: 10 Reasons Why Backup Breaks and How to Fix ItWebinar: 10 Reasons Why Backup Breaks and How to Fix It
Webinar: 10 Reasons Why Backup Breaks and How to Fix It
Storage Switzerland
 
Introduction to Digital Preservation
Introduction to Digital PreservationIntroduction to Digital Preservation
Introduction to Digital Preservation
Bill LeFurgy
 
Group 4
Group 4Group 4
Group 4
liujiaxuan
 
Group 4
Group 4Group 4
Group 4
liujiaxuan
 
Make the Upgrade: Data protection in the cloud
Make the Upgrade: Data protection in the cloudMake the Upgrade: Data protection in the cloud
Make the Upgrade: Data protection in the cloud
Erik Von Schlehenried
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
DigitalPreservationEurope
 
Software Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE projectSoftware Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE project
ATMOSPHERE .
 
Securing Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container DataSecuring Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container Data
Mirantis
 
Digital preservation: an introduction
Digital preservation: an introductionDigital preservation: an introduction
Digital preservation: an introduction
PublicLibraryServices
 
Tsc2021 cyber-issues
Tsc2021 cyber-issuesTsc2021 cyber-issues
Tsc2021 cyber-issues
Ernest Staats
 
Cloud slide
Cloud slideCloud slide
Cloud slide
Athulya K S
 
Intelligent Cloud Enablement
Intelligent Cloud EnablementIntelligent Cloud Enablement
Intelligent Cloud Enablement
DocuLynx
 

Similar to The “obsession” with checksums by Helen Hockx-Yu (20)

Destroying Data Silos
Destroying Data SilosDestroying Data Silos
Destroying Data Silos
 
Destroying Data Silos
Destroying Data SilosDestroying Data Silos
Destroying Data Silos
 
Alex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
Alex Hanway - Securing the Breach: Using a Holistic Data Protection FrameworkAlex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
Alex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
 
4.1 Introduction to cloud storage.pptx
4.1 Introduction to cloud storage.pptx4.1 Introduction to cloud storage.pptx
4.1 Introduction to cloud storage.pptx
 
ChodonKumar.pptx
ChodonKumar.pptxChodonKumar.pptx
ChodonKumar.pptx
 
26-170918023441.pptx
26-170918023441.pptx26-170918023441.pptx
26-170918023441.pptx
 
26-170918023441 (1).pptx
26-170918023441 (1).pptx26-170918023441 (1).pptx
26-170918023441 (1).pptx
 
somee.pptx
somee.pptxsomee.pptx
somee.pptx
 
Webinar: 10 Reasons Why Backup Breaks and How to Fix It
Webinar: 10 Reasons Why Backup Breaks and How to Fix ItWebinar: 10 Reasons Why Backup Breaks and How to Fix It
Webinar: 10 Reasons Why Backup Breaks and How to Fix It
 
Introduction to Digital Preservation
Introduction to Digital PreservationIntroduction to Digital Preservation
Introduction to Digital Preservation
 
Group 4
Group 4Group 4
Group 4
 
Group 4
Group 4Group 4
Group 4
 
Make the Upgrade: Data protection in the cloud
Make the Upgrade: Data protection in the cloudMake the Upgrade: Data protection in the cloud
Make the Upgrade: Data protection in the cloud
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
 
Software Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE projectSoftware Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE project
 
Securing Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container DataSecuring Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container Data
 
Digital preservation: an introduction
Digital preservation: an introductionDigital preservation: an introduction
Digital preservation: an introduction
 
Tsc2021 cyber-issues
Tsc2021 cyber-issuesTsc2021 cyber-issues
Tsc2021 cyber-issues
 
Cloud slide
Cloud slideCloud slide
Cloud slide
 
Intelligent Cloud Enablement
Intelligent Cloud EnablementIntelligent Cloud Enablement
Intelligent Cloud Enablement
 

More from CLOCKSS

How CLOCKSS ingests and preserves content by Thib Guicherd-Callin
How CLOCKSS ingests and preserves content by Thib Guicherd-CallinHow CLOCKSS ingests and preserves content by Thib Guicherd-Callin
How CLOCKSS ingests and preserves content by Thib Guicherd-Callin
CLOCKSS
 
Kim Smilay - One day at CLOCKSS
Kim Smilay - One day at CLOCKSSKim Smilay - One day at CLOCKSS
Kim Smilay - One day at CLOCKSS
CLOCKSS
 
Jeremy Morse - nasig lightning talk model digital preservation policy
Jeremy Morse - nasig lightning talk model digital preservation policyJeremy Morse - nasig lightning talk model digital preservation policy
Jeremy Morse - nasig lightning talk model digital preservation policy
CLOCKSS
 
Roxanne Missingham - CLOCKSS presentation
Roxanne Missingham - CLOCKSS presentationRoxanne Missingham - CLOCKSS presentation
Roxanne Missingham - CLOCKSS presentation
CLOCKSS
 
Tim Martin - strategic perspectives on digital preservation from OCLC
Tim Martin - strategic perspectives on digital preservation from OCLCTim Martin - strategic perspectives on digital preservation from OCLC
Tim Martin - strategic perspectives on digital preservation from OCLC
CLOCKSS
 
Gaelle Bequet - Project JASPER
Gaelle Bequet - Project JASPERGaelle Bequet - Project JASPER
Gaelle Bequet - Project JASPER
CLOCKSS
 
Alicia Wise - About CLOCKSS
Alicia Wise - About CLOCKSSAlicia Wise - About CLOCKSS
Alicia Wise - About CLOCKSS
CLOCKSS
 

More from CLOCKSS (7)

How CLOCKSS ingests and preserves content by Thib Guicherd-Callin
How CLOCKSS ingests and preserves content by Thib Guicherd-CallinHow CLOCKSS ingests and preserves content by Thib Guicherd-Callin
How CLOCKSS ingests and preserves content by Thib Guicherd-Callin
 
Kim Smilay - One day at CLOCKSS
Kim Smilay - One day at CLOCKSSKim Smilay - One day at CLOCKSS
Kim Smilay - One day at CLOCKSS
 
Jeremy Morse - nasig lightning talk model digital preservation policy
Jeremy Morse - nasig lightning talk model digital preservation policyJeremy Morse - nasig lightning talk model digital preservation policy
Jeremy Morse - nasig lightning talk model digital preservation policy
 
Roxanne Missingham - CLOCKSS presentation
Roxanne Missingham - CLOCKSS presentationRoxanne Missingham - CLOCKSS presentation
Roxanne Missingham - CLOCKSS presentation
 
Tim Martin - strategic perspectives on digital preservation from OCLC
Tim Martin - strategic perspectives on digital preservation from OCLCTim Martin - strategic perspectives on digital preservation from OCLC
Tim Martin - strategic perspectives on digital preservation from OCLC
 
Gaelle Bequet - Project JASPER
Gaelle Bequet - Project JASPERGaelle Bequet - Project JASPER
Gaelle Bequet - Project JASPER
 
Alicia Wise - About CLOCKSS
Alicia Wise - About CLOCKSSAlicia Wise - About CLOCKSS
Alicia Wise - About CLOCKSS
 

Recently uploaded

aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
Hitesh Sikarwar
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
SSR02
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 

Recently uploaded (20)

aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 

The “obsession” with checksums by Helen Hockx-Yu

  • 1. The “obsession” with checksums Helen Hockx-Yu Office of Information Technologies University of Notre Dame
  • 2. Fixity and Checksums ● Fixity refers to the property of a digital file / object being fixed, or unchanged ● Checksums are a means for determining fixity, or the sameness between two copies of a digital object or the same object at different times, and/or before and after certain events ● Checksums feature prominently in digital preservation ● Checksum-based fixity checking workflows are commonly used by libraries and archives
  • 3. IT use cases ● 5 years at ND, 3 roles ○ Strategy development for digital asset management (including digital archiving and preservation, in collaboration with the ND Libraries and Archives) ○ Implementation of enterprise Digital Asset Management (DAM) service ○ Now an Enterprise Architect ● Verify software packages before distributing to many endpoints ● Data migration - correctness ● Mostly rely on built-in fault-tolerance features of storage systems or data management systems for data storage and transfer integrity ○ Isilon, Globus
  • 4. Digital Preservation risks ● Bit flip is not a top reason for data loss ● “The real culprits are a combination of human error, viruses, bugs in application software, and malicious employees or intruders. Almost everyone has accidentally erased or overwritten a file. “ ● 13 threats in Requirements for Digital Preservation Systems: A Bottom-Up Approach
  • 5. A matter of where to spend energy ● Format migration - another prominently featured digital preservation task, on which our thoughts have evolved ○ relaxed approach to format obsolescence - preserving the bits and dealing with format obsolescence if and when it happens ● We still need to preserve the bits - but does this mean we need to run checksums on everything by ourselves? ○ Can we make use of hash of data stored by Cloud services - trust and egress ○ Pass hash values with upload ● More importantly, what other things have overlooked that can lead to data loss ○ Not collecting content ○ Dependency on storage technology (storage intermediary) that ■ Challenges the notion of “redundancy” or “lots of copies” ■ Introducing single point of failure ■ Holds the unique knowledge of file to object mapping ■ Hidden and invisible files could be added to the storage location inadvertently ● Scalability and automation ● Work together!