SlideShare a Scribd company logo
1 of 5
Download to read offline
The “obsession” with
checksums
Helen Hockx-Yu
Office of Information Technologies
University of Notre Dame
Fixity and Checksums
● Fixity refers to the property of a digital file / object being fixed,
or unchanged
● Checksums are a means for determining fixity, or the sameness
between two copies of a digital object or the same object at
different times, and/or before and after certain events
● Checksums feature prominently in digital preservation
● Checksum-based fixity checking workflows are commonly used by
libraries and archives
IT use cases
● 5 years at ND, 3 roles
○ Strategy development for digital asset management (including digital archiving
and preservation, in collaboration with the ND Libraries and Archives)
○ Implementation of enterprise Digital Asset Management (DAM) service
○ Now an Enterprise Architect
● Verify software packages before distributing to many endpoints
● Data migration - correctness
● Mostly rely on built-in fault-tolerance features of storage systems
or data management systems for data storage and transfer
integrity
○ Isilon, Globus
Digital Preservation risks
● Bit flip is not a top reason for
data loss
● “The real culprits are a
combination of human error,
viruses, bugs in application
software, and malicious
employees or intruders. Almost
everyone has accidentally
erased or overwritten a file. “
● 13 threats in Requirements for
Digital Preservation Systems: A
Bottom-Up Approach
A matter of where to spend energy
● Format migration - another prominently featured digital preservation task, on which our thoughts have evolved
○ relaxed approach to format obsolescence - preserving the bits and dealing with format obsolescence if and
when it happens
● We still need to preserve the bits - but does this mean we need to run checksums on everything by ourselves?
○ Can we make use of hash of data stored by Cloud services - trust and egress
○ Pass hash values with upload
● More importantly, what other things have overlooked that can lead to data loss
○ Not collecting content
○ Dependency on storage technology (storage intermediary) that
■ Challenges the notion of “redundancy” or “lots of copies”
■ Introducing single point of failure
■ Holds the unique knowledge of file to object mapping
■ Hidden and invisible files could be added to the storage location inadvertently
● Scalability and automation
● Work together!

More Related Content

What's hot

Solving Document Security
Solving Document SecuritySolving Document Security
Solving Document SecurityZia Consulting
 
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and LloydIn Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and LloydLighton Phiri
 
Filling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangeFilling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangePERICLES_FP7
 
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data MadnessStorage Switzerland
 
Keith prabhu global high on cloud summit
Keith prabhu  global high on cloud summitKeith prabhu  global high on cloud summit
Keith prabhu global high on cloud summitadministrator_confidis
 
Knowledge Management And Small Business
Knowledge Management And Small BusinessKnowledge Management And Small Business
Knowledge Management And Small BusinessJames Purser
 
Blockchain Technology for Public and Commercial Libraries
Blockchain Technology for Public and Commercial LibrariesBlockchain Technology for Public and Commercial Libraries
Blockchain Technology for Public and Commercial LibrariesDavid Nzoputa Ofili
 
Global bigdata conf_01282013
Global bigdata conf_01282013Global bigdata conf_01282013
Global bigdata conf_01282013HPCC Systems
 
CloudDOCX_SellSheet_GSmith_V2
CloudDOCX_SellSheet_GSmith_V2CloudDOCX_SellSheet_GSmith_V2
CloudDOCX_SellSheet_GSmith_V2George L. Smith
 
The rise of big data on cloud computing
The rise of big data on cloud computing The rise of big data on cloud computing
The rise of big data on cloud computing Muhammad Maaz Irfan
 
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKANOpen Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKANOpening-up.eu
 
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209Dez Blanchfield
 
Bluzelle - A Decentralized World Database
Bluzelle - A Decentralized World DatabaseBluzelle - A Decentralized World Database
Bluzelle - A Decentralized World DatabaseBluzelle
 
Xact aceds 5-7-14 webcast
Xact aceds 5-7-14 webcastXact aceds 5-7-14 webcast
Xact aceds 5-7-14 webcastRobbie Hilson
 
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...Florian Blum
 

What's hot (19)

Solving Document Security
Solving Document SecuritySolving Document Security
Solving Document Security
 
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and LloydIn Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
 
Filling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangeFilling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on Change
 
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
15 Minute Friday: Tips for The Weekend - Stop the Unstructured Data Madness
 
Cassie findlay
Cassie findlayCassie findlay
Cassie findlay
 
Keith prabhu global high on cloud summit
Keith prabhu  global high on cloud summitKeith prabhu  global high on cloud summit
Keith prabhu global high on cloud summit
 
Blockchain in Healthcare
Blockchain in HealthcareBlockchain in Healthcare
Blockchain in Healthcare
 
Knowledge Management And Small Business
Knowledge Management And Small BusinessKnowledge Management And Small Business
Knowledge Management And Small Business
 
Cloud computing - When is Deletion Deletion?
Cloud computing - When is Deletion Deletion?Cloud computing - When is Deletion Deletion?
Cloud computing - When is Deletion Deletion?
 
Blockchain Technology for Public and Commercial Libraries
Blockchain Technology for Public and Commercial LibrariesBlockchain Technology for Public and Commercial Libraries
Blockchain Technology for Public and Commercial Libraries
 
Global bigdata conf_01282013
Global bigdata conf_01282013Global bigdata conf_01282013
Global bigdata conf_01282013
 
CloudDOCX_SellSheet_GSmith_V2
CloudDOCX_SellSheet_GSmith_V2CloudDOCX_SellSheet_GSmith_V2
CloudDOCX_SellSheet_GSmith_V2
 
The rise of big data on cloud computing
The rise of big data on cloud computing The rise of big data on cloud computing
The rise of big data on cloud computing
 
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKANOpen Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
 
Todd Carpenter VM#3 Privacy Publisher Systems Introduction
Todd Carpenter VM#3 Privacy Publisher Systems IntroductionTodd Carpenter VM#3 Privacy Publisher Systems Introduction
Todd Carpenter VM#3 Privacy Publisher Systems Introduction
 
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
Briefing room 2016-ep03-novetta-hadoop_data_lakes-dez_slides-20160209
 
Bluzelle - A Decentralized World Database
Bluzelle - A Decentralized World DatabaseBluzelle - A Decentralized World Database
Bluzelle - A Decentralized World Database
 
Xact aceds 5-7-14 webcast
Xact aceds 5-7-14 webcastXact aceds 5-7-14 webcast
Xact aceds 5-7-14 webcast
 
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
Towards Blockchain Tactics: Building Hybrid Decentralized Software Architectu...
 

Similar to The “obsession” with checksums by Helen Hockx-Yu

Alex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
Alex Hanway - Securing the Breach: Using a Holistic Data Protection FrameworkAlex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
Alex Hanway - Securing the Breach: Using a Holistic Data Protection Frameworkcentralohioissa
 
4.1 Introduction to cloud storage.pptx
4.1 Introduction to cloud storage.pptx4.1 Introduction to cloud storage.pptx
4.1 Introduction to cloud storage.pptxyasothamohankumar
 
26-170918023441.pptx
26-170918023441.pptx26-170918023441.pptx
26-170918023441.pptxaravind Guru
 
26-170918023441 (1).pptx
26-170918023441 (1).pptx26-170918023441 (1).pptx
26-170918023441 (1).pptxAnSHiKa187943
 
Webinar: 10 Reasons Why Backup Breaks and How to Fix It
Webinar: 10 Reasons Why Backup Breaks and How to Fix ItWebinar: 10 Reasons Why Backup Breaks and How to Fix It
Webinar: 10 Reasons Why Backup Breaks and How to Fix ItStorage Switzerland
 
Introduction to Digital Preservation
Introduction to Digital PreservationIntroduction to Digital Preservation
Introduction to Digital PreservationBill LeFurgy
 
Make the Upgrade: Data protection in the cloud
Make the Upgrade: Data protection in the cloudMake the Upgrade: Data protection in the cloud
Make the Upgrade: Data protection in the cloudErik Von Schlehenried
 
Software Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE projectSoftware Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE projectATMOSPHERE .
 
Securing Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container DataSecuring Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container DataMirantis
 
Digital preservation: an introduction
Digital preservation: an introductionDigital preservation: an introduction
Digital preservation: an introductionPublicLibraryServices
 
Tsc2021 cyber-issues
Tsc2021 cyber-issuesTsc2021 cyber-issues
Tsc2021 cyber-issuesErnest Staats
 
Intelligent Cloud Enablement
Intelligent Cloud EnablementIntelligent Cloud Enablement
Intelligent Cloud EnablementDocuLynx
 

Similar to The “obsession” with checksums by Helen Hockx-Yu (20)

Destroying Data Silos
Destroying Data SilosDestroying Data Silos
Destroying Data Silos
 
Destroying Data Silos
Destroying Data SilosDestroying Data Silos
Destroying Data Silos
 
Alex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
Alex Hanway - Securing the Breach: Using a Holistic Data Protection FrameworkAlex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
Alex Hanway - Securing the Breach: Using a Holistic Data Protection Framework
 
4.1 Introduction to cloud storage.pptx
4.1 Introduction to cloud storage.pptx4.1 Introduction to cloud storage.pptx
4.1 Introduction to cloud storage.pptx
 
ChodonKumar.pptx
ChodonKumar.pptxChodonKumar.pptx
ChodonKumar.pptx
 
26-170918023441.pptx
26-170918023441.pptx26-170918023441.pptx
26-170918023441.pptx
 
26-170918023441 (1).pptx
26-170918023441 (1).pptx26-170918023441 (1).pptx
26-170918023441 (1).pptx
 
somee.pptx
somee.pptxsomee.pptx
somee.pptx
 
Webinar: 10 Reasons Why Backup Breaks and How to Fix It
Webinar: 10 Reasons Why Backup Breaks and How to Fix ItWebinar: 10 Reasons Why Backup Breaks and How to Fix It
Webinar: 10 Reasons Why Backup Breaks and How to Fix It
 
Introduction to Digital Preservation
Introduction to Digital PreservationIntroduction to Digital Preservation
Introduction to Digital Preservation
 
Group 4
Group 4Group 4
Group 4
 
Group 4
Group 4Group 4
Group 4
 
Make the Upgrade: Data protection in the cloud
Make the Upgrade: Data protection in the cloudMake the Upgrade: Data protection in the cloud
Make the Upgrade: Data protection in the cloud
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
 
Software Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE projectSoftware Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE project
 
Securing Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container DataSecuring Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container Data
 
Digital preservation: an introduction
Digital preservation: an introductionDigital preservation: an introduction
Digital preservation: an introduction
 
Tsc2021 cyber-issues
Tsc2021 cyber-issuesTsc2021 cyber-issues
Tsc2021 cyber-issues
 
Cloud slide
Cloud slideCloud slide
Cloud slide
 
Intelligent Cloud Enablement
Intelligent Cloud EnablementIntelligent Cloud Enablement
Intelligent Cloud Enablement
 

More from CLOCKSS

How CLOCKSS ingests and preserves content by Thib Guicherd-Callin
How CLOCKSS ingests and preserves content by Thib Guicherd-CallinHow CLOCKSS ingests and preserves content by Thib Guicherd-Callin
How CLOCKSS ingests and preserves content by Thib Guicherd-CallinCLOCKSS
 
Kim Smilay - One day at CLOCKSS
Kim Smilay - One day at CLOCKSSKim Smilay - One day at CLOCKSS
Kim Smilay - One day at CLOCKSSCLOCKSS
 
Jeremy Morse - nasig lightning talk model digital preservation policy
Jeremy Morse - nasig lightning talk model digital preservation policyJeremy Morse - nasig lightning talk model digital preservation policy
Jeremy Morse - nasig lightning talk model digital preservation policyCLOCKSS
 
Roxanne Missingham - CLOCKSS presentation
Roxanne Missingham - CLOCKSS presentationRoxanne Missingham - CLOCKSS presentation
Roxanne Missingham - CLOCKSS presentationCLOCKSS
 
Tim Martin - strategic perspectives on digital preservation from OCLC
Tim Martin - strategic perspectives on digital preservation from OCLCTim Martin - strategic perspectives on digital preservation from OCLC
Tim Martin - strategic perspectives on digital preservation from OCLCCLOCKSS
 
Gaelle Bequet - Project JASPER
Gaelle Bequet - Project JASPERGaelle Bequet - Project JASPER
Gaelle Bequet - Project JASPERCLOCKSS
 
Alicia Wise - About CLOCKSS
Alicia Wise - About CLOCKSSAlicia Wise - About CLOCKSS
Alicia Wise - About CLOCKSSCLOCKSS
 

More from CLOCKSS (7)

How CLOCKSS ingests and preserves content by Thib Guicherd-Callin
How CLOCKSS ingests and preserves content by Thib Guicherd-CallinHow CLOCKSS ingests and preserves content by Thib Guicherd-Callin
How CLOCKSS ingests and preserves content by Thib Guicherd-Callin
 
Kim Smilay - One day at CLOCKSS
Kim Smilay - One day at CLOCKSSKim Smilay - One day at CLOCKSS
Kim Smilay - One day at CLOCKSS
 
Jeremy Morse - nasig lightning talk model digital preservation policy
Jeremy Morse - nasig lightning talk model digital preservation policyJeremy Morse - nasig lightning talk model digital preservation policy
Jeremy Morse - nasig lightning talk model digital preservation policy
 
Roxanne Missingham - CLOCKSS presentation
Roxanne Missingham - CLOCKSS presentationRoxanne Missingham - CLOCKSS presentation
Roxanne Missingham - CLOCKSS presentation
 
Tim Martin - strategic perspectives on digital preservation from OCLC
Tim Martin - strategic perspectives on digital preservation from OCLCTim Martin - strategic perspectives on digital preservation from OCLC
Tim Martin - strategic perspectives on digital preservation from OCLC
 
Gaelle Bequet - Project JASPER
Gaelle Bequet - Project JASPERGaelle Bequet - Project JASPER
Gaelle Bequet - Project JASPER
 
Alicia Wise - About CLOCKSS
Alicia Wise - About CLOCKSSAlicia Wise - About CLOCKSS
Alicia Wise - About CLOCKSS
 

Recently uploaded

STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantadityabhardwaj282
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxBREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxPABOLU TEJASREE
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 

Recently uploaded (20)

STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are important
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxBREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 

The “obsession” with checksums by Helen Hockx-Yu

  • 1. The “obsession” with checksums Helen Hockx-Yu Office of Information Technologies University of Notre Dame
  • 2. Fixity and Checksums ● Fixity refers to the property of a digital file / object being fixed, or unchanged ● Checksums are a means for determining fixity, or the sameness between two copies of a digital object or the same object at different times, and/or before and after certain events ● Checksums feature prominently in digital preservation ● Checksum-based fixity checking workflows are commonly used by libraries and archives
  • 3. IT use cases ● 5 years at ND, 3 roles ○ Strategy development for digital asset management (including digital archiving and preservation, in collaboration with the ND Libraries and Archives) ○ Implementation of enterprise Digital Asset Management (DAM) service ○ Now an Enterprise Architect ● Verify software packages before distributing to many endpoints ● Data migration - correctness ● Mostly rely on built-in fault-tolerance features of storage systems or data management systems for data storage and transfer integrity ○ Isilon, Globus
  • 4. Digital Preservation risks ● Bit flip is not a top reason for data loss ● “The real culprits are a combination of human error, viruses, bugs in application software, and malicious employees or intruders. Almost everyone has accidentally erased or overwritten a file. “ ● 13 threats in Requirements for Digital Preservation Systems: A Bottom-Up Approach
  • 5. A matter of where to spend energy ● Format migration - another prominently featured digital preservation task, on which our thoughts have evolved ○ relaxed approach to format obsolescence - preserving the bits and dealing with format obsolescence if and when it happens ● We still need to preserve the bits - but does this mean we need to run checksums on everything by ourselves? ○ Can we make use of hash of data stored by Cloud services - trust and egress ○ Pass hash values with upload ● More importantly, what other things have overlooked that can lead to data loss ○ Not collecting content ○ Dependency on storage technology (storage intermediary) that ■ Challenges the notion of “redundancy” or “lots of copies” ■ Introducing single point of failure ■ Holds the unique knowledge of file to object mapping ■ Hidden and invisible files could be added to the storage location inadvertently ● Scalability and automation ● Work together!