SlideShare a Scribd company logo
Preservation Capability Miscellany
By Ross Spencer
Twitter: @beet_keeper
A brief ‘provenance’ note…
2014-06-20: Play It Again Conference Report:
http://bit.ly/2d8Bnw0
(playitagain.org)
2014-11-25: The Reality of Digital Transfer:
http://bit.ly/2ctxocQ
(slideshare.net)
We (Archives NZ) have got quite far… But
there's still a lot more to do…
So let's remind ourselves: What is the point?
● Work in concert with agencies and their consultants.
● Generate better information and records management
● Cleaner transfers...
● Create a more open and transparent government where the digital record is
concerned...
● DIA’s line... Support New Zealanders to build strong communities by providing
access to trusted information and knowledge.
And! Digital Preservation
● At this point in time, idiomatic methods of preservation are still forming...
● Whatever the future of archival custodianship...
● Or the future of digital preservation...
● Techniques need to be developed to support agencies with information and records
management, and memory institutes with long-term custodianship.
● Don't fall into the processing trap...
What can we identify as important?
● Infrastructure/team, supported by the organisation
● Some things work, some don’t; some change... be flexible.
● Work iteratively...
● Look at what you can do...
● Continue to develop... evidence, real use-cases
Is it all there for us..?
No, but we have a good foundation…
Policy...
●Has been a constant in my time here.
●Was a draw to me starting in NZ
●Sets the rules by which we can play…
●Literally, play: bend don’t break
● Achieved through careful stakeholder consultation and consideration of
impact.
●Sign-off process at director level.
●Two favourite policies, checksum, pre-conditioning.
Team...
●We could always do with more people…
●But we recognise that we've been allowed more folk dedicated to this
than some places.
●The team is supported in their decision making and their skills.
●Breakdown: Curious; driven; up-to-date; drive to ‘solve’ born-digital
transfer; different but complementary skills… *passion*!
●(And opinionated! ;-) )
●It doesn’t always look that way but there is a certain amount of leeway
from IT support too...
Technology...?
Rosetta by Ex-Libris: is the Long-term preservation system, it allows us to manage some
quite complex bits 'n' pieces… but:
●Does not yet enable transfer from Agency-to-Archives (it supports)
●Is not a clearing house for records
●Spot preservation risks up-front
●Doesn't 'do' sentencing…
●Does not build ingest packages…
●Does not 'do' archival description...
●Does not contain every tool under the sun to handle all the file formats…
Machine Learning: http://nautil.us/blog/the-fundamental-limits-of-machine-learning
The processes we need are biased toward transfer
and ingest…
Rosetta can only help so much…
||----------------||---------------------------------------------------------------------------------------------------||
Creation Transfer (Life of a record ~25 years) Life of an archive ~∞
The other processes we will still need will be
about (active) long term custodianship…
Rosetta is still only beginning that journey...
The miscellany in this presentation...
A story about the tools that can help us...
● Technical Registries (of practice)
● DROID/Siegfried Analysis Report
● Fuzzy Hashes
With everything we need to do…
We cannot action it all at the same time...
Knowledge needs to remain alive and accessible, record it:
Source: https://commons.wikimedia.org/wiki/Category:Kanban#/media/File:Simple_Task_Kanban.jpg
Trello: is one option...
Features...
● Kanban
● Teams
● Ownership
● Visibility
● Accessibility
● Reduce transitory records
● Create temporality
● Centralize knowledge
● Invite external colleagues
DROID/Siegfried Analysis Report
● Example of changing needs and capability
● Initially a plain-text reporting tool
● Evolved into a 'team' tool…
● Evolving into an organisation’s tool…
● Hopefully a community tool…
● Our first port of call for any transfer...
* Marriage of DROID and Siegfried: http://bit.ly/2ddS0IP
* A little bit more about the tool: http://bit.ly/2dii3jP
DROID/Siegfried Analysis Report
● Available to all the community (December 2013): http://bit.ly/2cB8gFY
● Maps DROID and Siegfried output to an SQLite database for querying power and speed.
● Aside from Python, ZERO-dependencies – user needs to be able to download it and go...
● Complete flexibility over output.
● TXT, HTML, Rogues, Heroes… Normalization via database layer – write your own!
● Normalization via database layer – abstracted for multiple ID tools
● The tools each do what they're supposed to well, the dissection of output can be left to others.
* Marriage of DROID and Siegfried (OPF Blog): http://bit.ly/2ddS0IP
* A little bit more about the tool (OPF Blog): http://bit.ly/2dii3jP
● Plain-text example...
● HTML Example…
Let’s have a look…
http://bit.ly/2dircst
Benefits...
● Sets a baseline for a lingua franca… beginners and experts
alike...
● Definitions contributed by our archivists!
● Easier on the eye
● Re-factored to be more flexible
● Give it a try! Let us know how it goes!
Checksums
● Look like:
– MD5: d41d8cd98f00b204e9800998ecf8427e
– SHA1: da39a3ee5e6b4b0d3255bfef95601890afd80709
Checksums
Checksums
● Looking to be unique
– De-duplication
– Fixity
● No connection between
– Security function
– Cannot reverse
But every file has a connection...
● Binary
● File Format
● Textual Content
● Embedded Content
● Template
● Author
● Like DNA, with many different strands to dissect...
● Fuzzy Hashing!
Fuzzy Hashing: SSDEEP
Source: https://github.com/KLDavies/ssdeep/
Fuzzy Hashing: tlsh
Source: https://github.com/trendmicro/tlsh
And they look like...
● aad371039d588b43e02887f87e570f6d2b1a7f1da89667ef11227d
9b3e706610d8e12d
● 0dc36013dd088b43e02983f87e534e6d2b1a7f1da88627ef11267d
8b3e716610d9e16d
● Not that different from regular checksums!
● But help us to demonstrate a closer relationship between files…
● “The sum of the parts is greater than the whole.”
~ Arist!otle
Which we're about to find out!
Workshop!
Results!
Results!
How can we use this?
● Sentencing... while still teaching our machines, we can still close
the net while looking at records manually…
● Discovery: Amazon like results: You might also like this record!
The experiment continues...
● Matches are relative to themselves...
● Algorithms make a difference...
● And perhaps, like genetics... some traits are more dominant than
others...
● Consider working with content in different ways...
– Utilize format bias... normalize
– Separate content from structure and analyse?
● Keep trying things, but at minimum cost... (another agile concept:
minimal viable product)
Conclusion: A bit more miscellany
●Keyword: Interim
●Our needs change constantly, and there's a lot to do…
●Don't suffer paralysis by analysis.
●Do a requirements analysis
●Look at what you can do (minimum viable product) and iterate...
Conclusion: A bit more miscellany
●Lot's of hints to bits 'n' pieces I haven't been able to talk about:
●Role of the community… (They/We're here to help! Same problems!)
●Communication and sharing… (Do it!)
●Software development skills… (There are other ways to be involved)
What's the point? (OPF Blog): http://bit.ly/2ddXnaY
●Maybe also a seed for discussion.
Thank you!

More Related Content

What's hot

Why Link?
Why Link?Why Link?
Why Link?
Richard Wallis
 
Linked Open Data
Linked Open DataLinked Open Data
The Danish National Bibliography as LOD
The Danish National Bibliography as LODThe Danish National Bibliography as LOD
Submitting your data to ProteomeXchange – a mini tutorial
Submitting your data to ProteomeXchange – a mini tutorialSubmitting your data to ProteomeXchange – a mini tutorial
Submitting your data to ProteomeXchange – a mini tutorial
Juan Antonio Vizcaino
 
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)dataSUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
Diego Valerio Camarda
 
OrientDB & Node.js Overview - JS.Everywhere() KW
OrientDB & Node.js Overview - JS.Everywhere() KWOrientDB & Node.js Overview - JS.Everywhere() KW
OrientDB & Node.js Overview - JS.Everywhere() KW
gmccarvell
 
While the Sun Shines: Assessing Born-Digital Holdings Before It's Too Late
While the Sun Shines: Assessing Born-Digital Holdings Before It's Too LateWhile the Sun Shines: Assessing Born-Digital Holdings Before It's Too Late
While the Sun Shines: Assessing Born-Digital Holdings Before It's Too Late
Ben Goldman
 

What's hot (7)

Why Link?
Why Link?Why Link?
Why Link?
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
The Danish National Bibliography as LOD
The Danish National Bibliography as LODThe Danish National Bibliography as LOD
The Danish National Bibliography as LOD
 
Submitting your data to ProteomeXchange – a mini tutorial
Submitting your data to ProteomeXchange – a mini tutorialSubmitting your data to ProteomeXchange – a mini tutorial
Submitting your data to ProteomeXchange – a mini tutorial
 
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)dataSUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
 
OrientDB & Node.js Overview - JS.Everywhere() KW
OrientDB & Node.js Overview - JS.Everywhere() KWOrientDB & Node.js Overview - JS.Everywhere() KW
OrientDB & Node.js Overview - JS.Everywhere() KW
 
While the Sun Shines: Assessing Born-Digital Holdings Before It's Too Late
While the Sun Shines: Assessing Born-Digital Holdings Before It's Too LateWhile the Sun Shines: Assessing Born-Digital Holdings Before It's Too Late
While the Sun Shines: Assessing Born-Digital Holdings Before It's Too Late
 

Similar to ASA Trial Workshop Slides for Archives NZ [2016-09-28]

What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...Stefano Fago
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"
Demi Ben-Ari
 
Pen Testing Development
Pen Testing DevelopmentPen Testing Development
Pen Testing Development
CTruncer
 
Python in Industry
Python in IndustryPython in Industry
Python in Industry
Dharmit Shah
 
Behind the Scenes at Coolblue - Feb 2017
Behind the Scenes at Coolblue - Feb 2017Behind the Scenes at Coolblue - Feb 2017
Behind the Scenes at Coolblue - Feb 2017
Pat Hermens
 
Blockchain and smart contracts, what they are and why you should really care ...
Blockchain and smart contracts, what they are and why you should really care ...Blockchain and smart contracts, what they are and why you should really care ...
Blockchain and smart contracts, what they are and why you should really care ...
maeste
 
D.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital PreservationD.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital Preservation
PRELIDA Project
 
Cloud accounting software uk
Cloud accounting software ukCloud accounting software uk
Cloud accounting software uk
Arcus Universe Ltd
 
Scalable, good, cheap
Scalable, good, cheapScalable, good, cheap
Scalable, good, cheap
Marc Cluet
 
Digital game preservation conference 12 25-2018
Digital game preservation conference   12 25-2018Digital game preservation conference   12 25-2018
Digital game preservation conference 12 25-2018
peterchanws
 
My talk at Linux Piter 2015
My talk at Linux Piter 2015My talk at Linux Piter 2015
My talk at Linux Piter 2015
Alex Chistyakov
 
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod serversKibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
HYS Enterprise
 
Messaging
MessagingMessaging
Messaging
Sean Kelly
 
Piano Media - approach to data gathering and processing
Piano Media - approach to data gathering and processingPiano Media - approach to data gathering and processing
Piano Media - approach to data gathering and processing
MartinStrycek
 
Years of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoopsYears of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoops
Kris Buytaert
 
Bio-IT Trends From The Trenches (digital edition)
Bio-IT Trends From The Trenches (digital edition)Bio-IT Trends From The Trenches (digital edition)
Bio-IT Trends From The Trenches (digital edition)
Chris Dagdigian
 
Seun - Breaking into Protocol Engineering (1).pptx
Seun - Breaking into Protocol Engineering (1).pptxSeun - Breaking into Protocol Engineering (1).pptx
Seun - Breaking into Protocol Engineering (1).pptx
SeunLanLege1
 
ContainerDays Boston 2016: "Hiding in Plain Sight: Managing Secrets in a Cont...
ContainerDays Boston 2016: "Hiding in Plain Sight: Managing Secrets in a Cont...ContainerDays Boston 2016: "Hiding in Plain Sight: Managing Secrets in a Cont...
ContainerDays Boston 2016: "Hiding in Plain Sight: Managing Secrets in a Cont...
DynamicInfraDays
 
Data Modeling for communication
Data Modeling for communicationData Modeling for communication
Data Modeling for communicationRichard Freggi
 
Archival Technologies
Archival TechnologiesArchival Technologies
Archival Technologies
Cliff Landis
 

Similar to ASA Trial Workshop Slides for Archives NZ [2016-09-28] (20)

What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"
 
Pen Testing Development
Pen Testing DevelopmentPen Testing Development
Pen Testing Development
 
Python in Industry
Python in IndustryPython in Industry
Python in Industry
 
Behind the Scenes at Coolblue - Feb 2017
Behind the Scenes at Coolblue - Feb 2017Behind the Scenes at Coolblue - Feb 2017
Behind the Scenes at Coolblue - Feb 2017
 
Blockchain and smart contracts, what they are and why you should really care ...
Blockchain and smart contracts, what they are and why you should really care ...Blockchain and smart contracts, what they are and why you should really care ...
Blockchain and smart contracts, what they are and why you should really care ...
 
D.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital PreservationD.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital Preservation
 
Cloud accounting software uk
Cloud accounting software ukCloud accounting software uk
Cloud accounting software uk
 
Scalable, good, cheap
Scalable, good, cheapScalable, good, cheap
Scalable, good, cheap
 
Digital game preservation conference 12 25-2018
Digital game preservation conference   12 25-2018Digital game preservation conference   12 25-2018
Digital game preservation conference 12 25-2018
 
My talk at Linux Piter 2015
My talk at Linux Piter 2015My talk at Linux Piter 2015
My talk at Linux Piter 2015
 
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod serversKibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
 
Messaging
MessagingMessaging
Messaging
 
Piano Media - approach to data gathering and processing
Piano Media - approach to data gathering and processingPiano Media - approach to data gathering and processing
Piano Media - approach to data gathering and processing
 
Years of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoopsYears of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoops
 
Bio-IT Trends From The Trenches (digital edition)
Bio-IT Trends From The Trenches (digital edition)Bio-IT Trends From The Trenches (digital edition)
Bio-IT Trends From The Trenches (digital edition)
 
Seun - Breaking into Protocol Engineering (1).pptx
Seun - Breaking into Protocol Engineering (1).pptxSeun - Breaking into Protocol Engineering (1).pptx
Seun - Breaking into Protocol Engineering (1).pptx
 
ContainerDays Boston 2016: "Hiding in Plain Sight: Managing Secrets in a Cont...
ContainerDays Boston 2016: "Hiding in Plain Sight: Managing Secrets in a Cont...ContainerDays Boston 2016: "Hiding in Plain Sight: Managing Secrets in a Cont...
ContainerDays Boston 2016: "Hiding in Plain Sight: Managing Secrets in a Cont...
 
Data Modeling for communication
Data Modeling for communicationData Modeling for communication
Data Modeling for communication
 
Archival Technologies
Archival TechnologiesArchival Technologies
Archival Technologies
 

Recently uploaded

2024: The FAR - Federal Acquisition Regulations, Part 38
2024: The FAR - Federal Acquisition Regulations, Part 382024: The FAR - Federal Acquisition Regulations, Part 38
2024: The FAR - Federal Acquisition Regulations, Part 38
JSchaus & Associates
 
Donate to charity during this holiday season
Donate to charity during this holiday seasonDonate to charity during this holiday season
Donate to charity during this holiday season
SERUDS INDIA
 
Get Government Grants and Assistance Program
Get Government Grants and Assistance ProgramGet Government Grants and Assistance Program
Get Government Grants and Assistance Program
Get Government Grants
 
快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
850fcj96
 
RFP for Reno's Community Assistance Center
RFP for Reno's Community Assistance CenterRFP for Reno's Community Assistance Center
RFP for Reno's Community Assistance Center
This Is Reno
 
如何办理(uoit毕业证书)加拿大安大略理工大学毕业证文凭证书录取通知原版一模一样
如何办理(uoit毕业证书)加拿大安大略理工大学毕业证文凭证书录取通知原版一模一样如何办理(uoit毕业证书)加拿大安大略理工大学毕业证文凭证书录取通知原版一模一样
如何办理(uoit毕业证书)加拿大安大略理工大学毕业证文凭证书录取通知原版一模一样
850fcj96
 
State crafting: Changes and challenges for managing the public finances
State crafting: Changes and challenges for managing the public financesState crafting: Changes and challenges for managing the public finances
State crafting: Changes and challenges for managing the public finances
ResolutionFoundation
 
Preliminary findings _OECD field visits to ten regions in the TSI EU mining r...
Preliminary findings _OECD field visits to ten regions in the TSI EU mining r...Preliminary findings _OECD field visits to ten regions in the TSI EU mining r...
Preliminary findings _OECD field visits to ten regions in the TSI EU mining r...
OECDregions
 
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
Congressional Budget Office
 
About Potato, The scientific name of the plant is Solanum tuberosum (L).
About Potato, The scientific name of the plant is Solanum tuberosum (L).About Potato, The scientific name of the plant is Solanum tuberosum (L).
About Potato, The scientific name of the plant is Solanum tuberosum (L).
Christina Parmionova
 
Monitoring Health for the SDGs - Global Health Statistics 2024 - WHO
Monitoring Health for the SDGs - Global Health Statistics 2024 - WHOMonitoring Health for the SDGs - Global Health Statistics 2024 - WHO
Monitoring Health for the SDGs - Global Health Statistics 2024 - WHO
Christina Parmionova
 
A Guide to AI for Smarter Nonprofits - Dr. Cori Faklaris, UNC Charlotte
A Guide to AI for Smarter Nonprofits - Dr. Cori Faklaris, UNC CharlotteA Guide to AI for Smarter Nonprofits - Dr. Cori Faklaris, UNC Charlotte
A Guide to AI for Smarter Nonprofits - Dr. Cori Faklaris, UNC Charlotte
University of North Carolina at Charlotte
 
Transit-Oriented Development Study Working Group Meeting
Transit-Oriented Development Study Working Group MeetingTransit-Oriented Development Study Working Group Meeting
Transit-Oriented Development Study Working Group Meeting
Cuyahoga County Planning Commission
 
NHAI_Under_Implementation_01-05-2024.pdf
NHAI_Under_Implementation_01-05-2024.pdfNHAI_Under_Implementation_01-05-2024.pdf
NHAI_Under_Implementation_01-05-2024.pdf
AjayVejendla3
 
Invitation Letter for an alumni association
Invitation Letter for an alumni associationInvitation Letter for an alumni association
Invitation Letter for an alumni association
elmerdalida001
 
Opinions on EVs: Metro Atlanta Speaks 2023
Opinions on EVs: Metro Atlanta Speaks 2023Opinions on EVs: Metro Atlanta Speaks 2023
Opinions on EVs: Metro Atlanta Speaks 2023
ARCResearch
 
A proposed request for information on LIHTC
A proposed request for information on LIHTCA proposed request for information on LIHTC
A proposed request for information on LIHTC
Roger Valdez
 
A guide to the International day of Potatoes 2024 - May 30th
A guide to the International day of Potatoes 2024 - May 30thA guide to the International day of Potatoes 2024 - May 30th
A guide to the International day of Potatoes 2024 - May 30th
Christina Parmionova
 
2017 Omnibus Rules on Appointments and Other Human Resource Actions, As Amended
2017 Omnibus Rules on Appointments and Other Human Resource Actions, As Amended2017 Omnibus Rules on Appointments and Other Human Resource Actions, As Amended
2017 Omnibus Rules on Appointments and Other Human Resource Actions, As Amended
johnmarimigallon
 
2024: The FAR - Federal Acquisition Regulations, Part 39
2024: The FAR - Federal Acquisition Regulations, Part 392024: The FAR - Federal Acquisition Regulations, Part 39
2024: The FAR - Federal Acquisition Regulations, Part 39
JSchaus & Associates
 

Recently uploaded (20)

2024: The FAR - Federal Acquisition Regulations, Part 38
2024: The FAR - Federal Acquisition Regulations, Part 382024: The FAR - Federal Acquisition Regulations, Part 38
2024: The FAR - Federal Acquisition Regulations, Part 38
 
Donate to charity during this holiday season
Donate to charity during this holiday seasonDonate to charity during this holiday season
Donate to charity during this holiday season
 
Get Government Grants and Assistance Program
Get Government Grants and Assistance ProgramGet Government Grants and Assistance Program
Get Government Grants and Assistance Program
 
快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
 
RFP for Reno's Community Assistance Center
RFP for Reno's Community Assistance CenterRFP for Reno's Community Assistance Center
RFP for Reno's Community Assistance Center
 
如何办理(uoit毕业证书)加拿大安大略理工大学毕业证文凭证书录取通知原版一模一样
如何办理(uoit毕业证书)加拿大安大略理工大学毕业证文凭证书录取通知原版一模一样如何办理(uoit毕业证书)加拿大安大略理工大学毕业证文凭证书录取通知原版一模一样
如何办理(uoit毕业证书)加拿大安大略理工大学毕业证文凭证书录取通知原版一模一样
 
State crafting: Changes and challenges for managing the public finances
State crafting: Changes and challenges for managing the public financesState crafting: Changes and challenges for managing the public finances
State crafting: Changes and challenges for managing the public finances
 
Preliminary findings _OECD field visits to ten regions in the TSI EU mining r...
Preliminary findings _OECD field visits to ten regions in the TSI EU mining r...Preliminary findings _OECD field visits to ten regions in the TSI EU mining r...
Preliminary findings _OECD field visits to ten regions in the TSI EU mining r...
 
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
 
About Potato, The scientific name of the plant is Solanum tuberosum (L).
About Potato, The scientific name of the plant is Solanum tuberosum (L).About Potato, The scientific name of the plant is Solanum tuberosum (L).
About Potato, The scientific name of the plant is Solanum tuberosum (L).
 
Monitoring Health for the SDGs - Global Health Statistics 2024 - WHO
Monitoring Health for the SDGs - Global Health Statistics 2024 - WHOMonitoring Health for the SDGs - Global Health Statistics 2024 - WHO
Monitoring Health for the SDGs - Global Health Statistics 2024 - WHO
 
A Guide to AI for Smarter Nonprofits - Dr. Cori Faklaris, UNC Charlotte
A Guide to AI for Smarter Nonprofits - Dr. Cori Faklaris, UNC CharlotteA Guide to AI for Smarter Nonprofits - Dr. Cori Faklaris, UNC Charlotte
A Guide to AI for Smarter Nonprofits - Dr. Cori Faklaris, UNC Charlotte
 
Transit-Oriented Development Study Working Group Meeting
Transit-Oriented Development Study Working Group MeetingTransit-Oriented Development Study Working Group Meeting
Transit-Oriented Development Study Working Group Meeting
 
NHAI_Under_Implementation_01-05-2024.pdf
NHAI_Under_Implementation_01-05-2024.pdfNHAI_Under_Implementation_01-05-2024.pdf
NHAI_Under_Implementation_01-05-2024.pdf
 
Invitation Letter for an alumni association
Invitation Letter for an alumni associationInvitation Letter for an alumni association
Invitation Letter for an alumni association
 
Opinions on EVs: Metro Atlanta Speaks 2023
Opinions on EVs: Metro Atlanta Speaks 2023Opinions on EVs: Metro Atlanta Speaks 2023
Opinions on EVs: Metro Atlanta Speaks 2023
 
A proposed request for information on LIHTC
A proposed request for information on LIHTCA proposed request for information on LIHTC
A proposed request for information on LIHTC
 
A guide to the International day of Potatoes 2024 - May 30th
A guide to the International day of Potatoes 2024 - May 30thA guide to the International day of Potatoes 2024 - May 30th
A guide to the International day of Potatoes 2024 - May 30th
 
2017 Omnibus Rules on Appointments and Other Human Resource Actions, As Amended
2017 Omnibus Rules on Appointments and Other Human Resource Actions, As Amended2017 Omnibus Rules on Appointments and Other Human Resource Actions, As Amended
2017 Omnibus Rules on Appointments and Other Human Resource Actions, As Amended
 
2024: The FAR - Federal Acquisition Regulations, Part 39
2024: The FAR - Federal Acquisition Regulations, Part 392024: The FAR - Federal Acquisition Regulations, Part 39
2024: The FAR - Federal Acquisition Regulations, Part 39
 

ASA Trial Workshop Slides for Archives NZ [2016-09-28]

  • 1. Preservation Capability Miscellany By Ross Spencer Twitter: @beet_keeper
  • 3.
  • 4. 2014-06-20: Play It Again Conference Report: http://bit.ly/2d8Bnw0 (playitagain.org) 2014-11-25: The Reality of Digital Transfer: http://bit.ly/2ctxocQ (slideshare.net)
  • 5. We (Archives NZ) have got quite far… But there's still a lot more to do…
  • 6. So let's remind ourselves: What is the point? ● Work in concert with agencies and their consultants. ● Generate better information and records management ● Cleaner transfers... ● Create a more open and transparent government where the digital record is concerned... ● DIA’s line... Support New Zealanders to build strong communities by providing access to trusted information and knowledge.
  • 7. And! Digital Preservation ● At this point in time, idiomatic methods of preservation are still forming... ● Whatever the future of archival custodianship... ● Or the future of digital preservation... ● Techniques need to be developed to support agencies with information and records management, and memory institutes with long-term custodianship. ● Don't fall into the processing trap...
  • 8. What can we identify as important? ● Infrastructure/team, supported by the organisation ● Some things work, some don’t; some change... be flexible. ● Work iteratively... ● Look at what you can do... ● Continue to develop... evidence, real use-cases
  • 9. Is it all there for us..?
  • 10. No, but we have a good foundation…
  • 11. Policy... ●Has been a constant in my time here. ●Was a draw to me starting in NZ ●Sets the rules by which we can play… ●Literally, play: bend don’t break ● Achieved through careful stakeholder consultation and consideration of impact. ●Sign-off process at director level. ●Two favourite policies, checksum, pre-conditioning.
  • 12. Team... ●We could always do with more people… ●But we recognise that we've been allowed more folk dedicated to this than some places. ●The team is supported in their decision making and their skills. ●Breakdown: Curious; driven; up-to-date; drive to ‘solve’ born-digital transfer; different but complementary skills… *passion*! ●(And opinionated! ;-) ) ●It doesn’t always look that way but there is a certain amount of leeway from IT support too...
  • 13. Technology...? Rosetta by Ex-Libris: is the Long-term preservation system, it allows us to manage some quite complex bits 'n' pieces… but: ●Does not yet enable transfer from Agency-to-Archives (it supports) ●Is not a clearing house for records ●Spot preservation risks up-front ●Doesn't 'do' sentencing… ●Does not build ingest packages… ●Does not 'do' archival description... ●Does not contain every tool under the sun to handle all the file formats… Machine Learning: http://nautil.us/blog/the-fundamental-limits-of-machine-learning
  • 14. The processes we need are biased toward transfer and ingest… Rosetta can only help so much… ||----------------||---------------------------------------------------------------------------------------------------|| Creation Transfer (Life of a record ~25 years) Life of an archive ~∞ The other processes we will still need will be about (active) long term custodianship… Rosetta is still only beginning that journey...
  • 15. The miscellany in this presentation... A story about the tools that can help us... ● Technical Registries (of practice) ● DROID/Siegfried Analysis Report ● Fuzzy Hashes
  • 16.
  • 17.
  • 18. With everything we need to do… We cannot action it all at the same time...
  • 19. Knowledge needs to remain alive and accessible, record it: Source: https://commons.wikimedia.org/wiki/Category:Kanban#/media/File:Simple_Task_Kanban.jpg
  • 20. Trello: is one option...
  • 21. Features... ● Kanban ● Teams ● Ownership ● Visibility ● Accessibility ● Reduce transitory records ● Create temporality ● Centralize knowledge ● Invite external colleagues
  • 22. DROID/Siegfried Analysis Report ● Example of changing needs and capability ● Initially a plain-text reporting tool ● Evolved into a 'team' tool… ● Evolving into an organisation’s tool… ● Hopefully a community tool… ● Our first port of call for any transfer... * Marriage of DROID and Siegfried: http://bit.ly/2ddS0IP * A little bit more about the tool: http://bit.ly/2dii3jP
  • 23. DROID/Siegfried Analysis Report ● Available to all the community (December 2013): http://bit.ly/2cB8gFY ● Maps DROID and Siegfried output to an SQLite database for querying power and speed. ● Aside from Python, ZERO-dependencies – user needs to be able to download it and go... ● Complete flexibility over output. ● TXT, HTML, Rogues, Heroes… Normalization via database layer – write your own! ● Normalization via database layer – abstracted for multiple ID tools ● The tools each do what they're supposed to well, the dissection of output can be left to others. * Marriage of DROID and Siegfried (OPF Blog): http://bit.ly/2ddS0IP * A little bit more about the tool (OPF Blog): http://bit.ly/2dii3jP
  • 24.
  • 27. Let’s have a look… http://bit.ly/2dircst
  • 28. Benefits... ● Sets a baseline for a lingua franca… beginners and experts alike... ● Definitions contributed by our archivists! ● Easier on the eye ● Re-factored to be more flexible ● Give it a try! Let us know how it goes!
  • 29. Checksums ● Look like: – MD5: d41d8cd98f00b204e9800998ecf8427e – SHA1: da39a3ee5e6b4b0d3255bfef95601890afd80709
  • 31. Checksums ● Looking to be unique – De-duplication – Fixity ● No connection between – Security function – Cannot reverse
  • 32. But every file has a connection... ● Binary ● File Format ● Textual Content ● Embedded Content ● Template ● Author ● Like DNA, with many different strands to dissect... ● Fuzzy Hashing!
  • 33. Fuzzy Hashing: SSDEEP Source: https://github.com/KLDavies/ssdeep/
  • 34. Fuzzy Hashing: tlsh Source: https://github.com/trendmicro/tlsh
  • 35. And they look like... ● aad371039d588b43e02887f87e570f6d2b1a7f1da89667ef11227d 9b3e706610d8e12d ● 0dc36013dd088b43e02983f87e534e6d2b1a7f1da88627ef11267d 8b3e716610d9e16d ● Not that different from regular checksums! ● But help us to demonstrate a closer relationship between files… ● “The sum of the parts is greater than the whole.” ~ Arist!otle
  • 36. Which we're about to find out!
  • 40. How can we use this? ● Sentencing... while still teaching our machines, we can still close the net while looking at records manually… ● Discovery: Amazon like results: You might also like this record!
  • 41. The experiment continues... ● Matches are relative to themselves... ● Algorithms make a difference... ● And perhaps, like genetics... some traits are more dominant than others... ● Consider working with content in different ways... – Utilize format bias... normalize – Separate content from structure and analyse? ● Keep trying things, but at minimum cost... (another agile concept: minimal viable product)
  • 42.
  • 43. Conclusion: A bit more miscellany ●Keyword: Interim ●Our needs change constantly, and there's a lot to do… ●Don't suffer paralysis by analysis. ●Do a requirements analysis ●Look at what you can do (minimum viable product) and iterate...
  • 44. Conclusion: A bit more miscellany ●Lot's of hints to bits 'n' pieces I haven't been able to talk about: ●Role of the community… (They/We're here to help! Same problems!) ●Communication and sharing… (Do it!) ●Software development skills… (There are other ways to be involved) What's the point? (OPF Blog): http://bit.ly/2ddXnaY ●Maybe also a seed for discussion.