This document provides an overview of digital preservation planning and implementation tools. It begins with an introduction explaining that digital preservation poses many of the same challenges as traditional preservation, such as planning, risk assessment, and materials quality. It then outlines several tools for digital preservation planning and assessment, including TRAC, DRAMBORA, and file format identification tools. Finally, it discusses challenges around geographic distribution, repository platforms, records management, and emphasizes that the most important thing is taking action and not letting inaction become the greatest threat.
XPDS13: Zero-copy display of guest framebuffers using GEM - John Baboval, CitrixThe Linux Foundation
The current state-of-the-art in displaying guest video is to copy pixel data from domU memory into a buffer in the device model domain, and then to render the display using something like X, or VNC. The quantity of data copied is partially mitigated by dirty page tracking. However when using the VM to play video or other other tasks that require frequent full-screen updates, copying is a significant drag on system performance and power consumption. By using the DRM subsystem in dom0 on systems with a unified memory architecture, it is possible to make arbitrary pages available for direct scanout by the graphics hardware. The in-kernel graphics drivers make this relatively straight forward and maintainable. This presentation explains how the current display path works, and how to use DRM to improve it.
Kernel Recipes 2015: Anatomy of an atomic KMS driverAnne Nicolas
The DRM and KMS APIs have won in the Linux graphics ecosystem. Long gone are the days when KMS meant only a handful of desktop graphics drivers. As a side effect, new problems have been uncovered, and API extensions are being designed to address advanced use cases. Atomic updates is the latest significant of such extensions.
While the userspace API extension is simple, a lot of work went under the hood and the in-kernel KMS helpers went through major changes that are not trivial to implement in drivers. This talk will present KMS atomic updates and explain how to update KMS drivers to take advantage of the new API, using the Renesas rcar-du-drm driver as an example.
Laurent Pinchart, Ideas on Board
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Alex Pinto
We could all have predicted this with our magical Big Data analytics platforms, but it seems that Machine Learning is the new hotness in Information Security. A great number of startups with ‘cy’ and ‘threat’ in their names that claim that their product will defend or detect more effectively than their neighbour's product "because math". And it should be easy to fool people without a PhD or two that math just works.
Indeed, math is powerful and large scale machine learning is an important cornerstone of much of the systems that we use today. However, not all algorithms and techniques are born equal. Machine Learning is a most powerful tool box, but not every tool can be applied to every problem and that’s where the pitfalls lie.
This presentation will describe the different techniques available for data analysis and machine learning for information security, and discuss their strengths and caveats. The Ghost of Marketing Past will also show how similar the unfulfilled promises of deterministic and exploratory analysis were, and how to avoid making the same mistakes again.
Finally, the presentation will describe the techniques and feature sets that were developed by the presenter on the past year as a part of his ongoing research project on the subject, in particular present some interesting results obtained since the last presentation on DefCon 21, and some ideas that could improve the application of machine learning for use in information security, especially in its use as a helper for security analysts in incident detection and response.
Joint presentation for the Graduate Certificate in Higher Education at Deakin University by Colin Warren & Joyce Seitzinger. Covers online identity, personal learning networks, mobile learning, visual learning, filtering, sharing, digital curation and creation of artefacts.
Presentation for the Graduate Certificate in Higher Education at Deakin University by Colin Warren & Joyce Seitzinger. Covers online identity, personal learning networks, mobile learning, visual learning, filtering, sharing, digital curation and creation of artefacts.
XPDS13: Zero-copy display of guest framebuffers using GEM - John Baboval, CitrixThe Linux Foundation
The current state-of-the-art in displaying guest video is to copy pixel data from domU memory into a buffer in the device model domain, and then to render the display using something like X, or VNC. The quantity of data copied is partially mitigated by dirty page tracking. However when using the VM to play video or other other tasks that require frequent full-screen updates, copying is a significant drag on system performance and power consumption. By using the DRM subsystem in dom0 on systems with a unified memory architecture, it is possible to make arbitrary pages available for direct scanout by the graphics hardware. The in-kernel graphics drivers make this relatively straight forward and maintainable. This presentation explains how the current display path works, and how to use DRM to improve it.
Kernel Recipes 2015: Anatomy of an atomic KMS driverAnne Nicolas
The DRM and KMS APIs have won in the Linux graphics ecosystem. Long gone are the days when KMS meant only a handful of desktop graphics drivers. As a side effect, new problems have been uncovered, and API extensions are being designed to address advanced use cases. Atomic updates is the latest significant of such extensions.
While the userspace API extension is simple, a lot of work went under the hood and the in-kernel KMS helpers went through major changes that are not trivial to implement in drivers. This talk will present KMS atomic updates and explain how to update KMS drivers to take advantage of the new API, using the Renesas rcar-du-drm driver as an example.
Laurent Pinchart, Ideas on Board
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Alex Pinto
We could all have predicted this with our magical Big Data analytics platforms, but it seems that Machine Learning is the new hotness in Information Security. A great number of startups with ‘cy’ and ‘threat’ in their names that claim that their product will defend or detect more effectively than their neighbour's product "because math". And it should be easy to fool people without a PhD or two that math just works.
Indeed, math is powerful and large scale machine learning is an important cornerstone of much of the systems that we use today. However, not all algorithms and techniques are born equal. Machine Learning is a most powerful tool box, but not every tool can be applied to every problem and that’s where the pitfalls lie.
This presentation will describe the different techniques available for data analysis and machine learning for information security, and discuss their strengths and caveats. The Ghost of Marketing Past will also show how similar the unfulfilled promises of deterministic and exploratory analysis were, and how to avoid making the same mistakes again.
Finally, the presentation will describe the techniques and feature sets that were developed by the presenter on the past year as a part of his ongoing research project on the subject, in particular present some interesting results obtained since the last presentation on DefCon 21, and some ideas that could improve the application of machine learning for use in information security, especially in its use as a helper for security analysts in incident detection and response.
Joint presentation for the Graduate Certificate in Higher Education at Deakin University by Colin Warren & Joyce Seitzinger. Covers online identity, personal learning networks, mobile learning, visual learning, filtering, sharing, digital curation and creation of artefacts.
Presentation for the Graduate Certificate in Higher Education at Deakin University by Colin Warren & Joyce Seitzinger. Covers online identity, personal learning networks, mobile learning, visual learning, filtering, sharing, digital curation and creation of artefacts.
"The evolution of mobile apps". Alan Cannistraro, FacebookYandex
The business of building and selling iOS apps just had its five-year anniversary. This time has been a journey, with a lot of lessons learned. The Gold-Rush days of striking it rich making a simple flashlight app are behind us, but mature apps continue to thrive. This talk will explore the evolution of Mobile apps, from simple utilities to sophisticated tools, and will extrapolate to what may come next.
For Access 2009 conference. Grab a bucket, it's raining data! Library data, research data, primary data, mashed-up data, raw data, cooked data, our data, other people's data... But which bucket should we grab? And can we really, truly fit all the data in one bucket? And don't we risk turning data into sludge if we mix it all together in our bucket? Finding a bucket is the easy part. Grappling with data acquisition, modeling, discovery, and reuse is hard. How will we do it? Can we?
Paraimpu: a social tool for the Web of ThingsAntonio Pintus
Paraimpu is a social tool for the Web of Things.
Paraimpu is a social tool for the Web of Things.
Connect, use, share and compose things, services and devices to create personalized brand new applications!
The Web of Things is more than Things in the Web
Presentation of "Reusing Linguistic Resources: Tasks and Goals for a Linked Data Approach", March 9, DGfS 34, Frankfurt Germany.
Find the paper at: http://www.springerlink.com/content/k535323272457913
The Next Big Thing is Web 3.0. Catch It If You Can Judy O'Connell
The best minds on our planet are suggesting that the Internet will continue to be arguably the most influential invention of our time. We are in the midst of a highly dynamic and dramatically changing landscape. Where Web 1.0 made us consumers of information, Web 2.0 allowed us to be participators and creators. Web 3.0 and the Semantic Web technologies are beginning to play a larger and more significant role in the search and filtering of the content fire hose that teachers and students encounter each day. How will the semantic web influence our learning and teaching encounters on the web? What is the connection between meaning and data? Will search or discovery be the main driving force in the 3.0 information revolution? How will information and knowledge creation in a semantic-powered online world develop? This session will draw on Semantic Web research and developments and show how connecting, collaborating and networking in a Web 3.0 world is changing the ground-rules once again.
From Virtual Reality to Blockchain: Current and Emerging Tech TrendsBohyun Kim
Webinar given for the LibraryLinkNJ, The New Jersey Library Cooperative on May 8, 2018. http://librarylinknj.org/
CC-BY-NC 4.0
[https://creativecommons.org/licenses/by-nc/4.0/]
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...Cengage Learning
This presentation will share the use of mobile applications and devices in a computer course at Harrisburg Area
Community College. CIS 145,“Using Mobile Technologies” explored apps for tablets, laptops and cell phones.
Students in this course completed a group project using&comparing three different mobile devices&apps. The
projects and results will be shared as well as an overview of the favorite apps from the class. Projects included
Mind mapping, Zombie Preparedness,Cloud-based projects and more.
"The evolution of mobile apps". Alan Cannistraro, FacebookYandex
The business of building and selling iOS apps just had its five-year anniversary. This time has been a journey, with a lot of lessons learned. The Gold-Rush days of striking it rich making a simple flashlight app are behind us, but mature apps continue to thrive. This talk will explore the evolution of Mobile apps, from simple utilities to sophisticated tools, and will extrapolate to what may come next.
For Access 2009 conference. Grab a bucket, it's raining data! Library data, research data, primary data, mashed-up data, raw data, cooked data, our data, other people's data... But which bucket should we grab? And can we really, truly fit all the data in one bucket? And don't we risk turning data into sludge if we mix it all together in our bucket? Finding a bucket is the easy part. Grappling with data acquisition, modeling, discovery, and reuse is hard. How will we do it? Can we?
Paraimpu: a social tool for the Web of ThingsAntonio Pintus
Paraimpu is a social tool for the Web of Things.
Paraimpu is a social tool for the Web of Things.
Connect, use, share and compose things, services and devices to create personalized brand new applications!
The Web of Things is more than Things in the Web
Presentation of "Reusing Linguistic Resources: Tasks and Goals for a Linked Data Approach", March 9, DGfS 34, Frankfurt Germany.
Find the paper at: http://www.springerlink.com/content/k535323272457913
The Next Big Thing is Web 3.0. Catch It If You Can Judy O'Connell
The best minds on our planet are suggesting that the Internet will continue to be arguably the most influential invention of our time. We are in the midst of a highly dynamic and dramatically changing landscape. Where Web 1.0 made us consumers of information, Web 2.0 allowed us to be participators and creators. Web 3.0 and the Semantic Web technologies are beginning to play a larger and more significant role in the search and filtering of the content fire hose that teachers and students encounter each day. How will the semantic web influence our learning and teaching encounters on the web? What is the connection between meaning and data? Will search or discovery be the main driving force in the 3.0 information revolution? How will information and knowledge creation in a semantic-powered online world develop? This session will draw on Semantic Web research and developments and show how connecting, collaborating and networking in a Web 3.0 world is changing the ground-rules once again.
From Virtual Reality to Blockchain: Current and Emerging Tech TrendsBohyun Kim
Webinar given for the LibraryLinkNJ, The New Jersey Library Cooperative on May 8, 2018. http://librarylinknj.org/
CC-BY-NC 4.0
[https://creativecommons.org/licenses/by-nc/4.0/]
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...Cengage Learning
This presentation will share the use of mobile applications and devices in a computer course at Harrisburg Area
Community College. CIS 145,“Using Mobile Technologies” explored apps for tablets, laptops and cell phones.
Students in this course completed a group project using&comparing three different mobile devices&apps. The
projects and results will be shared as well as an overview of the favorite apps from the class. Projects included
Mind mapping, Zombie Preparedness,Cloud-based projects and more.
Lecture for LIS 644 "Digital Trends, Tools, and Debates." Not my strong point, so I won't swear there are no errors. If you reuse, please respect the CC-BY-NC-SA license on the photo.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Taming the Monster: Digital Preservation Planning and Implementation Tools
1. Taming
the Monster
Digital Preservation Planning
and Implementation Tools
Dorothea Salo
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/ One System, One Library
WorldIslandInfo.com / CC-BY 2.0
2 June 2011
2. Why is this
so scary?
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
3. Isn’t this just
as scary?
Photo: “News Paper Origami Dragon Monster”
http://www.flickr.com/photos/epsos/3777343342/
epSos.de / CC-BY 2.0
4. Yet we
persevere.
Photo: “News Paper Origami Dragon Monster”
http://www.flickr.com/photos/epsos/3777343342/
epSos.de / CC-BY 2.0
5. DIGITAL IS NO
DIFFERENT.
Photo: “559 - The Matrix - Seamless Texture”
http://www.flickr.com/photos/zooboing/4335531915/
Patrick Hoesly / CC-BY 2.0
6. Many of the same ideas apply...
• Planning and policy
• Risk assessment
• Risk management
• (knowing that we can’t save everything)
• Materials quality matters!
• Problem discovery and remediation
• Crisis management
• Chief problems: staff, $$$, organizational
commitment
Photo: “Where I Teach”
http://www.flickr.com/photos/eklektikos/2541408630/
Todd Ehlers / CC-BY 2.0
7. Planning and
assessment
tools
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
8. Scene-setting
• Rosenthal, David. “Requirements for Digital
Preservation: a Bottom-Up Approach.”
• http://www.dlib.org/dlib/november05/rosenthal/
11rosenthal.html
• If you’re new to this, or trying to find your
feet, this is the best short introduction I
know.
• The list of threats is outstanding.
Photo: “Bottoms Up! - Duck; San Anton Gardens, Malta”
http://www.flickr.com/photos/foxypar4/3123113762/
John Haslam / CC-BY 2.0
9. TRAC
• “Trusted Repository Audit Checklist”
• Despite the name, covers a LOT more than
the technology!
!
• Budget
• Staffing
• “designated communities”
• CRL will audit you, if you like
• (don’t, unless you’re really serious!)
• http://catalog.crl.edu/record=b2212602~S1
10. DRAMBORA
• Digital Repository Audit Method Based on
Risk Assessment
• A “self-test,” if you will.
• DRAMBORA is equally good as a pre- or post-test.
• Personally, I prefer DRAMBORA to TRAC,
!
especially for those just starting out.
• http://www.repositoryaudit.eu/
• (registration required for toolkit access)
11. Coping with
file formats
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
12. The one acronym you
need to know: FITS
• “File Information Tool Set”
• (you need to know this; otherwise it’s hard to Google)
• Wrapper for several file-format detector
software packages
• Intended to be baked into other software
• It’s early days yet!
• (This means you can’t always trust what the tools tell
you, especially when they’re telling you about errors.)
13. What’s this file?
• wotsit.org “The Programmer’s File and
Data Resource”
• Directory of file extensions
• When in doubt: open in a browser or text
editor and see what you get.
• N.b.: Microsoft Word is NOT a text editor!
14. Solving the
geographic
distribution
problem
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
15. What problem, now?
• The “all your eggs in one basket” problem.
• If all your bits are on one server, and the server room
is flooded, or your town is nuked—oops.
• Not the same as backups!
• Don’t get me wrong, backups are important!
• Backups are SHORT-TERM, and usually LOCAL.
Geographic distribution (plus associated auditing) is
intended for the long term.
• Don’t forget auditing!
Photo: “Nido”
http://www.flickr.com/photos/italintheheart/3679974298/
Jorge Elías / CC-BY 2.0
16. LOCKSS
• Lots of Copies Keeps Stuff Safe!
• (There is also Portico, but Portico only works with
e‑journal content.)
• Open-source software that handles replication and
(some) auditing.
• “Private LOCKSS network”
• A group of institutions agrees to build a LOCKSS
network just for the stuff they’re interested in.
• ASERL does this for ETDs. Many institutions
(including UW-Madison) participate in a PLN for
govdocs.
17. “The cloud”
• Typical cloud-based storage services make
NO promises they won’t lose your stuff.
• And for large quantities of data, bandwidth can become
an issue.
• And can they look at your stuff? Should they be able to?
• Some early movers in this market fading
• Iron Mountain had to kill their service.
• DuraCloud
• trying to finesse this issue by negotiating tougher SLAs
with cloud-storage providers
Photo: “Sky View From Humboldt Park”
http://www.flickr.com/photos/purpleslog/2589612577/
Purple Slog / CC-BY 2.0
18. Repository
and digital-library
platforms
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
19. Friendly word
of advice:
PICK
SOFTWARE
LAST. Photo: “Briana Calderon; future educator of america.”
http://www.flickr.com/photos/46132085@N03/4703617843/
Arielle Calderon / CC-BY 2.0
20. Another friendly word of
advice:
DON’T CHASE
THE SHINY.
Photo: “Sparkle Texture”
http://www.flickr.com/photos/abbylanes/3214921616/
Abby Lane / CC-BY 2.0
21. Digital-library software
• Is almost always VERY BAD at digital
preservation!
• (most packages don’t even try!)
• So if a file gets corrupted on the server, or whatever...
no warnings, no restore, nothing. Also, provenance?
Who needs provenance? Event tracking? What’s that?
• I’m not saying don’t use it. I’m saying that
it doesn’t solve this problem.
• In fact, if you’re using this software, you need to solve
this problem FOR IT.
Photo: “National DIGITAL Library”
http://www.flickr.com/photos/schex/193912573/
Jesse Schexnayder / CC-BY 2.0
23. Institutional-repository
software
• Is SHOCKINGLY bad at digital preservation!
• (Though sometimes better than most DL software.)
• Examples
• Hosted/commercial: Digital Commons (BePress),
ContentDM, DigiTool
• If you go hosted, you’d better ask about their digital-
preservation practices!
• Open-source: EPrints, DSpace, Fedora
Photo: “IMG_0668”
http://www.flickr.com/photos/12967790@N00/66531124
Robert / CC-BY 2.0
24. A new approach:
curation
microservices
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
25. Do we really need
Photo: “giant crystal blob”
http://www.flickr.com/photos/a_of_doom/527905701/
A of DooM / CC-BY 2.0
THE BLOB?
26. How about a jigsaw
puzzle instead?
• Break the digital-preservation problem
down into parts.
• Code up each part, making sure that it
plays nicely with other parts.
• lots of nice APIs!
• which means other software can adopt/adapt
microservices as well!
• Put parts together as you need them.
Photo: “Lapsana Apogonoides Puzzle”
http://www.flickr.com/photos/gdesigneralex/2313092112/
gdesigneralex / CC-BY 2.0
27. California Digital Library
• Pioneering this approach
• Has open-sourced code for microservices
• Has added microservices together to build
its “Merritt” storage/repository service
28. Escaping the silos:
Fedora Commons
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
29. What is Fedora Commons?
• Blueprints and foundation, not the whole
house (analogy credit to Peter Gorman)
• You build the house you want!
• Or you build condominiums on the same
foundation.
• Need different user interfaces for different materials?
• Need different structures and behaviors?
• No problem! Fedora can handle that.
• (have I run this analogy into the ground yet?)
32. E-records
management
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
33. Axioms
• Records management is
about policy and
procedures.
• If your policy doesn’t fit with
their procedures, guess what
wins? Choose battles wisely.
• There is never enough
storage space.
• Nobody cares until
there’s a crisis.
• Software will not save
you... but it might help!
Photo: “The Never Ending Math Problem”
http://www.flickr.com/photos/acidwashphotography/2967752733/
d3 Dan / CC-BY 2.0
34. Duke Data Accessioner
• Accessioning tool for digital data
• use case: J. Important Scholar dumps her hard drive
on your desk, expects you to cope
• File migrator, metadata manager, GUI,
plugins (e.g. for file-format detection)
• Bit rough, but in production use.
• http://library.duke.edu/uarchives/about/tools/data-
accessioner.html
35. Archivematica
• Soup-to-nuts records management and
digital preservation tool.
• Evaluation and accessioning all the way through
preservation actions. (Oddly, they seem to be
missing disposal... but they’re in alpha, so...)
• Open source
• Runs on a Linux server; RMs and archivists log in to
GUI application remotely.
• Normally I hate and fear silos, but this one
is smartly built on microservices.
36. Practical E-Records
• Weblog by Chris Prom and protegés
• Tool evaluations, conference-session
writeups, essays on praxis
• Best reading out there for the do-it-
yourselfer
• If you’re not reading it, why not?
• http://e-records.chrisprom.com/
37. Last thoughts
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
38. If you can’t do everything...
Image: “Confused”
http://www.flickr.com/photos/kristiand/3223044657/
Kristian D. / CC-BY 2.0
that’s okay. Who can?
39. DO SOMETHING.
Photo: “Came hame háááá!”
http://www.flickr.com/photos/kristiand/3223044657/
Guirí R. Reyes / CC-BY 2.0
40. The worst threat?
INACTION. Photo: “Fatty’s role model”
http://www.flickr.com/photos/cloudzilla/4910616774/
cloudzilla / CC-BY 2.0
41. Thank you!
This presentation is available
under a Creative Commons 3.0
United States license.
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0