SlideShare a Scribd company logo
1 of 69
Download to read offline
Dan Gillean – Artefactual Systems
ACA Conference 2016 – Montréal, QC
Technologie Proche:
Envisioning the Archival Systems of
Tomorrow with the Tools of Today
https://en.wikipedia.org/wiki/Hermes_%28missile_progra
m%29#/media/File:Hermes_A-1_Test_Rockets_-_GPN-
2000-000063.jpg
This is
going to be
a bit of an
experiment
….
https://en.wikipedia.org/wiki/Nikola_Tesla#/media/File:Nikola_Tesla,_with_his_equipment_Wellcome_M0014782.jpg
What projects, technologies,
or initiatives from the last
couple of years have inspired
you around archives and
technology? What are you
anticipating?
Group Brainstorm:
Google doc: http://bit.ly/tech-Proche
Twitter: #techProche
aboutmodafinil.com via https://commons.wikimedia.org/wiki/File:Brain_01.jpg
http://knowyourmeme.com/memes/i-have-no-idea-what-i-m-doing
Change is inevitable
https://en.wikipedia.org/wiki/Analog_computer#/media/File:Analog_Computing_Machine_GPN-2000-000354.jpg
We can’t future-proof… so let’s be strategic
•Open
•Collaborative
•Decentralized
•Agnostic
•Interoperable
Building systems to be:
data
metadata
https://commons.wikimedia.org/wiki/File:Go_Board,_Hoge_Rielen,_BelgiumEdit_Fcb981.jpg
Centralized, Decentralized, and Distributed
http://www.truthcoin.info/blog/measuring-decentralization/
•Open source
•Open standards
•Existing communities
We need to
leverage what
already exists
http://www.totallylocalvc.com/art-of-teamwork/
The Tools of Today
Rusty tools; Biser Todorov via https://commons.wikimedia.org/wiki/File:Rusty_tools.JPG
Distributed version control and git
Version control:
A system for managing changes to
source data (documents, code, etc)
over time.
Version control systems help you track
who made what change when, and
why. They also allow you to roll back
changes to your data to a previous
state.
https://git-scm.com/
Local version control
•Changes are tracked
on a local computer
•Simple – great for
working alone
•Not effective for
collaboration
https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control
Centralized version control
•Single server with all
versioned files; users
“check out” files
•Allows for
collaboration
•Single point of failure
https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control
Distributed version control
•Full mirroring of
source with each user
•Broadest potential for
collaboration
•Complex merging,
branching, etc.
possible
https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control
Git
•Open source
•Distributed model
•Strong User Community, lots
of free documentation
•Widely used and supported
across platforms
https://xkcd.com/1597/
Git
•Hashes all files in the
repository
•Produces a diff of changes 
•Commit messages explain who
did what when and why
•Supports parallel workflows
with a method for resolving
conflicts
https://github.com/artefactual/atom-
docs/commit/6d8fe4ab9803ec3421b04bcabe215e3dbb4a77dc
Git
•Essentially a (NoSQL) database
– a key/value store
• Objects DB and Refs DB
•Libgit2: git implementation in C
• Supports different backends
•Might git provide a model for
auditable, versioned data
management? https://github.com/artefactual/atom-
docs/commit/6d8fe4ab9803ec3421b04bcabe215e3dbb4a77dcVincent Driessen via http://nvie.com/posts/a-successful-git-branching-model/
Git for Archives?
•Description systems:
• Maintain transparent provenance and manage collaboration
and concurrency – e.g. authority records
•Collaborative preservation environments
• Track transfer/ingest, preservation actions
•Distributed, version-controlled backups
• Reconcile changes via branch management
Peer to Peer computing and BitTorrent
P2P:
A distributed application
architecture that directly links
users as peers with equal
privileges, instead of relying on
a centralized server for
communication and exchange
Early mainframe computing
Lawrence Livermore National Laboratory. IBM 704 Mainframe, via
https://en.wikipedia.org/wiki/Mainframe_computer#/media/File:IBM_704_mainframe.gif
The promise of the web
ARPANET MARCH 1977 via
https://commons.wikimedia.org/wiki/World_Wide_Web#/media/File:Arpanet_logical_map,_march_1977.png
The rise of the cloud
https://pixabay.com/en/network-iot-internet-of-things-782707/
The rise of the cloud
https://pixabay.com/en/network-iot-internet-of-things-782707/
The rise of the cloud
https://pixabay.com/en/network-iot-internet-of-things-782707/
P2P file sharing
http://www.nextinpact.com/dossier/hadopi-nicolas-sarkozy-sacd-snep/1.htm http://popjoust.com/jousts/400
BitTorrent Protocol
•A communications protocol for peer-to-peer file sharing
•Works with any kind/size of files
•Increases speed with increases in users – each user
becomes a potential download source
•Uses cryptographic hashing to segment files into pieces,
so different pieces can be downloaded from different
users and verified against the torrent tracker
BitTorrent Protocol
BitTorrent client .torrent file
BitTorrent Protocol
• Torrent users who share content
are called peers
• A group of peers is called a swarm
• Users who download without also
sharing are called leeches
• Early BT implementations used a
centralized tracker to discover and
connect peers, before they could
start sharing directly
BitTorrent Protocol
DHT:
Distributed Hash Table. A decentralized
lookup system (similar to a hash table)
with key/value pairs.
• Pairs are stored in the DHT
• Any peer can look up the value associated with
a key
• Responsibility for maintaining the mapping
from keys to values is algorithmically distributed
among the nodes to minimize disruption when
nodes appear/depart/fail
BitTorrent Protocol
•Central trackers and DHT
can be used together with
other methods to increase
efficiency
•Other features (e.g.
encryption) can be added
on top
Archival uses for BitTorrent?
https://archive.org/details/bittorrent
Archival uses for BitTorrent?
https://archive.org/details/toronto
Archival uses for P2P and data?
https://www.lockss.org/about/how-it-works/
Archival uses for P2P and data?
http://dat-data.com/
https://github.com/CfABrigadePhiladelphia/jawn
Archival uses for P2P and data?
• Description and access: a networking and data exchange
protocol
• Exchanging authority records between repositories
• Updating portal sites
• Offering content to the public
• Preservation: Sharing supporting technology images
• Operating systems, codecs, tools
• LOCKSS-style approach to preservation and real-time access despite
service interruptions at any one point – distributed copies
Archival uses for P2P and data?
Linkedopendata(LOD)
LinkingOpenDataclouddiagram2014,byMax
Schmachtenberg,ChristianBizer,AnjaJentzschandRichard
Cyganiak.http://lod-cloud.net/
The WWW
The WWW
The WWW ?
https://www.youtube.com/watch?v=cf7Ie6PMZJk
Linked data
A method of expressing and
publishing structured data so
that it can be interlinked and
queried semantically by both
humans and machines.
http://linkeddata.org/
• Turning the web of documents into a
machine-readable web of data
• From the World Wide Web to the Giant
Global Graph
Linked data
Tim Berners-Lee. John S. and James L. Knight Foundation via
https://commons.wikimedia.org/wiki/File:Tim_Berners-Lee-Knight-crop.jpg
4 Principles:
• Use URIs to name (identify) things.
• Use HTTP URIs so that these things can be
looked up (interpreted, "dereferenced").
• Provide useful information about what a
name identifies when it's looked up, using
open standards such as RDF, SPARQL, etc.
• Refer to other things using their HTTP URI-
based names when publishing data on the
Web.
RDF
• Makes statements about resources as expressions
• Uses subject  predicate  object structure to form triples
A standard model for data
interchange on the web
RDF
A standard model
for data interchange
on the web
• Uses URIs to express each
element of the triple
• Can be serialized into
many different formats
RDF
https://www.w3.org/TR/2014/NOTE-rdf11-primer-20140225/
5-star deployment
scheme for open
data
Linked open data
★ Make your stuff available on the Web
(whatever format) under an open license
★ ★ Make it available as structured data (e.g.,
Excel instead of image scan of a table)
★ ★ ★ Make it available in a non-proprietary
open format (e.g., CSV instead of Excel)
★ ★ ★ ★ Use URIs to denote things, so that people
can point at your stuff
★ ★ ★ ★ ★ Link your data to other data to provide
context
http://5stardata.info/en/
Linked open data and archives
Lots of discussion, many existing useful ontologies, and
some implementation in larger portal sites. Few
institutional implementers out there.
• Europeana Data Model (EU), Archives Hub (UK)
• LoC, W3C, Getty; SKOS, ViAF, GeoNames, etc.
• PREMIS  creating an OWL ontology
• PRONOM ontology? Work on hold
• ICA EGAD developing an entity relationship model that can be
expressed semantically?
Linked open data and archives
https://linkedjazz.org/
Linked open data and archives
https://linkedjazz.org/tools/
Combining linked open data, machine learning, and
crowdsourcing into a suite of user friendly tools
Linked open data and archives
• Description and Access
• Our systems should connect users to other related information –
linked data is the way
• Relationship visualizations, queries across institutions, domains
• Crowd-sourced description and reconciling from communities of
interest
• Preservation
• PREMIS ontology + PRONOM ontology + W3C PROV= building
blocks for a community Format Policy Registry (FPR)
• Ability to generate community-wide preservation stats
Blockchain:
A distributed ledger that uses
cryptographic hashing to maintain a
continuously growing record of
transactions, divided into linked,
time-stamped blocks.
Blockchain technology
Jorge González, “Chains reloaded.”July 29, 2011.
https://www.flickr.com/photos/aloriel/6292261464
Bitcoin
• An online, anonymous cryptocurrency
• Allows peer-to-peer transactions without
an intermediary
• Created 2008, released open-source in
January 2009
• Relies on the blockchain:
• Transactions are verified by nodes on the
network
• Nodes are rewarded for verification
• Once transactions are verified, they are added
to a block every 10mins
https://www.flickr.com/photos/jason_benjamin/12795733984
Bitcoin
• A wallet can have as many BTC addresses as desired
• An address is created via public/private key cryptography
User installs a
Bitcoin client (a
wallet), and uses
it to create a
bitcoin address
for a transaction
Bitcoin
Bitcoin
H(a)
H(b)
H(c)
H(d)
H(AB)
H(CD)
HABCD
00002393k242l284q34qrsw
Previous block’s
hash
2016-06-03 09:55:01:003 UTC
Timestamp
55k4903rgw4004j48f2l2024j5
Transactions’
root hash
N(x)
nonce
Merkle tree
Block (candidate)
Previous hash + transaction hash + nonce = new block hash (0000000…)
Blockchain
• Miners (peers) try to solve the
cryptographic puzzle of the
nonce
• First to solve is rewarded with
Bitcoin
• Other miners verify the solution
• When there is consensus, the
block is added to the blockchain
• New blocks are created every 10
minutes
https://medium.com/@nik5ter/explain-bitcoin-like-im-five-73b4257ac833
Blockchain
• An immutable, distributed, public ledger
• No single point of failure
• Removes central mediators in transactions
• Errors and fakes are resolved via consensus
against all versions
• Entire history of transactions available to all
Graph - by Rocchini - Own work, CC BY 3.0,
https://commons.wikimedia.org/w/index.php?curid=19859789
Blockchain
Oleh Zaiats. “Blockhain mind map.” FinTech, Jan 12, 2016.
http://fintechinfo.com/blockchain-mind-map/
Blockchain
https://www.ethereum.org/
Blockchain
http://etherscripter.com/what_is_ethereum.html
Smart contracts
• Contracts are essentially
agreements – much like
traditional law – often with
economic impacts
• Terms and conditions, time
frames and consequences, all
lend themselves well to code
• Ethereum provides digital
currency and a programming
language to create
distributed, enforceable
agreements
Blockchain http://dapps.ethercasts.com/
Archival Blockchain uses?
Chain of custody
• An archival blockchain could support the maintenance provenance and authenticity
• Register acquisitions, donor agreements, transfers, preservation actions
• LOCKSS-style P2P backups – use ledger + smart contracts to track who is preserving
what where, under what terms – no need for centralized coordination
Archival Services Network
• Institutions donate storage space, compute power, or services
• Are rewarded with “ArchCoin” for use on the network
• These are used to cover network costs, use of other services
• DAOs and smart contracts allow automation of decentralized preservation services
Bringing it all together
Thomas Hawk via https://commons.wikimedia.org/wiki/File:Fireworks_4.jpg
Bringing it all together
https://ipfs.io/
Bringing it all together
http://brewster.kahle.org/2015/08/11/locking-the-web-open-a-call-for-a-distributed-web-2/
Bringing it all together
http://www.decentralizedweb.net/
Bringing it all together
https://medium.com/on-archivy/decentralized-autonomous-collections-ff256267cbd6
“…I believe that the emergence of blockchain technology and the
concept of Decentralized Autonomous Organizations, alongside the
maturation of peer-to-peer networks, open library technology
architectures, and open-source software practices offers a new approach
to the issues of control, privilege, and sustainability that are inherent to
many centralized information collections.”
-- Peter Van Garderen, Decentralized Autonomous Collections
“A Decentralized Autonomous Collection is a set
of digital information objects stored for ongoing
re-use with the means and incentives for
independent parties to participate in the
contribution, presentation, and curation of the
information objects outside the control of an
exclusive custodian.”
Will we be
a part of
this future?
https://commons.wikimedia.org/wiki/File:GT056-Antigua_C-ruins.jpeg
• What is the most interesting part of this technology in terms of archives?
What potential do you see?
• Where do you think these ideas fall short? What might be some of the
challenges?
• How might the ideas shared around this technology impact your institution if
it existed now? How might it change the way you carry out these
workflows/activities?
• If it doesn’t seem like it would solve a challenge you face, why not? What
commonalities can you find in terms of challenges with others in your group?
Group questions
Google doc: http://bit.ly/tech-Proche
Twitter: #techProche
Thanks!
Dan Gillean – Artefactual Systems
ACA Conference 2016 – Montréal, QC

More Related Content

What's hot

Digital Curation using Archivematica and AtoM: DLF Forum 2015
Digital Curation using Archivematica and AtoM: DLF Forum 2015Digital Curation using Archivematica and AtoM: DLF Forum 2015
Digital Curation using Archivematica and AtoM: DLF Forum 2015Artefactual Systems - AtoM
 
Building the Future Together: AtoM3, Governance, and the Sustainability of Op...
Building the Future Together: AtoM3, Governance, and the Sustainability of Op...Building the Future Together: AtoM3, Governance, and the Sustainability of Op...
Building the Future Together: AtoM3, Governance, and the Sustainability of Op...Artefactual Systems - AtoM
 
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...Artefactual Systems - Archivematica
 
Linked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageLinked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageNoreen Whysel
 
Digital Infrastructure: Storage and Content Management
Digital Infrastructure: Storage and Content ManagementDigital Infrastructure: Storage and Content Management
Digital Infrastructure: Storage and Content ManagementNoreen Whysel
 
Getting Started with AtoM and Archivematica for Digital Preservation and Access
Getting Started with AtoM and Archivematica for Digital Preservation and AccessGetting Started with AtoM and Archivematica for Digital Preservation and Access
Getting Started with AtoM and Archivematica for Digital Preservation and AccessArtefactual Systems - Archivematica
 
ROTLD DNSSEC Implementation
ROTLD DNSSEC ImplementationROTLD DNSSEC Implementation
ROTLD DNSSEC ImplementationKevin Meynell
 
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...Andrea Bollini
 

What's hot (20)

Workshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and ArchivematicaWorkshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and Archivematica
 
Digital Curation using Archivematica and AtoM: DLF Forum 2015
Digital Curation using Archivematica and AtoM: DLF Forum 2015Digital Curation using Archivematica and AtoM: DLF Forum 2015
Digital Curation using Archivematica and AtoM: DLF Forum 2015
 
Building the Future Together: AtoM3, Governance, and the Sustainability of Op...
Building the Future Together: AtoM3, Governance, and the Sustainability of Op...Building the Future Together: AtoM3, Governance, and the Sustainability of Op...
Building the Future Together: AtoM3, Governance, and the Sustainability of Op...
 
Introduction to Archivematica
Introduction to ArchivematicaIntroduction to Archivematica
Introduction to Archivematica
 
AtoM Community Update: 2019-05
AtoM Community Update: 2019-05AtoM Community Update: 2019-05
AtoM Community Update: 2019-05
 
Archivematica presentation to SJSU iSchool Colloquia series
Archivematica presentation to SJSU iSchool Colloquia seriesArchivematica presentation to SJSU iSchool Colloquia series
Archivematica presentation to SJSU iSchool Colloquia series
 
Archivematica and the digital archival chain of custody
Archivematica and the digital archival chain of custodyArchivematica and the digital archival chain of custody
Archivematica and the digital archival chain of custody
 
Archivematica Community Update - SAA 2016
Archivematica Community Update - SAA 2016Archivematica Community Update - SAA 2016
Archivematica Community Update - SAA 2016
 
Digital Preservation with Archivematica: An Introduction
Digital Preservation with Archivematica: An IntroductionDigital Preservation with Archivematica: An Introduction
Digital Preservation with Archivematica: An Introduction
 
PREMIS in METS in Archivematica
PREMIS in METS in ArchivematicaPREMIS in METS in Archivematica
PREMIS in METS in Archivematica
 
Your Digital Preservation Cookbook
Your Digital Preservation CookbookYour Digital Preservation Cookbook
Your Digital Preservation Cookbook
 
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...
 
Linked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageLinked Open Data for Cultural Heritage
Linked Open Data for Cultural Heritage
 
Digital Infrastructure: Storage and Content Management
Digital Infrastructure: Storage and Content ManagementDigital Infrastructure: Storage and Content Management
Digital Infrastructure: Storage and Content Management
 
Report: Archivematica hosting in the cloud
Report: Archivematica hosting in the cloudReport: Archivematica hosting in the cloud
Report: Archivematica hosting in the cloud
 
ION Bucharest - ISOC & Deploy360 overview
ION Bucharest - ISOC & Deploy360 overviewION Bucharest - ISOC & Deploy360 overview
ION Bucharest - ISOC & Deploy360 overview
 
IETF Update: Making the Internet Work Better
IETF Update: Making the Internet Work BetterIETF Update: Making the Internet Work Better
IETF Update: Making the Internet Work Better
 
Getting Started with AtoM and Archivematica for Digital Preservation and Access
Getting Started with AtoM and Archivematica for Digital Preservation and AccessGetting Started with AtoM and Archivematica for Digital Preservation and Access
Getting Started with AtoM and Archivematica for Digital Preservation and Access
 
ROTLD DNSSEC Implementation
ROTLD DNSSEC ImplementationROTLD DNSSEC Implementation
ROTLD DNSSEC Implementation
 
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
 

Viewers also liked

AtoM and Vagrant: Installing and Configuring the AtoM Vagrant Box for Local T...
AtoM and Vagrant: Installing and Configuring the AtoM Vagrant Box for Local T...AtoM and Vagrant: Installing and Configuring the AtoM Vagrant Box for Local T...
AtoM and Vagrant: Installing and Configuring the AtoM Vagrant Box for Local T...Artefactual Systems - AtoM
 
Новые тенденции в отечественном и мировом делопроизводстве и архивном деле
Новые тенденции в отечественном и мировом делопроизводстве и архивном делеНовые тенденции в отечественном и мировом делопроизводстве и архивном деле
Новые тенденции в отечественном и мировом делопроизводстве и архивном делеNatasha Khramtsovsky
 
Деятельность ИСО по стандартизации в сфере управления документами и информаци...
Деятельность ИСО по стандартизации в сфере управления документами и информаци...Деятельность ИСО по стандартизации в сфере управления документами и информаци...
Деятельность ИСО по стандартизации в сфере управления документами и информаци...Natasha Khramtsovsky
 
Блокчейн и PKI: друзья или конкуренты?
Блокчейн и PKI: друзья или конкуренты?Блокчейн и PKI: друзья или конкуренты?
Блокчейн и PKI: друзья или конкуренты?Natasha Khramtsovsky
 
Project Documentation with Sphinx (or, How I Learned to Stop Worrying and Lov...
Project Documentation with Sphinx (or, How I Learned to Stop Worrying and Lov...Project Documentation with Sphinx (or, How I Learned to Stop Worrying and Lov...
Project Documentation with Sphinx (or, How I Learned to Stop Worrying and Lov...Artefactual Systems - AtoM
 
«Назначение ГОСТ Р/ИСО 18128 «Информация и документация. Оценка рисков для до...
«Назначение ГОСТ Р/ИСО 18128 «Информация и документация. Оценка рисков для до...«Назначение ГОСТ Р/ИСО 18128 «Информация и документация. Оценка рисков для до...
«Назначение ГОСТ Р/ИСО 18128 «Информация и документация. Оценка рисков для до...Natasha Khramtsovsky
 
Законодательство и оценка рисков при автоматизации работы с документами. Вызо...
Законодательство и оценка рисков при автоматизации работы с документами. Вызо...Законодательство и оценка рисков при автоматизации работы с документами. Вызо...
Законодательство и оценка рисков при автоматизации работы с документами. Вызо...Natasha Khramtsovsky
 

Viewers also liked (7)

AtoM and Vagrant: Installing and Configuring the AtoM Vagrant Box for Local T...
AtoM and Vagrant: Installing and Configuring the AtoM Vagrant Box for Local T...AtoM and Vagrant: Installing and Configuring the AtoM Vagrant Box for Local T...
AtoM and Vagrant: Installing and Configuring the AtoM Vagrant Box for Local T...
 
Новые тенденции в отечественном и мировом делопроизводстве и архивном деле
Новые тенденции в отечественном и мировом делопроизводстве и архивном делеНовые тенденции в отечественном и мировом делопроизводстве и архивном деле
Новые тенденции в отечественном и мировом делопроизводстве и архивном деле
 
Деятельность ИСО по стандартизации в сфере управления документами и информаци...
Деятельность ИСО по стандартизации в сфере управления документами и информаци...Деятельность ИСО по стандартизации в сфере управления документами и информаци...
Деятельность ИСО по стандартизации в сфере управления документами и информаци...
 
Блокчейн и PKI: друзья или конкуренты?
Блокчейн и PKI: друзья или конкуренты?Блокчейн и PKI: друзья или конкуренты?
Блокчейн и PKI: друзья или конкуренты?
 
Project Documentation with Sphinx (or, How I Learned to Stop Worrying and Lov...
Project Documentation with Sphinx (or, How I Learned to Stop Worrying and Lov...Project Documentation with Sphinx (or, How I Learned to Stop Worrying and Lov...
Project Documentation with Sphinx (or, How I Learned to Stop Worrying and Lov...
 
«Назначение ГОСТ Р/ИСО 18128 «Информация и документация. Оценка рисков для до...
«Назначение ГОСТ Р/ИСО 18128 «Информация и документация. Оценка рисков для до...«Назначение ГОСТ Р/ИСО 18128 «Информация и документация. Оценка рисков для до...
«Назначение ГОСТ Р/ИСО 18128 «Информация и документация. Оценка рисков для до...
 
Законодательство и оценка рисков при автоматизации работы с документами. Вызо...
Законодательство и оценка рисков при автоматизации работы с документами. Вызо...Законодательство и оценка рисков при автоматизации работы с документами. Вызо...
Законодательство и оценка рисков при автоматизации работы с документами. Вызо...
 

Similar to Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools of Today

Social Semantic (Sensor) Web
Social Semantic (Sensor) WebSocial Semantic (Sensor) Web
Social Semantic (Sensor) WebDavid Crowley
 
The Social Semantic Web
The Social Semantic WebThe Social Semantic Web
The Social Semantic WebJohn Breslin
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked DataAdrian Stevenson
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...Martin Klein
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...openminted_eu
 
New challenges for digital scholarship and curation in the era of ubiquitous ...
New challenges for digital scholarship and curation in the era of ubiquitous ...New challenges for digital scholarship and curation in the era of ubiquitous ...
New challenges for digital scholarship and curation in the era of ubiquitous ...Derek Keats
 
Open Data - Principles and Techniques
Open Data - Principles and TechniquesOpen Data - Principles and Techniques
Open Data - Principles and TechniquesBernhard Haslhofer
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair" OpenAIRE
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data GenerationFilip Radulovic
 
New ICT Trends and Issues of Librarianship
New ICT Trends and Issues of LibrarianshipNew ICT Trends and Issues of Librarianship
New ICT Trends and Issues of LibrarianshipLiaquat Rahoo
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Blue BRIDGE
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us? Andrea Volpini
 
Metaverse for Dataverse
Metaverse for DataverseMetaverse for Dataverse
Metaverse for Dataversevty
 
Lessons learned from Semantic Wiki
Lessons learned from Semantic WikiLessons learned from Semantic Wiki
Lessons learned from Semantic WikiJie Bao
 

Similar to Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools of Today (20)

Social Semantic (Sensor) Web
Social Semantic (Sensor) WebSocial Semantic (Sensor) Web
Social Semantic (Sensor) Web
 
The Social Semantic Web
The Social Semantic WebThe Social Semantic Web
The Social Semantic Web
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked Data
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...
 
Linked Data to Improve the OER Experience
Linked Data to Improve the OER ExperienceLinked Data to Improve the OER Experience
Linked Data to Improve the OER Experience
 
Sgci esip-7-20-18
Sgci esip-7-20-18Sgci esip-7-20-18
Sgci esip-7-20-18
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
New challenges for digital scholarship and curation in the era of ubiquitous ...
New challenges for digital scholarship and curation in the era of ubiquitous ...New challenges for digital scholarship and curation in the era of ubiquitous ...
New challenges for digital scholarship and curation in the era of ubiquitous ...
 
Open Data - Principles and Techniques
Open Data - Principles and TechniquesOpen Data - Principles and Techniques
Open Data - Principles and Techniques
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data Generation
 
New ICT Trends and Issues of Librarianship
New ICT Trends and Issues of LibrarianshipNew ICT Trends and Issues of Librarianship
New ICT Trends and Issues of Librarianship
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us?
 
Webware Webinar
Webware WebinarWebware Webinar
Webware Webinar
 
Metaverse for Dataverse
Metaverse for DataverseMetaverse for Dataverse
Metaverse for Dataverse
 
CAEPIA 2011
CAEPIA 2011CAEPIA 2011
CAEPIA 2011
 
Lessons learned from Semantic Wiki
Lessons learned from Semantic WikiLessons learned from Semantic Wiki
Lessons learned from Semantic Wiki
 

More from Artefactual Systems - AtoM

Things I wish I'd known - AtoM tips, tricks, and gotchas
Things I wish I'd known - AtoM tips, tricks, and gotchasThings I wish I'd known - AtoM tips, tricks, and gotchas
Things I wish I'd known - AtoM tips, tricks, and gotchasArtefactual Systems - AtoM
 
Creating your own AtoM demo data set for re-use with Vagrant
Creating your own AtoM demo data set for re-use with VagrantCreating your own AtoM demo data set for re-use with Vagrant
Creating your own AtoM demo data set for re-use with VagrantArtefactual Systems - AtoM
 
Introducing Binder: A Web-based, Open Source Digital Preservation Management ...
Introducing Binder: A Web-based, Open Source Digital Preservation Management ...Introducing Binder: A Web-based, Open Source Digital Preservation Management ...
Introducing Binder: A Web-based, Open Source Digital Preservation Management ...Artefactual Systems - AtoM
 
Introducing the Digital Repository for Museum Collections (DRMC)
Introducing the Digital Repository for Museum Collections (DRMC)Introducing the Digital Repository for Museum Collections (DRMC)
Introducing the Digital Repository for Museum Collections (DRMC)Artefactual Systems - AtoM
 

More from Artefactual Systems - AtoM (17)

CSV import in AtoM
CSV import in AtoMCSV import in AtoM
CSV import in AtoM
 
Things I wish I'd known - AtoM tips, tricks, and gotchas
Things I wish I'd known - AtoM tips, tricks, and gotchasThings I wish I'd known - AtoM tips, tricks, and gotchas
Things I wish I'd known - AtoM tips, tricks, and gotchas
 
Creating your own AtoM demo data set for re-use with Vagrant
Creating your own AtoM demo data set for re-use with VagrantCreating your own AtoM demo data set for re-use with Vagrant
Creating your own AtoM demo data set for re-use with Vagrant
 
Searching in AtoM
Searching in AtoMSearching in AtoM
Searching in AtoM
 
AtoM Implementations
AtoM ImplementationsAtoM Implementations
AtoM Implementations
 
AtoM Data Migrations
AtoM Data MigrationsAtoM Data Migrations
AtoM Data Migrations
 
Contributing to the AtoM documentation
Contributing to the AtoM documentationContributing to the AtoM documentation
Contributing to the AtoM documentation
 
Installing AtoM with Ansible
Installing AtoM with AnsibleInstalling AtoM with Ansible
Installing AtoM with Ansible
 
AtoM feature development
AtoM feature developmentAtoM feature development
AtoM feature development
 
Constructing SQL queries for AtoM
Constructing SQL queries for AtoMConstructing SQL queries for AtoM
Constructing SQL queries for AtoM
 
Creating custom themes in AtoM
Creating custom themes in AtoMCreating custom themes in AtoM
Creating custom themes in AtoM
 
Installing and Upgrading AtoM
Installing and Upgrading AtoMInstalling and Upgrading AtoM
Installing and Upgrading AtoM
 
Get to Know AtoM's Codebase
Get to Know AtoM's CodebaseGet to Know AtoM's Codebase
Get to Know AtoM's Codebase
 
AtoM's Command Line Tasks - An Introduction
AtoM's Command Line Tasks - An IntroductionAtoM's Command Line Tasks - An Introduction
AtoM's Command Line Tasks - An Introduction
 
Command-Line 101
Command-Line 101Command-Line 101
Command-Line 101
 
Introducing Binder: A Web-based, Open Source Digital Preservation Management ...
Introducing Binder: A Web-based, Open Source Digital Preservation Management ...Introducing Binder: A Web-based, Open Source Digital Preservation Management ...
Introducing Binder: A Web-based, Open Source Digital Preservation Management ...
 
Introducing the Digital Repository for Museum Collections (DRMC)
Introducing the Digital Repository for Museum Collections (DRMC)Introducing the Digital Repository for Museum Collections (DRMC)
Introducing the Digital Repository for Museum Collections (DRMC)
 

Recently uploaded

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 

Recently uploaded (20)

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 

Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools of Today

  • 1. Dan Gillean – Artefactual Systems ACA Conference 2016 – Montréal, QC Technologie Proche: Envisioning the Archival Systems of Tomorrow with the Tools of Today https://en.wikipedia.org/wiki/Hermes_%28missile_progra m%29#/media/File:Hermes_A-1_Test_Rockets_-_GPN- 2000-000063.jpg
  • 2. This is going to be a bit of an experiment …. https://en.wikipedia.org/wiki/Nikola_Tesla#/media/File:Nikola_Tesla,_with_his_equipment_Wellcome_M0014782.jpg
  • 3. What projects, technologies, or initiatives from the last couple of years have inspired you around archives and technology? What are you anticipating? Group Brainstorm: Google doc: http://bit.ly/tech-Proche Twitter: #techProche aboutmodafinil.com via https://commons.wikimedia.org/wiki/File:Brain_01.jpg
  • 6. We can’t future-proof… so let’s be strategic •Open •Collaborative •Decentralized •Agnostic •Interoperable Building systems to be: data metadata https://commons.wikimedia.org/wiki/File:Go_Board,_Hoge_Rielen,_BelgiumEdit_Fcb981.jpg
  • 7. Centralized, Decentralized, and Distributed http://www.truthcoin.info/blog/measuring-decentralization/
  • 8. •Open source •Open standards •Existing communities We need to leverage what already exists http://www.totallylocalvc.com/art-of-teamwork/
  • 9. The Tools of Today Rusty tools; Biser Todorov via https://commons.wikimedia.org/wiki/File:Rusty_tools.JPG
  • 10. Distributed version control and git Version control: A system for managing changes to source data (documents, code, etc) over time. Version control systems help you track who made what change when, and why. They also allow you to roll back changes to your data to a previous state. https://git-scm.com/
  • 11. Local version control •Changes are tracked on a local computer •Simple – great for working alone •Not effective for collaboration https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control
  • 12. Centralized version control •Single server with all versioned files; users “check out” files •Allows for collaboration •Single point of failure https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control
  • 13. Distributed version control •Full mirroring of source with each user •Broadest potential for collaboration •Complex merging, branching, etc. possible https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control
  • 14. Git •Open source •Distributed model •Strong User Community, lots of free documentation •Widely used and supported across platforms https://xkcd.com/1597/
  • 15. Git •Hashes all files in the repository •Produces a diff of changes  •Commit messages explain who did what when and why •Supports parallel workflows with a method for resolving conflicts https://github.com/artefactual/atom- docs/commit/6d8fe4ab9803ec3421b04bcabe215e3dbb4a77dc
  • 16. Git •Essentially a (NoSQL) database – a key/value store • Objects DB and Refs DB •Libgit2: git implementation in C • Supports different backends •Might git provide a model for auditable, versioned data management? https://github.com/artefactual/atom- docs/commit/6d8fe4ab9803ec3421b04bcabe215e3dbb4a77dcVincent Driessen via http://nvie.com/posts/a-successful-git-branching-model/
  • 17. Git for Archives? •Description systems: • Maintain transparent provenance and manage collaboration and concurrency – e.g. authority records •Collaborative preservation environments • Track transfer/ingest, preservation actions •Distributed, version-controlled backups • Reconcile changes via branch management
  • 18. Peer to Peer computing and BitTorrent P2P: A distributed application architecture that directly links users as peers with equal privileges, instead of relying on a centralized server for communication and exchange
  • 19. Early mainframe computing Lawrence Livermore National Laboratory. IBM 704 Mainframe, via https://en.wikipedia.org/wiki/Mainframe_computer#/media/File:IBM_704_mainframe.gif
  • 20. The promise of the web ARPANET MARCH 1977 via https://commons.wikimedia.org/wiki/World_Wide_Web#/media/File:Arpanet_logical_map,_march_1977.png
  • 21. The rise of the cloud https://pixabay.com/en/network-iot-internet-of-things-782707/
  • 22. The rise of the cloud https://pixabay.com/en/network-iot-internet-of-things-782707/
  • 23. The rise of the cloud https://pixabay.com/en/network-iot-internet-of-things-782707/
  • 25. BitTorrent Protocol •A communications protocol for peer-to-peer file sharing •Works with any kind/size of files •Increases speed with increases in users – each user becomes a potential download source •Uses cryptographic hashing to segment files into pieces, so different pieces can be downloaded from different users and verified against the torrent tracker
  • 27. BitTorrent Protocol • Torrent users who share content are called peers • A group of peers is called a swarm • Users who download without also sharing are called leeches • Early BT implementations used a centralized tracker to discover and connect peers, before they could start sharing directly
  • 28. BitTorrent Protocol DHT: Distributed Hash Table. A decentralized lookup system (similar to a hash table) with key/value pairs. • Pairs are stored in the DHT • Any peer can look up the value associated with a key • Responsibility for maintaining the mapping from keys to values is algorithmically distributed among the nodes to minimize disruption when nodes appear/depart/fail
  • 29. BitTorrent Protocol •Central trackers and DHT can be used together with other methods to increase efficiency •Other features (e.g. encryption) can be added on top
  • 30. Archival uses for BitTorrent? https://archive.org/details/bittorrent
  • 31. Archival uses for BitTorrent? https://archive.org/details/toronto
  • 32. Archival uses for P2P and data? https://www.lockss.org/about/how-it-works/
  • 33. Archival uses for P2P and data? http://dat-data.com/
  • 35. • Description and access: a networking and data exchange protocol • Exchanging authority records between repositories • Updating portal sites • Offering content to the public • Preservation: Sharing supporting technology images • Operating systems, codecs, tools • LOCKSS-style approach to preservation and real-time access despite service interruptions at any one point – distributed copies Archival uses for P2P and data?
  • 40. Linked data A method of expressing and publishing structured data so that it can be interlinked and queried semantically by both humans and machines. http://linkeddata.org/ • Turning the web of documents into a machine-readable web of data • From the World Wide Web to the Giant Global Graph
  • 41. Linked data Tim Berners-Lee. John S. and James L. Knight Foundation via https://commons.wikimedia.org/wiki/File:Tim_Berners-Lee-Knight-crop.jpg 4 Principles: • Use URIs to name (identify) things. • Use HTTP URIs so that these things can be looked up (interpreted, "dereferenced"). • Provide useful information about what a name identifies when it's looked up, using open standards such as RDF, SPARQL, etc. • Refer to other things using their HTTP URI- based names when publishing data on the Web.
  • 42. RDF • Makes statements about resources as expressions • Uses subject  predicate  object structure to form triples A standard model for data interchange on the web
  • 43. RDF A standard model for data interchange on the web • Uses URIs to express each element of the triple • Can be serialized into many different formats
  • 45. 5-star deployment scheme for open data Linked open data ★ Make your stuff available on the Web (whatever format) under an open license ★ ★ Make it available as structured data (e.g., Excel instead of image scan of a table) ★ ★ ★ Make it available in a non-proprietary open format (e.g., CSV instead of Excel) ★ ★ ★ ★ Use URIs to denote things, so that people can point at your stuff ★ ★ ★ ★ ★ Link your data to other data to provide context http://5stardata.info/en/
  • 46. Linked open data and archives Lots of discussion, many existing useful ontologies, and some implementation in larger portal sites. Few institutional implementers out there. • Europeana Data Model (EU), Archives Hub (UK) • LoC, W3C, Getty; SKOS, ViAF, GeoNames, etc. • PREMIS  creating an OWL ontology • PRONOM ontology? Work on hold • ICA EGAD developing an entity relationship model that can be expressed semantically?
  • 47. Linked open data and archives https://linkedjazz.org/
  • 48. Linked open data and archives https://linkedjazz.org/tools/ Combining linked open data, machine learning, and crowdsourcing into a suite of user friendly tools
  • 49. Linked open data and archives • Description and Access • Our systems should connect users to other related information – linked data is the way • Relationship visualizations, queries across institutions, domains • Crowd-sourced description and reconciling from communities of interest • Preservation • PREMIS ontology + PRONOM ontology + W3C PROV= building blocks for a community Format Policy Registry (FPR) • Ability to generate community-wide preservation stats
  • 50. Blockchain: A distributed ledger that uses cryptographic hashing to maintain a continuously growing record of transactions, divided into linked, time-stamped blocks. Blockchain technology Jorge González, “Chains reloaded.”July 29, 2011. https://www.flickr.com/photos/aloriel/6292261464
  • 51. Bitcoin • An online, anonymous cryptocurrency • Allows peer-to-peer transactions without an intermediary • Created 2008, released open-source in January 2009 • Relies on the blockchain: • Transactions are verified by nodes on the network • Nodes are rewarded for verification • Once transactions are verified, they are added to a block every 10mins https://www.flickr.com/photos/jason_benjamin/12795733984
  • 52. Bitcoin • A wallet can have as many BTC addresses as desired • An address is created via public/private key cryptography User installs a Bitcoin client (a wallet), and uses it to create a bitcoin address for a transaction
  • 54. Bitcoin H(a) H(b) H(c) H(d) H(AB) H(CD) HABCD 00002393k242l284q34qrsw Previous block’s hash 2016-06-03 09:55:01:003 UTC Timestamp 55k4903rgw4004j48f2l2024j5 Transactions’ root hash N(x) nonce Merkle tree Block (candidate) Previous hash + transaction hash + nonce = new block hash (0000000…)
  • 55. Blockchain • Miners (peers) try to solve the cryptographic puzzle of the nonce • First to solve is rewarded with Bitcoin • Other miners verify the solution • When there is consensus, the block is added to the blockchain • New blocks are created every 10 minutes https://medium.com/@nik5ter/explain-bitcoin-like-im-five-73b4257ac833
  • 56. Blockchain • An immutable, distributed, public ledger • No single point of failure • Removes central mediators in transactions • Errors and fakes are resolved via consensus against all versions • Entire history of transactions available to all Graph - by Rocchini - Own work, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=19859789
  • 57. Blockchain Oleh Zaiats. “Blockhain mind map.” FinTech, Jan 12, 2016. http://fintechinfo.com/blockchain-mind-map/
  • 59. Blockchain http://etherscripter.com/what_is_ethereum.html Smart contracts • Contracts are essentially agreements – much like traditional law – often with economic impacts • Terms and conditions, time frames and consequences, all lend themselves well to code • Ethereum provides digital currency and a programming language to create distributed, enforceable agreements
  • 61. Archival Blockchain uses? Chain of custody • An archival blockchain could support the maintenance provenance and authenticity • Register acquisitions, donor agreements, transfers, preservation actions • LOCKSS-style P2P backups – use ledger + smart contracts to track who is preserving what where, under what terms – no need for centralized coordination Archival Services Network • Institutions donate storage space, compute power, or services • Are rewarded with “ArchCoin” for use on the network • These are used to cover network costs, use of other services • DAOs and smart contracts allow automation of decentralized preservation services
  • 62. Bringing it all together Thomas Hawk via https://commons.wikimedia.org/wiki/File:Fireworks_4.jpg
  • 63. Bringing it all together https://ipfs.io/
  • 64. Bringing it all together http://brewster.kahle.org/2015/08/11/locking-the-web-open-a-call-for-a-distributed-web-2/
  • 65. Bringing it all together http://www.decentralizedweb.net/
  • 66. Bringing it all together https://medium.com/on-archivy/decentralized-autonomous-collections-ff256267cbd6 “…I believe that the emergence of blockchain technology and the concept of Decentralized Autonomous Organizations, alongside the maturation of peer-to-peer networks, open library technology architectures, and open-source software practices offers a new approach to the issues of control, privilege, and sustainability that are inherent to many centralized information collections.” -- Peter Van Garderen, Decentralized Autonomous Collections “A Decentralized Autonomous Collection is a set of digital information objects stored for ongoing re-use with the means and incentives for independent parties to participate in the contribution, presentation, and curation of the information objects outside the control of an exclusive custodian.”
  • 67. Will we be a part of this future? https://commons.wikimedia.org/wiki/File:GT056-Antigua_C-ruins.jpeg
  • 68. • What is the most interesting part of this technology in terms of archives? What potential do you see? • Where do you think these ideas fall short? What might be some of the challenges? • How might the ideas shared around this technology impact your institution if it existed now? How might it change the way you carry out these workflows/activities? • If it doesn’t seem like it would solve a challenge you face, why not? What commonalities can you find in terms of challenges with others in your group? Group questions Google doc: http://bit.ly/tech-Proche Twitter: #techProche
  • 69. Thanks! Dan Gillean – Artefactual Systems ACA Conference 2016 – Montréal, QC

Editor's Notes

  1. I’m Dan Gillean. I work for Artefactual Systems as the AtoM Program Manager; these views do not necessarily reflect my employer This is a new session type the ACA wanted to try – a “working sessions” that aims to encourage interactivity. Its success depends on you. Feedback welcome. We spend a lot of time at conferences recapping our challenges, or sharing our updates. We should also make time to think laterally, creatively, to brainstorm together about what’s possible Instead of just consuming presentations, the hope is to encourage discussion and exchange in future working sessions, should this format be continued The conference theme is Futur Proche. Today, let’s think about ways in which we might start shaping that future with tools and technologies that already exist now.
  2. Before we start, I want to be clear: I am not an expert on any of the things I am about to talk about. My interests are around archives and technology, and these are some of the things that have caught my attention recently. Ask me too many specific questions and I might not be able to answer them. However, my hope in leading this session is to make the case that this should not stop us from paying attention to what’s out there and dreaming big for our own future. We may not be technology experts, but our perspective as a profession on archival theory and digital preservation should play an important role in imagining the future, if we want to be a part of it. Our conferences are often dedicated to discussing challenges or sharing particular institutional workflows – and these are important - but I think we also need blue sky sessions, collaborative working sessions, ways to share our excitement about what’s possible. It’s rare to get so many of us in the room together – often this is the one chance we have a year to see so many Canadian archivists together. Let’s see what happens when so many brilliant minds dream together.
  3. We live in a constantly shifting technological landscape. Digital preservation is a moving target – no matter how well we plan, we’re not going to “solve” digital preservation. No matter how long-lived our media, some disruptive technology will inevitably come along that forces to reassess our methods of preservation, storage, and retrieval. Additionally, we work in an extremely under-resourced sector, especially in Canada. It is highly unlikely that we are going to achieve long-term digital preservation if we are all going it alone, each of us coming up with bespoke solutions, re-inventing the wheel and confronting the same challenges in isolation.
  4. That doesn’t mean that we can’t be strategic. These are some of the high-level values that I believe will help us ensure that the investments now we make are still useful to us as we move into an uncertain future. Today I’m going to focus on possibilities for next-generation archival systems, but these values might equally apply to all 3 components. We should aim for the archival systems we build today to be: Open: open for study, reuse, exchange, modification, and repurposing. Collaborative: given the under-resourced nature of our profession, we want solutions that will allow us to share what resources we have, solve problems collaboratively, and build upon the successes of our peers. Decentralized: When human and economic resources can shift suddenly, we should be wary of solutions with a single point of control, or failure. Decentralized architectures are more resilient, and support collaboration. Agnostic: our solutions should not be institution-specific, nor should they be platform, format, or vendor-specific. Interoperable: Even across systems and platforms, we want underlying commonalities that allow for exchange, re-use, and reinterpretation. I want to say a bit more about openness before moving on.
  5. I believe that, in our under-resourced community, collaborative solutions will be critical. Open source enables this by allowing us to share the solutions we create or invest in, and build upon the work of our colleagues. Digital preservation requires us to maintain access to data and metadata into uncertain futures – we already know that closed, proprietary formats and platforms make this difficult. Why, as a profession, would we ever choose to build our own tools in such a way that they might not be understandable or usable in the future? If a vendor started selling an archival Hollinger box that could not be opened in 10 or 100 years without them, would any of us buy these boxes? Similarly, we want the tools and systems, the data, and the metadata we use and create to be useful even if their original authors are no longer available. Open source and open standards allow us to take an iterative approach. One institution can build upon the work of another, or fork it if the use cases are irreconcilably different. Vendors have a role to play in this ecosystem, but we should not be dependent on them, and we should require them to use open standards – we can create service-based economies around open source projects. For us to tackle the challenges of long-term preservation and access in an under-resourced sector, we need an ecosystem where we can all contribute to the solution as we are able, and we need our tools to be as preservation-ready as our data and metadata. Open source and open standards help move us closer to these aims.
  6. So let’s take a look at some of the existing technologies that we might leverage in imagining the archival systems of tomorrow… In each of the following cases, I’m going to introduce a technology, which really represents a computing paradigm – and a specific example of that computing paradigm. The specific examples might not actually be the technologies that will give us everything we need – but they solve problems which are very related to ours, and thus should serve as inspiration, and possible entry points to building the tools we need.
  7. The first possibilities I want to consider are distributed version control systems, such as git.
  8. Version control systems generally have 3 approaches. A local VCS tracks changes directly on your computer. Files are checked out, and changes are tracked. This is the simplest model, and it’s great for an isolated environment, but the second I want to collaborate with you, I run into problems integrating your real-time changes with mine.
  9. We might instead have a centralized VCS – like subversion. Here a central server manages our resources, we check them out from the VCS and conflicts are reconciled against the master repository. This allows for collaboration, but with a limited point of entry. If Alice and Bob are working on the same file, and then Bob wants to send his version to Carol, things will fall apart if Carol’s not part of the central repository. Equally, if our VCS server goes down, everyone’s work grinds to a halt, or we end up with no way to reconcile our independent changes.
  10. Finally, there are distributed VCS’s like git. In this case, each user maintains a local mirror of the entire repository. If Bob wants to include Carol, Carol ends up with the entire repo as well, and can maintain her changes as an independent fork or reconcile them at any time with the master, or with just Bob or just Alice. If the main repository goes down, every user becomes a potential point of backup.
  11. Git was created by the Linux community to streamline broad collaboration in distributed open-source development environments. There are many other alternative distributed VCSs out there, but git has become a de facto standard among developers.
  12. Essentially, git applies a cryptographic SHA1 hash against all files in a repository, and will produce a diff, shown in a graphical form from GitHub here on the right, of the additions, deletions, and modifications made. Every change submitted by a contributor includes a commit message with a timestamp, which includes a short, human-readable summary of the changes and an option for a longer narrative description or justification. Platforms like GitHub, which add a web-based user interface to git repositories and workflows, also offer the ability to discuss changes on a per-line basis via comments and issues. Artefactual uses git to manage all of its software projects – but we also use it to manage our documentation. In this image, we’re actually looking at an image from a diff on GitHub, made when I submitted changes to the Access to Memory documentation.
  13. For data: Git is traditionally used to manage code, while our data tends to be stored in databases. However, as Brandon Keepers of GitHub has pointed out, git is essentially a noSQL database, in that it uses 2 different key/value stores (the objects db and the refs db) to track and manage changes. Git’s implementation in the C programming language (libgit2) further supports swapping these with many other backends, including relational databases – and as such, we can consider using it as such to manage our data. There are some serious limits to scaling this approach, but it opens a wide avenue of possibilities for consideration.
  14. Early computers relied on a central mainframe – and were accessed via thin clients, which didn’t have the computational resources (memory, disk space) to perform calculations without the mainframe.
  15. As computers gained capacity (disk, memory, CPU) and prices fell, the web allowed for decentralized networks to arise, where users could exchange information directly.
  16. Essentially, with the cloud we’ve moved back to the centralized client server model – only these servers are often owned by private companies with opaque practices and a strong business incentive. Will this model really meet our needs moving forward? At Code4Lib 2014, Matt Zumwalt pointed out that according to the Cisco Global Cloud Index, by 2019 the data created by Internet of Everything devices alone will be 49 times higher than all the data that moved through data centers in 2014. (source: http://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/Cloud_Index_White_Paper.html) That’s an estimated 507.5 ZB per year of data created by this new connected reality – centralized models may not scale. More importantly, much of this data is highly personal – our phone metadata, our smart homes, our fitness trackers – do we want this data to be held in corporate data centers? We need a model where each user curates the data that matters to them, and shares what data they want with other users who care about that data. We need take out the middle man.
  17. Essentially, with the cloud we’ve moved back to the centralized client server model – only these servers are often owned by private companies with opaque practices and a strong business incentive. Will this model really meet our needs moving forward? At Code4Lib 2014, Matt Zumwalt pointed out that according to the Cisco Global Cloud Index, by 2019 the data created by Internet of Everything devices alone will be 49 times higher than all the data that moved through data centers in 2014. (source: http://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/Cloud_Index_White_Paper.html) That’s an estimated 507.5 ZB per year of data created by this new connected reality – centralized models may not scale. More importantly, much of this data is highly personal – our phone metadata, our smart homes, our fitness trackers – do we want this data to be held in corporate data centers? We need a model where each user curates the data that matters to them, and shares what data they want with other users who care about that data. We need take out the middle man.
  18. Essentially, with the cloud we’ve moved back to the centralized client server model – only these servers are often owned by private companies with opaque practices and a strong business incentive. Will this model really meet our needs moving forward? At Code4Lib 2014, Matt Zumwalt pointed out that according to the Cisco Global Cloud Index, by 2019 the data created by Internet of Everything devices alone will be 49 times higher than all the data that moved through data centers in 2014. (source: http://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/Cloud_Index_White_Paper.html) That’s an estimated 507.5 ZB per year of data created by this new connected reality – centralized models may not scale. More importantly, much of this data is highly personal – our phone metadata, our smart homes, our fitness trackers – do we want this data to be held in corporate data centers? We need a model where each user curates the data that matters to them, and shares what data they want with other users who care about that data. We need take out the middle man.
  19. Of course, LOCKSS has been providing a P2P-based preservation environment for some time now. In the LOCKSS model, ingested content is replicated to other participating institutions and packaged with usage rights. If the original source is unavailable, any other peer with the content can provide it to end-users. LOCKSS also offers what they call “migration on access” – when a format can’t be displayed, it is migrated on the fly, based on a series of rules, to an equivalent format the end user can access. I’ll return to talking about preservation services, and how they might fit into distributed archival systems, later.
  20. Dat is a grant funded, open source, decentralized data tool for versioning and distributing datasets. It incorporates best-of-breed ideas from the BitTorrent protocol and git’s model of distributed version control. The focus has been on scientific data so far: the pilot users have so far been in the fields of astronomy, bioinformatics, and neuroscience. However, the underlying versioning and sharing tool, Hyperdrive, is generalized for any binary data, including audio, video, linux containers, and so forth. By using the hashed diffs and merkle trees (which we’ll talk about later) generated in versions, the tool also deduplicates data between versions to reduce the bandwidth required for sharing.
  21. Matt Zumwalt, one of the original creators of the Hydra repository project, has been involved in an module for dat that focuses specifically on tabular data use cases, called dat jawn. The project works via the Code4Philly, building a strong model around mentorship, group learning, and open collaborative coding.
  22. To a computer, the text on a web page is just a pile of keywords. Humans, on the other hand, can understand how different pieces of information relate to each other, and it is these relationships that add meaning to data. In order for computers to understand these relationships, we need to add structure to the data.
  23. To make a transaction, you need several pieces of information. Every transaction has a set of inputs and a set of outputs. The inputs identify which bitcoins are being spent, and the outputs assign those bitcoins to their new owners. Each input is just a digitally signed reference to some output from a previous transaction. Once an output is spent by a subsequent input, no other transaction can spend that output again. That’s what makes bitcoins impossible to copy.
  24. To make a transaction, you need several pieces of information. Every transaction has a set of inputs and a set of outputs. The inputs identify which bitcoins are being spent, and the outputs assign those bitcoins to their new owners. Each input is just a digitally signed reference to some output from a previous transaction. Once an output is spent by a subsequent input, no other transaction can spend that output again. That’s what makes bitcoins impossible to copy.
  25. Many use cases, such as registering documents and tracking provenance, providing proof of ownership, managing rights, and much more.
  26. Examples: Augur – blockchain based crowd-sourced predictions Storj and MaidSafe – blockchain managed distributed, encrypted storage
  27. Creator Juan Benet describes it as “really just an integration of old ideas.” P2P architectures BitTorrent protocol  BitSwap in IPFS DHT Distributed version control Support for blockchain marketplaces – partnership with FileCoin