SlideShare a Scribd company logo
125 Databases for the
Year 2080
A technology challenge and how it can be met
Dr. Kai Naumann – Landesarchiv Baden-Württemberg (Germany)
WADL Workshop on IJDC 2020, Wuhan (China)
Landesarchiv Baden-Württemberg at a glance
• knowledge centre about the past of
the state of Baden-Württenberg
• key research infrastructure
• saves records of all kinds as cultural
heritage, preserves them and makes
them accessible
• provides transparency of
governmental, administrative, and
judicial decision-making
• archives government websites and
other sites with relevance to Baden-
Württemberg since 2006 --> about
300 URLs twice a year
• 9 sites throughout the country
• 11 million EUR overall budget
• 308 employees
• 1207 years: oldest dated charter
• 10.138 consultations per year
• 152.284 meters of occupied shelves
• 2.095.106 photographs
• 13.226.262 pages of scanned
documents
• 290.783.182 datasets rows
• ∞ eternal survival as a task
Our Oldest Database – the 1961 census
• Conceived at Statistical Offices of Germany in 1960
• Populated in 1961 on rented IBM machines
• 6 million individual punched cards destroyed in 1968
by a flooding
• Surviving part: calculated sums on ca. 1,592,821
punched cards
• Migrated to magnetic tape in the 1960s
• Migrated to CD-ROM in the 1990s
• Transferred to the State Archives in 2006
• Can we do better?!
LABW StAL E 258 II Bü 214
http://www.landesarchiv-bw.de/plink/?f=2-335336
Why we set up the challenge
• Emulation as a service - enormous progress since 2010
• SIARD - method of long-term database normalization – efforts to
establish SIARD as an European Union Standard
The challenge
• How do you preserve 125 databases of diverse origin for future use
from the year 2080 onwards?
• Prepare them in such a way that they can be used in as many ways as
possible in 2080.
• In the following 60 years
• a) no costs should be incurred apart from secure storage
• b) the database contents must not be publicly accessible.
How to preserve?
Pictures taken by the author
Political and legislative issues
Global Intellectual Property (IP) legislation is poorely prepared for
obsolesence.
Orphaned books (author and editor unknown) may freely be copied and
disseminated in most parts of the world.
The status of orphaned software is unclear, risks looming from unclear IP
claims.
In most countries of the world, no agency is responsible for preserving
software.
The European DSM directive has recently moved into a good direction, but
work has to continue in order to assure a risk-free environment for the
software emulation approaches.
CSV solution
• Choose the most important tables or prepare archival tables.
• Export them to CSV.
• Make an XML description of the fields and relations.
• Take screenshots of the graphical user interface (GUI).
• Add handbooks and tutorials for the database.
• Wait.
XML Solution
• Choose the most important tables or prepare archival tables.
• Export them to an XML Schema containing the most important
features of the DBMS (e.g. SIARD Schema).
• Take screenshots of the graphical user interface (GUI).
• Add handbooks and tutorials for the database.
• Wait.
Disk image solution
• Take a disk image of the client hardware.
• Take a disk image of the server hardware.
• Preserve necessary Operating System environments.
• Add handbooks or tutorials for the database.
• Regularly check performance of emulative software stack.
Docker image solution
• Take a Docker image of the client software.
• Take a Docker image of the server software.
• Preserve necessary Operating System environments.
• Add handbooks or tutorials for the database.
• Regularly check performance of emulative software stack.
Web Crawler solution
• This only works for databases with a full web-based frontend
displaying a complete list of their objects.
• Let a crawler translate all database content into an HTML/JavaScript
Container (e.g. WARC file).
• Regularly visit the crawl to test accessibility.
• In order to make quality assessments:
• Let Archive.org crawl the server as well
• Also use the CSV solution on the data
Solutions and their cost forecast
CSV Solution
XML Solution
Disk Image Solution
Docker Image Solution
Web Crawler Solution
0
50
100
150
200
250
01.01.2020
01.01.2022
01.01.2024
01.01.2026
01.01.2028
01.01.2030
01.01.2032
01.01.2034
01.01.2036
01.01.2038
01.01.2040
01.01.2042
01.01.2044
01.01.2046
01.01.2048
01.01.2050
01.01.2052
01.01.2054
01.01.2056
01.01.2058
01.01.2060
01.01.2062
01.01.2064
01.01.2066
01.01.2068
01.01.2070
01.01.2072
01.01.2074
01.01.2076
01.01.2078
01.01.2080
CSV Solution XML Solution Disk Image Solution Docker Image Solution Web Crawler Solution
Any questions? Want to join the quest?
• Further ideas, business models welcome!
• I will try to continue collecting answers at #WeMissiPRES
• Feel invited to a workshop on the issue at Stuttgart (Germany) in
2021!
• Contact me:
• Dr. Kai Naumann, Landesarchiv Baden-Württemberg
• kai <dot> naumann <at> la-bw <dot> de
• Twitter @Naumann_Kai
• Phone 0049 711 212 4284

More Related Content

What's hot

H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
Jo-fai Chow
 
That won’t fit into RAM - Michał Brzezicki
That won’t fit into RAM -  Michał  BrzezickiThat won’t fit into RAM -  Michał  Brzezicki
That won’t fit into RAM - Michał Brzezicki
Evention
 
Into the cold - Object Storage in SWITCHengines
Into the cold - Object Storage in SWITCHenginesInto the cold - Object Storage in SWITCHengines
Into the cold - Object Storage in SWITCHengines
Simon Leinen
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
BigData_Europe
 
What’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data managementWhat’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data management
Alluxio, Inc.
 
An Introduction of Apache Hadoop
An Introduction of Apache HadoopAn Introduction of Apache Hadoop
An Introduction of Apache Hadoop
KMS Technology
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
BigData_Europe
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
Ryan Blue
 
Evan Kaplan [InfluxData] | InfluxDays Opening Remarks | InfluxDays EMEA 2021
Evan Kaplan [InfluxData] | InfluxDays Opening Remarks | InfluxDays EMEA 2021Evan Kaplan [InfluxData] | InfluxDays Opening Remarks | InfluxDays EMEA 2021
Evan Kaplan [InfluxData] | InfluxDays Opening Remarks | InfluxDays EMEA 2021
InfluxData
 
Data analytics and downscaling for climate research in a big data world
Data analytics and downscaling for climate research in a big data worldData analytics and downscaling for climate research in a big data world
Data analytics and downscaling for climate research in a big data world
BigData_Europe
 
How to Develop and Operate Cloud First Data Platforms
How to Develop and Operate Cloud First Data PlatformsHow to Develop and Operate Cloud First Data Platforms
How to Develop and Operate Cloud First Data Platforms
Alluxio, Inc.
 
Innovative hydrographic data management: now and in the future
Innovative hydrographic data management: now and in the futureInnovative hydrographic data management: now and in the future
Innovative hydrographic data management: now and in the future
Hydrographic Society Benelux
 
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
BigData_Europe
 
Drupal Simple DCAT Export module
Drupal Simple DCAT Export moduleDrupal Simple DCAT Export module
Drupal Simple DCAT Export module
Bart Hanssens
 
view_hdf
view_hdfview_hdf
SC5 Hangout2 pilot 1 description
SC5 Hangout2  pilot 1 descriptionSC5 Hangout2  pilot 1 description
SC5 Hangout2 pilot 1 description
BigData_Europe
 
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
InfluxData
 
10 basic terms so you can talk to data engineer
10 basic terms so you can  talk to data engineer10 basic terms so you can  talk to data engineer
10 basic terms so you can talk to data engineer
Worapol Alex Pongpech, PhD
 
ClusterVision & Intel: Top500 class Computing at the University of Paderborn
ClusterVision & Intel: Top500 class Computing at the University of PaderbornClusterVision & Intel: Top500 class Computing at the University of Paderborn
ClusterVision & Intel: Top500 class Computing at the University of Paderborn
Intel IT Center
 
BDE SC4 Hangout - Hajira Jabeen, general architecture
BDE SC4 Hangout - Hajira Jabeen, general architectureBDE SC4 Hangout - Hajira Jabeen, general architecture
BDE SC4 Hangout - Hajira Jabeen, general architecture
BigData_Europe
 

What's hot (20)

H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
That won’t fit into RAM - Michał Brzezicki
That won’t fit into RAM -  Michał  BrzezickiThat won’t fit into RAM -  Michał  Brzezicki
That won’t fit into RAM - Michał Brzezicki
 
Into the cold - Object Storage in SWITCHengines
Into the cold - Object Storage in SWITCHenginesInto the cold - Object Storage in SWITCHengines
Into the cold - Object Storage in SWITCHengines
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
 
What’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data managementWhat’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data management
 
An Introduction of Apache Hadoop
An Introduction of Apache HadoopAn Introduction of Apache Hadoop
An Introduction of Apache Hadoop
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Evan Kaplan [InfluxData] | InfluxDays Opening Remarks | InfluxDays EMEA 2021
Evan Kaplan [InfluxData] | InfluxDays Opening Remarks | InfluxDays EMEA 2021Evan Kaplan [InfluxData] | InfluxDays Opening Remarks | InfluxDays EMEA 2021
Evan Kaplan [InfluxData] | InfluxDays Opening Remarks | InfluxDays EMEA 2021
 
Data analytics and downscaling for climate research in a big data world
Data analytics and downscaling for climate research in a big data worldData analytics and downscaling for climate research in a big data world
Data analytics and downscaling for climate research in a big data world
 
How to Develop and Operate Cloud First Data Platforms
How to Develop and Operate Cloud First Data PlatformsHow to Develop and Operate Cloud First Data Platforms
How to Develop and Operate Cloud First Data Platforms
 
Innovative hydrographic data management: now and in the future
Innovative hydrographic data management: now and in the futureInnovative hydrographic data management: now and in the future
Innovative hydrographic data management: now and in the future
 
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
 
Drupal Simple DCAT Export module
Drupal Simple DCAT Export moduleDrupal Simple DCAT Export module
Drupal Simple DCAT Export module
 
view_hdf
view_hdfview_hdf
view_hdf
 
SC5 Hangout2 pilot 1 description
SC5 Hangout2  pilot 1 descriptionSC5 Hangout2  pilot 1 description
SC5 Hangout2 pilot 1 description
 
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
 
10 basic terms so you can talk to data engineer
10 basic terms so you can  talk to data engineer10 basic terms so you can  talk to data engineer
10 basic terms so you can talk to data engineer
 
ClusterVision & Intel: Top500 class Computing at the University of Paderborn
ClusterVision & Intel: Top500 class Computing at the University of PaderbornClusterVision & Intel: Top500 class Computing at the University of Paderborn
ClusterVision & Intel: Top500 class Computing at the University of Paderborn
 
BDE SC4 Hangout - Hajira Jabeen, general architecture
BDE SC4 Hangout - Hajira Jabeen, general architectureBDE SC4 Hangout - Hajira Jabeen, general architecture
BDE SC4 Hangout - Hajira Jabeen, general architecture
 

Similar to 125 Databases for the Year 2080

Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
HostedbyConfluent
 
Scalable Preservation Workflows
Scalable Preservation WorkflowsScalable Preservation Workflows
Scalable Preservation Workflows
SCAPE Project
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
Lars Albertsson
 
Big data berlin
Big data berlinBig data berlin
Big data berlin
kammeyer
 
Learn from HomeAway Hadoop Development and Operations Best Practices
Learn from HomeAway Hadoop Development and Operations Best PracticesLearn from HomeAway Hadoop Development and Operations Best Practices
Learn from HomeAway Hadoop Development and Operations Best Practices
Driven Inc.
 
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
DataScienceConferenc1
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
Sri Ambati
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
Srinath Perera
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
C4Media
 
A Tight Ship: How Containers and SDS Optimize the Enterprise
 A Tight Ship: How Containers and SDS Optimize the Enterprise A Tight Ship: How Containers and SDS Optimize the Enterprise
A Tight Ship: How Containers and SDS Optimize the Enterprise
Eric Kavanagh
 
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
Dirk Petersen
 
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
Inside Analysis
 
4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay
Chris Despopoulos
 
The New Model
The New ModelThe New Model
The New Model
David Kaiser
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigData_Europe
 
Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...
Kieran Kunhya
 
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
Vadym Kazulkin
 
GEO Analytics Canada Overview April 2020
GEO Analytics Canada Overview April 2020GEO Analytics Canada Overview April 2020
GEO Analytics Canada Overview April 2020
GEO Analytics Canada
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
Big Data Spain
 
The Exascale Computing Project and the future of HPC
The Exascale Computing Project and the future of HPCThe Exascale Computing Project and the future of HPC
The Exascale Computing Project and the future of HPC
inside-BigData.com
 

Similar to 125 Databases for the Year 2080 (20)

Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
 
Scalable Preservation Workflows
Scalable Preservation WorkflowsScalable Preservation Workflows
Scalable Preservation Workflows
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
Big data berlin
Big data berlinBig data berlin
Big data berlin
 
Learn from HomeAway Hadoop Development and Operations Best Practices
Learn from HomeAway Hadoop Development and Operations Best PracticesLearn from HomeAway Hadoop Development and Operations Best Practices
Learn from HomeAway Hadoop Development and Operations Best Practices
 
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
A Tight Ship: How Containers and SDS Optimize the Enterprise
 A Tight Ship: How Containers and SDS Optimize the Enterprise A Tight Ship: How Containers and SDS Optimize the Enterprise
A Tight Ship: How Containers and SDS Optimize the Enterprise
 
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
 
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
 
4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay
 
The New Model
The New ModelThe New Model
The New Model
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal Pilots
 
Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...
 
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
 
GEO Analytics Canada Overview April 2020
GEO Analytics Canada Overview April 2020GEO Analytics Canada Overview April 2020
GEO Analytics Canada Overview April 2020
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 
The Exascale Computing Project and the future of HPC
The Exascale Computing Project and the future of HPCThe Exascale Computing Project and the future of HPC
The Exascale Computing Project and the future of HPC
 

Recently uploaded

快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
850fcj96
 
Item # 10 -- Historical Presv. Districts
Item # 10 -- Historical Presv. DistrictsItem # 10 -- Historical Presv. Districts
Item # 10 -- Historical Presv. Districts
ahcitycouncil
 
2024: The FAR - Federal Acquisition Regulations, Part 38
2024: The FAR - Federal Acquisition Regulations, Part 382024: The FAR - Federal Acquisition Regulations, Part 38
2024: The FAR - Federal Acquisition Regulations, Part 38
JSchaus & Associates
 
Texas Water Development Board Updates June 2024
Texas Water Development Board Updates June 2024Texas Water Development Board Updates June 2024
Texas Water Development Board Updates June 2024
Texas Alliance of Groundwater Districts
 
Practical guide for the celebration of World Environment Day on june 5th.
Practical guide for the  celebration of World Environment Day on  june 5th.Practical guide for the  celebration of World Environment Day on  june 5th.
Practical guide for the celebration of World Environment Day on june 5th.
Christina Parmionova
 
在线办理(ISU毕业证书)爱荷华州立大学毕业证学历证书一模一样
在线办理(ISU毕业证书)爱荷华州立大学毕业证学历证书一模一样在线办理(ISU毕业证书)爱荷华州立大学毕业证学历证书一模一样
在线办理(ISU毕业证书)爱荷华州立大学毕业证学历证书一模一样
yemqpj
 
Opinions on EVs: Metro Atlanta Speaks 2023
Opinions on EVs: Metro Atlanta Speaks 2023Opinions on EVs: Metro Atlanta Speaks 2023
Opinions on EVs: Metro Atlanta Speaks 2023
ARCResearch
 
IEA World Energy Investment June 2024- Statistics
IEA World Energy Investment June 2024- StatisticsIEA World Energy Investment June 2024- Statistics
IEA World Energy Investment June 2024- Statistics
Energy for One World
 
About Potato, The scientific name of the plant is Solanum tuberosum (L).
About Potato, The scientific name of the plant is Solanum tuberosum (L).About Potato, The scientific name of the plant is Solanum tuberosum (L).
About Potato, The scientific name of the plant is Solanum tuberosum (L).
Christina Parmionova
 
A guide to the International day of Potatoes 2024 - May 30th
A guide to the International day of Potatoes 2024 - May 30thA guide to the International day of Potatoes 2024 - May 30th
A guide to the International day of Potatoes 2024 - May 30th
Christina Parmionova
 
快速办理(UVM毕业证书)佛蒙特大学毕业证学位证一模一样
快速办理(UVM毕业证书)佛蒙特大学毕业证学位证一模一样快速办理(UVM毕业证书)佛蒙特大学毕业证学位证一模一样
快速办理(UVM毕业证书)佛蒙特大学毕业证学位证一模一样
yemqpj
 
PPT Item # 8&9 - Demolition Code Amendments
PPT Item # 8&9 - Demolition Code AmendmentsPPT Item # 8&9 - Demolition Code Amendments
PPT Item # 8&9 - Demolition Code Amendments
ahcitycouncil
 
Researching the client.pptxsxssssssssssssssssssssss
Researching the client.pptxsxssssssssssssssssssssssResearching the client.pptxsxssssssssssssssssssssss
Researching the client.pptxsxssssssssssssssssssssss
DanielOliver74
 
Get Government Grants and Assistance Program
Get Government Grants and Assistance ProgramGet Government Grants and Assistance Program
Get Government Grants and Assistance Program
Get Government Grants
 
Transit-Oriented Development Study Working Group Meeting
Transit-Oriented Development Study Working Group MeetingTransit-Oriented Development Study Working Group Meeting
Transit-Oriented Development Study Working Group Meeting
Cuyahoga County Planning Commission
 
Invitation Letter for an alumni association
Invitation Letter for an alumni associationInvitation Letter for an alumni association
Invitation Letter for an alumni association
elmerdalida001
 
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
Congressional Budget Office
 
Donate to charity during this holiday season
Donate to charity during this holiday seasonDonate to charity during this holiday season
Donate to charity during this holiday season
SERUDS INDIA
 
原版制作(DPU毕业证书)德保罗大学毕业证Offer一模一样
原版制作(DPU毕业证书)德保罗大学毕业证Offer一模一样原版制作(DPU毕业证书)德保罗大学毕业证Offer一模一样
原版制作(DPU毕业证书)德保罗大学毕业证Offer一模一样
yemqpj
 
State crafting: Changes and challenges for managing the public finances
State crafting: Changes and challenges for managing the public financesState crafting: Changes and challenges for managing the public finances
State crafting: Changes and challenges for managing the public finances
ResolutionFoundation
 

Recently uploaded (20)

快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
快速制作(ocad毕业证书)加拿大安大略艺术设计学院毕业证本科学历雅思成绩单原版一模一样
 
Item # 10 -- Historical Presv. Districts
Item # 10 -- Historical Presv. DistrictsItem # 10 -- Historical Presv. Districts
Item # 10 -- Historical Presv. Districts
 
2024: The FAR - Federal Acquisition Regulations, Part 38
2024: The FAR - Federal Acquisition Regulations, Part 382024: The FAR - Federal Acquisition Regulations, Part 38
2024: The FAR - Federal Acquisition Regulations, Part 38
 
Texas Water Development Board Updates June 2024
Texas Water Development Board Updates June 2024Texas Water Development Board Updates June 2024
Texas Water Development Board Updates June 2024
 
Practical guide for the celebration of World Environment Day on june 5th.
Practical guide for the  celebration of World Environment Day on  june 5th.Practical guide for the  celebration of World Environment Day on  june 5th.
Practical guide for the celebration of World Environment Day on june 5th.
 
在线办理(ISU毕业证书)爱荷华州立大学毕业证学历证书一模一样
在线办理(ISU毕业证书)爱荷华州立大学毕业证学历证书一模一样在线办理(ISU毕业证书)爱荷华州立大学毕业证学历证书一模一样
在线办理(ISU毕业证书)爱荷华州立大学毕业证学历证书一模一样
 
Opinions on EVs: Metro Atlanta Speaks 2023
Opinions on EVs: Metro Atlanta Speaks 2023Opinions on EVs: Metro Atlanta Speaks 2023
Opinions on EVs: Metro Atlanta Speaks 2023
 
IEA World Energy Investment June 2024- Statistics
IEA World Energy Investment June 2024- StatisticsIEA World Energy Investment June 2024- Statistics
IEA World Energy Investment June 2024- Statistics
 
About Potato, The scientific name of the plant is Solanum tuberosum (L).
About Potato, The scientific name of the plant is Solanum tuberosum (L).About Potato, The scientific name of the plant is Solanum tuberosum (L).
About Potato, The scientific name of the plant is Solanum tuberosum (L).
 
A guide to the International day of Potatoes 2024 - May 30th
A guide to the International day of Potatoes 2024 - May 30thA guide to the International day of Potatoes 2024 - May 30th
A guide to the International day of Potatoes 2024 - May 30th
 
快速办理(UVM毕业证书)佛蒙特大学毕业证学位证一模一样
快速办理(UVM毕业证书)佛蒙特大学毕业证学位证一模一样快速办理(UVM毕业证书)佛蒙特大学毕业证学位证一模一样
快速办理(UVM毕业证书)佛蒙特大学毕业证学位证一模一样
 
PPT Item # 8&9 - Demolition Code Amendments
PPT Item # 8&9 - Demolition Code AmendmentsPPT Item # 8&9 - Demolition Code Amendments
PPT Item # 8&9 - Demolition Code Amendments
 
Researching the client.pptxsxssssssssssssssssssssss
Researching the client.pptxsxssssssssssssssssssssssResearching the client.pptxsxssssssssssssssssssssss
Researching the client.pptxsxssssssssssssssssssssss
 
Get Government Grants and Assistance Program
Get Government Grants and Assistance ProgramGet Government Grants and Assistance Program
Get Government Grants and Assistance Program
 
Transit-Oriented Development Study Working Group Meeting
Transit-Oriented Development Study Working Group MeetingTransit-Oriented Development Study Working Group Meeting
Transit-Oriented Development Study Working Group Meeting
 
Invitation Letter for an alumni association
Invitation Letter for an alumni associationInvitation Letter for an alumni association
Invitation Letter for an alumni association
 
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
CBO’s Outlook for U.S. Fertility Rates: 2024 to 2054
 
Donate to charity during this holiday season
Donate to charity during this holiday seasonDonate to charity during this holiday season
Donate to charity during this holiday season
 
原版制作(DPU毕业证书)德保罗大学毕业证Offer一模一样
原版制作(DPU毕业证书)德保罗大学毕业证Offer一模一样原版制作(DPU毕业证书)德保罗大学毕业证Offer一模一样
原版制作(DPU毕业证书)德保罗大学毕业证Offer一模一样
 
State crafting: Changes and challenges for managing the public finances
State crafting: Changes and challenges for managing the public financesState crafting: Changes and challenges for managing the public finances
State crafting: Changes and challenges for managing the public finances
 

125 Databases for the Year 2080

  • 1. 125 Databases for the Year 2080 A technology challenge and how it can be met Dr. Kai Naumann – Landesarchiv Baden-Württemberg (Germany) WADL Workshop on IJDC 2020, Wuhan (China)
  • 2. Landesarchiv Baden-Württemberg at a glance • knowledge centre about the past of the state of Baden-Württenberg • key research infrastructure • saves records of all kinds as cultural heritage, preserves them and makes them accessible • provides transparency of governmental, administrative, and judicial decision-making • archives government websites and other sites with relevance to Baden- Württemberg since 2006 --> about 300 URLs twice a year • 9 sites throughout the country • 11 million EUR overall budget • 308 employees • 1207 years: oldest dated charter • 10.138 consultations per year • 152.284 meters of occupied shelves • 2.095.106 photographs • 13.226.262 pages of scanned documents • 290.783.182 datasets rows • ∞ eternal survival as a task
  • 3. Our Oldest Database – the 1961 census • Conceived at Statistical Offices of Germany in 1960 • Populated in 1961 on rented IBM machines • 6 million individual punched cards destroyed in 1968 by a flooding • Surviving part: calculated sums on ca. 1,592,821 punched cards • Migrated to magnetic tape in the 1960s • Migrated to CD-ROM in the 1990s • Transferred to the State Archives in 2006 • Can we do better?! LABW StAL E 258 II Bü 214 http://www.landesarchiv-bw.de/plink/?f=2-335336
  • 4. Why we set up the challenge • Emulation as a service - enormous progress since 2010 • SIARD - method of long-term database normalization – efforts to establish SIARD as an European Union Standard
  • 5. The challenge • How do you preserve 125 databases of diverse origin for future use from the year 2080 onwards? • Prepare them in such a way that they can be used in as many ways as possible in 2080. • In the following 60 years • a) no costs should be incurred apart from secure storage • b) the database contents must not be publicly accessible.
  • 6. How to preserve? Pictures taken by the author
  • 7. Political and legislative issues Global Intellectual Property (IP) legislation is poorely prepared for obsolesence. Orphaned books (author and editor unknown) may freely be copied and disseminated in most parts of the world. The status of orphaned software is unclear, risks looming from unclear IP claims. In most countries of the world, no agency is responsible for preserving software. The European DSM directive has recently moved into a good direction, but work has to continue in order to assure a risk-free environment for the software emulation approaches.
  • 8. CSV solution • Choose the most important tables or prepare archival tables. • Export them to CSV. • Make an XML description of the fields and relations. • Take screenshots of the graphical user interface (GUI). • Add handbooks and tutorials for the database. • Wait.
  • 9. XML Solution • Choose the most important tables or prepare archival tables. • Export them to an XML Schema containing the most important features of the DBMS (e.g. SIARD Schema). • Take screenshots of the graphical user interface (GUI). • Add handbooks and tutorials for the database. • Wait.
  • 10. Disk image solution • Take a disk image of the client hardware. • Take a disk image of the server hardware. • Preserve necessary Operating System environments. • Add handbooks or tutorials for the database. • Regularly check performance of emulative software stack.
  • 11. Docker image solution • Take a Docker image of the client software. • Take a Docker image of the server software. • Preserve necessary Operating System environments. • Add handbooks or tutorials for the database. • Regularly check performance of emulative software stack.
  • 12. Web Crawler solution • This only works for databases with a full web-based frontend displaying a complete list of their objects. • Let a crawler translate all database content into an HTML/JavaScript Container (e.g. WARC file). • Regularly visit the crawl to test accessibility. • In order to make quality assessments: • Let Archive.org crawl the server as well • Also use the CSV solution on the data
  • 13. Solutions and their cost forecast CSV Solution XML Solution Disk Image Solution Docker Image Solution Web Crawler Solution 0 50 100 150 200 250 01.01.2020 01.01.2022 01.01.2024 01.01.2026 01.01.2028 01.01.2030 01.01.2032 01.01.2034 01.01.2036 01.01.2038 01.01.2040 01.01.2042 01.01.2044 01.01.2046 01.01.2048 01.01.2050 01.01.2052 01.01.2054 01.01.2056 01.01.2058 01.01.2060 01.01.2062 01.01.2064 01.01.2066 01.01.2068 01.01.2070 01.01.2072 01.01.2074 01.01.2076 01.01.2078 01.01.2080 CSV Solution XML Solution Disk Image Solution Docker Image Solution Web Crawler Solution
  • 14. Any questions? Want to join the quest? • Further ideas, business models welcome! • I will try to continue collecting answers at #WeMissiPRES • Feel invited to a workshop on the issue at Stuttgart (Germany) in 2021! • Contact me: • Dr. Kai Naumann, Landesarchiv Baden-Württemberg • kai <dot> naumann <at> la-bw <dot> de • Twitter @Naumann_Kai • Phone 0049 711 212 4284