SlideShare a Scribd company logo
Towards an Infrastructure of
Migration


                               Dirk Roorda
• .
History of MIXED
•   history
•   defining
•   developing
•   using
•   exploiting
what is it?
MIXED is a file format converter



plus a set of formats, called SDFP, i.e.
  Standard Data Formats for Preservation
founding idea
National Archive (NL): testbed
testbed: spreadsheets
XML is an appropriate choice for the
long-term preservation of
spreadsheets. XML can be used to
specify the context, content and
structure of spreadsheets.
testbed: databases
At present, XML is the most
effective strategy for the
durable preservation of
databases. XML is highly
capable of representing the
context, content, and structure
of databases.
This strategy can
implemented using a number
of different methods.
what do repositories want
 Conversion to preservable formats.
      Automatically
       at most once




                            Faithfully.
preservation strategy
Migration and emulation are complementary
strategies. Migration is best for offering
usable content. Emulation is best for
invoking the original experience.

Migration to XML is
normalised migration,
hence we coin it smart migration.
Ingredients
suitable xml formats for your data
software to convert
  legacy data to xml
  ingest data to xml
  xml to dissemination data
connectors to your repository workflow
MIXED - snapshot
timeline
defining MIXED
•   history
•   defining
•   developing
•   using
•   exploiting
XML
XML sounds great

what is MIXED’s XML?
Data kinds
Data comes in kinds, defined by the typical
applications that manipulate it.
Spreadsheets, databases, rich
text, images, audio, video, drawings, ...
The need for these applications are the
basic reason for the threat of data loss
caused by software obsolescence.
standards for data kinds
binary vendor formats (doc)
ascii vendor formats (rtf)
open formats (HTML export)
interchange formats (ad-hoc XML)
standard formats (defined XML: OOXML)
preservation formats (selected XML: SDFP)
SDFP
Standard Data Formats for Preservation
Spreadsheets: ODF subset
Databases: e-David-XML
Statistical Data: DDI
SDFP as umbrella
Datatypes
numbers: ISO 6093
date-time: ISO 8601-3
characters: UNICODE
Scope (kinds)
initially
  tabular data
       spreadsheets and databases
later
  statistical data
and then
  text, still images, ...
Scope (aspects)
         Content semantics

 databases            cell positions
 data model               values
  data itself           formulas

spreadsheets
Aspects that didn’t make it
presentation details       action details
       fonts           update, insert, delete
      forms             stored procedures
                               triggers
developing MIXED
•   history
•   defining
•   developing
•   using
•   exploiting
design principles
building block in workflows
no built-in user interface
easily extensible / updatable
use and produce open source code
framework and plugins
framework
  managing plugins
  managing execution
  administration
plugins
  for each conversion
   from/to SDFP
issues
how loose/tight are the components
  connected?
pure own Java code / borrow existing
  programs in other languages?
modularity of file type recognition (JHOVE)
Using MIXED
•   history
•   defining
•   developing
•   using
•   exploiting
Data archives
         collect

         preserve

         re-use
improvements for repositories
• users can select format most usable to
  them, irrespective of producer
• users can select the preservation
  format, in case usable formats are not
  supported
• less uncertainties in interpretation, either
  by humans or by software
further improvements
combine data from heterogeneous sources
• different formats (straightforward)
• different data models (advanced)
• different data kinds
Exploiting MIXED
•   history
•   defining
•   developing
•   using
•   exploiting
Research Infrastructures
Data on an Infrastructure
•   higher demand for interoperability
•   more needs for standards
•   more opportunities for re-use
•   more scope for digital preservation tools
Conversions needed




            lots of them ...
Conversion as a service
• a uniform resource
  • yielding uniform results
• easily accessible
• product of community effort
  • a good conversion requires a lot of intelligent
    work
  • quality is reached in an iterative manner
MIXED as Infrastructure
• provides a standard for preservation
  formats
• implements the tools to maintain the
  standard
• accumulates the shared wisdom of data
  formats
when software
vendors realize
that there
should always
be
an im/export
to a
preservation
format,
it means ...........   The End of MIXED

More Related Content

What's hot

Introduction to ms access database
Introduction to ms access databaseIntroduction to ms access database
Introduction to ms access database
Obuasi Senior High Technical School
 
Integrating an electronic lab notebook with a university it environment rdmf ...
Integrating an electronic lab notebook with a university it environment rdmf ...Integrating an electronic lab notebook with a university it environment rdmf ...
Integrating an electronic lab notebook with a university it environment rdmf ...
rmacneil88
 
Networked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And DissertationsNetworked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And Dissertationssinglish
 
Oceangraphic data formats
Oceangraphic data formatsOceangraphic data formats
Oceangraphic data formatsFiddy Prasetiya
 
Multimedia database
Multimedia databaseMultimedia database
Multimedia database
Rashmi Agale
 
ELAG 2014, Workshop on Electronic Resource Management
ELAG 2014, Workshop on Electronic Resource ManagementELAG 2014, Workshop on Electronic Resource Management
ELAG 2014, Workshop on Electronic Resource ManagementLydiaU
 
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
Björn Muschall
 
Data management principles
Data management principlesData management principles
Data management principlesFiddy Prasetiya
 
Database And their types
Database And their typesDatabase And their types
Database And their types
Rajiv Ranjan Mishra
 
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
Plans for Enhanced NetCDF-4 Interface to HDF5 DataPlans for Enhanced NetCDF-4 Interface to HDF5 Data
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
The HDF-EOS Tools and Information Center
 
Multimedia db system
Multimedia db systemMultimedia db system
Multimedia db system
Yojana Nanaware
 
Introduction to data management, terminologies and use of data management pla...
Introduction to data management, terminologies and use of data management pla...Introduction to data management, terminologies and use of data management pla...
Introduction to data management, terminologies and use of data management pla...
International Institute of Tropical Agriculture
 
Emerging database technology multimedia database
Emerging database technology   multimedia databaseEmerging database technology   multimedia database
Emerging database technology multimedia database
Salama Al Busaidi
 
Mutimedia databases
Mutimedia databasesMutimedia databases
Mutimedia databases
Spoorthi Sham
 
Xml and multimedia database
Xml and multimedia databaseXml and multimedia database
Xml and multimedia database
Muhammad Harris
 
MMBD - Multimedia Databases
MMBD - Multimedia DatabasesMMBD - Multimedia Databases
MMBD - Multimedia Databasesrahmivolkan
 
Documentation With Open Source Tools·(ასლი)
Documentation With Open Source Tools·(ასლი)Documentation With Open Source Tools·(ასლი)
Documentation With Open Source Tools·(ასლი)Rashad Aliyev
 
Documentation With Open Source Tools
Documentation With Open Source ToolsDocumentation With Open Source Tools
Documentation With Open Source ToolsRashad Aliyev
 

What's hot (20)

Introduction to ms access database
Introduction to ms access databaseIntroduction to ms access database
Introduction to ms access database
 
Integrating an electronic lab notebook with a university it environment rdmf ...
Integrating an electronic lab notebook with a university it environment rdmf ...Integrating an electronic lab notebook with a university it environment rdmf ...
Integrating an electronic lab notebook with a university it environment rdmf ...
 
Networked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And DissertationsNetworked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And Dissertations
 
4 create database
4 create database4 create database
4 create database
 
Oceangraphic data formats
Oceangraphic data formatsOceangraphic data formats
Oceangraphic data formats
 
Multimedia database
Multimedia databaseMultimedia database
Multimedia database
 
ELAG 2014, Workshop on Electronic Resource Management
ELAG 2014, Workshop on Electronic Resource ManagementELAG 2014, Workshop on Electronic Resource Management
ELAG 2014, Workshop on Electronic Resource Management
 
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
 
Data management principles
Data management principlesData management principles
Data management principles
 
Database And their types
Database And their typesDatabase And their types
Database And their types
 
Database and types of databases
Database and types of databasesDatabase and types of databases
Database and types of databases
 
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
Plans for Enhanced NetCDF-4 Interface to HDF5 DataPlans for Enhanced NetCDF-4 Interface to HDF5 Data
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
 
Multimedia db system
Multimedia db systemMultimedia db system
Multimedia db system
 
Introduction to data management, terminologies and use of data management pla...
Introduction to data management, terminologies and use of data management pla...Introduction to data management, terminologies and use of data management pla...
Introduction to data management, terminologies and use of data management pla...
 
Emerging database technology multimedia database
Emerging database technology   multimedia databaseEmerging database technology   multimedia database
Emerging database technology multimedia database
 
Mutimedia databases
Mutimedia databasesMutimedia databases
Mutimedia databases
 
Xml and multimedia database
Xml and multimedia databaseXml and multimedia database
Xml and multimedia database
 
MMBD - Multimedia Databases
MMBD - Multimedia DatabasesMMBD - Multimedia Databases
MMBD - Multimedia Databases
 
Documentation With Open Source Tools·(ასლი)
Documentation With Open Source Tools·(ასლი)Documentation With Open Source Tools·(ასლი)
Documentation With Open Source Tools·(ასლი)
 
Documentation With Open Source Tools
Documentation With Open Source ToolsDocumentation With Open Source Tools
Documentation With Open Source Tools
 

Similar to 2009 PLANETS Vienna - MIXED migration to XML

2007 iPres Beijing - MIXED: Preservation by migration to XML
2007 iPres Beijing - MIXED: Preservation by migration to XML2007 iPres Beijing - MIXED: Preservation by migration to XML
2007 iPres Beijing - MIXED: Preservation by migration to XML
Dirk Roorda
 
Database Systems Lec 1.pptx
Database Systems Lec 1.pptxDatabase Systems Lec 1.pptx
Database Systems Lec 1.pptx
NishaTariq1
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
datastack
 
Earth Science Markup Language
Earth Science Markup LanguageEarth Science Markup Language
Earth Science Markup Language
The HDF-EOS Tools and Information Center
 
2016 SDMX Experts meeting, Opening of SDMX Capacity Building - Introduction ...
2016 SDMX Experts meeting, Opening of SDMX Capacity Building  - Introduction ...2016 SDMX Experts meeting, Opening of SDMX Capacity Building  - Introduction ...
2016 SDMX Experts meeting, Opening of SDMX Capacity Building - Introduction ...
StatsCommunications
 
CS-324-6-2.pdf
CS-324-6-2.pdfCS-324-6-2.pdf
CS-324-6-2.pdf
Rizulthakur2
 
Earth Science Markup Language (ESML) - A Tutorial
Earth Science Markup Language (ESML) - A TutorialEarth Science Markup Language (ESML) - A Tutorial
Earth Science Markup Language (ESML) - A Tutorial
The HDF-EOS Tools and Information Center
 
Multimedia Database
Multimedia Database Multimedia Database
Multimedia Database
Avnish Patel
 
IBM InterConnect 2015 - IIB Effective Application Development
IBM InterConnect 2015 - IIB Effective Application DevelopmentIBM InterConnect 2015 - IIB Effective Application Development
IBM InterConnect 2015 - IIB Effective Application Development
Andrew Coleman
 
The Role of XML in an Information Society with Barry Schaeffer
The Role of XML in an Information Society with Barry SchaefferThe Role of XML in an Information Society with Barry Schaeffer
The Role of XML in an Information Society with Barry Schaeffer
dclsocialmedia
 
Fyp presentation 2 (SQL Converter)
Fyp presentation 2 (SQL Converter)Fyp presentation 2 (SQL Converter)
Fyp presentation 2 (SQL Converter)
Muhammad Shafiq
 
Lecture-1.ppt
Lecture-1.pptLecture-1.ppt
Lecture-1.ppt
ChSheraz3
 
Making your data work harder than you do
Making your data work harder than you doMaking your data work harder than you do
Making your data work harder than you do
Susan Jane Williams
 
Wed van horik_handson_research data management
Wed van horik_handson_research data managementWed van horik_handson_research data management
Wed van horik_handson_research data managementeswcsummerschool
 
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
RCAHMW
 
Hia 1693-effective application-development_in_iib
Hia 1693-effective application-development_in_iibHia 1693-effective application-development_in_iib
Hia 1693-effective application-development_in_iib
Andrew Coleman
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
Jack Eapen
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
Jack Eapen
 
The XML Forms Architecture
The XML Forms ArchitectureThe XML Forms Architecture
The XML Forms Architecture
iText Group nv
 
CRUD Operation of images through XML
CRUD Operation of images through XMLCRUD Operation of images through XML
CRUD Operation of images through XML
Anshudha Maheshwari
 

Similar to 2009 PLANETS Vienna - MIXED migration to XML (20)

2007 iPres Beijing - MIXED: Preservation by migration to XML
2007 iPres Beijing - MIXED: Preservation by migration to XML2007 iPres Beijing - MIXED: Preservation by migration to XML
2007 iPres Beijing - MIXED: Preservation by migration to XML
 
Database Systems Lec 1.pptx
Database Systems Lec 1.pptxDatabase Systems Lec 1.pptx
Database Systems Lec 1.pptx
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
 
Earth Science Markup Language
Earth Science Markup LanguageEarth Science Markup Language
Earth Science Markup Language
 
2016 SDMX Experts meeting, Opening of SDMX Capacity Building - Introduction ...
2016 SDMX Experts meeting, Opening of SDMX Capacity Building  - Introduction ...2016 SDMX Experts meeting, Opening of SDMX Capacity Building  - Introduction ...
2016 SDMX Experts meeting, Opening of SDMX Capacity Building - Introduction ...
 
CS-324-6-2.pdf
CS-324-6-2.pdfCS-324-6-2.pdf
CS-324-6-2.pdf
 
Earth Science Markup Language (ESML) - A Tutorial
Earth Science Markup Language (ESML) - A TutorialEarth Science Markup Language (ESML) - A Tutorial
Earth Science Markup Language (ESML) - A Tutorial
 
Multimedia Database
Multimedia Database Multimedia Database
Multimedia Database
 
IBM InterConnect 2015 - IIB Effective Application Development
IBM InterConnect 2015 - IIB Effective Application DevelopmentIBM InterConnect 2015 - IIB Effective Application Development
IBM InterConnect 2015 - IIB Effective Application Development
 
The Role of XML in an Information Society with Barry Schaeffer
The Role of XML in an Information Society with Barry SchaefferThe Role of XML in an Information Society with Barry Schaeffer
The Role of XML in an Information Society with Barry Schaeffer
 
Fyp presentation 2 (SQL Converter)
Fyp presentation 2 (SQL Converter)Fyp presentation 2 (SQL Converter)
Fyp presentation 2 (SQL Converter)
 
Lecture-1.ppt
Lecture-1.pptLecture-1.ppt
Lecture-1.ppt
 
Making your data work harder than you do
Making your data work harder than you doMaking your data work harder than you do
Making your data work harder than you do
 
Wed van horik_handson_research data management
Wed van horik_handson_research data managementWed van horik_handson_research data management
Wed van horik_handson_research data management
 
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
 
Hia 1693-effective application-development_in_iib
Hia 1693-effective application-development_in_iibHia 1693-effective application-development_in_iib
Hia 1693-effective application-development_in_iib
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
The XML Forms Architecture
The XML Forms ArchitectureThe XML Forms Architecture
The XML Forms Architecture
 
CRUD Operation of images through XML
CRUD Operation of images through XMLCRUD Operation of images through XML
CRUD Operation of images through XML
 

More from Dirk Roorda

TF-FAIR.pdf
TF-FAIR.pdfTF-FAIR.pdf
TF-FAIR.pdf
Dirk Roorda
 
Textpy
TextpyTextpy
Textpy
Dirk Roorda
 
General Missives
General MissivesGeneral Missives
General Missives
Dirk Roorda
 
Text Display (when it gets tricky)
Text Display (when it gets tricky)Text Display (when it gets tricky)
Text Display (when it gets tricky)
Dirk Roorda
 
Tf in-context
Tf in-contextTf in-context
Tf in-context
Dirk Roorda
 
Quran and Text-Fabric
Quran and Text-FabricQuran and Text-Fabric
Quran and Text-Fabric
Dirk Roorda
 
Ancient corpora analysis
Ancient corpora analysisAncient corpora analysis
Ancient corpora analysis
Dirk Roorda
 
Qdf2tf
Qdf2tfQdf2tf
Qdf2tf
Dirk Roorda
 
Text fabric
Text fabricText fabric
Text fabric
Dirk Roorda
 
Verbal Valency in Hebrew Verbs
Verbal Valency in Hebrew VerbsVerbal Valency in Hebrew Verbs
Verbal Valency in Hebrew Verbs
Dirk Roorda
 
Data management for researchers
Data management for researchersData management for researchers
Data management for researchers
Dirk Roorda
 
Annotating the Hebrew Bible
Annotating the Hebrew BibleAnnotating the Hebrew Bible
Annotating the Hebrew Bible
Dirk Roorda
 
20151111 utrecht ver theolbibliothecarissen
20151111 utrecht ver theolbibliothecarissen20151111 utrecht ver theolbibliothecarissen
20151111 utrecht ver theolbibliothecarissen
Dirk Roorda
 
Text as Data: processing the Hebrew Bible
Text as Data: processing the Hebrew BibleText as Data: processing the Hebrew Bible
Text as Data: processing the Hebrew Bible
Dirk Roorda
 
Datamanagement for Research: A Case Study
Datamanagement for Research: A Case StudyDatamanagement for Research: A Case Study
Datamanagement for Research: A Case Study
Dirk Roorda
 
Award
AwardAward
Datamanagement for Research: A Case Study
Datamanagement for Research: A Case StudyDatamanagement for Research: A Case Study
Datamanagement for Research: A Case Study
Dirk Roorda
 
Hebrew Bible as Data: Laboratory, Sharing, Lessons
Hebrew Bible as Data: Laboratory, Sharing, LessonsHebrew Bible as Data: Laboratory, Sharing, Lessons
Hebrew Bible as Data: Laboratory, Sharing, Lessons
Dirk Roorda
 
Laf fabric-dh benelux2014
Laf fabric-dh benelux2014Laf fabric-dh benelux2014
Laf fabric-dh benelux2014
Dirk Roorda
 
Data Analysis in the Hebrew Bible
Data Analysis in the Hebrew BibleData Analysis in the Hebrew Bible
Data Analysis in the Hebrew Bible
Dirk Roorda
 

More from Dirk Roorda (20)

TF-FAIR.pdf
TF-FAIR.pdfTF-FAIR.pdf
TF-FAIR.pdf
 
Textpy
TextpyTextpy
Textpy
 
General Missives
General MissivesGeneral Missives
General Missives
 
Text Display (when it gets tricky)
Text Display (when it gets tricky)Text Display (when it gets tricky)
Text Display (when it gets tricky)
 
Tf in-context
Tf in-contextTf in-context
Tf in-context
 
Quran and Text-Fabric
Quran and Text-FabricQuran and Text-Fabric
Quran and Text-Fabric
 
Ancient corpora analysis
Ancient corpora analysisAncient corpora analysis
Ancient corpora analysis
 
Qdf2tf
Qdf2tfQdf2tf
Qdf2tf
 
Text fabric
Text fabricText fabric
Text fabric
 
Verbal Valency in Hebrew Verbs
Verbal Valency in Hebrew VerbsVerbal Valency in Hebrew Verbs
Verbal Valency in Hebrew Verbs
 
Data management for researchers
Data management for researchersData management for researchers
Data management for researchers
 
Annotating the Hebrew Bible
Annotating the Hebrew BibleAnnotating the Hebrew Bible
Annotating the Hebrew Bible
 
20151111 utrecht ver theolbibliothecarissen
20151111 utrecht ver theolbibliothecarissen20151111 utrecht ver theolbibliothecarissen
20151111 utrecht ver theolbibliothecarissen
 
Text as Data: processing the Hebrew Bible
Text as Data: processing the Hebrew BibleText as Data: processing the Hebrew Bible
Text as Data: processing the Hebrew Bible
 
Datamanagement for Research: A Case Study
Datamanagement for Research: A Case StudyDatamanagement for Research: A Case Study
Datamanagement for Research: A Case Study
 
Award
AwardAward
Award
 
Datamanagement for Research: A Case Study
Datamanagement for Research: A Case StudyDatamanagement for Research: A Case Study
Datamanagement for Research: A Case Study
 
Hebrew Bible as Data: Laboratory, Sharing, Lessons
Hebrew Bible as Data: Laboratory, Sharing, LessonsHebrew Bible as Data: Laboratory, Sharing, Lessons
Hebrew Bible as Data: Laboratory, Sharing, Lessons
 
Laf fabric-dh benelux2014
Laf fabric-dh benelux2014Laf fabric-dh benelux2014
Laf fabric-dh benelux2014
 
Data Analysis in the Hebrew Bible
Data Analysis in the Hebrew BibleData Analysis in the Hebrew Bible
Data Analysis in the Hebrew Bible
 

Recently uploaded

special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
Vivekanand Anglo Vedic Academy
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
Col Mukteshwar Prasad
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
GeoBlogs
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
rosedainty
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
Fundacja Rozwoju Społeczeństwa Przedsiębiorczego
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
Excellence Foundation for South Sudan
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 

Recently uploaded (20)

special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 

2009 PLANETS Vienna - MIXED migration to XML

  • 1. Towards an Infrastructure of Migration Dirk Roorda
  • 3. History of MIXED • history • defining • developing • using • exploiting
  • 4. what is it? MIXED is a file format converter plus a set of formats, called SDFP, i.e. Standard Data Formats for Preservation
  • 6. testbed: spreadsheets XML is an appropriate choice for the long-term preservation of spreadsheets. XML can be used to specify the context, content and structure of spreadsheets.
  • 7. testbed: databases At present, XML is the most effective strategy for the durable preservation of databases. XML is highly capable of representing the context, content, and structure of databases. This strategy can implemented using a number of different methods.
  • 8. what do repositories want Conversion to preservable formats. Automatically at most once Faithfully.
  • 9. preservation strategy Migration and emulation are complementary strategies. Migration is best for offering usable content. Emulation is best for invoking the original experience. Migration to XML is normalised migration, hence we coin it smart migration.
  • 10. Ingredients suitable xml formats for your data software to convert legacy data to xml ingest data to xml xml to dissemination data connectors to your repository workflow
  • 13. defining MIXED • history • defining • developing • using • exploiting
  • 14. XML XML sounds great what is MIXED’s XML?
  • 15. Data kinds Data comes in kinds, defined by the typical applications that manipulate it. Spreadsheets, databases, rich text, images, audio, video, drawings, ... The need for these applications are the basic reason for the threat of data loss caused by software obsolescence.
  • 16. standards for data kinds binary vendor formats (doc) ascii vendor formats (rtf) open formats (HTML export) interchange formats (ad-hoc XML) standard formats (defined XML: OOXML) preservation formats (selected XML: SDFP)
  • 17. SDFP Standard Data Formats for Preservation Spreadsheets: ODF subset Databases: e-David-XML Statistical Data: DDI
  • 19. Datatypes numbers: ISO 6093 date-time: ISO 8601-3 characters: UNICODE
  • 20. Scope (kinds) initially tabular data spreadsheets and databases later statistical data and then text, still images, ...
  • 21. Scope (aspects) Content semantics databases cell positions data model values data itself formulas spreadsheets
  • 22. Aspects that didn’t make it presentation details action details fonts update, insert, delete forms stored procedures triggers
  • 23. developing MIXED • history • defining • developing • using • exploiting
  • 24. design principles building block in workflows no built-in user interface easily extensible / updatable use and produce open source code
  • 25. framework and plugins framework managing plugins managing execution administration plugins for each conversion from/to SDFP
  • 26. issues how loose/tight are the components connected? pure own Java code / borrow existing programs in other languages? modularity of file type recognition (JHOVE)
  • 27. Using MIXED • history • defining • developing • using • exploiting
  • 28. Data archives collect preserve re-use
  • 29. improvements for repositories • users can select format most usable to them, irrespective of producer • users can select the preservation format, in case usable formats are not supported • less uncertainties in interpretation, either by humans or by software
  • 30. further improvements combine data from heterogeneous sources • different formats (straightforward) • different data models (advanced) • different data kinds
  • 31. Exploiting MIXED • history • defining • developing • using • exploiting
  • 33. Data on an Infrastructure • higher demand for interoperability • more needs for standards • more opportunities for re-use • more scope for digital preservation tools
  • 34. Conversions needed lots of them ...
  • 35. Conversion as a service • a uniform resource • yielding uniform results • easily accessible • product of community effort • a good conversion requires a lot of intelligent work • quality is reached in an iterative manner
  • 36. MIXED as Infrastructure • provides a standard for preservation formats • implements the tools to maintain the standard • accumulates the shared wisdom of data formats
  • 37. when software vendors realize that there should always be an im/export to a preservation format, it means ........... The End of MIXED

Editor's Notes

  1. I first want to express my delight that you have made it to Scheveningen, to this consultation workshop for MIXED.
  2. This is what MIXED is about, according to the White Paper.I wonder whether in the future I just write Wordles instead of White Papers.
  3. This is an overview of my talk.
  4. Let us now state very briefly what MIXED is, at least, the tangible project result.
  5. Really, MIXED is not so surprising. The idea is quite natural. There have been attempts to put it on the agenda of digital preservation
  6. They made explicit statements about various data kinds: spreadsheets
  7. The verdict on XML for databases is also positive
  8. So the existing tools for converting to XML leaves something to wish for.That leads to the question:
  9. Let us talk a bit about preservation strategy, because you cannot effectively build tools if you do not have a strategy
  10. What are the ingredients for this to work out as desired?
  11. Lets zoom in on the processes of smart migration.First let us ignore the passage of time, we take that into account after this slide
  12. Now, considering time, the question is, how long do we have to maintain parts of the system?
  13. Let us have a closer look at what defines MIXED
  14. The first characteristic is a buzz word: XML
  15. The question is: what do we want to model with our XML?
  16. There is an evolution in file formats, one with a positive direction, look as an illustration to the formats that Microsoft Word can deal with.
  17. So now we are approaching MIXED again, by means of the selection of a few XML schema’s.
  18. SDFP is open, eXtensible, the objective is to gather the best preservation formats for each data kind under one umbrella.
  19. Below XML there is still some structure left: basic data types. XML schema can define them. We chose the ISO definitions.
  20. MIXED is far from complete. It is an experiment, so we have limited our scope.In two ways, one is just a matter of choice, the other may be a more intrinsical limitation.
  21. Here are the more intrinsic limitations. This is what we do
  22. and this is what we don’t
  23. By way of introducing the presentation of Jan, I want to say a few words about the MIXED software
  24. When we were designing