SlideShare a Scribd company logo
1 of 23
Roberto Rosselli Del Turco - Università di Torino Florentina Armaselu - CVCE
roberto.rossellidelturco@unito.it florentina.armaselu@cvce.eu
Chiara Di Pietro - Università di Pisa Lars Wieneke - CVCE
dipi.chiara@gmail.com lars.wieneke@cvce.eu
Raffaele Masotti - Università di Pisa
raffaele.masotti@gmail.com
1
www.cvce.eu
Europe’s Beginnings through the Looking
Glass: Publishing Historical Documents
on the Web Using EVT
The CVCE
Summary 2
1. Overview of the WEU-DIPLO project
2. Experiments with Web publication platforms
3. EVT adaptation
• experiments
• publication framework overview
4. Future work
5. Conclusion
6. References
Summary
Summary 3
Overview of the WEU-DIPLO project: document structure. ©WEU-UEO
Overview WEU-DIPLO 4
Header
Content
Footer
1. Goal: XML-TEI encoding, corpus analysis and Web publication of institutional documents
of the W.E.U. (Western European Union):
• Topics: armament production, standardization, control in the period from 1954 to 1982;
• Source: Archives nationales de Luxembourg, W.E.U collection.
2. Initial format:
• digitized versions (JPEG) of typewritten materials (one file per page).
3. Size:
*proc. = processed
Overview of the WEU-DIPLO project
Overview WEU-DIPLO 5
Category Number of
documents
Number of documents
per language
Number
of pages
Number of pages per
language
EN FR FR proc.* EN FR FR proc.*
Note 89 43 46 37 395 191 204 155
Minutes 30 15 15 15 256 138 118 118
Memorandum 3 1 2 2 16 7 9 9
Study 2 0 2 1 12 0 12 8
Discourse 1 0 1 0 4 0 4 0
Draft protocol 2 1 1 0 4 2 2 0
Total 127 60 67 55 687 338 349 290
Overview of the WEU-DIPLO project: workflow
Overview WEU-DIPLO 6
Microsoft Word Styling (headers, footers) – WEU-DIPLO
Overview WEU-DIPLO 7
Microsoft Word Styling (headings, line breaks, paragraphs) – WEU-DIPLO
Overview WEU-DIPLO 8
XML-TEI Encoding: WEU-DIPLO - metadata, header. ©WEU-UEO
Overview WEU-DIPLO 9
@@hAuthor @@hArchNum
@@hStampConfid
@@hDocRef
@@hOrigDate
@@hOrigLang
@@hVersion
XML-TEI Encoding: WEU-DIPLO – Headings, paragraphs, line breaks. ©WEU-UEO
Overview WEU-DIPLO 10
@@Heading2
@@Paragraph
@@LineBreak
INTRODUCTION TO EVT
EVT FOR DIPLOMATIC DOCUMENTS
EVT experiments
Experiments 14
(Partial) customisation:
• General layout: folders structure, images renaming.
• EVT Transformer: builder pack (XSLT)
o added/modified templates for transforming specific patterns (headers, footers, paragraphs) (layout
not fully supported – e.g. sections, subsections, paragraph indentation, etc.).
• EVT Viewer: CSS
o added/modified statements to support visualisation in the browser of specific patterns (alignment,
text decoration, colour of headers, footers, etc.).
• Manual modification
o XML-TEI input: page breaks linked to the facsimile images;
o transformation output: changed HTML output to support particular features (Text-Link, HotSpot) (should
not occur in the real workflow).
EVT experiments – facsimile/transcription page side-by-side view (title page). ©WEU-UEO
Experiments 15
1. Goal:
• publishing on the CVCE’s Web site different types of documents on
European Integration history.
2. Types of documents (for the majority, high quality multilingual
transcriptions are available - TXT, RTF, SRT formats):
• treaties;
• administrative documents (minutes, notes, memoranda);
• press articles;
• handwritten notes;
• letters;
• video and audio archives.
3. Types of features to be implemented (required / optional):
• side by side facsimile/transcription (replicating the original with more or
less fidelity) (r);
• multipanel alignment (r);
• text-image link (o);
• zooming (r);
• HotSpot (o), etc.
EVT adaptation – towards a TEI-based publication framework – types of documents/features
EVT adaptation 17
EVT adaptation – towards a TEI-based publication framework – manuscript note (Werner corpus)
EVT adaptation 18
EVT adaptation/combination with other tools – towards a TEI-based publication framework – general layout
EVT adaptation 19
EVT adaptation – towards a TEI-based publication framework – architecture, workflow
EVT adaptation 20
General architecture General workflow
1. Identification of features to be implemented in the digital
editions:
• visualisation;
• search.
2. Publication framework design:
• core / plugin;
• optional / project specific.
3. Implementation of the module for XML-TEI conversion
(potential adaptation of OxGarage for batch processing).
4. Implementation/integration into existing CVCE architecture:
• Back End;
• Front End.
Future work
Future work 21
EVT framework:
• flexible enough to support different types of documents in
European integration history;
• possibility to compare original / transcription (of interest for
researchers in European integration studies);
• different degrees of fidelity to the original can be envisaged
(balance manual / automatic processing).
EVT adaptation:
• minimise the amount of manual interventions in the XML-TEI
documents;
• publication framework with modular architecture to allow gradual
development and customisation according to the needs of the
projects.
Conclusion
Future work 22
DEMO
THANKS A LOT FOR YOUR
ATTENTION
• EVT (Edition Visualization Technology): http://sourceforge.net/projects/evt-
project/
• KILN : http://kiln.readthedocs.org/en/latest/#
• TEIBoilerplate : http://dcl.ils.indiana.edu/teibp/
• TEI (Text Encoding Initiative): http://www.tei-c.org
• Versioning Machine: http://v-machine.org/
• XTF (eXtensible Text Framework): http://xtf.cdlib.org/about/
References
References 25

More Related Content

Similar to Europe’s Beginnings through the Looking Glass: Publishing Historical Documents on the Web Using EVT

BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...Pieter Pauwels
 
SCAPE Information Day at BL - Some of the SCAPE Outputs Available
SCAPE Information Day at BL - Some of the SCAPE Outputs AvailableSCAPE Information Day at BL - Some of the SCAPE Outputs Available
SCAPE Information Day at BL - Some of the SCAPE Outputs AvailableSCAPE Project
 
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...Lifeng (Aaron) Han
 
SustainablePlaces_ifcOWL_applications_2015-09-17
SustainablePlaces_ifcOWL_applications_2015-09-17SustainablePlaces_ifcOWL_applications_2015-09-17
SustainablePlaces_ifcOWL_applications_2015-09-17Pieter Pauwels
 
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...An overview of The European Library. Olaf Janssen presenting during DRH 2005,...
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...Olaf Janssen
 
Architectures and buildings
Architectures and buildingsArchitectures and buildings
Architectures and buildingsARCFIRE ICT
 
2015 11-04 HEADS at EclipseCon: Modelling Things for IoT
2015 11-04 HEADS at EclipseCon: Modelling Things for IoT2015 11-04 HEADS at EclipseCon: Modelling Things for IoT
2015 11-04 HEADS at EclipseCon: Modelling Things for IoTUdoHafermann
 
07 europeana tech
07 europeana tech07 europeana tech
07 europeana techEuropeana
 
OLE Project Webinr - Conversation with CUFTS April 8 2009
OLE Project Webinr - Conversation with CUFTS April 8 2009OLE Project Webinr - Conversation with CUFTS April 8 2009
OLE Project Webinr - Conversation with CUFTS April 8 2009John Little
 
Swimming upstream: OPNFV Doctor project case study
Swimming upstream: OPNFV Doctor project case studySwimming upstream: OPNFV Doctor project case study
Swimming upstream: OPNFV Doctor project case studyOPNFV
 
Bringing semantic publishing into TEI: ideas and pointers
Bringing semantic publishing into TEI: ideas and pointersBringing semantic publishing into TEI: ideas and pointers
Bringing semantic publishing into TEI: ideas and pointersUniversity of Bologna
 
Model Execution: Past, Present and Future
Model Execution: Past, Present and FutureModel Execution: Past, Present and Future
Model Execution: Past, Present and FutureBenoit Combemale
 
OOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVOOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVFIAT/IFTA
 
Implementing artificial intelligence strategies for content annotation and pu...
Implementing artificial intelligence strategies for content annotation and pu...Implementing artificial intelligence strategies for content annotation and pu...
Implementing artificial intelligence strategies for content annotation and pu...ReTV project
 
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...Implementing Artificial Intelligence Strategies for Content Annotation and Pu...
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...ReTV project
 
EUDAT Generic Execution Framework
EUDAT Generic Execution FrameworkEUDAT Generic Execution Framework
EUDAT Generic Execution FrameworkEUDAT
 
Design patterns intro
Design patterns introDesign patterns intro
Design patterns introJean Pаoli
 

Similar to Europe’s Beginnings through the Looking Glass: Publishing Historical Documents on the Web Using EVT (20)

BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
 
SCAPE Information Day at BL - Some of the SCAPE Outputs Available
SCAPE Information Day at BL - Some of the SCAPE Outputs AvailableSCAPE Information Day at BL - Some of the SCAPE Outputs Available
SCAPE Information Day at BL - Some of the SCAPE Outputs Available
 
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
 
ctchou-resume
ctchou-resumectchou-resume
ctchou-resume
 
SustainablePlaces_ifcOWL_applications_2015-09-17
SustainablePlaces_ifcOWL_applications_2015-09-17SustainablePlaces_ifcOWL_applications_2015-09-17
SustainablePlaces_ifcOWL_applications_2015-09-17
 
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
 
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...An overview of The European Library. Olaf Janssen presenting during DRH 2005,...
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...
 
Architectures and buildings
Architectures and buildingsArchitectures and buildings
Architectures and buildings
 
2015 11-04 HEADS at EclipseCon: Modelling Things for IoT
2015 11-04 HEADS at EclipseCon: Modelling Things for IoT2015 11-04 HEADS at EclipseCon: Modelling Things for IoT
2015 11-04 HEADS at EclipseCon: Modelling Things for IoT
 
07 europeana tech
07 europeana tech07 europeana tech
07 europeana tech
 
OLE Project Webinr - Conversation with CUFTS April 8 2009
OLE Project Webinr - Conversation with CUFTS April 8 2009OLE Project Webinr - Conversation with CUFTS April 8 2009
OLE Project Webinr - Conversation with CUFTS April 8 2009
 
Swimming upstream: OPNFV Doctor project case study
Swimming upstream: OPNFV Doctor project case studySwimming upstream: OPNFV Doctor project case study
Swimming upstream: OPNFV Doctor project case study
 
Bringing semantic publishing into TEI: ideas and pointers
Bringing semantic publishing into TEI: ideas and pointersBringing semantic publishing into TEI: ideas and pointers
Bringing semantic publishing into TEI: ideas and pointers
 
Model Execution: Past, Present and Future
Model Execution: Past, Present and FutureModel Execution: Past, Present and Future
Model Execution: Past, Present and Future
 
OpenMI 2.0: What's New?
OpenMI 2.0: What's New?OpenMI 2.0: What's New?
OpenMI 2.0: What's New?
 
OOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVOOMEN MEZARIS ReTV
OOMEN MEZARIS ReTV
 
Implementing artificial intelligence strategies for content annotation and pu...
Implementing artificial intelligence strategies for content annotation and pu...Implementing artificial intelligence strategies for content annotation and pu...
Implementing artificial intelligence strategies for content annotation and pu...
 
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...Implementing Artificial Intelligence Strategies for Content Annotation and Pu...
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...
 
EUDAT Generic Execution Framework
EUDAT Generic Execution FrameworkEUDAT Generic Execution Framework
EUDAT Generic Execution Framework
 
Design patterns intro
Design patterns introDesign patterns intro
Design patterns intro
 

More from dhlab

Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...
Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...
Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...dhlab
 
Humanist machine interaction for the digital humanities
Humanist machine interaction for the digital humanitiesHumanist machine interaction for the digital humanities
Humanist machine interaction for the digital humanitiesdhlab
 
History of Europe demo at IEEE MMSP 2013
History of Europe demo at IEEE MMSP 2013History of Europe demo at IEEE MMSP 2013
History of Europe demo at IEEE MMSP 2013dhlab
 
CUbRIK Summer School RHodes histoGraph
CUbRIK Summer School RHodes histoGraphCUbRIK Summer School RHodes histoGraph
CUbRIK Summer School RHodes histoGraphdhlab
 
HistoGraph presentation Insa de Lyon
HistoGraph presentation Insa de LyonHistoGraph presentation Insa de Lyon
HistoGraph presentation Insa de Lyondhlab
 
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...dhlab
 
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...dhlab
 
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989dhlab
 
DH2013: Christine Sauter – Results of the task force
DH2013: Christine Sauter – Results of the task forceDH2013: Christine Sauter – Results of the task force
DH2013: Christine Sauter – Results of the task forcedhlab
 
DH2013: Julia Fallon – Legal aspects of UGC
DH2013: Julia Fallon – Legal aspects of UGCDH2013: Julia Fallon – Legal aspects of UGC
DH2013: Julia Fallon – Legal aspects of UGCdhlab
 
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...dhlab
 
DH2013: Lars Wieneke – Workshop introduction
DH2013: Lars Wieneke – Workshop introduction DH2013: Lars Wieneke – Workshop introduction
DH2013: Lars Wieneke – Workshop introduction dhlab
 

More from dhlab (12)

Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...
Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...
Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...
 
Humanist machine interaction for the digital humanities
Humanist machine interaction for the digital humanitiesHumanist machine interaction for the digital humanities
Humanist machine interaction for the digital humanities
 
History of Europe demo at IEEE MMSP 2013
History of Europe demo at IEEE MMSP 2013History of Europe demo at IEEE MMSP 2013
History of Europe demo at IEEE MMSP 2013
 
CUbRIK Summer School RHodes histoGraph
CUbRIK Summer School RHodes histoGraphCUbRIK Summer School RHodes histoGraph
CUbRIK Summer School RHodes histoGraph
 
HistoGraph presentation Insa de Lyon
HistoGraph presentation Insa de LyonHistoGraph presentation Insa de Lyon
HistoGraph presentation Insa de Lyon
 
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...
 
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...
 
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989
 
DH2013: Christine Sauter – Results of the task force
DH2013: Christine Sauter – Results of the task forceDH2013: Christine Sauter – Results of the task force
DH2013: Christine Sauter – Results of the task force
 
DH2013: Julia Fallon – Legal aspects of UGC
DH2013: Julia Fallon – Legal aspects of UGCDH2013: Julia Fallon – Legal aspects of UGC
DH2013: Julia Fallon – Legal aspects of UGC
 
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...
 
DH2013: Lars Wieneke – Workshop introduction
DH2013: Lars Wieneke – Workshop introduction DH2013: Lars Wieneke – Workshop introduction
DH2013: Lars Wieneke – Workshop introduction
 

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 

Recently uploaded (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Europe’s Beginnings through the Looking Glass: Publishing Historical Documents on the Web Using EVT

  • 1. Roberto Rosselli Del Turco - Università di Torino Florentina Armaselu - CVCE roberto.rossellidelturco@unito.it florentina.armaselu@cvce.eu Chiara Di Pietro - Università di Pisa Lars Wieneke - CVCE dipi.chiara@gmail.com lars.wieneke@cvce.eu Raffaele Masotti - Università di Pisa raffaele.masotti@gmail.com 1 www.cvce.eu Europe’s Beginnings through the Looking Glass: Publishing Historical Documents on the Web Using EVT
  • 3. 1. Overview of the WEU-DIPLO project 2. Experiments with Web publication platforms 3. EVT adaptation • experiments • publication framework overview 4. Future work 5. Conclusion 6. References Summary Summary 3
  • 4. Overview of the WEU-DIPLO project: document structure. ©WEU-UEO Overview WEU-DIPLO 4 Header Content Footer
  • 5. 1. Goal: XML-TEI encoding, corpus analysis and Web publication of institutional documents of the W.E.U. (Western European Union): • Topics: armament production, standardization, control in the period from 1954 to 1982; • Source: Archives nationales de Luxembourg, W.E.U collection. 2. Initial format: • digitized versions (JPEG) of typewritten materials (one file per page). 3. Size: *proc. = processed Overview of the WEU-DIPLO project Overview WEU-DIPLO 5 Category Number of documents Number of documents per language Number of pages Number of pages per language EN FR FR proc.* EN FR FR proc.* Note 89 43 46 37 395 191 204 155 Minutes 30 15 15 15 256 138 118 118 Memorandum 3 1 2 2 16 7 9 9 Study 2 0 2 1 12 0 12 8 Discourse 1 0 1 0 4 0 4 0 Draft protocol 2 1 1 0 4 2 2 0 Total 127 60 67 55 687 338 349 290
  • 6. Overview of the WEU-DIPLO project: workflow Overview WEU-DIPLO 6
  • 7. Microsoft Word Styling (headers, footers) – WEU-DIPLO Overview WEU-DIPLO 7
  • 8. Microsoft Word Styling (headings, line breaks, paragraphs) – WEU-DIPLO Overview WEU-DIPLO 8
  • 9. XML-TEI Encoding: WEU-DIPLO - metadata, header. ©WEU-UEO Overview WEU-DIPLO 9 @@hAuthor @@hArchNum @@hStampConfid @@hDocRef @@hOrigDate @@hOrigLang @@hVersion
  • 10. XML-TEI Encoding: WEU-DIPLO – Headings, paragraphs, line breaks. ©WEU-UEO Overview WEU-DIPLO 10 @@Heading2 @@Paragraph @@LineBreak
  • 12. EVT FOR DIPLOMATIC DOCUMENTS
  • 13. EVT experiments Experiments 14 (Partial) customisation: • General layout: folders structure, images renaming. • EVT Transformer: builder pack (XSLT) o added/modified templates for transforming specific patterns (headers, footers, paragraphs) (layout not fully supported – e.g. sections, subsections, paragraph indentation, etc.). • EVT Viewer: CSS o added/modified statements to support visualisation in the browser of specific patterns (alignment, text decoration, colour of headers, footers, etc.). • Manual modification o XML-TEI input: page breaks linked to the facsimile images; o transformation output: changed HTML output to support particular features (Text-Link, HotSpot) (should not occur in the real workflow).
  • 14. EVT experiments – facsimile/transcription page side-by-side view (title page). ©WEU-UEO Experiments 15
  • 15. 1. Goal: • publishing on the CVCE’s Web site different types of documents on European Integration history. 2. Types of documents (for the majority, high quality multilingual transcriptions are available - TXT, RTF, SRT formats): • treaties; • administrative documents (minutes, notes, memoranda); • press articles; • handwritten notes; • letters; • video and audio archives. 3. Types of features to be implemented (required / optional): • side by side facsimile/transcription (replicating the original with more or less fidelity) (r); • multipanel alignment (r); • text-image link (o); • zooming (r); • HotSpot (o), etc. EVT adaptation – towards a TEI-based publication framework – types of documents/features EVT adaptation 17
  • 16. EVT adaptation – towards a TEI-based publication framework – manuscript note (Werner corpus) EVT adaptation 18
  • 17. EVT adaptation/combination with other tools – towards a TEI-based publication framework – general layout EVT adaptation 19
  • 18. EVT adaptation – towards a TEI-based publication framework – architecture, workflow EVT adaptation 20 General architecture General workflow
  • 19. 1. Identification of features to be implemented in the digital editions: • visualisation; • search. 2. Publication framework design: • core / plugin; • optional / project specific. 3. Implementation of the module for XML-TEI conversion (potential adaptation of OxGarage for batch processing). 4. Implementation/integration into existing CVCE architecture: • Back End; • Front End. Future work Future work 21
  • 20. EVT framework: • flexible enough to support different types of documents in European integration history; • possibility to compare original / transcription (of interest for researchers in European integration studies); • different degrees of fidelity to the original can be envisaged (balance manual / automatic processing). EVT adaptation: • minimise the amount of manual interventions in the XML-TEI documents; • publication framework with modular architecture to allow gradual development and customisation according to the needs of the projects. Conclusion Future work 22
  • 21. DEMO
  • 22. THANKS A LOT FOR YOUR ATTENTION
  • 23. • EVT (Edition Visualization Technology): http://sourceforge.net/projects/evt- project/ • KILN : http://kiln.readthedocs.org/en/latest/# • TEIBoilerplate : http://dcl.ils.indiana.edu/teibp/ • TEI (Text Encoding Initiative): http://www.tei-c.org • Versioning Machine: http://v-machine.org/ • XTF (eXtensible Text Framework): http://xtf.cdlib.org/about/ References References 25