SlideShare a Scribd company logo
Working digitally with
Historical Documents
Georg Vogler
@gvogeler
http://www.i-d-e.dehttp://informationsmodellierung.uni-graz.at Napoli, 25.9.2018
Bridge the distance between modern use and
historical production
„Digital Scholarly Edition“ „Historical Analysis“
People in the Past and their Activities
Humanities Scholars, in particular Historians
Archival Document
lists, databases, spreadsheets, reference works , ...
scholarly editionscholarly edition
index, regestaindex, regesta
TextsTexts
Word, PDF. HTML,
SVG, ...
csv, xslx, SQL ...
TEI
Digital Images
EAD/ RiC
RDFs, OWL
RDF
CIDOC-CRM, …
Interpretation
Presentation
Data analysis
Annotation
Scan /
Photographs
Description
Transformation
Conceptuali-
sation
Data creation
OCR/HTR,
Transcription
Metasource
(J.-Ph.Genet1994)
Bridge the distance between modern use and
historical production
„Digital Scholarly Edition“
• Select object
• Digitise the archival document
• Create full text
• Structure text
• Annotate / enrich with external
knowledge
• Convert text into structured data
„Historical Analysis“
• Modelling research question and
the data needed to answer the
question
• Select data
• Evaluating algorithms / tools to
process data
• Visualise / organise data in a
meaningful way
Computational Methods are advancing …
Digitisation
Human
• Selection of objects to be
digitized
• Decision on the appropriate
method
• Quality control
Machine
• Pixel representation
• OCR/HTR
• Make the documents available in
the internet
Digitisation: „Confidence“
50% 48%
http://prhlt-kws.prhlt.upv.es/himanis/?q=bavarie&t=50&r=
http://prhlt-kws.prhlt.upv.es/himanis/?q=bavarie&t=48&r=
Digitisation
Human
• Selection of objects to be
digitized
• Decision on the appropriate
method
• Quality control
• Integrate into scholarly
discourse
Machine
• Pixel representation
• Suggestions for layout
• Suggestion for transcriptions (by
training with human
transcriptions)
• Publish
Information Extraction?
Human Machine
• Named Entity Recognition
• „Topic Modelling“
Screenshot from the ChartEx annotation tool
ChartEx Annotation process (Brat)
Information Extraction
Human
• „semantic“ annotation
• „If you have my name, you still don‘t
know me.“
• Manual annotation
• Identifying (Imported / exported)
• Classification schemes
• Integrate into scholarly discourse
Machine
• Named Entity Recognition
• In modern texts
• Linguistic method
• „Topic Modelling“
• groups of words typical for a
specific text chunk
• Linguistic “surface”
The Human in the Loop
Digitization
• Sensoric representation
• Algorithmic conversion
• On the „linguistic surface“
Digital Edition
• Reflecting on the text production
and transmission
• Enrichment with human knowledge
• As part of scholarly discourse
The assertive edition …
… is an scholarly edition which includes a formal representation of the
assertions on the historical reality made by a document in the
interpretation of the editor.
• Assertion: a proposition / statement
• historical reality: what scholars think that people in the past did and suffered
• Made by a document: a physical object carrying text as a means of
communication (made in the past)
• Interpretation of the editor: as only the editors are part of the current
scholarly discourse
• Formal representation: RDF triples linked to a digital representation of the
document
Vogeler 2018
Humans!
Feed the machine
and you will get great insights.
?
Humans!
Integrate the machine into your discourse
and you will get great insights.
Georg Vogler
georg.vogeler@uni-graz.at
http://www.i-d-e.dehttp://informationsmodellierung.uni-graz.at
References
• Himanis: http://himanis.org/
• ChartEx: https://chartex.org/
• Vogeler, Georg (2018). “The ‘assertive edition’”. In: International
Journal of Digital Humanities 1. Forthcoming.
This work is licensed under a Creative Commons Namensnennung 4.0
International License.
All works of other author cited here are their intellectual property and
are used for academic teaching purpose only.

More Related Content

What's hot

[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...Digital Classicist Seminar Berlin
 
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
Micah Altman
 
MDST 3703 F10 Seminar 11
MDST 3703 F10 Seminar 11MDST 3703 F10 Seminar 11
MDST 3703 F10 Seminar 11Rafael Alvarado
 
Towards a Graph of Ancient World Data & an Ecosystem of Gazetteers
Towards a Graph of Ancient World Data & an Ecosystem of GazetteersTowards a Graph of Ancient World Data & an Ecosystem of Gazetteers
Towards a Graph of Ancient World Data & an Ecosystem of Gazetteers
aboutgeo
 
New Discovery Tools for Digital Humanities and Spatial Data (Summary of the J...
New Discovery Tools for Digital Humanities and Spatial Data (Summary of the J...New Discovery Tools for Digital Humanities and Spatial Data (Summary of the J...
New Discovery Tools for Digital Humanities and Spatial Data (Summary of the J...
Micah Altman
 
Pieterjan Deckers - Medea an online platform for recording metal-detected finds
Pieterjan Deckers - Medea an online platform for recording metal-detected findsPieterjan Deckers - Medea an online platform for recording metal-detected finds
Pieterjan Deckers - Medea an online platform for recording metal-detected finds
ariadnenetwork
 
PhD Projects in Text Mining Research Topics With Source Code
PhD Projects in Text Mining Research Topics With Source CodePhD Projects in Text Mining Research Topics With Source Code
PhD Projects in Text Mining Research Topics With Source Code
PhD Services
 
Text Analysis Methods for Digital Humanities
Text Analysis Methods for Digital HumanitiesText Analysis Methods for Digital Humanities
Text Analysis Methods for Digital Humanities
Helen Bailey
 
Linked Data: principles and examples
Linked Data: principles and examples Linked Data: principles and examples
Linked Data: principles and examples
Victor de Boer
 
(BIG) DATA SCIENCE AND HISTORICAL ARCHAEOLOGICAL STUDIES: A METHODOLOGICAL, ...
(BIG) DATA SCIENCE AND HISTORICAL ARCHAEOLOGICAL STUDIES:  A METHODOLOGICAL, ...(BIG) DATA SCIENCE AND HISTORICAL ARCHAEOLOGICAL STUDIES:  A METHODOLOGICAL, ...
(BIG) DATA SCIENCE AND HISTORICAL ARCHAEOLOGICAL STUDIES: A METHODOLOGICAL, ...
4Science
 
One day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic WebOne day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic Web
Victor de Boer
 
Data curation and data archiving at different stages of the research process
Data curation and data archiving at different stages of the research processData curation and data archiving at different stages of the research process
Data curation and data archiving at different stages of the research process
Andrea Scharnhorst
 
(Un)writing the histories of Humanities Computing(s)
(Un)writing the histories of Humanities Computing(s)(Un)writing the histories of Humanities Computing(s)
(Un)writing the histories of Humanities Computing(s)
Edward Vanhoutte
 
Deploy of CENIEH’s new institutional repository
Deploy of CENIEH’s new institutional repositoryDeploy of CENIEH’s new institutional repository
Deploy of CENIEH’s new institutional repository
ariadnenetwork
 
14th EUROGRAPHICS Workshop on Graphics and Cultural Heritage
14th EUROGRAPHICS Workshop on Graphics and Cultural Heritage 14th EUROGRAPHICS Workshop on Graphics and Cultural Heritage
14th EUROGRAPHICS Workshop on Graphics and Cultural Heritage
Gravitate Project
 
Dariah Advisory Board June 2009 Peter
Dariah Advisory Board June 2009 PeterDariah Advisory Board June 2009 Peter
Dariah Advisory Board June 2009 Peterpkdoorn
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
Andrea Bollini
 
Archaeological Heritage in the management and information system of the Andal...
Archaeological Heritage in the management and information system of the Andal...Archaeological Heritage in the management and information system of the Andal...
Archaeological Heritage in the management and information system of the Andal...
ariadnenetwork
 

What's hot (18)

[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
 
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
 
MDST 3703 F10 Seminar 11
MDST 3703 F10 Seminar 11MDST 3703 F10 Seminar 11
MDST 3703 F10 Seminar 11
 
Towards a Graph of Ancient World Data & an Ecosystem of Gazetteers
Towards a Graph of Ancient World Data & an Ecosystem of GazetteersTowards a Graph of Ancient World Data & an Ecosystem of Gazetteers
Towards a Graph of Ancient World Data & an Ecosystem of Gazetteers
 
New Discovery Tools for Digital Humanities and Spatial Data (Summary of the J...
New Discovery Tools for Digital Humanities and Spatial Data (Summary of the J...New Discovery Tools for Digital Humanities and Spatial Data (Summary of the J...
New Discovery Tools for Digital Humanities and Spatial Data (Summary of the J...
 
Pieterjan Deckers - Medea an online platform for recording metal-detected finds
Pieterjan Deckers - Medea an online platform for recording metal-detected findsPieterjan Deckers - Medea an online platform for recording metal-detected finds
Pieterjan Deckers - Medea an online platform for recording metal-detected finds
 
PhD Projects in Text Mining Research Topics With Source Code
PhD Projects in Text Mining Research Topics With Source CodePhD Projects in Text Mining Research Topics With Source Code
PhD Projects in Text Mining Research Topics With Source Code
 
Text Analysis Methods for Digital Humanities
Text Analysis Methods for Digital HumanitiesText Analysis Methods for Digital Humanities
Text Analysis Methods for Digital Humanities
 
Linked Data: principles and examples
Linked Data: principles and examples Linked Data: principles and examples
Linked Data: principles and examples
 
(BIG) DATA SCIENCE AND HISTORICAL ARCHAEOLOGICAL STUDIES: A METHODOLOGICAL, ...
(BIG) DATA SCIENCE AND HISTORICAL ARCHAEOLOGICAL STUDIES:  A METHODOLOGICAL, ...(BIG) DATA SCIENCE AND HISTORICAL ARCHAEOLOGICAL STUDIES:  A METHODOLOGICAL, ...
(BIG) DATA SCIENCE AND HISTORICAL ARCHAEOLOGICAL STUDIES: A METHODOLOGICAL, ...
 
One day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic WebOne day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic Web
 
Data curation and data archiving at different stages of the research process
Data curation and data archiving at different stages of the research processData curation and data archiving at different stages of the research process
Data curation and data archiving at different stages of the research process
 
(Un)writing the histories of Humanities Computing(s)
(Un)writing the histories of Humanities Computing(s)(Un)writing the histories of Humanities Computing(s)
(Un)writing the histories of Humanities Computing(s)
 
Deploy of CENIEH’s new institutional repository
Deploy of CENIEH’s new institutional repositoryDeploy of CENIEH’s new institutional repository
Deploy of CENIEH’s new institutional repository
 
14th EUROGRAPHICS Workshop on Graphics and Cultural Heritage
14th EUROGRAPHICS Workshop on Graphics and Cultural Heritage 14th EUROGRAPHICS Workshop on Graphics and Cultural Heritage
14th EUROGRAPHICS Workshop on Graphics and Cultural Heritage
 
Dariah Advisory Board June 2009 Peter
Dariah Advisory Board June 2009 PeterDariah Advisory Board June 2009 Peter
Dariah Advisory Board June 2009 Peter
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
 
Archaeological Heritage in the management and information system of the Andal...
Archaeological Heritage in the management and information system of the Andal...Archaeological Heritage in the management and information system of the Andal...
Archaeological Heritage in the management and information system of the Andal...
 

Similar to Working digitally with Historical Documents

From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
4Science
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
IMPACT Centre of Competence
 
Digital Humanities Workshop
Digital Humanities WorkshopDigital Humanities Workshop
Wikipedia as source of collaboratively created Knowledge Organization Systems
Wikipedia as source of collaboratively created Knowledge Organization SystemsWikipedia as source of collaboratively created Knowledge Organization Systems
Wikipedia as source of collaboratively created Knowledge Organization Systems
Jakob .
 
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
OpenEdition
 
AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101  AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101
Digital Research and Curator Team @ British Library
 
Data Mining Newspapers Metadata
Data Mining Newspapers MetadataData Mining Newspapers Metadata
Data Mining Newspapers Metadata
Jean-Philippe Moreux
 
2013 RBMS Premodern manuscript application profile presentation
2013 RBMS Premodern manuscript application profile presentation2013 RBMS Premodern manuscript application profile presentation
2013 RBMS Premodern manuscript application profile presentation
ssteuer
 
Multimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical NewspapersMultimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical Newspapers
cneudecker
 
Nemeth Marton - Widening the limits of cognitive reception with online digita...
Nemeth Marton - Widening the limits of cognitive reception with online digita...Nemeth Marton - Widening the limits of cognitive reception with online digita...
Nemeth Marton - Widening the limits of cognitive reception with online digita...
BOBCATSSS 2017
 
Dh presentation helig 2014
Dh presentation helig 2014Dh presentation helig 2014
Dh presentation helig 2014
HELIGLIASA
 
Festival of publishing 2013 slides, London
Festival of publishing 2013 slides, LondonFestival of publishing 2013 slides, London
Festival of publishing 2013 slides, London
Helen K Jeffrey
 
Digital Research Conference 2012, Oxford: Re-imagining the literary essay for...
Digital Research Conference 2012, Oxford: Re-imagining the literary essay for...Digital Research Conference 2012, Oxford: Re-imagining the literary essay for...
Digital Research Conference 2012, Oxford: Re-imagining the literary essay for...
Helen K Jeffrey
 
Design challenges, content and tools for cultural heritage
Design challenges, content and tools for cultural heritageDesign challenges, content and tools for cultural heritage
Design challenges, content and tools for cultural heritage
ISMB
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Global Media Monitor - Marko Grobelnik
Global Media Monitor - Marko GrobelnikGlobal Media Monitor - Marko Grobelnik
Global Media Monitor - Marko Grobelnik
Marko Grobelnik
 
Irish Digital Libraries Summit
Irish Digital Libraries SummitIrish Digital Libraries Summit
Irish Digital Libraries Summit
Sebastian Ryszard Kruk
 
Widening the limits of cognitive reception with online digital library graph ...
Widening the limits of cognitive reception with online digital library graph ...Widening the limits of cognitive reception with online digital library graph ...
Widening the limits of cognitive reception with online digital library graph ...
Marton Nemeth
 
Transkribus | Günter Mühlberger
Transkribus | Günter MühlbergerTranskribus | Günter Mühlberger
Transkribus | Günter Mühlberger
Netwerk Oorlogsbronnen
 
Class 5-introto dl
Class 5-introto dlClass 5-introto dl
Class 5-introto dlmadhuvardhan
 

Similar to Working digitally with Historical Documents (20)

From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Digital Humanities Workshop
Digital Humanities WorkshopDigital Humanities Workshop
Digital Humanities Workshop
 
Wikipedia as source of collaboratively created Knowledge Organization Systems
Wikipedia as source of collaboratively created Knowledge Organization SystemsWikipedia as source of collaboratively created Knowledge Organization Systems
Wikipedia as source of collaboratively created Knowledge Organization Systems
 
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
 
AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101  AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101
 
Data Mining Newspapers Metadata
Data Mining Newspapers MetadataData Mining Newspapers Metadata
Data Mining Newspapers Metadata
 
2013 RBMS Premodern manuscript application profile presentation
2013 RBMS Premodern manuscript application profile presentation2013 RBMS Premodern manuscript application profile presentation
2013 RBMS Premodern manuscript application profile presentation
 
Multimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical NewspapersMultimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical Newspapers
 
Nemeth Marton - Widening the limits of cognitive reception with online digita...
Nemeth Marton - Widening the limits of cognitive reception with online digita...Nemeth Marton - Widening the limits of cognitive reception with online digita...
Nemeth Marton - Widening the limits of cognitive reception with online digita...
 
Dh presentation helig 2014
Dh presentation helig 2014Dh presentation helig 2014
Dh presentation helig 2014
 
Festival of publishing 2013 slides, London
Festival of publishing 2013 slides, LondonFestival of publishing 2013 slides, London
Festival of publishing 2013 slides, London
 
Digital Research Conference 2012, Oxford: Re-imagining the literary essay for...
Digital Research Conference 2012, Oxford: Re-imagining the literary essay for...Digital Research Conference 2012, Oxford: Re-imagining the literary essay for...
Digital Research Conference 2012, Oxford: Re-imagining the literary essay for...
 
Design challenges, content and tools for cultural heritage
Design challenges, content and tools for cultural heritageDesign challenges, content and tools for cultural heritage
Design challenges, content and tools for cultural heritage
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Global Media Monitor - Marko Grobelnik
Global Media Monitor - Marko GrobelnikGlobal Media Monitor - Marko Grobelnik
Global Media Monitor - Marko Grobelnik
 
Irish Digital Libraries Summit
Irish Digital Libraries SummitIrish Digital Libraries Summit
Irish Digital Libraries Summit
 
Widening the limits of cognitive reception with online digital library graph ...
Widening the limits of cognitive reception with online digital library graph ...Widening the limits of cognitive reception with online digital library graph ...
Widening the limits of cognitive reception with online digital library graph ...
 
Transkribus | Günter Mühlberger
Transkribus | Günter MühlbergerTranskribus | Günter Mühlberger
Transkribus | Günter Mühlberger
 
Class 5-introto dl
Class 5-introto dlClass 5-introto dl
Class 5-introto dl
 

More from Georg Vogeler

Standing-off Trees and Graphs : on the affordance of technologies for the edi...
Standing-off Trees and Graphs : on the affordance of technologies for the edi...Standing-off Trees and Graphs : on the affordance of technologies for the edi...
Standing-off Trees and Graphs : on the affordance of technologies for the edi...
Georg Vogeler
 
Von IIIF zu IPIF? Ein Vorschlag für den Datenaustausch über Personen
Von IIIF zu IPIF? Ein Vorschlag für den Datenaustausch über PersonenVon IIIF zu IPIF? Ein Vorschlag für den Datenaustausch über Personen
Von IIIF zu IPIF? Ein Vorschlag für den Datenaustausch über Personen
Georg Vogeler
 
Digitising charter images : benefits and pitfalls
Digitising charter images : benefits and pitfallsDigitising charter images : benefits and pitfalls
Digitising charter images : benefits and pitfalls
Georg Vogeler
 
Transformationen: Zum Übergang aus langfristigen Editionsprojekten in die dig...
Transformationen:Zum Übergang aus langfristigen Editionsprojekten in die dig...Transformationen:Zum Übergang aus langfristigen Editionsprojekten in die dig...
Transformationen: Zum Übergang aus langfristigen Editionsprojekten in die dig...
Georg Vogeler
 
Digital diplomatics - Defining a new scope of interpretation of historical do...
Digital diplomatics - Defining a new scope of interpretation of historical do...Digital diplomatics - Defining a new scope of interpretation of historical do...
Digital diplomatics - Defining a new scope of interpretation of historical do...
Georg Vogeler
 
Vernetzung Zum Verhältnis von klassischen Formen der Archiverschließung und I...
VernetzungZum Verhältnis von klassischen Formen der Archiverschließung und I...VernetzungZum Verhältnis von klassischen Formen der Archiverschließung und I...
Vernetzung Zum Verhältnis von klassischen Formen der Archiverschließung und I...
Georg Vogeler
 
Encoding Text About Things (Georg Vogeler)
Encoding Text About Things (Georg Vogeler)Encoding Text About Things (Georg Vogeler)
Encoding Text About Things (Georg Vogeler)
Georg Vogeler
 
Results of “Digital Diplomatics” for the research with medieval documents
Results of “Digital Diplomatics” for the research with medieval documentsResults of “Digital Diplomatics” for the research with medieval documents
Results of “Digital Diplomatics” for the research with medieval documents
Georg Vogeler
 
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
Georg Vogeler
 
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
Georg Vogeler
 
Possibilities of Digital Analysis of Charter corpora
Possibilities of Digital Analysis of Charter corpora Possibilities of Digital Analysis of Charter corpora
Possibilities of Digital Analysis of Charter corpora
Georg Vogeler
 
Medieval and Early Modern Accounts in the Digital Age
Medieval and Early Modern Accounts in the Digital AgeMedieval and Early Modern Accounts in the Digital Age
Medieval and Early Modern Accounts in the Digital Age
Georg Vogeler
 
Why not edit medieval account books digitally?
Why not edit medieval account books digitally?Why not edit medieval account books digitally?
Why not edit medieval account books digitally?
Georg Vogeler
 
Semantic Technologies in the Scholarly Edition of Medieval and Early Modern A...
Semantic Technologies in the Scholarly Edition of Medieval and Early Modern A...Semantic Technologies in the Scholarly Edition of Medieval and Early Modern A...
Semantic Technologies in the Scholarly Edition of Medieval and Early Modern A...
Georg Vogeler
 
Charter encoding
Charter encodingCharter encoding
Charter encoding
Georg Vogeler
 

More from Georg Vogeler (15)

Standing-off Trees and Graphs : on the affordance of technologies for the edi...
Standing-off Trees and Graphs : on the affordance of technologies for the edi...Standing-off Trees and Graphs : on the affordance of technologies for the edi...
Standing-off Trees and Graphs : on the affordance of technologies for the edi...
 
Von IIIF zu IPIF? Ein Vorschlag für den Datenaustausch über Personen
Von IIIF zu IPIF? Ein Vorschlag für den Datenaustausch über PersonenVon IIIF zu IPIF? Ein Vorschlag für den Datenaustausch über Personen
Von IIIF zu IPIF? Ein Vorschlag für den Datenaustausch über Personen
 
Digitising charter images : benefits and pitfalls
Digitising charter images : benefits and pitfallsDigitising charter images : benefits and pitfalls
Digitising charter images : benefits and pitfalls
 
Transformationen: Zum Übergang aus langfristigen Editionsprojekten in die dig...
Transformationen:Zum Übergang aus langfristigen Editionsprojekten in die dig...Transformationen:Zum Übergang aus langfristigen Editionsprojekten in die dig...
Transformationen: Zum Übergang aus langfristigen Editionsprojekten in die dig...
 
Digital diplomatics - Defining a new scope of interpretation of historical do...
Digital diplomatics - Defining a new scope of interpretation of historical do...Digital diplomatics - Defining a new scope of interpretation of historical do...
Digital diplomatics - Defining a new scope of interpretation of historical do...
 
Vernetzung Zum Verhältnis von klassischen Formen der Archiverschließung und I...
VernetzungZum Verhältnis von klassischen Formen der Archiverschließung und I...VernetzungZum Verhältnis von klassischen Formen der Archiverschließung und I...
Vernetzung Zum Verhältnis von klassischen Formen der Archiverschließung und I...
 
Encoding Text About Things (Georg Vogeler)
Encoding Text About Things (Georg Vogeler)Encoding Text About Things (Georg Vogeler)
Encoding Text About Things (Georg Vogeler)
 
Results of “Digital Diplomatics” for the research with medieval documents
Results of “Digital Diplomatics” for the research with medieval documentsResults of “Digital Diplomatics” for the research with medieval documents
Results of “Digital Diplomatics” for the research with medieval documents
 
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
 
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
Warum werden mittelalterliche und frühneuzeitliche Rechnungsbücher eigentlich...
 
Possibilities of Digital Analysis of Charter corpora
Possibilities of Digital Analysis of Charter corpora Possibilities of Digital Analysis of Charter corpora
Possibilities of Digital Analysis of Charter corpora
 
Medieval and Early Modern Accounts in the Digital Age
Medieval and Early Modern Accounts in the Digital AgeMedieval and Early Modern Accounts in the Digital Age
Medieval and Early Modern Accounts in the Digital Age
 
Why not edit medieval account books digitally?
Why not edit medieval account books digitally?Why not edit medieval account books digitally?
Why not edit medieval account books digitally?
 
Semantic Technologies in the Scholarly Edition of Medieval and Early Modern A...
Semantic Technologies in the Scholarly Edition of Medieval and Early Modern A...Semantic Technologies in the Scholarly Edition of Medieval and Early Modern A...
Semantic Technologies in the Scholarly Edition of Medieval and Early Modern A...
 
Charter encoding
Charter encodingCharter encoding
Charter encoding
 

Recently uploaded

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 

Recently uploaded (20)

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 

Working digitally with Historical Documents

  • 1. Working digitally with Historical Documents Georg Vogler @gvogeler http://www.i-d-e.dehttp://informationsmodellierung.uni-graz.at Napoli, 25.9.2018
  • 2. Bridge the distance between modern use and historical production „Digital Scholarly Edition“ „Historical Analysis“
  • 3. People in the Past and their Activities Humanities Scholars, in particular Historians Archival Document lists, databases, spreadsheets, reference works , ... scholarly editionscholarly edition index, regestaindex, regesta TextsTexts Word, PDF. HTML, SVG, ... csv, xslx, SQL ... TEI Digital Images EAD/ RiC RDFs, OWL RDF CIDOC-CRM, … Interpretation Presentation Data analysis Annotation Scan / Photographs Description Transformation Conceptuali- sation Data creation OCR/HTR, Transcription
  • 4. Metasource (J.-Ph.Genet1994) Bridge the distance between modern use and historical production „Digital Scholarly Edition“ • Select object • Digitise the archival document • Create full text • Structure text • Annotate / enrich with external knowledge • Convert text into structured data „Historical Analysis“ • Modelling research question and the data needed to answer the question • Select data • Evaluating algorithms / tools to process data • Visualise / organise data in a meaningful way
  • 6. Digitisation Human • Selection of objects to be digitized • Decision on the appropriate method • Quality control Machine • Pixel representation • OCR/HTR • Make the documents available in the internet
  • 8. Digitisation Human • Selection of objects to be digitized • Decision on the appropriate method • Quality control • Integrate into scholarly discourse Machine • Pixel representation • Suggestions for layout • Suggestion for transcriptions (by training with human transcriptions) • Publish
  • 9. Information Extraction? Human Machine • Named Entity Recognition • „Topic Modelling“
  • 10. Screenshot from the ChartEx annotation tool
  • 12. Information Extraction Human • „semantic“ annotation • „If you have my name, you still don‘t know me.“ • Manual annotation • Identifying (Imported / exported) • Classification schemes • Integrate into scholarly discourse Machine • Named Entity Recognition • In modern texts • Linguistic method • „Topic Modelling“ • groups of words typical for a specific text chunk • Linguistic “surface”
  • 13. The Human in the Loop Digitization • Sensoric representation • Algorithmic conversion • On the „linguistic surface“ Digital Edition • Reflecting on the text production and transmission • Enrichment with human knowledge • As part of scholarly discourse
  • 14. The assertive edition … … is an scholarly edition which includes a formal representation of the assertions on the historical reality made by a document in the interpretation of the editor. • Assertion: a proposition / statement • historical reality: what scholars think that people in the past did and suffered • Made by a document: a physical object carrying text as a means of communication (made in the past) • Interpretation of the editor: as only the editors are part of the current scholarly discourse • Formal representation: RDF triples linked to a digital representation of the document Vogeler 2018
  • 15. Humans! Feed the machine and you will get great insights. ?
  • 16. Humans! Integrate the machine into your discourse and you will get great insights. Georg Vogler georg.vogeler@uni-graz.at http://www.i-d-e.dehttp://informationsmodellierung.uni-graz.at
  • 17. References • Himanis: http://himanis.org/ • ChartEx: https://chartex.org/ • Vogeler, Georg (2018). “The ‘assertive edition’”. In: International Journal of Digital Humanities 1. Forthcoming.
  • 18. This work is licensed under a Creative Commons Namensnennung 4.0 International License. All works of other author cited here are their intellectual property and are used for academic teaching purpose only.