Introductiontodigitalepigraphy
EmmanuelleMorlock
CNRS, HISoMA (UMR 5189)
French-american program ‘Visible Words’
Information day, Ecole Française d’Athènes, may 4th
, 2015
Outline
● Digital Scholarly Editions (DSE)
● Digital epigraphy examples
● How does it work?
What is a DSE*?
*Digital Scholarly Edition
Why digital editions?
1. to facilitate the pooling and exchange of resources
2. for larger dissemination of resources:
○ as webpages
○ multimodal distribution : one single source (xml) => several outputs
(html, pdf, word, epub, xml, etc.)
3. to overcome the material constraints and limits of print
editions
4. to enable new kinds of exploitations (statistics,
visualizations, semantic web, big data…)
Text?
Patrick Sahle
A critical representation
● Representation:
○ re-creation, re-presentation of a text
○ model, data structure(s)
● Critical:
○ enhancement of the material with scholarly knowledge:
■ facsimile != not a digital scholarly edition
● A schoarly edition is about a research question...
○ Research objectives determines what is necessary to annotate
cf. P. Sahle, Criteria for Reviewing Scholarly Digital Editions, version 1.1
<http://ride.i-d-e.de/reviewers/catalogue-criteria-for-reviewing-scholarly-digital-editions/>
“model” of brandebourg
gate with lego blocks
Digital epigraphy?
Digital Epigraphy: community driven from the beginning
● Since 1999-2000
○ 1st
draft of EpiDoc as guidelines for the application of TEI
● Today:
○ a mechanism for the creation of complete digital editions
○ a framework maintained by an active community
“The collaborators were seeking a digital encoding method that
preserved the time-tested combination of flexibility and rigor in
editorial expression to which classical epigraphers were
accustomed in print, while bringing to both the creator and the
reader of epigraphic editions the power and reusability of XML.”
a TEI file structure
Digital Epigraphy: What is EpiDoc?
● EpiDoc
○ a subset of TEI tags
○ specific structural constraints:
■ re-expression of the epigraphic lemma in the metadata of the
transcription file (teiHeader)
■ transcription part (text) divided in the conventional parts of a
traditionnal edition: edition, apparatus, bibliography, commentary,
translation
○ guidelines for their use, dedicated to epigraphy
○ tools (xslt tranformation files from XML to .html and .txt, ODD schema)
div[@type=”commentary”]
entrée d’apparat critique
leçon préférée
alternative
version régularisée
version diplomatique
autre leçon
fin entrée d’apparat critique
What new interfaces
will you design?
inscriptions as ‘communication devices’...
How does it work?
very basic principle of web pages production
the formula
HTML + CSS
=
web page
in a navigator
<html>
(...)
<h1>Visible Words</h1>
<p>Editer & Etudier les
inscriptions dans un
environnement numérique :
méthodes, outils,
ressources</p>
(...)
</html>
body {
font-family:Times;
}
h1 {
font-size: 200%;
color: green;
font-weight: bold;
}
p {
color: black;
font-size: 100%;
margin-top:10%;
}
Visible Words
Editer & Etudier les inscriptions dans
un environnement numérique :
méthodes, outils, ressources
h1
(title level 1)
How do you do it?
XML file HTML file
transformation (XSLT, Xquery)
Index
transformation (XSLT, Xquery)
many
XML
files
TOC
RDF
etc.
edition as the
design of
information
artifacts
Why XML?
the basics
XML in short
1. XML doesn’t do nothing. It only describes. With means of tags (delimiter).
In a context of text representation: text structures in particular (book, section, chapter, paragraph,
etc.).
2. XML tags are not pre-defined.
One can freely create its own tags (according to one’s research interests, for example).
3. But a tag’s grammar can be defined (DTD or Schema)
Provides some rigour or means to use a common language between projects.
4. XML is defined to be self descriptive and can easily be read
You can open any xml file with any text editor and read the tags labels (it’s english!)
Descriptive markup - 1
★ chunks of text (of all sizes) delimited by start tag and
end tag
★ description of nature of function in tag name
<tagX>My contenttagX>
start tag
end tagchunk of text
Descriptive markup - 2
★ Attributes: additional information
<handNote xml:id="EP" medium="red-ink">
Ezra Pound's annotations.
</handNote>
value
attribute
name
Descriptive markup - 3
★ descriptive markup says what things are.
○ not what is to be done with the data (procedural information)
○ not how they are to be displayed (presentational information).
○ The objective is to describe the fonction and not the final appearance.
★ Separation of form and content
★ Compare:
★ More flexibility:
○ same underlying data for multiple presentations
○ presentation easy to change through stylesheets, etc.
○ facilitates the addition of multiple annotation and re-use
<author>Louise Labé</author>
<span class=”small-caps”>Louise Labé</span>
More specifically
XML file :
<author><forename>Louise</forename>
<surname>Labé</surname></author>
CSS file:
surname { font-variant: small-caps; font-family:
Times; }
Web page in
browser:
Louise LABÉ
Advangages of a
TEI/Epidoc markup
Expressiveness
Exploitability
Upgradability
Reusability
EpiDocencodingexample:abbreviation
<expan>
<abbr>a</abbr>
<ex>bc</ex>
</expan>
<expan>
<abbr>
<supplied reason="lost" cert="low">F</supplied>el
</abbr>
<ex cert="low">icitati</ex>
</expan>
a(bc)
Default (Panciera) style: [F?]el(icitati?)
Duke Databank style: [F(?)]el(icitati(?))
London style: [F?]el(icitati?)
Tools:OxygenEditor
Schema &
documentation
Wrap up - 1
● Digitized vs digital
○ if you can reproduce your edition without substantial loss, you’re not really doing a scholarly
edition…
● Encoding text allows to:
○ publish texts electronically
○ capture semantic distinctions
○ single input => multiple output
○ interchange with other projects
■ federated searches
■ linked data
○ Reuses
○ Long term sustainability
Wrap up - 2
● Markup may be an intellectual activity:
○ there is no such thing as a neutral markup
○ the editor’s job: deciding what markup to apply and how this represents his understanding
● It’s not difficult: Philology is encoding

Emmanuelle Morlock - Introduction to Digital Epigraphy

  • 1.
    Introductiontodigitalepigraphy EmmanuelleMorlock CNRS, HISoMA (UMR5189) French-american program ‘Visible Words’ Information day, Ecole Française d’Athènes, may 4th , 2015
  • 2.
    Outline ● Digital ScholarlyEditions (DSE) ● Digital epigraphy examples ● How does it work?
  • 3.
    What is aDSE*? *Digital Scholarly Edition
  • 4.
    Why digital editions? 1.to facilitate the pooling and exchange of resources 2. for larger dissemination of resources: ○ as webpages ○ multimodal distribution : one single source (xml) => several outputs (html, pdf, word, epub, xml, etc.) 3. to overcome the material constraints and limits of print editions 4. to enable new kinds of exploitations (statistics, visualizations, semantic web, big data…)
  • 11.
  • 12.
    A critical representation ●Representation: ○ re-creation, re-presentation of a text ○ model, data structure(s) ● Critical: ○ enhancement of the material with scholarly knowledge: ■ facsimile != not a digital scholarly edition ● A schoarly edition is about a research question... ○ Research objectives determines what is necessary to annotate cf. P. Sahle, Criteria for Reviewing Scholarly Digital Editions, version 1.1 <http://ride.i-d-e.de/reviewers/catalogue-criteria-for-reviewing-scholarly-digital-editions/> “model” of brandebourg gate with lego blocks
  • 13.
  • 14.
    Digital Epigraphy: communitydriven from the beginning ● Since 1999-2000 ○ 1st draft of EpiDoc as guidelines for the application of TEI ● Today: ○ a mechanism for the creation of complete digital editions ○ a framework maintained by an active community “The collaborators were seeking a digital encoding method that preserved the time-tested combination of flexibility and rigor in editorial expression to which classical epigraphers were accustomed in print, while bringing to both the creator and the reader of epigraphic editions the power and reusability of XML.”
  • 15.
    a TEI filestructure
  • 16.
    Digital Epigraphy: Whatis EpiDoc? ● EpiDoc ○ a subset of TEI tags ○ specific structural constraints: ■ re-expression of the epigraphic lemma in the metadata of the transcription file (teiHeader) ■ transcription part (text) divided in the conventional parts of a traditionnal edition: edition, apparatus, bibliography, commentary, translation ○ guidelines for their use, dedicated to epigraphy ○ tools (xslt tranformation files from XML to .html and .txt, ODD schema)
  • 17.
  • 31.
    entrée d’apparat critique leçonpréférée alternative version régularisée version diplomatique autre leçon fin entrée d’apparat critique
  • 32.
    What new interfaces willyou design? inscriptions as ‘communication devices’...
  • 34.
    How does itwork? very basic principle of web pages production
  • 35.
    the formula HTML +CSS = web page in a navigator
  • 36.
    <html> (...) <h1>Visible Words</h1> <p>Editer &Etudier les inscriptions dans un environnement numérique : méthodes, outils, ressources</p> (...) </html> body { font-family:Times; } h1 { font-size: 200%; color: green; font-weight: bold; } p { color: black; font-size: 100%; margin-top:10%; } Visible Words Editer & Etudier les inscriptions dans un environnement numérique : méthodes, outils, ressources h1 (title level 1)
  • 37.
    How do youdo it? XML file HTML file transformation (XSLT, Xquery) Index transformation (XSLT, Xquery) many XML files TOC RDF etc. edition as the design of information artifacts
  • 38.
  • 39.
    XML in short 1.XML doesn’t do nothing. It only describes. With means of tags (delimiter). In a context of text representation: text structures in particular (book, section, chapter, paragraph, etc.). 2. XML tags are not pre-defined. One can freely create its own tags (according to one’s research interests, for example). 3. But a tag’s grammar can be defined (DTD or Schema) Provides some rigour or means to use a common language between projects. 4. XML is defined to be self descriptive and can easily be read You can open any xml file with any text editor and read the tags labels (it’s english!)
  • 40.
    Descriptive markup -1 ★ chunks of text (of all sizes) delimited by start tag and end tag ★ description of nature of function in tag name <tagX>My contenttagX> start tag end tagchunk of text
  • 41.
    Descriptive markup -2 ★ Attributes: additional information <handNote xml:id="EP" medium="red-ink"> Ezra Pound's annotations. </handNote> value attribute name
  • 42.
    Descriptive markup -3 ★ descriptive markup says what things are. ○ not what is to be done with the data (procedural information) ○ not how they are to be displayed (presentational information). ○ The objective is to describe the fonction and not the final appearance. ★ Separation of form and content ★ Compare: ★ More flexibility: ○ same underlying data for multiple presentations ○ presentation easy to change through stylesheets, etc. ○ facilitates the addition of multiple annotation and re-use <author>Louise Labé</author> <span class=”small-caps”>Louise Labé</span>
  • 43.
    More specifically XML file: <author><forename>Louise</forename> <surname>Labé</surname></author> CSS file: surname { font-variant: small-caps; font-family: Times; } Web page in browser: Louise LABÉ
  • 44.
    Advangages of a TEI/Epidocmarkup Expressiveness Exploitability Upgradability Reusability
  • 45.
    EpiDocencodingexample:abbreviation <expan> <abbr>a</abbr> <ex>bc</ex> </expan> <expan> <abbr> <supplied reason="lost" cert="low">F</supplied>el </abbr> <excert="low">icitati</ex> </expan> a(bc) Default (Panciera) style: [F?]el(icitati?) Duke Databank style: [F(?)]el(icitati(?)) London style: [F?]el(icitati?)
  • 46.
  • 47.
    Wrap up -1 ● Digitized vs digital ○ if you can reproduce your edition without substantial loss, you’re not really doing a scholarly edition… ● Encoding text allows to: ○ publish texts electronically ○ capture semantic distinctions ○ single input => multiple output ○ interchange with other projects ■ federated searches ■ linked data ○ Reuses ○ Long term sustainability
  • 48.
    Wrap up -2 ● Markup may be an intellectual activity: ○ there is no such thing as a neutral markup ○ the editor’s job: deciding what markup to apply and how this represents his understanding ● It’s not difficult: Philology is encoding