2. DEFINITIONS
Source: a run of plain text or text + markup.
Standoff: markup or annotations which occur away from the
source they deal with and which are not referenced directly by
that source.
Annotation: markup that adds ancillary information to a source
or part of a source.
3. STANDOFF
The TEI Guidelines mostly use the term “stand-off” in one sense
—referring to markup that takes source text in one form and re-
constructs it into a different form.
Text structured
in pages
Text structured in
chapters and
paragraphs
source
standoff
markup
restructuring markup
4. STANDOFF
But people often use the term “standoff” also to refer to
annotations on the source text that associate new information
and analysis with it
Text
Notes, analysis,
additional information
source
standoff
markup
associative markup
5. WHERE IS IT?
Many kinds of associative annotation can occur in multiple
contexts. Notes, for example, can appear inline, at the point
where the note is anchored, or can use the @target attribute to
point at the thing they are annotating.
The source itself can point outward to additional information, for
example a <persName> with a @ref pointing to a <person>
elsewhere.“This string is a name for the person defined over
there”.
So annotation can be inline, referenced, or standoff.
6. SOMETIMESTHERE’S A CHOICE
Some annotations can work in all three ways: e.g. <note>
<p>Some text<note>with an inline note</note>.</p>
<p xml:id="id">Some text.</p>
...
<note target="#id">with a standoff note</note>
<p>Some text.<ptr target="#id"/></p>
...
<note xml:id="id">with a referenced note</note>
“Here’s some more information about this paragraph.”
7. BUT
Other types of annotation really only work one way:
<p><seg>When the Alexandrian war flared up,
<persName ref="#JC">Caesar</persName> summoned
every fleet from Rhodes and Syria and Cilicia;
from Crete he raised archers, and cavalry from
Malchus, king of the Nabataeans, and ordered
artillery to be procured, corn despatched, and
auxiliary troops mustered from every quarter.</
seg>...</p>
“The enclosed string is a personal name, which refers to
the person defined in the element with the id ‘JC’”.
8. A DIGRESSION ON WORKFLOWS
Why would you do standoff markup?
1. To have it both ways: e.g. mark the source up by pages, but have a version
with the same text and chapters/paragraphs.
2. As a step in the construction of an edition, e.g. having collaborators
identify persons and places without changing the source yet.
3. Adding information to a source you don’t own or can’t modify (but ideally
is stable).
4. Adding a new category of information to an already complex, highly-
structured source.
9. THREETYPES OF STANDOFF
Restructuring standoff: virtually rewrites the structure of the
source being annotated; operates on big chunks of text, not really
fragments.
Associative standoff: juxtaposes some part of the source with
a note or other piece of markup; fine-grained, but limited to
attaching one bit of information to another.
Assertive standoff: would make an assertion about some part
of the source, e.g.“This string is a place name.” BUT: how to do
it?
10. HOW; SOME IDEAS
Use restructuring standoff: rewrite the source with the personal names
identified.
Adopt a convention, e.g.:
<p><seg>When the Alexandrian war flared up, Caesar
summoned every fleet from Rhodes and Syria and
Cilicia;...</seg>...</p>
...
<person xml:id="Caesar">Julius Caesar</person>
...
<link
target="#match(//p[1]/seg[1],'Caesar') #Caesar"/>
Note: Not the same thing as <persName ref="#Caesar">Caesar</persName>.
12. HOW;ASSOCIATION
<link> with @target, which contains a space-separated list of
pointers understood to be associated.
<span> with @from and @to, specifying a start and end of the
thing being annotated or with @target (somewhat confusingly).
<note> with @target
All of these require some additional knowledge outside the
markup, because all they do is connect things up (they're
associative).
13. HOW;ASSERTION
Our example using restructuring is assertive. It clearly says “here is a
reading of this passage with personal names identified”, but it has some
drawbacks:
It requires that the whole passage be remade—it can’t target just the
names.
Annotations can have overlap problems, so restructuring runs into the
usual difficulties.
They may have interdependencies (if name x refers to person A, then y
is probably her brother, person B; if not, then y is probably person Z).
14. DOESTEI HAVE ASSERTIVE
ANNOTATIONS?
A critical apparatus, or apparatus criticus if you’re being snooty, is
a set of annotations that record textual variants an editor wants
the reader to know about.
<p n="1" xml:id="p1">
<seg n="1" xml:id="seg-1.1">Bello Alexandrino
conflato Caesar <app>
<lem>Rhodo</lem>
<rdg wit="#S" ana="#orthographical">Ordo</rdg>
</app> atque ex Syria Ciliciaque omnem classem
arcessit; ...</seg>
...
</p>
The lemma (what’s in
the editor’s text)
A reading; from S (Florence,
BML Ashburnham 33)
15. WHAT? WHY WOULDYOU DO
SUCH ATHING?
Takes the form of inline or standoff notes on the text.
Expressly for making assertive annotations in the form "version x
reads “B” rather than “A” here.
Can accommodate differences in markup as well as text.
Can cope reasonably well with overlap.
Can handle dependencies / conflicts between annotations.
16. CRITICAL APPARATUS
Can also report prior editors’ emendations of the text or
speculative emendations by the current editor.
Can even record alternate ways of punctuating the text.
So it’s not too far-fetched to consider using it for emendations to
the markup.
18. OK, FINE, BUTYOU SAID SOMETHING
ABOUT IMPLEMENTING IT...
We need:
1. A way to identify persons, places, etc.
2. A way to turn that into a usable data source.
3. A way to actually do things with it.
19. RECOGITO (#1)
https://recogito.pelagios.org/
Developed mainly by Rainer Simon of the Austrian Institute of
Technology for the Pelagios Network (https://pelagios.org/)
Designed for, and has most support for place annotations, but
does people and organizations too.
Exports to CSV, JSON-LD, RDF, GeoJSON, ...and TEI
Pretty much covers #1. #2 needs a bit of work.
20. #3TURNS OUTTO BE EASY(ISH)
Given a TEI document and annotations like:
<listApp>
<app from="#match(seg-1.1,'Caesar')"><rdg source="#Damon"><persName
ref="#Caesar" >Caesar</persName></rdg></app>
<app from="#match(seg-1.1,'Rhodo')"><rdg source="#Damon"><placeName
ref="http://pleiades.stoa.org/places/590031">Rhodo</placeName></rdg></
app>
<app from="#match(seg-1.1,'Syria')"><rdg source="#Damon"><placeName
ref="http://pleiades.stoa.org/places/1306">Syria</placeName></rdg></app>
<app from="#match(seg-1.1,'Cilicia')"><rdg source="#Damon"><placeName
ref="http://pleiades.stoa.org/places/628957">Cilicia</placeName></rdg></
app>
<app from="#match(seg-1.1,'Creta')"><rdg source="#Damon"><placeName
ref="http://pleiades.stoa.org/places/991373">Creta</placeName></rdg></
app>
...
we can (e.g.) turn the standoff annotations into links
21. THE HARD PART
#2, the boring, standards-making part of deciding what TEI standoff
annotations actually look like is hard. Export is easy—Recogito will
basically already do it—but what does the export look like?
There is a proposal underway for a new TEI <standoff> element
that could contain (e.g. the output of an annotation session).
Maybe later this year we'll be done yelling at each other and be
able to actually define it. I hope there's a place in it for assertive
annotations, even if they don't look precisely like critical apparatus.