SlideShare a Scribd company logo
1 of 22
Download to read offline
IMPLEMENTINGTEI STANDOFF
ANNOTATION INTHE BROWSER
Hugh Cayless, Duke University
@hcayless
DEFINITIONS
Source: a run of plain text or text + markup.
Standoff: markup or annotations which occur away from the
source they deal with and which are not referenced directly by
that source.
Annotation: markup that adds ancillary information to a source
or part of a source.
STANDOFF
The TEI Guidelines mostly use the term “stand-off” in one sense
—referring to markup that takes source text in one form and re-
constructs it into a different form.
Text structured
in pages
Text structured in
chapters and
paragraphs
source
standoff	
markup
restructuring	markup
STANDOFF
But people often use the term “standoff” also to refer to
annotations on the source text that associate new information
and analysis with it
Text
Notes, analysis,
additional information
source
standoff	
markup
associative	markup
WHERE IS IT?
Many kinds of associative annotation can occur in multiple
contexts. Notes, for example, can appear inline, at the point
where the note is anchored, or can use the @target attribute to
point at the thing they are annotating.
The source itself can point outward to additional information, for
example a <persName> with a @ref pointing to a <person>
elsewhere.“This string is a name for the person defined over
there”.
So annotation can be inline, referenced, or standoff.
SOMETIMESTHERE’S A CHOICE
Some annotations can work in all three ways: e.g. <note>
<p>Some	text<note>with	an	inline	note</note>.</p>
<p	xml:id="id">Some	text.</p>

...

<note	target="#id">with	a	standoff	note</note>
<p>Some	text.<ptr	target="#id"/></p>

...

<note	xml:id="id">with	a	referenced	note</note>
“Here’s some more information about this paragraph.”
BUT
Other types of annotation really only work one way:
<p><seg>When	the	Alexandrian	war	flared	up,	
<persName	ref="#JC">Caesar</persName>	summoned	
every	fleet	from	Rhodes	and	Syria	and	Cilicia;	
from	Crete	he	raised	archers,	and	cavalry	from	
Malchus,	king	of	the	Nabataeans,	and	ordered	
artillery	to	be	procured,	corn	despatched,	and	
auxiliary	troops	mustered	from	every	quarter.</
seg>...</p>
“The enclosed string is a personal name, which refers to
the person defined in the element with the id ‘JC’”.
A DIGRESSION ON WORKFLOWS
Why would you do standoff markup?
1. To have it both ways: e.g. mark the source up by pages, but have a version
with the same text and chapters/paragraphs.
2. As a step in the construction of an edition, e.g. having collaborators
identify persons and places without changing the source yet.
3. Adding information to a source you don’t own or can’t modify (but ideally
is stable).
4. Adding a new category of information to an already complex, highly-
structured source.
THREETYPES OF STANDOFF
Restructuring standoff: virtually rewrites the structure of the
source being annotated; operates on big chunks of text, not really
fragments.
Associative standoff: juxtaposes some part of the source with
a note or other piece of markup; fine-grained, but limited to
attaching one bit of information to another.
Assertive standoff: would make an assertion about some part
of the source, e.g.“This string is a place name.” BUT: how to do
it?
HOW; SOME IDEAS
Use restructuring standoff: rewrite the source with the personal names
identified.
Adopt a convention, e.g.: 

<p><seg>When	the	Alexandrian	war	flared	up,	Caesar	
summoned	every	fleet	from	Rhodes	and	Syria	and	
Cilicia;...</seg>...</p>

...

<person	xml:id="Caesar">Julius	Caesar</person>

...

<link	

			target="#match(//p[1]/seg[1],'Caesar')	#Caesar"/>
Note: Not the same thing as <persName	ref="#Caesar">Caesar</persName>.
HOW; RESTRUCTURING
<p><seg>When	the	Alexandrian	war	flared	up,	
Caesar	summoned	every	fleet	from	Rhodes	and	Syria	
and	Cilicia;	...</seg>...</p>	
<p><seg><join	target="#string-range(//p[1]/
seg[1],0,36)"/><persName	ref="#Caesar"><join	
target="#string-range(//p[1]/seg[1],37,42)"/></
persName><join	target="#string-range(//p[1]/
seg[1],43,98)"/>	...</seg>...</p>
HOW;ASSOCIATION
<link> with @target, which contains a space-separated list of
pointers understood to be associated.
<span> with @from and @to, specifying a start and end of the
thing being annotated or with @target (somewhat confusingly).
<note> with @target
All of these require some additional knowledge outside the
markup, because all they do is connect things up (they're
associative).
HOW;ASSERTION
Our example using restructuring is assertive. It clearly says “here is a
reading of this passage with personal names identified”, but it has some
drawbacks:
It requires that the whole passage be remade—it can’t target just the
names.
Annotations can have overlap problems, so restructuring runs into the
usual difficulties.
They may have interdependencies (if name x refers to person A, then y
is probably her brother, person B; if not, then y is probably person Z).
DOESTEI HAVE ASSERTIVE
ANNOTATIONS?
A critical apparatus, or apparatus criticus if you’re being snooty, is
a set of annotations that record textual variants an editor wants
the reader to know about.
<p	n="1"	xml:id="p1">

		<seg	n="1"	xml:id="seg-1.1">Bello	Alexandrino	

				conflato	Caesar	<app>

						<lem>Rhodo</lem>

						<rdg	wit="#S"	ana="#orthographical">Ordo</rdg>

				</app>	atque	ex	Syria	Ciliciaque	omnem	classem	

				arcessit;	...</seg>

		...

</p>
The lemma (what’s in 

the editor’s text)
A reading; from S (Florence,
BML Ashburnham 33)
WHAT? WHY WOULDYOU DO
SUCH ATHING?
Takes the form of inline or standoff notes on the text.
Expressly for making assertive annotations in the form "version x
reads “B” rather than “A” here.
Can accommodate differences in markup as well as text.
Can cope reasonably well with overlap.
Can handle dependencies / conflicts between annotations.
CRITICAL APPARATUS
Can also report prior editors’ emendations of the text or
speculative emendations by the current editor.
Can even record alternate ways of punctuating the text.
So it’s not too far-fetched to consider using it for emendations to
the markup.
Given:

<p><seg>When	the	Alexandrian	war	flared	up,	Caesar	
summoned	every	fleet	from	Rhodes	and	Syria	and	
Cilicia;...</seg>...</p>

...

<person	xml:id="Caesar">Julius	Caesar</person>

Instead of:



<link	

			target="#match(//p[1]/seg[1],'Caesar')	#Caesar"/>

why not:



<app	from="#match(//p[1]/seg[1],'Caesar')">

		<rdg	source="#Damon"><persName	ref="#Caesar">Caesar

		</persName></rdg>

</app>
Says, explicitly:“Damon says this is a personal name
referring to Julius Caesar.”
OK, FINE, BUTYOU SAID SOMETHING
ABOUT IMPLEMENTING IT...
We need:
1. A way to identify persons, places, etc.
2. A way to turn that into a usable data source.
3. A way to actually do things with it.
RECOGITO (#1)
https://recogito.pelagios.org/
Developed mainly by Rainer Simon of the Austrian Institute of
Technology for the Pelagios Network (https://pelagios.org/)
Designed for, and has most support for place annotations, but
does people and organizations too.
Exports to CSV, JSON-LD, RDF, GeoJSON, ...and TEI
Pretty much covers #1. #2 needs a bit of work.
#3TURNS OUTTO BE EASY(ISH)
Given a TEI document and annotations like:
<listApp>

		<app	from="#match(seg-1.1,'Caesar')"><rdg	source="#Damon"><persName	
ref="#Caesar"	>Caesar</persName></rdg></app>

		<app	from="#match(seg-1.1,'Rhodo')"><rdg	source="#Damon"><placeName	
ref="http://pleiades.stoa.org/places/590031">Rhodo</placeName></rdg></
app>

		<app	from="#match(seg-1.1,'Syria')"><rdg	source="#Damon"><placeName	
ref="http://pleiades.stoa.org/places/1306">Syria</placeName></rdg></app>

		<app	from="#match(seg-1.1,'Cilicia')"><rdg	source="#Damon"><placeName	
ref="http://pleiades.stoa.org/places/628957">Cilicia</placeName></rdg></
app>

		<app	from="#match(seg-1.1,'Creta')"><rdg	source="#Damon"><placeName	
ref="http://pleiades.stoa.org/places/991373">Creta</placeName></rdg></
app>

...
we can (e.g.) turn the standoff annotations into links
THE HARD PART
#2, the boring, standards-making part of deciding what TEI standoff
annotations actually look like is hard. Export is easy—Recogito will
basically already do it—but what does the export look like?
There is a proposal underway for a new TEI <standoff> element
that could contain (e.g. the output of an annotation session).
Maybe later this year we'll be done yelling at each other and be
able to actually define it. I hope there's a place in it for assertive
annotations, even if they don't look precisely like critical apparatus.
QUESTIONS?

More Related Content

Similar to Implementing TEI Standoff Annotation in the Browser

HTML Lists & Llinks
HTML Lists & LlinksHTML Lists & Llinks
HTML Lists & LlinksNisa Soomro
 
Readme Driven Development
Readme Driven DevelopmentReadme Driven Development
Readme Driven DevelopmentMark Rickerby
 
Information retrieval and extraction
Information retrieval and extractionInformation retrieval and extraction
Information retrieval and extractionAnkit Sharma
 
Dublin Core Description Set Profiles (DC-2009)
Dublin Core Description Set Profiles (DC-2009)Dublin Core Description Set Profiles (DC-2009)
Dublin Core Description Set Profiles (DC-2009)Pete Johnston
 
April 2016 - USG Web Tech Day - Let's Talk Drupal
April 2016 - USG Web Tech Day - Let's Talk DrupalApril 2016 - USG Web Tech Day - Let's Talk Drupal
April 2016 - USG Web Tech Day - Let's Talk DrupalEric Sembrat
 
EPiServer report generation
EPiServer report generationEPiServer report generation
EPiServer report generationPaul Graham
 
Xml part3
Xml part3Xml part3
Xml part3NOHA AW
 
HTML5 - create hyperlinks and anchors
HTML5 - create hyperlinks and anchorsHTML5 - create hyperlinks and anchors
HTML5 - create hyperlinks and anchorsGrayzon Gonzales, LPT
 
Industrial strength - Natural Language Processing
Industrial strength - Natural Language ProcessingIndustrial strength - Natural Language Processing
Industrial strength - Natural Language ProcessingJeffrey Williams
 
27 f157al5enhanced er diagram
27 f157al5enhanced er diagram27 f157al5enhanced er diagram
27 f157al5enhanced er diagramdddgh
 
Database index by Reema Gajjar
Database index by Reema GajjarDatabase index by Reema Gajjar
Database index by Reema GajjarReema Gajjar
 
Entity Relationship Diagram
Entity Relationship DiagramEntity Relationship Diagram
Entity Relationship DiagramSiti Ismail
 
Facet: Building Web Pages with SPARQL
Facet: Building Web Pages with SPARQLFacet: Building Web Pages with SPARQL
Facet: Building Web Pages with SPARQLLeigh Dodds
 
Swap For Dummies Rsp 2007 11 29
Swap For Dummies Rsp 2007 11 29Swap For Dummies Rsp 2007 11 29
Swap For Dummies Rsp 2007 11 29Julie Allinson
 
Repositories thru the looking glass
Repositories thru the looking glassRepositories thru the looking glass
Repositories thru the looking glassEduserv Foundation
 
Understanding REST
Understanding RESTUnderstanding REST
Understanding RESTNitin Pande
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesPaolo Pareti
 

Similar to Implementing TEI Standoff Annotation in the Browser (20)

HTML Lists & Llinks
HTML Lists & LlinksHTML Lists & Llinks
HTML Lists & Llinks
 
Readme Driven Development
Readme Driven DevelopmentReadme Driven Development
Readme Driven Development
 
Information retrieval and extraction
Information retrieval and extractionInformation retrieval and extraction
Information retrieval and extraction
 
Dublin Core Description Set Profiles (DC-2009)
Dublin Core Description Set Profiles (DC-2009)Dublin Core Description Set Profiles (DC-2009)
Dublin Core Description Set Profiles (DC-2009)
 
April 2016 - USG Web Tech Day - Let's Talk Drupal
April 2016 - USG Web Tech Day - Let's Talk DrupalApril 2016 - USG Web Tech Day - Let's Talk Drupal
April 2016 - USG Web Tech Day - Let's Talk Drupal
 
EPiServer report generation
EPiServer report generationEPiServer report generation
EPiServer report generation
 
Xml part3
Xml part3Xml part3
Xml part3
 
HTML5 - create hyperlinks and anchors
HTML5 - create hyperlinks and anchorsHTML5 - create hyperlinks and anchors
HTML5 - create hyperlinks and anchors
 
Industrial strength - Natural Language Processing
Industrial strength - Natural Language ProcessingIndustrial strength - Natural Language Processing
Industrial strength - Natural Language Processing
 
enhanced er diagram
enhanced er diagramenhanced er diagram
enhanced er diagram
 
27 f157al5enhanced er diagram
27 f157al5enhanced er diagram27 f157al5enhanced er diagram
27 f157al5enhanced er diagram
 
Database index by Reema Gajjar
Database index by Reema GajjarDatabase index by Reema Gajjar
Database index by Reema Gajjar
 
Entity Relationship Diagram
Entity Relationship DiagramEntity Relationship Diagram
Entity Relationship Diagram
 
The Glory of Rest
The Glory of RestThe Glory of Rest
The Glory of Rest
 
Facet: Building Web Pages with SPARQL
Facet: Building Web Pages with SPARQLFacet: Building Web Pages with SPARQL
Facet: Building Web Pages with SPARQL
 
Swap For Dummies Rsp 2007 11 29
Swap For Dummies Rsp 2007 11 29Swap For Dummies Rsp 2007 11 29
Swap For Dummies Rsp 2007 11 29
 
Repositories thru the looking glass
Repositories thru the looking glassRepositories thru the looking glass
Repositories thru the looking glass
 
Understanding REST
Understanding RESTUnderstanding REST
Understanding REST
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Spotlight
SpotlightSpotlight
Spotlight
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Implementing TEI Standoff Annotation in the Browser

  • 1. IMPLEMENTINGTEI STANDOFF ANNOTATION INTHE BROWSER Hugh Cayless, Duke University @hcayless
  • 2. DEFINITIONS Source: a run of plain text or text + markup. Standoff: markup or annotations which occur away from the source they deal with and which are not referenced directly by that source. Annotation: markup that adds ancillary information to a source or part of a source.
  • 3. STANDOFF The TEI Guidelines mostly use the term “stand-off” in one sense —referring to markup that takes source text in one form and re- constructs it into a different form. Text structured in pages Text structured in chapters and paragraphs source standoff markup restructuring markup
  • 4. STANDOFF But people often use the term “standoff” also to refer to annotations on the source text that associate new information and analysis with it Text Notes, analysis, additional information source standoff markup associative markup
  • 5. WHERE IS IT? Many kinds of associative annotation can occur in multiple contexts. Notes, for example, can appear inline, at the point where the note is anchored, or can use the @target attribute to point at the thing they are annotating. The source itself can point outward to additional information, for example a <persName> with a @ref pointing to a <person> elsewhere.“This string is a name for the person defined over there”. So annotation can be inline, referenced, or standoff.
  • 6. SOMETIMESTHERE’S A CHOICE Some annotations can work in all three ways: e.g. <note> <p>Some text<note>with an inline note</note>.</p> <p xml:id="id">Some text.</p>
 ...
 <note target="#id">with a standoff note</note> <p>Some text.<ptr target="#id"/></p>
 ...
 <note xml:id="id">with a referenced note</note> “Here’s some more information about this paragraph.”
  • 7. BUT Other types of annotation really only work one way: <p><seg>When the Alexandrian war flared up, <persName ref="#JC">Caesar</persName> summoned every fleet from Rhodes and Syria and Cilicia; from Crete he raised archers, and cavalry from Malchus, king of the Nabataeans, and ordered artillery to be procured, corn despatched, and auxiliary troops mustered from every quarter.</ seg>...</p> “The enclosed string is a personal name, which refers to the person defined in the element with the id ‘JC’”.
  • 8. A DIGRESSION ON WORKFLOWS Why would you do standoff markup? 1. To have it both ways: e.g. mark the source up by pages, but have a version with the same text and chapters/paragraphs. 2. As a step in the construction of an edition, e.g. having collaborators identify persons and places without changing the source yet. 3. Adding information to a source you don’t own or can’t modify (but ideally is stable). 4. Adding a new category of information to an already complex, highly- structured source.
  • 9. THREETYPES OF STANDOFF Restructuring standoff: virtually rewrites the structure of the source being annotated; operates on big chunks of text, not really fragments. Associative standoff: juxtaposes some part of the source with a note or other piece of markup; fine-grained, but limited to attaching one bit of information to another. Assertive standoff: would make an assertion about some part of the source, e.g.“This string is a place name.” BUT: how to do it?
  • 10. HOW; SOME IDEAS Use restructuring standoff: rewrite the source with the personal names identified. Adopt a convention, e.g.: 
 <p><seg>When the Alexandrian war flared up, Caesar summoned every fleet from Rhodes and Syria and Cilicia;...</seg>...</p>
 ...
 <person xml:id="Caesar">Julius Caesar</person>
 ...
 <link 
 target="#match(//p[1]/seg[1],'Caesar') #Caesar"/> Note: Not the same thing as <persName ref="#Caesar">Caesar</persName>.
  • 12. HOW;ASSOCIATION <link> with @target, which contains a space-separated list of pointers understood to be associated. <span> with @from and @to, specifying a start and end of the thing being annotated or with @target (somewhat confusingly). <note> with @target All of these require some additional knowledge outside the markup, because all they do is connect things up (they're associative).
  • 13. HOW;ASSERTION Our example using restructuring is assertive. It clearly says “here is a reading of this passage with personal names identified”, but it has some drawbacks: It requires that the whole passage be remade—it can’t target just the names. Annotations can have overlap problems, so restructuring runs into the usual difficulties. They may have interdependencies (if name x refers to person A, then y is probably her brother, person B; if not, then y is probably person Z).
  • 14. DOESTEI HAVE ASSERTIVE ANNOTATIONS? A critical apparatus, or apparatus criticus if you’re being snooty, is a set of annotations that record textual variants an editor wants the reader to know about. <p n="1" xml:id="p1">
 <seg n="1" xml:id="seg-1.1">Bello Alexandrino 
 conflato Caesar <app>
 <lem>Rhodo</lem>
 <rdg wit="#S" ana="#orthographical">Ordo</rdg>
 </app> atque ex Syria Ciliciaque omnem classem 
 arcessit; ...</seg>
 ...
 </p> The lemma (what’s in 
 the editor’s text) A reading; from S (Florence, BML Ashburnham 33)
  • 15. WHAT? WHY WOULDYOU DO SUCH ATHING? Takes the form of inline or standoff notes on the text. Expressly for making assertive annotations in the form "version x reads “B” rather than “A” here. Can accommodate differences in markup as well as text. Can cope reasonably well with overlap. Can handle dependencies / conflicts between annotations.
  • 16. CRITICAL APPARATUS Can also report prior editors’ emendations of the text or speculative emendations by the current editor. Can even record alternate ways of punctuating the text. So it’s not too far-fetched to consider using it for emendations to the markup.
  • 18. OK, FINE, BUTYOU SAID SOMETHING ABOUT IMPLEMENTING IT... We need: 1. A way to identify persons, places, etc. 2. A way to turn that into a usable data source. 3. A way to actually do things with it.
  • 19. RECOGITO (#1) https://recogito.pelagios.org/ Developed mainly by Rainer Simon of the Austrian Institute of Technology for the Pelagios Network (https://pelagios.org/) Designed for, and has most support for place annotations, but does people and organizations too. Exports to CSV, JSON-LD, RDF, GeoJSON, ...and TEI Pretty much covers #1. #2 needs a bit of work.
  • 20. #3TURNS OUTTO BE EASY(ISH) Given a TEI document and annotations like: <listApp>
 <app from="#match(seg-1.1,'Caesar')"><rdg source="#Damon"><persName ref="#Caesar" >Caesar</persName></rdg></app>
 <app from="#match(seg-1.1,'Rhodo')"><rdg source="#Damon"><placeName ref="http://pleiades.stoa.org/places/590031">Rhodo</placeName></rdg></ app>
 <app from="#match(seg-1.1,'Syria')"><rdg source="#Damon"><placeName ref="http://pleiades.stoa.org/places/1306">Syria</placeName></rdg></app>
 <app from="#match(seg-1.1,'Cilicia')"><rdg source="#Damon"><placeName ref="http://pleiades.stoa.org/places/628957">Cilicia</placeName></rdg></ app>
 <app from="#match(seg-1.1,'Creta')"><rdg source="#Damon"><placeName ref="http://pleiades.stoa.org/places/991373">Creta</placeName></rdg></ app>
 ... we can (e.g.) turn the standoff annotations into links
  • 21. THE HARD PART #2, the boring, standards-making part of deciding what TEI standoff annotations actually look like is hard. Export is easy—Recogito will basically already do it—but what does the export look like? There is a proposal underway for a new TEI <standoff> element that could contain (e.g. the output of an annotation session). Maybe later this year we'll be done yelling at each other and be able to actually define it. I hope there's a place in it for assertive annotations, even if they don't look precisely like critical apparatus.