Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Introduction to Text Encoding and the Text Encoding Initiative (TEI) Richard Wisneski Head, Bibliographic/Metadata Service...
<ul><li>This stuff can get difficult. </li></ul><ul><li>This stuff takes time to learn, practice, and patience </li></ul><...
<ul><li>P5 Guidelines, PDF link (The current “Bible” for text encoding):  http://www.tei-c.org/release/doc/tei-p5-doc/en/h...
PART 1: Overview of Text Encoding
<ul><li>Text encoding marks up a document in XML to capture metadata (administrative, descriptive, technical, preservation...
Quick Example <lg> <head>After <del>an</del><add>the <del>unsolv’d</del></add> argument</head> <l><del>The</del><add><del>...
<ul><li>Text encoding does NOT attempt to provide one unique, authoritative version of a work. It often pairs the document...
<ul><li>To allow researchers to have access to an electronic text that does not require special-purpose software or hardwa...
<ul><li>Tailor searching under specific genres (e.g. verse, drama, prose) </li></ul><ul><li>Search different formats (e.g....
<ul><li>Digital libraries and digital archives </li></ul><ul><li>Anthropology and social sciences </li></ul><ul><li>Litera...
<ul><li>Technically : a standards organization for humanities text encoding  </li></ul><ul><li>Organizationally : an inter...
PART 2: Text Encoding and XML
<ul><li>Texts are encoded using eXtensible Markup Language (XML)  </li></ul><ul><li>XML is… </li></ul><ul><li>Easy to unde...
XML Documents Must Be: <ul><li>Well-formed: Have no syntax errors and conform to XML code specifications </li></ul><ul><li...
XML Vocabulary <ul><li>Elements, Content, Attributes, Values </li></ul><ul><li><titleStmt> </li></ul><ul><li><title type=“...
<ul><li><biblStruct> <titleStmt>   <title level=&quot;m&quot;>Early history of the Cleveland Public Schools</title>   <aut...
<ul><li>A valid TEI document follows the rules of a schema that describes it. </li></ul><ul><li>The Schema (or DTD) ensure...
Schema Examples <ul><li><book measure=“centimeters”>21</book>     <xs:element name=“book&quot;> <xs:complexType> <xs:simp...
PART 3: Levels of TEI Encoding
<ul><li>Latest iteration of TEI is Protocol 5 (a.k.a. P5) </li></ul><ul><li>Current TEI Consortium Best Practices Group (f...
Level 1 Encoding:  Fully Automated Conversion and Encoding <ul><li>To create electronic text with the primary purpose of k...
Level 1 Encoding: Characteristics <div1> or <div> There should be only one child of <body>: a single <div> (or <div1>) <ab...
Level 2 Encoding:  Minimal Encoding <ul><li>To create electronic text for full-text searching, linking to page images, and...
Level 2 Encoding: Characteristics All elements specified in Level 1 plus the following: <front>, <back> Optional <div1> or...
<ul><li><?oxygen RNGSchema=&quot;http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng&quot; type=&quot;x...
<ul><li><?oxygen RNGSchema=&quot;http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng&quot; type=&quot;x...
Level 3 Encoding:  Simple Analysis <ul><li>To create text that can stand alone as electronic text </li></ul><ul><li>Identi...
Level 3 Encoding: Characteristics All elements specified in Levels 1 and 2 plus the following : <front>, <back> Required i...
Level 3 Encoding:  General Recommendations <ul><li>Front matter </li></ul><ul><ul><li><div type=&quot;contents&quot;> : Us...
Level 3 Encoding:  Prose Example <TEI xmlns=&quot;http://www.tei-c.org/ns/1.0&quot; xml:id=&quot;VAA2383&quot;>  <teiHeade...
Level 3 Encoding:  Verse Example <TEI xmlns=&quot;http://www.tei-c.org/ns/1.0&quot; xml:id=&quot;VAA2383&quot;>  <teiHeade...
Level 4 Encoding:  Basic Content Analysis <ul><li>To create text that can stand alone as electronic text </li></ul><ul><li...
Level 4 Encoding: Characteristics All elements specified in Levels 1, 2 and 3 plus the following : Et cetera; see TEI BPG ...
<ul><li><p> But it is well authenticated by the observation of every one, that  <del rend=&quot;overstrike&quot; hand=&quo...
Level 5 Encoding:  Scholarly Encoding Projects <ul><li>Level 5 texts are those that require subject knowledge, and encode ...
<ul><li><l>So hath myn </li></ul><ul><li><app> </li></ul><ul><li><lem wit=“#msB #msC”>herte</lem> </li></ul><ul><li><rdg w...
General Recommendations <ul><li>An encoding project should strive for internal consistency and for use of standards so tha...
PART 4: Short Practice in Text Encoding
<ul><li>Author:  James Wallen </li></ul><ul><li>Title:  Cleveland’s Golden Story </li></ul><ul><li>Publishing Place and Pu...
<ul><li>Chapter 1: The Kingdom of Gold </li></ul><ul><li>Gold is the symbol of adventure—the unresting urge that stirs men...
<ul><li><?oxygen RNGSchema=&quot;http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng&quot; type=&quot;x...
<ul><li><?oxygen RNGSchema=&quot;http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng&quot; type=&quot;x...
<ul><li><p> Gold is the symbol of adventure—the unresting urge that stirs men’s souls. Francois de Orlenna, who crossed th...
<ul><li><p>Gold is the symbol of adventure—the unresting urge that stirs men’s souls.  <name type=“person” key=“FDO1”> Fra...
PART 5: TEI Header
<ul><li>Provides administrative, descriptive, and preservation metadata </li></ul><ul><ul><li>Administrative : who created...
<ul><li>Electronic Version Information </li></ul><ul><ul><li>Information about the ELECTRONIC version of the work(s) </li>...
<ul><li>Can reflect a text center’s standards, serve as the basis for other types of metadata system records,  </li></ul><...
Example: MARC to TEI Header <ul><li>LEADER 00000nam 2200000Ia 4500 </li></ul><ul><li>001 49237829 </li></ul><ul><li>003 OC...
Session 2: Text Encoding and the Text Encoding Initiative (TEI) Richard Wisneski Head, Bibliographic/Metadata Services Kel...
PART 6: Some Common Practices in Text Encoding
<ul><li>Make sure you are linked to a schema </li></ul><ul><li><?oxygen RNGSchema=&quot; http://www.tei-c.org/release/xml/...
<ul><li>5. Include  <respStmt> in header for any emendations or corrections you later make to a text that has been previou...
<ul><li>As part of a page: </li></ul><ul><li><figure> </li></ul><ul><li><figDesc>An illuminated page from the de Brailes H...
Footnotes
<ul><li><p>FIRST ANNUAL REPORT<note place=&quot;foot&quot; rend=&quot;*&quot;>This was the first report made after the sch...
<ul><li>In the body of the text: </li></ul><ul><li><p>He liked to eat pie<ptr target=&quot;#note1&quot; rend=&quot;1&quot;...
Another Method
<ul><li><l>(Diff'rent our parties, but with equal grace</l> </li></ul><ul><li><l>The Goddess smiles on Whig and Tory race,...
PART 7: Encoding References
<ul><li>Definition : Things we know about the content of the text that we want to be able to state explicitly to add value...
<ul><li>Names </li></ul><ul><li><persName>Baron Olivier of Brighton</persName>  </li></ul><ul><li><placeName>New York</pla...
<ul><li>’ Ographies: </li></ul><ul><ul><li>prosopography (personography) </li></ul></ul><ul><ul><li>gazetteers (placeograp...
<ul><li>Like a local name authority file </li></ul><ul><li>Can be simple or very detailed </li></ul><ul><li>Can be kept in...
<ul><li><teiHeader> </li></ul><ul><li><!-- ... --> </li></ul><ul><li><particDesc> </li></ul><ul><li><listPerson> </li></ul...
<ul><li>Very similar to personography...but for places </li></ul><ul><li>Can be linked to maps via geographic information ...
<ul><li><body> </li></ul><ul><li><p>The tree stood about a mile east of <placeName ref=&quot;#l_chepachet&quot;>Chepachet<...
<ul><li>To associate a keyword or interpretive concept with a word, phrase, or passage of text:  </li></ul><ul><li><body> ...
Conclusion
<ul><li>More and better documentation </li></ul><ul><li>More use (and support for use) by individuals </li></ul><ul><li>Mo...
<ul><li>Historical Event Markup Language (HEML):  http://www.heml.org/heml-cocoon/ </li></ul><ul><li>Music Markup Language...
<ul><li>WWP Guide to Scholarly Text Encoding:  http://www.wwp.brown.edu/encoding/guide/index.html </li></ul><ul><li>TEI we...
Upcoming SlideShare
Loading in …5
×

Wisneski TeI workshop 2009-2010

1,350 views

Published on

Introduction to text encoding and TEI

Published in: Education, Technology
  • Be the first to comment

Wisneski TeI workshop 2009-2010

  1. 1. Introduction to Text Encoding and the Text Encoding Initiative (TEI) Richard Wisneski Head, Bibliographic/Metadata Services Kelvin Smith Library Case Western Reserve University 2009-2010
  2. 2. <ul><li>This stuff can get difficult. </li></ul><ul><li>This stuff takes time to learn, practice, and patience </li></ul><ul><li>We can only cover so much in this session, but there are further resources to consult after this session… </li></ul>First, Some Ground Rules
  3. 3. <ul><li>P5 Guidelines, PDF link (The current “Bible” for text encoding): http://www.tei-c.org/release/doc/tei-p5-doc/en/html/index.html </li></ul><ul><li>P5 Guidelines, esp. Appendix C: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/index-toc.html </li></ul><ul><li>Women Writers Project Guide to Scholarly Encoding: http://www.wwp.brown.edu/encoding/guide/index.html </li></ul>Sources to Consult
  4. 4. PART 1: Overview of Text Encoding
  5. 5. <ul><li>Text encoding marks up a document in XML to capture metadata (administrative, descriptive, technical, preservation) AND represent textual features important for research. </li></ul><ul><li>Examples: </li></ul><ul><li>The Poetess Archive </li></ul><ul><li>Women Writers Online </li></ul><ul><li>The Dolly Madison Digital Edition </li></ul><ul><li>The Walt Whitman Archive </li></ul>What Is Text Encoding?
  6. 6. Quick Example <lg> <head>After <del>an</del><add>the <del>unsolv’d</del></add> argument</head> <l><del>The</del><add><del>Coming in,</del> A group of</add> little children, and their <lb/>ways and chatter, flow in <del>upon me</del></l> <l>Like <add>welcome</add> rippling water o'er my <lb>heated <add>nerves and</add> flesh.</l> </lg>
  7. 7. <ul><li>Text encoding does NOT attempt to provide one unique, authoritative version of a work. It often pairs the document with interpretation (markup and metadata) </li></ul><ul><li>Text encoding does NOT provide one static, permanent markup for a document. While there can be alternative markup in certain instances, there can be incorrect markup </li></ul><ul><li>Text encoding (TEI) is NOT meant to have an encoding recommendation for all possibilities, but rather intends to be customized and modified within TEI guidelines </li></ul>What Text Encoding Is NOT
  8. 8. <ul><li>To allow researchers to have access to an electronic text that does not require special-purpose software or hardware </li></ul><ul><li>To analyze information – provide a standard text-encoding scheme and metadata language which accommodates searching, retrieval, etc. </li></ul><ul><li>To share information – have a standard format for data interchange in humanities research </li></ul>Why Do Text Encoding?
  9. 9. <ul><li>Tailor searching under specific genres (e.g. verse, drama, prose) </li></ul><ul><li>Search different formats (e.g. chronicle, diary) </li></ul><ul><li>Search across collections </li></ul><ul><li>Search by mode (e.g. satire, pastoral) </li></ul><ul><li>Search by historical or geographic period </li></ul><ul><li>Search by title, author, and subject headings </li></ul><ul><li>Search via structural features of text itself, including: </li></ul><ul><ul><li>Sections </li></ul></ul><ul><ul><li>Headings </li></ul></ul><ul><ul><li>Paragraphs </li></ul></ul><ul><ul><li>Quotations </li></ul></ul><ul><ul><li>Highlighting </li></ul></ul><ul><ul><li>Footnotes </li></ul></ul><ul><ul><li>Captions </li></ul></ul>Text Encoding Allows Users To…
  10. 10. <ul><li>Digital libraries and digital archives </li></ul><ul><li>Anthropology and social sciences </li></ul><ul><li>Literary and cultural materials </li></ul><ul><li>Scholarly editions </li></ul><ul><li>Manuscript collections and descriptions </li></ul><ul><li>Dictionaries </li></ul><ul><li>Language corpora </li></ul><ul><li>Historical documents </li></ul><ul><li>Authoring </li></ul><ul><li>Linguistics </li></ul>Who Does Text Encoding? Where Is It Found?
  11. 11. <ul><li>Technically : a standards organization for humanities text encoding </li></ul><ul><li>Organizationally : an international membership consortium </li></ul><ul><li>Socially : a community of people and projects </li></ul><ul><li>Web site: http://www.tei-c.org/ </li></ul>What Is the Text Encoding Initiative (TEI)?
  12. 12. PART 2: Text Encoding and XML
  13. 13. <ul><li>Texts are encoded using eXtensible Markup Language (XML) </li></ul><ul><li>XML is… </li></ul><ul><li>Easy to understand. </li></ul><ul><li>Non-proprietary plain-text: </li></ul><ul><ul><li>Human readable </li></ul></ul><ul><ul><li>Software independent </li></ul></ul><ul><ul><li>Hardware independent </li></ul></ul><ul><li>(relatively) easy to write a parser for. </li></ul><ul><li>Widespread: Well-supported by commercial and open-source software. </li></ul>Text Encoding and XML
  14. 14. XML Documents Must Be: <ul><li>Well-formed: Have no syntax errors and conform to XML code specifications </li></ul><ul><li><title>Little Memoirs of the Nineteenth Century</title> <author>George Paston</author> </li></ul><ul><li>Valid: Satisfy the rules of a DTD, Schema, or RELAX NG </li></ul><ul><li>If DTD or Schema says that author name must come before the title, then content above would be rejected </li></ul>
  15. 15. XML Vocabulary <ul><li>Elements, Content, Attributes, Values </li></ul><ul><li><titleStmt> </li></ul><ul><li><title type=“m”>Little Memoirs</title> </li></ul>Element Attribute Value Content </titleStmt> Nested <titleStmt> is PARENT ELEMENT. <title> is the CHILD ELEMENT for <titleStmt>
  16. 16. <ul><li><biblStruct> <titleStmt> <title level=&quot;m&quot;>Early history of the Cleveland Public Schools</title> <author><persName>Freese, Andrew</persName></author> </titleStmt> <extent>128 p. : ill. ; 23 cm.</extent> <publicationStmt> <!-- groups information concerning publisher, place of publication, and date of the text --> <pubPlace>Cleveland, Ohio</pubPlace> </li></ul><ul><li><publisher>Robison, Savage &amp; Co., Book Printers</publisher> <date>1876</date> <!-- contains a date in any format, with normalized value in the value attribute, of bibliographic item's original publication --> </publicationStmt> <notesStmt> <note>by Andrew Freese ; Published by order of the Board of Education.</note> </notesStmt> </li></ul><ul><li></biblStruct> </li></ul>Quick Example
  17. 17. <ul><li>A valid TEI document follows the rules of a schema that describes it. </li></ul><ul><li>The Schema (or DTD) ensures that all required elements are present in the document </li></ul><ul><li>The schema may prevent undefined elements from being used </li></ul><ul><li>The schema may enforce a specific data structure </li></ul><ul><li>The schema may specify the use of attributes and define their possible values </li></ul><ul><li>The schema may define default values for attributes </li></ul><ul><li>An XML document can be well-formed but NOT valid </li></ul><ul><li>An XML document can never be valid without being well-formed </li></ul>Validity
  18. 18. Schema Examples <ul><li><book measure=“centimeters”>21</book>  <xs:element name=“book&quot;> <xs:complexType> <xs:simpleContent> <xs:extension base=“xs:string”> </li></ul><ul><li><xs:attribute name=“measure” type=“xs:string” /> </li></ul><ul><li></xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <book bookISBN=“152-32-29359535”>Go Tell It on the Mountain</book> </li></ul><ul><li><authorLastName>Baldwin</authorLastName> </li></ul><ul><li><authorFirstName>James</authorFirstName>  </li></ul><ul><li><xs:element name=“book&quot;> <xs:complexType> <xs:sequence> <xs:element ref=“authorLastName&quot; /> <xs:element ref=“authorFirstName&quot; /> </xs:sequence> <xs:attribute ref=“bookISBN&quot; use=&quot;required&quot; /> </xs:complexType> </xs:element> </li></ul>
  19. 19. PART 3: Levels of TEI Encoding
  20. 20. <ul><li>Latest iteration of TEI is Protocol 5 (a.k.a. P5) </li></ul><ul><li>Current TEI Consortium Best Practices Group (formed in Summer 2008) has been establishing best practices and standards for: </li></ul><ul><ul><li>TEI headers </li></ul></ul><ul><ul><li>Level One encoding: Fully Automated Conversion and Encoding </li></ul></ul><ul><ul><li>Level Two: Minimal Encoding </li></ul></ul><ul><ul><li>Level Three: Simple Analysis </li></ul></ul><ul><ul><li>Level Four: Basic Content Analysis </li></ul></ul><ul><ul><li>Level Five: Scholarly Encoding Projects </li></ul></ul><ul><li>The BPG will present its work at the Digital Library Federation conference in early May, get feedback, and publish a final document later in 2009 </li></ul>Five Levels
  21. 21. Level 1 Encoding: Fully Automated Conversion and Encoding <ul><li>To create electronic text with the primary purpose of keyword searching and linking to page images </li></ul><ul><li>The text is subordinate to the page image, and is not intended to stand alone as an electronic text (without page images). </li></ul><ul><li>Most suitable for: </li></ul><ul><ul><li>A large volume of material to be made available online quickly </li></ul></ul><ul><ul><li>When a digital image of each page is desired </li></ul></ul><ul><ul><li>No manual intervention is performed in the text creation process </li></ul></ul><ul><ul><li>material is of interest to a large community of users who wish to read texts that allow keyword searching </li></ul></ul><ul><ul><li>sophisticated search and display capabilities based on the structure of the text are not necessary </li></ul></ul>
  22. 22. Level 1 Encoding: Characteristics <div1> or <div> There should be only one child of <body>: a single <div> (or <div1>) <ab> There should be only one child of the <div> (or <div1>): a single <ab> wrapping all text OCR text. If the text is ever “upgraded” to a Level 3 or higher, the <ab> element will be replaced by structural elements like <p> and <table>. <pb> Required in Level 1. Page images can be linked to the text by specifying a jpeg or other image file as the value of the facs= attribute. Page numbers can be supplied with the n= attribute to record the number that is on the page. The Task Force sees the use of METS here as having a tremendous advantage. METS/TEI page turning documentation will be included in the near future.
  23. 23. Level 2 Encoding: Minimal Encoding <ul><li>To create electronic text for full-text searching, linking to page images, and identifying simple structural hierarchy to improve navigation. (For example, you can create a table of contents from such encoding.) </li></ul><ul><li>The text is mainly subordinate to the page image, though navigational markers (textual divisions, headings) are captured. However, the text could stand alone as electronic text (without page images) </li></ul><ul><li>Requires some human intervention to identify each textual division and heading. </li></ul><ul><li>Most suitable for: </li></ul><ul><ul><li>A large volume of material to be made available online quickly </li></ul></ul><ul><ul><li>When a digital image of each page is desired </li></ul></ul><ul><ul><li>material is of interest to a large community of users who wish to read texts that allow keyword searching </li></ul></ul><ul><ul><li>Rudimentary search and display capabilities based on the large structures of the text are desired </li></ul></ul><ul><ul><li>Each text is checked to ensure that divisions and headers are properly identified </li></ul></ul>
  24. 24. Level 2 Encoding: Characteristics All elements specified in Level 1 plus the following: <front>, <back> Optional <div1> or <div> If no type= attribute is specified, a type= value of &quot;section&quot; should be presumed. <head> Required if present. <ab> At least one container element is required. <fw> Running heads; can be automatically generated
  25. 25. <ul><li><?oxygen RNGSchema=&quot;http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng&quot; type=&quot;xml&quot;?> <TEI xmlns:xsi=&quot;http://www.w3.org/2001/XMLSchema-instance&quot; xmlns=&quot;http://www.tei-c.org/ns/1.0&quot;> </li></ul><ul><li><TEI xmlns=&quot;http://www.tei-c.org/ns/1.0&quot;>  <teiHeader type=&quot;text&quot;> [stuff] </teiHeader>  <text>    <front>       [title page information, table of contents, prefaces, etc.]      [optional]    </front>    <body>      <div type=&quot;section&quot;>        <pb n=&quot;1&quot; facs=&quot; [URI of page 1 image] &quot;/>        <head> [heading of section 1] </head>        <ab> [entire contents of section 1 here, with           interspersed <pb /> elements pointing to page           images; in this example there are 26 more pages           to section 1] </ab>      </div>      <div type=&quot;section&quot;>        <pb n=&quot;27&quot; facs=&quot; [URI of page 27 image] &quot;/>        <div type=&quot;subsection&quot;>          <head >[heading of section 2 subsection 1]</head>          <ab>[all the paragraphs of subsection one go here            with page breaks inserted] </ab>        </div>      </div>    </body>    <back> [optional] </back>  </text> </TEI> </li></ul>P5 Level 2 Encoding Template
  26. 26. <ul><li><?oxygen RNGSchema=&quot;http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng&quot; type=&quot;xml&quot;?> <TEI xmlns:xsi=&quot;http://www.w3.org/2001/XMLSchema-instance&quot; xmlns=&quot;http://www.tei-c.org/ns/1.0&quot;> </li></ul><ul><li><TEI xml:id=&quot;someid&quot; xmlns=&quot;http://www.tei-c.org/ns/1.0&quot;> </li></ul><ul><li><teiHeader> [Source and processing information goes here] </teiHeader> </li></ul><ul><li><text> </li></ul><ul><li><body> </li></ul><ul><li><div1> </li></ul><ul><li><pb n=&quot;113&quot; facs=&quot; 00000001.tif &quot;/> </li></ul><ul><li><head>POINT VIII: BECAUSE OF UNLAWFUL SURVEILLANCE, PETITIONER'S CONVICTION SHOULD BE VACATED; ALTERNATIVELY, DISCOVERY AND A HEARING SHOULD BE ORDERED.</head> </li></ul><ul><li><ab> POINT VIII. BECAUSE OF UNLAWFUL SURVEILLANCE, PETITIONER'S CONVICTION SHOULD BE VACATED; … </li></ul><ul><li><pb n=&quot;114&quot; facs=&quot;00000002.tif&quot;/> on the Hiss mail in 1945, … </li></ul><ul><li><pb n=&quot;115&quot; facs=&quot;00000003.tif&quot;/> occurred from December 13, 1945 until the Hisses moved from Washington, D.C. to New York City on September 13, 1947. … </li></ul><ul><li></ab> </li></ul><ul><li></div1> </li></ul><ul><li></body> </li></ul><ul><li></text> </li></ul><ul><li></TEI> </li></ul>P5 Level 2 Encoding Example
  27. 27. Level 3 Encoding: Simple Analysis <ul><li>To create text that can stand alone as electronic text </li></ul><ul><li>Identifies hierarchy (logical structure) and typography without content analysis being of primary importance </li></ul><ul><li>Features to be encoded are determined by the logical structure and appearance of the text </li></ul><ul><li>can stand alone as text without page images </li></ul><ul><li>Most suitable for: </li></ul><ul><ul><li>Some sophistication of display, delivery, and searching based on structure of the text is desired </li></ul></ul><ul><ul><li>Texts will be checked to ensure that encoding decisions have been made appropriately </li></ul></ul><ul><ul><li>material is of interest to a large community of users who wish to read texts that allow keyword searching </li></ul></ul>
  28. 28. Level 3 Encoding: Characteristics All elements specified in Levels 1 and 2 plus the following : <front>, <back> Required if present <div> Required if present; type attribute is recommended <floatingText> Recommended if present. <p> Required for paragraph breaks in prose. <lg> and <l> Required for identifying groups of lines and lines, respectively <list> and <item> May be used in this level to indicate ordered and unordered list structures <table>, <row>, and <cell> May be used to indicate table structures. <figure> Required to indicate figures other than page images <hi> Required to indicate changes in typeface; rend attribute is optional <note> All notes must be encoded. It is also recommended that notes that extend beyond one page be combined into one <note> element. Marginal notes, without reference, should occur at the beginning of the paragraph to which they refer, with the value of the place attribute as &quot;margin&quot;
  29. 29. Level 3 Encoding: General Recommendations <ul><li>Front matter </li></ul><ul><ul><li><div type=&quot;contents&quot;> : Use lists to mark up the table of contents with the <ptr> tag used to reference the starting page number. The <ptr> tag can reference the <pb> identifier OR an identifier (e.g., @xml:id) placed in the corresponding division of text. </li></ul></ul><ul><li>Body </li></ul><ul><ul><li><note> Inline. The note is inserted at the point of reference. An n attribute records the value of the note reference if there is one </li></ul></ul><ul><li>Back </li></ul><ul><ul><li><div type=&quot;index&quot;> : Use lists to mark up index entries with the <ref> tag used to reference the corresponding page number. Add the &quot;target&quot; attribute (@target) to reference the <pb> identifier to generate links from the index into the text proper </li></ul></ul><ul><li>Running heads, catch words, and other such forme work information should NOT be included in Level 3, with the exception of page numbers, which are recorded using pb </li></ul>
  30. 30. Level 3 Encoding: Prose Example <TEI xmlns=&quot;http://www.tei-c.org/ns/1.0&quot; xml:id=&quot;VAA2383&quot;> <teiHeader> [stuff] </teiHeader> <text> <front> <div type=&quot;frontispiece&quot;>[figure]</div1> <titlePage>[text]</titlePage> <div type=&quot;dedication&quot;>[text]</div1> <div type=&quot;contents&quot;>[text]</div1> </front> <body> <div type=&quot;book&quot;> <head>[book title]</head> <div type=&quot;chapter“> <pb n=“5” xml:id=“freear-p03” />[text] </div2> <div type=&quot;chapter&quot;> <pb n=“12” xml:id=“freear-p12” />[text] </div2> <div type=&quot;chapter&quot;>[text]</div2> </div> </body> <back> <div type=&quot;appendix&quot;>[text]</div1> <div type=&quot;index&quot;>[text]</div1> </back> </text></TEI> Table of Contents: <!--@target references page break identifier--> <div type=&quot;contents&quot;> <head>CONTENTS</head> <list type=&quot;simple&quot;> <item>I. A Boy and His Dog <hi rend=&quot;right&quot;>3</hi> <ptr target=&quot;#freear-p03&quot;/> </item> <item>II. Romance <hi rend=&quot;right&quot;>12</hi> <ptr target=&quot;#freear-p12&quot;/> </item> </div>
  31. 31. Level 3 Encoding: Verse Example <TEI xmlns=&quot;http://www.tei-c.org/ns/1.0&quot; xml:id=&quot;VAA2383&quot;> <teiHeader> [stuff] </teiHeader> <text> <front> <titlePage>[text]</titlePage> <div type=&quot;dedication&quot;>[text]</div1> <div type=&quot;contents&quot;>[text]</div1> </front> <body> <div type=&quot;book&quot;> <head>[book title]</head> <div type=&quot;part&quot;> <head>[section title]</head> <div type=&quot;poem&quot;> <head>THE DAYS GONE BY.</head> <lg> <l n=&quot;1&quot;>O the days gone by! O the days gone by!</l> <l n=&quot;2&quot;>The apples in the orchard, and the pathway through the rye;</l> <l n=&quot;3&quot;>The chirrup of the robin, and the whistle of the quail</l> <l n=&quot;4&quot;>As he piped across the meadows sweet as any nightingale;</l> </lg> <lg>[lines of poetry]</lg> <lg>[lines of poetry]</lg> </div> </div> </div> </body> </text> </TEI>
  32. 32. Level 4 Encoding: Basic Content Analysis <ul><li>To create text that can stand alone as electronic text </li></ul><ul><li>identifies hierarchy and typography </li></ul><ul><li>specifies function of textual and structural elements </li></ul><ul><li>describes the nature of the content and not merely its appearance. </li></ul><ul><li>Features of the text that may contribute to meaning, such as indentation of verse lines and typographic change, are preserved </li></ul><ul><li>Most suitable for: </li></ul><ul><ul><li>sophisticated search and retrieval capabilities are desired </li></ul></ul><ul><ul><li>texts will be used for textual analysis </li></ul></ul><ul><ul><li>users of the texts may have limited storage or display capabilities </li></ul></ul>
  33. 33. Level 4 Encoding: Characteristics All elements specified in Levels 1, 2 and 3 plus the following : Et cetera; see TEI BPG Guidelines <titlePage> and child elements Required if present <group> Required to encode a collection of independent texts that are regarded as a single group for processing or other purposes <emph>, <foreign>, <gloss>, <term>, or <title> Recommended to identify typographically distinct text <epigraph>, <quote>, <said>, <mentioned>, or <soCalled> Recommended to represent speech, thought, quotation, etc. <sic>, <corr>, or <choice> Recommended to encode errors or typos. <add>, <del>, <gap>, and <unclear> Recommended to encode material that is omitted, added, marked for deletion, or is illegible, invisible, or inaudible <opener>, <dateline>, <salute> <closer>, <signed>, <postscript> Required to indicate specific parts of letters <sp>, <speaker>, and <stage> Required to encode different dramatic structures. <sp> and <speaker> Required to encode oral histories interviews
  34. 34. <ul><li><p> But it is well authenticated by the observation of every one, that <del rend=&quot;overstrike&quot; hand=&quot;JHL&quot;> their manner </del> <add rend=&quot;sup&quot; hand=&quot;JHL&quot;> this way—i.e. the above </add> of writing influences the style of compos. of those who practise it considerably, when they grow up to years of manhood; for their productions, <del hand=&quot;JHL&quot; rend=&quot;overstrike&quot;> instead </del> far from being terse, argumentative, convincing, are without head or tail & are generally an incongruous mass mixed up in the most disgusting manner, without divisions or heads & in short without a subject (so to speak). </p> </li></ul>Example of Level 4 Encoding
  35. 35. Level 5 Encoding: Scholarly Encoding Projects <ul><li>Level 5 texts are those that require subject knowledge, and encode semantic, linguistic, prosodic, or other elements beyond a basic structural level </li></ul>
  36. 36. <ul><li><l>So hath myn </li></ul><ul><li><app> </li></ul><ul><li><lem wit=“#msB #msC”>herte</lem> </li></ul><ul><li><rdg wit=“#msA”>hert</rdg> </li></ul><ul><li><rdg wit=“#msD”>minde</rdg> </li></ul><ul><li><rdg wit=“#msE>mynde</rdg> </li></ul><ul><li></app> </li></ul><ul><li>Caught in remembraunce</l> </li></ul>Example: Variant Readings in Level 5 Apparatus; critical apparatus Lemma, or base text
  37. 37. General Recommendations <ul><li>An encoding project should strive for internal consistency and for use of standards so that the data can be modified or enhanced in the future with ease </li></ul><ul><li>When reformatting to digital media using any level of encoding, the electronic text should begin with the transcription of the first word on the first leaf of the original work </li></ul><ul><li>Certain features of the text, such as publisher's advertisements or indexes, should be included as links to page images </li></ul><ul><li>Any omissions of material found in the original work should be noted in the <editorialDecl> in the TEI header </li></ul><ul><li>An encoding project should use only numbered divisions (i.e., <div1>, <div2>, etc.) or unnumbered divisions (i.e., <div>) but not both </li></ul><ul><li>Whether numbered or unnumbered divisions are used, the @type attribute of the division element is not recommended at level 1, is optional at level 2, is recommended at level 3, and required at levels 4 and 5 </li></ul><ul><li>Page breaks should be encoded using the <pb> element, which should demark the top of a page (i.e. the text of page seven should immediately follow <pb n=&quot;7&quot;/>), and should always be contained within a div for ease of retrieval with indexing software </li></ul>
  38. 38. PART 4: Short Practice in Text Encoding
  39. 39. <ul><li>Author: James Wallen </li></ul><ul><li>Title: Cleveland’s Golden Story </li></ul><ul><li>Publishing Place and Publisher : [Cleveland, OH]: Wm. Taylor Son & Co. </li></ul><ul><li>Year : 1920 </li></ul><ul><li>93 pp. </li></ul><ul><li>CONTENTS </li></ul>TEXT ETC . Chapter 1. The Kingdom of God. 1 Chapter 2. Lincoln-Hearted Men 9 Chapter 3. Taming the Wilderness 19
  40. 40. <ul><li>Chapter 1: The Kingdom of Gold </li></ul><ul><li>Gold is the symbol of adventure—the unresting urge that stirs men’s souls. Francois de Orlenna, who crossed the South American continent from ocean to ocean in 1540, wrote, “Having eaten our boots and saddles, boiled with a few wild herbs, we set out to reach the Kingdom of Gold.” </li></ul><ul><li>His catalog of iritations included: </li></ul><ul><li>1. The weather </li></ul><ul><li>2. The peacocks </li></ul><ul><li>3. His meagre grasp of Hamlet, Prince of Denmark </li></ul>Chapter Heading and Paragraph
  41. 41. <ul><li><?oxygen RNGSchema=&quot;http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng&quot; type=&quot;xml&quot;?> <TEI xmlns:xsi=&quot;http://www.w3.org/2001/XMLSchema-instance&quot; xmlns=&quot;http://www.tei-c.org/ns/1.0&quot;> </li></ul><ul><li><teiHeader type=&quot;text&quot;> [stuff goes here] </teiHeader> </li></ul><ul><li><text> </li></ul><ul><li><front> </li></ul><ul><li><list> </li></ul><ul><li><item> Chapter 1. The Kingdom of God. 1 </item> </li></ul><ul><li><item> Chapter 2. Lincoln-Hearted Men. 9 </item> [ETC.] </li></ul><ul><li></list> </li></ul><ul><li></front> </li></ul><ul><li><body> </li></ul><ul><li><div type=“section&quot;> </li></ul><ul><li><pb n=“1&quot; facs=&quot;p1.jpg&quot;/> </li></ul><ul><li><head> The Kingdom of God </head> </li></ul><ul><li><ab> </li></ul><ul><li>[a whole section is contained within this anonymous block tag; interspersed with <pb> elements pointing to page images] <pb xml:id=&quot;p21198-zz0002mpwb&quot; n=&quot;2&quot;/> </li></ul><ul><li></ab> </li></ul><ul><li></div> </li></ul><ul><li></body> </li></ul><ul><li><back> [optional] </back> </li></ul><ul><li></text> </li></ul><ul><li></TEI> </li></ul>P5 Level 2 Encoding
  42. 42. <ul><li><?oxygen RNGSchema=&quot;http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng&quot; type=&quot;xml&quot;?> <TEI xmlns:xsi=&quot;http://www.w3.org/2001/XMLSchema-instance&quot; xmlns=&quot;http://www.tei-c.org/ns/1.0&quot;> </li></ul><ul><li><teiHeader type=&quot;text&quot;>[stuff goes here]</teiHeader> </li></ul><ul><li><text> </li></ul><ul><li><front> </li></ul><ul><li><pb n=“1” xml:id=“walcle01-00” /> </li></ul><ul><li><div type=“contents”> </li></ul><ul><li><list> </li></ul><ul><li><item>Chapter 1. The Kingdom of God. <hi rend=&quot;right&quot;> 1 </hi> <ptr target=“#walcle01-p1”/> </item> </li></ul><ul><li><item>Chapter 2. Lincoln-Hearted Men. <hi rend=&quot;right&quot;> 9 </hi><ptr target=“# walcle01-p1”/> </item> [ETC.] </li></ul><ul><li></list> </li></ul><ul><li></div> </li></ul><ul><li></front> </li></ul><ul><li><body> </li></ul><ul><li><div type=“ chapter &quot;> </li></ul><ul><li><pb n=“1&quot; xml:id=&quot;walcle01-p1&quot;/> </li></ul><ul><li><head type=“main”> Chapter 1 </head> </li></ul><ul><li><head type=“subtitle”> The Kingdom of God </head> </li></ul><ul><li><p> [FIRST PARAGRAPH GOES HERE] </p> </li></ul><ul><li></div> </li></ul><ul><li></body> </li></ul><ul><li><back> [optional] </back> </text></TEI> </li></ul>P5 Level 3 Encoding
  43. 43. <ul><li><p> Gold is the symbol of adventure—the unresting urge that stirs men’s souls. Francois de Orlenna, who crossed the South American continent from ocean to ocean in 1540, wrote, “Having eaten our boots and saddles, boiled with a few wild herbs, we set out to reach the Kingdom of Gold.” </p> </li></ul><ul><li><p> His catalog of iritations included: </li></ul><ul><li><list> </li></ul><ul><li><item> 1. The weather </item> </li></ul><ul><li><item> 2. The peacocks </item> </li></ul><ul><li><item> 3. His meagre grasp of <hi> Hamlet, Prince of Denmark </hi> </item> </li></ul><ul><li></list> </li></ul><ul><li></p> </li></ul>P5 Level 3 Continued
  44. 44. <ul><li><p>Gold is the symbol of adventure—the unresting urge that stirs men’s souls. <name type=“person” key=“FDO1”> Francois de Orlenna </name> , who crossed the South American continent from ocean to ocean in <date when=“1540”> 1540 </date> , wrote, <q> Having eaten our boots and saddles, boiled with a few wild herbs, we set out to reach the Kingdom of Gold. </q> </p> </li></ul><ul><li><p>His catalog of <sic> iritations <sic><corr> irritations </corr> included: </li></ul><ul><li><list> </li></ul><ul><li><item> 1. The weather</item> </li></ul><ul><li><item> 2. The peacocks </item> </li></ul><ul><li><item> 3. His meagre grasp of <hi> <bibl><title ref=“hamlet1”> Hamlet, Prince of Denmark </title></bibl> </hi> </item> </li></ul><ul><li></list> </li></ul><ul><li></p> </li></ul><ul><li>… </li></ul><ul><li><bibStruct xml:id=“hamlet1”> </li></ul><ul><li><monogr> </li></ul><ul><li><author> Shakespeare, William </author> </li></ul><ul><li><title> Hamlet, Prince of Denmark </title> </li></ul><ul><li><date> </li></ul><ul><li></monogr> </bibStruct> </li></ul>P5 Level 4 Encoding
  45. 45. PART 5: TEI Header
  46. 46. <ul><li>Provides administrative, descriptive, and preservation metadata </li></ul><ul><ul><li>Administrative : who created the metadata? When was it created? Where is the original item located? Etc. </li></ul></ul><ul><ul><li>Descriptive : title, author, publication info, subject headings, number of pages, etc. </li></ul></ul><ul><ul><li>Preservation : file size, identifier, format, etc. </li></ul></ul>TEI Header
  47. 47. <ul><li>Electronic Version Information </li></ul><ul><ul><li>Information about the ELECTRONIC version of the work(s) </li></ul></ul><ul><li>Electronic Distributor Information </li></ul><ul><ul><li>Information about the publisher of the ELECTRONIC version of the work(s) </li></ul></ul><ul><ul><li>E.g. William Taylor & Co. published the original work, but Kelvin Smith Library is publishing the electronic version </li></ul></ul><ul><li>Original Document Bibliographic Information </li></ul><ul><ul><li>Bibliographic information of the text from which the electronic version was derived. May be generated from MARC record (but does not have to be). </li></ul></ul><ul><li>Encoding Description </li></ul><ul><ul><li>Includes project description, encoding level declaration, what classification structure is used (e.g. LCSH), etc. </li></ul></ul><ul><li>Profile Description </li></ul><ul><ul><li>Includes text language, subject terms </li></ul></ul><ul><li>Revision Description </li></ul><ul><ul><li>If any revision was done to the TEI document, this is where that information is recorded, included revision details, party(ies) involved, and date(s) </li></ul></ul>Basic Components of TEI Header
  48. 48. <ul><li>Can reflect a text center’s standards, serve as the basis for other types of metadata system records, </li></ul><ul><li>Can function in detached form as records in a catalog, as a title page inherent to the document, or as a source for index displays </li></ul><ul><li>May describe a collection of documents, a single item, or a portion of an item </li></ul><ul><li>A TEI header may NOT necessarily have a one to one correspondence with a MARC record. One TEI header may have multiple MARC analytic records, or one MARC record may be used to describe a collection of TEI documents with individual headers </li></ul><ul><li>May contain an historical background on how the file has been treated and extend the information of a classic catalog record </li></ul><ul><li>There is no ONE header template. Modification needed depending on project, text type. </li></ul>TEI Header (continued)
  49. 49. Example: MARC to TEI Header <ul><li>LEADER 00000nam 2200000Ia 4500 </li></ul><ul><li>001 49237829 </li></ul><ul><li>003 OCoLC </li></ul><ul><li>005 20020305071435.0 </li></ul><ul><li>008 020305s1905 ohu r 000 0 eng d </li></ul><ul><li>040 CWR|cCWR </li></ul><ul><li>049 CWRR </li></ul><ul><li>090 BJ1161|b.G6 1905a </li></ul><ul><li>100 1 Given, Charles Stewart </li></ul><ul><li>245 12 A fleece of gold :|bfive lessons from the fable of Jason and the Golden Fleece /|cby Charles Stewart Given </li></ul><ul><li>260 Cincinnati [Ohio] :|bJennings and Graham ;|aNew York [N.Y.] :|bEaton and Mains,|cc1905 </li></ul><ul><li>300 103 p. ;|c18 cm </li></ul><ul><li>533 Photocopy.|bLaCrosse, Wis. :|cBrookhaven Press : digital production by Northern Micrographics, Inc.,|d2001.|e18 cm </li></ul><ul><li>650 0 Success </li></ul><ul><li>650 0 Conduct of life </li></ul><ul><li>650 0 Jason (Greek mythology) </li></ul><ul><li><sourceDesc> <biblStruct> <titleStmt> <title type=&quot;main&quot;>A fleece of gold</title> <title type=&quot;sub&quot;>five lessons from the fable of Jason and the Golden Fleece</title> <!-- subheading [if applicable] --> <author> </li></ul><ul><li><persName>Given, Charles Stewart </li></ul><ul><li><persName> </li></ul><ul><li></author> </titleStmt> <extent>103 p.</extent> <publicationStmt> <!-- groups information concerning publisher, place of publication, and date of the text --> <pubPlace>Cincinnati [Ohio]</pubPlace> <publisher>Jennings and Graham</publisher> <date>1905</date> <idno>BJ1161 .G6 1905a</idno> </publicationStmt> </biblStruct> </sourceDesc> </li></ul><ul><li>… </li></ul><ul><li><profileDesc> <keywords scheme=&quot;LCSH&quot;> <!-- if the keywords come from a controlled vocabulary, it can be identified by the scheme attribute --> <term>Success</term> <term>Conduct of life</term> <term>Jason (Greek mythology)</term> </keywords> </textClass> </profileDesc> </li></ul>
  50. 50. Session 2: Text Encoding and the Text Encoding Initiative (TEI) Richard Wisneski Head, Bibliographic/Metadata Services Kelvin Smith Library Case Western Reserve University 2009-2010
  51. 51. PART 6: Some Common Practices in Text Encoding
  52. 52. <ul><li>Make sure you are linked to a schema </li></ul><ul><li><?oxygen RNGSchema=&quot; http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng &quot; type=“xml&quot;?> </li></ul><ul><li>2. Include as much of the header elements as possible – including <editorialDecl> </li></ul><ul><li><editorialDecl> <hyphenation eol=&quot;none&quot;> <p>Hyphenated words that appear at the end of a line have been removed</p> </hyphenation> </li></ul><ul><li></editorialDecl> </li></ul><ul><li>3. Make use of spell-check: F4 key </li></ul><ul><li>4. Delete hyphens within all words – except in special cases (e.g. poetry, dramatic verse) </li></ul><ul><li>Institu-tion becomes Institution </li></ul>Using oXygen
  53. 53. <ul><li>5. Include <respStmt> in header for any emendations or corrections you later make to a text that has been previously encoded </li></ul><ul><li><respStmt> <name xml:id=&quot;rlw54&quot;>Richard Wisneski</name> </li></ul><ul><li><!-- one resp per respStmt --> <resp>TEI Header creator</resp> </li></ul><ul><li><!-- OR TEI Header and document creator --> </respStmt> </li></ul><ul><li>6. Page breaks must be inserted, using <pb n=“[page number]” xml:id=“” /> </li></ul><ul><li><pb xml:id=&quot;fleboo-032&quot; n=&quot;33&quot;/> </li></ul><ul><li>OR, if one desires to have page reference the specific page image: </li></ul><ul><li><pb facs=“fleboo-032.jp2“ n=&quot;33&quot;/> </li></ul><ul><li>xml:id attribute MUST be (a) unique and (b) start with a letter </li></ul><ul><li>facs attribute MAY link to a permanent URL or URI </li></ul>Common Practices (continued)
  54. 54. <ul><li>As part of a page: </li></ul><ul><li><figure> </li></ul><ul><li><figDesc>An illuminated page from the de Brailes Hours, containing a historiated initial with a signed self-portrait of William de Brailes </li></ul><ul><li></figDesc> </li></ul><ul><li><graphic url=&quot;./gfx/debrailes_ms.jpg&quot; height=&quot;600px&quot;/> </li></ul><ul><li></figure> </li></ul>Inserting Images <figDesc> is not required in Level 3, but we are using it to capture either an image caption or to describe the image if a caption is not present
  55. 55. Footnotes
  56. 56. <ul><li><p>FIRST ANNUAL REPORT<note place=&quot;foot&quot; rend=&quot;*&quot;>This was the first report made after the schools were regularly organized under the ordinance. The Bethel School mentioned in the opening paragraph had existed through a part of the previous year, the year 1836, and a Board appointed to look after its interests, had made an informal— probably oral — report.</note> OF THE- BOARD OF MANAGERS OF COMMON SCHOOLS.</p> </li></ul>Footnote Encoded (and Marginalia) If this note were in the MARGIN of the page, it would be encoded, for example: <note type=“auth” place=“margin-left”> text, text,text </note> Type= and rend= attributes are optional
  57. 57. <ul><li>In the body of the text: </li></ul><ul><li><p>He liked to eat pie<ptr target=&quot;#note1&quot; rend=&quot;1&quot;/>.</p> </li></ul><ul><li>OR… </li></ul><ul><li><p>See <ref target=&quot;#note1&quot;>Note 30</ref></p> </li></ul><ul><li>At the end of the text: </li></ul><ul><li><div type=&quot;endnotes&quot;> <pb n=&quot;130&quot; /> <p xml:id=“note1&quot;>Pie is a dessert often eaten.</p> </div> </li></ul>Endnotes
  58. 58. Another Method
  59. 59. <ul><li><l>(Diff'rent our parties, but with equal grace</l> </li></ul><ul><li><l>The Goddess smiles on Whig and Tory race, </li></ul><ul><li><ptr rend=&quot;unmarked&quot; target=&quot;#note3.284&quot;/> </li></ul><ul><li></l> </li></ul><ul><li><l>'Tis the same rope at sev'ral ends they twist,</l> </li></ul><ul><li><l>To Dulness, Ridpath is as dear as Mist)</l> </li></ul><ul><li><note xml:id=&quot;note3.284&quot; type=&quot;imitation&quot; place=&quot;foot&quot; anchored=&quot;false&quot;> <bibl>Virg. Æn. 10.</bibl> <quote> <l>Tros Rutulusve fuat; nullo discrimine habebo.</l> <l>—— Rex Jupiter omnibus idem.</l> </quote> </li></ul><ul><li></note> </li></ul>Encode with Pointer and Link
  60. 60. PART 7: Encoding References
  61. 61. <ul><li>Definition : Things we know about the content of the text that we want to be able to state explicitly to add value to the text or assist the reader in understanding it better, such as: </li></ul><ul><ul><li>Authority control: information about the identity of things named in the text: people, places, books, etc. </li></ul></ul><ul><ul><li>Additional information about: birthdates, geographical locations, date published, etc. </li></ul></ul><ul><ul><li>Interpretive information: themes, keywords </li></ul></ul><ul><ul><li>Normalization of measurements, dates, etc. </li></ul></ul>Encoding Contextual Information
  62. 62. <ul><li>Names </li></ul><ul><li><persName>Baron Olivier of Brighton</persName> </li></ul><ul><li><placeName>New York</placeName> </li></ul><ul><li><orgName>Podunk Sewing Club</orgName> </li></ul><ul><li>Linguistic: <foreign>, <distinct>, <soCalled>, <mentioned>, <term>, <emph> </li></ul><ul><li><distinct>dinna ken</distinct> why that </li></ul><ul><li><foreign xml:lang=&quot;fr&quot;>soi-disant</foreign> <soCalled>expert</soCalled> </li></ul><ul><li>must be <emph>so</emph> particular about pronouncing </li></ul><ul><li><mentioned xml:lang=&quot;cy&quot;>Llandaff</mentioned> using a </li></ul><ul><li><term>voiceless lateral fricative</term> </li></ul>Common Tags for Contextual Information
  63. 63. <ul><li>’ Ographies: </li></ul><ul><ul><li>prosopography (personography) </li></ul></ul><ul><ul><li>gazetteers (placeography) </li></ul></ul><ul><ul><li>orgography, bibliography </li></ul></ul><ul><li>These are like local authority lists that you create </li></ul><ul><li>Keywords applied to the text as a whole </li></ul><ul><li>Thematic or interpretive information applied to specific places in the text </li></ul>Types of Contextual Information
  64. 64. <ul><li>Like a local name authority file </li></ul><ul><li>Can be simple or very detailed </li></ul><ul><li>Can be kept in your encoded file or externally </li></ul><ul><li>Includes specific elements for the most common data </li></ul><ul><li>Also includes general elements for the unforeseen </li></ul>Personography
  65. 65. <ul><li><teiHeader> </li></ul><ul><li><!-- ... --> </li></ul><ul><li><particDesc> </li></ul><ul><li><listPerson> </li></ul><ul><li><person xml:id=&quot;andrew_j_steere&quot;> </li></ul><ul><li><persName>Steere, Andrew J.</persName> </li></ul><ul><li><birth when=&quot;1844&quot;> </li></ul><ul><li><placeName ref=&quot;#l_scituate&quot;>Scituate, RI</placeName> </li></ul><ul><li></birth> </li></ul><ul><li><death notBefore=&quot;1918&quot;/> </li></ul><ul><li></person> </li></ul><ul><li><person xml:id=&quot;george_pope_morris&quot;> </li></ul><ul><li><persName>Morris, George Pope</persName> </li></ul><ul><li><birth when=&quot;1802&quot;> </li></ul><ul><li><placeName>Philadelphia, PA</placeName> </li></ul><ul><li></birth> </li></ul><ul><li><death when=&quot;1864&quot;/> </li></ul><ul><li></person> </li></ul><ul><li></listPerson> </li></ul><ul><li></particDesc> </li></ul><ul><li></teiHeader> </li></ul><ul><li><text> </li></ul><ul><li><p>...However, the plea of Woodman spare that tree and the patriotic pride </li></ul><ul><li>of the owner, <persName ref=&quot;#andrew_j_steere&quot;>Mr. Andrew J. Steere</persName>, </li></ul><ul><li>had guaranteed its safety from the woodsman’s axe. </li></ul><ul><li></p> </li></ul><ul><li></text> </li></ul>Personography Encoding TEI header Participation description listPerson person
  66. 66. <ul><li>Very similar to personography...but for places </li></ul><ul><li>Can be linked to maps via geographic information data </li></ul>Placeography (Gazetteer)
  67. 67. <ul><li><body> </li></ul><ul><li><p>The tree stood about a mile east of <placeName ref=&quot;#l_chepachet&quot;>Chepachet</placeName> and a mile north of <placeName ref=&quot;#l_spring_grove&quot;>Spring </li></ul><ul><li>Grove</placeName> ... </p> </li></ul><ul><li><!-- ... --> </li></ul><ul><li></body> </li></ul><ul><li><back> </li></ul><ul><li><div type=&quot;editorial&quot;> </li></ul><ul><li><listPlace> </li></ul><ul><li><place type=&quot;state&quot; xml:id=&quot;l_rhode_island&quot;> </li></ul><ul><li><placeName>The State of Rhode Island and Providence Plantations</placeName> </li></ul><ul><li><country>United States of America</country> </li></ul><ul><li><region>New England</region> </li></ul><ul><li></place> </li></ul><ul><li><place type=&quot;settlement&quot; xml:id=&quot;l_chepachet&quot;> </li></ul><ul><li><placeName>Chepachet</placeName> </li></ul><ul><li><region ref=&quot;#l_rhode_island&quot;/> </li></ul><ul><li><location> </li></ul><ul><li><geo>41.915131 -71.671397</geo> </li></ul><ul><li></location> </li></ul><ul><li></place> </li></ul><ul><li><place type=&quot;settlement&quot; xml:id=&quot;l_spring_grove&quot;> </li></ul><ul><li><placeName>Spring Grove</placeName> </li></ul><ul><li><region ref=&quot;#l_rhode_island&quot;/> </li></ul><ul><li><location> </li></ul><ul><li><geo>41.905583 -71.656219</geo> </li></ul><ul><li></location> </li></ul><ul><li></place> </li></ul><ul><li></listPlace> </li></ul><ul><li></div> </li></ul><ul><li></back> </li></ul>Placeography Encoding back div place
  68. 68. <ul><li>To associate a keyword or interpretive concept with a word, phrase, or passage of text: </li></ul><ul><li><body> </li></ul><ul><li><div type=&quot;section&quot;> </li></ul><ul><li><p>However, the plea of <quote>Woodman spare that tree</quote> and the <seg ana=&quot;#patriotism&quot;>patriotic pride of the owner</seg>, <persName>Mr. Andrew J. Steere</persName>, had <seg ana=&quot;#conservation&quot;>guaranteed its safety from </li></ul><ul><li>the woodsman’s axe</seg>...</p> </li></ul><ul><li></div> </li></ul><ul><li></body> </li></ul><ul><li><back> </li></ul><ul><li><div type=&quot;editorial&quot;> </li></ul><ul><li><interpGrp> </li></ul><ul><li><interp xml:id=&quot;ri_history&quot;>Rhode Island local history</interp> </li></ul><ul><li><interp xml:id=&quot;patriotism&quot;>Patriotism and references to the war effort</interp> </li></ul><ul><li><interp xml:id=&quot;commercial&quot;>References to commercial harvesting and use of </li></ul><ul><li>trees</interp> </li></ul><ul><li><interp xml:id=&quot;conservation&quot;>Conservation efforts and protection of trees</interp> </li></ul><ul><li><interp xml:id=&quot;arboriculture&quot;>References to tree species and their cultivation</interp> </li></ul><ul><li></interpGrp> </li></ul><ul><li></div> </li></ul><ul><li></back> </li></ul>Interpretative Keywords and Themes
  69. 69. Conclusion
  70. 70. <ul><li>More and better documentation </li></ul><ul><li>More use (and support for use) by individuals </li></ul><ul><li>More discipline-specific customizations </li></ul>Future Trends in TEI
  71. 71. <ul><li>Historical Event Markup Language (HEML): http://www.heml.org/heml-cocoon/ </li></ul><ul><li>Music Markup Language: http://www.musicmarkup.info/ </li></ul><ul><li>Multi-Element Coding System: http://helmer.hit.uib.no/claus/mecs/mecs.htm </li></ul>Other Encoding Possibilities
  72. 72. <ul><li>WWP Guide to Scholarly Text Encoding: http://www.wwp.brown.edu/encoding/guide/index.html </li></ul><ul><li>TEI web site: http://www.tei-c.org/index.xml </li></ul><ul><li>The TEI listserv (TEI-L) </li></ul><ul><li>TEI Wiki: http://www.tei-c.org/wiki/index.php/Main_Page </li></ul><ul><li>Teach Yourself TEI: http://www.tei-c.org/Support/Learn/tutorials.xml </li></ul><ul><li>Guidelines for Text Encoding and Interchange: http://quod.lib.umich.edu/t/tei/ </li></ul><ul><li>A Gentle Introduction to XML: http://www.tei-c.org/release/doc/tei-p4-doc/html/SG.html </li></ul><ul><li>A Companion to Digital Literary Studies: http://www.digitalhumanities.org/companion/DLS/ </li></ul>References

×