SlideShare a Scribd company logo
Introduction to Text Encoding and the Text Encoding Initiative (TEI) Richard Wisneski Head, Bibliographic/Metadata Services Kelvin Smith Library Case Western Reserve University 2009-2010
[object Object],[object Object],[object Object],First, Some Ground Rules
[object Object],[object Object],[object Object],Sources to Consult
PART 1: Overview of Text Encoding
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],What Is Text Encoding?
Quick Example <lg> <head>After <del>an</del><add>the <del>unsolv’d</del></add> argument</head> <l><del>The</del><add><del>Coming in,</del> A group of</add> little children, and their <lb/>ways and chatter, flow in <del>upon me</del></l> <l>Like <add>welcome</add> rippling water o'er my <lb>heated <add>nerves and</add> flesh.</l> </lg>
[object Object],[object Object],[object Object],What Text Encoding Is NOT
[object Object],[object Object],[object Object],Why Do Text Encoding?
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Text Encoding Allows Users To…
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Who Does Text Encoding? Where Is It Found?
[object Object],[object Object],[object Object],[object Object],What Is the Text Encoding Initiative (TEI)?
PART 2: Text Encoding and XML
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Text Encoding and XML
XML Documents Must Be: ,[object Object],[object Object],[object Object],[object Object]
XML Vocabulary ,[object Object],[object Object],[object Object],Element Attribute Value Content </titleStmt> Nested <titleStmt> is PARENT ELEMENT. <title> is the CHILD ELEMENT for <titleStmt>
[object Object],[object Object],[object Object],Quick Example
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Validity
Schema Examples ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
PART 3: Levels of TEI Encoding
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Five Levels
Level 1 Encoding:  Fully Automated Conversion and Encoding ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Level 1 Encoding: Characteristics <div1> or <div> There should be only one child of <body>: a single <div> (or <div1>) <ab> There should be only one child of the <div> (or <div1>): a single <ab> wrapping all text OCR text. If the text is ever “upgraded” to a Level 3 or higher, the <ab> element will be replaced by structural elements like <p> and <table>. <pb> Required in Level 1. Page images can be linked to the text by specifying a jpeg or other image file as the value of the facs= attribute. Page numbers can be supplied with the n= attribute to record the number that is on the page. The Task Force sees the use of METS here as having a tremendous advantage. METS/TEI page turning documentation will be included in the near future.
Level 2 Encoding:  Minimal Encoding ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Level 2 Encoding: Characteristics All elements specified in Level 1 plus the following: <front>, <back> Optional <div1> or <div> If no type= attribute is specified, a type= value of &quot;section&quot; should be presumed.  <head> Required if present. <ab> At least one container element is required. <fw> Running heads; can be automatically generated
[object Object],[object Object],P5 Level 2 Encoding Template
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],P5 Level 2 Encoding Example
Level 3 Encoding:  Simple Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Level 3 Encoding: Characteristics All elements specified in Levels 1 and 2 plus the following : <front>, <back> Required if present <div> Required if present; type attribute is recommended  <floatingText> Recommended if present. <p> Required for paragraph breaks in prose. <lg> and <l> Required for identifying groups of lines and lines, respectively <list> and <item> May be used in this level to indicate ordered and unordered list structures <table>, <row>, and <cell> May be used to indicate table structures. <figure> Required to indicate figures other than page images <hi> Required to indicate changes in typeface; rend attribute is optional <note> All notes must be encoded. It is also recommended that notes that extend beyond one page be combined into one <note> element. Marginal notes, without reference, should occur at the beginning of the paragraph to which they refer, with the value of the place attribute as &quot;margin&quot;
Level 3 Encoding:  General Recommendations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Level 3 Encoding:  Prose Example <TEI xmlns=&quot;http://www.tei-c.org/ns/1.0&quot; xml:id=&quot;VAA2383&quot;>  <teiHeader> [stuff] </teiHeader> <text> <front> <div type=&quot;frontispiece&quot;>[figure]</div1>  <titlePage>[text]</titlePage>  <div type=&quot;dedication&quot;>[text]</div1> <div type=&quot;contents&quot;>[text]</div1> </front> <body> <div type=&quot;book&quot;> <head>[book title]</head> <div type=&quot;chapter“> <pb n=“5” xml:id=“freear-p03” />[text] </div2> <div type=&quot;chapter&quot;> <pb n=“12” xml:id=“freear-p12” />[text] </div2> <div type=&quot;chapter&quot;>[text]</div2> </div> </body> <back> <div type=&quot;appendix&quot;>[text]</div1> <div type=&quot;index&quot;>[text]</div1> </back> </text></TEI> Table of Contents: <!--@target references page break identifier--> <div type=&quot;contents&quot;> <head>CONTENTS</head> <list type=&quot;simple&quot;> <item>I. A Boy and His Dog <hi rend=&quot;right&quot;>3</hi> <ptr target=&quot;#freear-p03&quot;/> </item> <item>II. Romance <hi rend=&quot;right&quot;>12</hi> <ptr target=&quot;#freear-p12&quot;/> </item> </div>
Level 3 Encoding:  Verse Example <TEI xmlns=&quot;http://www.tei-c.org/ns/1.0&quot; xml:id=&quot;VAA2383&quot;>  <teiHeader> [stuff] </teiHeader> <text> <front> <titlePage>[text]</titlePage> <div type=&quot;dedication&quot;>[text]</div1> <div type=&quot;contents&quot;>[text]</div1> </front>  <body> <div type=&quot;book&quot;> <head>[book title]</head> <div type=&quot;part&quot;> <head>[section title]</head> <div type=&quot;poem&quot;> <head>THE DAYS GONE BY.</head>   <lg> <l n=&quot;1&quot;>O the days gone by! O the days gone by!</l>   <l n=&quot;2&quot;>The apples in the orchard, and the pathway through the rye;</l> <l n=&quot;3&quot;>The chirrup of the robin, and the whistle of the quail</l> <l n=&quot;4&quot;>As he piped across the meadows sweet as any nightingale;</l> </lg> <lg>[lines of poetry]</lg> <lg>[lines of poetry]</lg> </div> </div> </div>  </body>  </text>  </TEI>
Level 4 Encoding:  Basic Content Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Level 4 Encoding: Characteristics All elements specified in Levels 1, 2 and 3 plus the following : Et cetera; see TEI BPG Guidelines <titlePage> and child elements Required if present <group> Required to encode a collection of independent texts that are regarded as a single group for processing or other purposes  <emph>, <foreign>, <gloss>, <term>, or <title> Recommended to identify typographically distinct text <epigraph>, <quote>, <said>, <mentioned>, or <soCalled> Recommended to represent speech, thought, quotation, etc. <sic>, <corr>, or <choice> Recommended to encode errors or typos. <add>, <del>, <gap>, and <unclear> Recommended to encode material that is omitted, added, marked for deletion, or is illegible, invisible, or inaudible <opener>, <dateline>, <salute> <closer>, <signed>, <postscript> Required to indicate specific parts of letters <sp>, <speaker>, and <stage> Required to encode different dramatic structures. <sp> and <speaker> Required to encode oral histories interviews
[object Object],Example of Level 4 Encoding
Level 5 Encoding:  Scholarly Encoding Projects ,[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Example: Variant Readings in Level 5 Apparatus; critical apparatus Lemma, or base text
General Recommendations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
PART 4: Short Practice in Text Encoding
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],TEXT ETC . Chapter 1.  The Kingdom of God.  1 Chapter 2. Lincoln-Hearted Men 9 Chapter 3. Taming the Wilderness 19
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter Heading and Paragraph
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],P5 Level 2 Encoding
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],P5 Level 3 Encoding
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],P5 Level 3 Continued
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],P5 Level 4 Encoding
PART 5: TEI Header
[object Object],[object Object],[object Object],[object Object],TEI Header
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Basic Components of TEI Header
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],TEI Header (continued)
Example: MARC to TEI Header ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Session 2: Text Encoding and the Text Encoding Initiative (TEI) Richard Wisneski Head, Bibliographic/Metadata Services Kelvin Smith Library Case Western Reserve University 2009-2010
PART 6: Some Common Practices in Text Encoding
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Using oXygen
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Common Practices (continued)
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Inserting Images <figDesc> is not required in Level 3, but we are using it to capture either an image caption or to describe the image if a caption is not present
Footnotes
[object Object],Footnote Encoded (and Marginalia) If this note were in the MARGIN of the page, it would be encoded, for example: <note type=“auth” place=“margin-left”> text, text,text </note> Type= and rend= attributes are optional
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Endnotes
Another Method
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Encode with Pointer and Link
PART 7: Encoding References
[object Object],[object Object],[object Object],[object Object],[object Object],Encoding Contextual Information
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Common Tags for Contextual Information
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Types of Contextual Information
[object Object],[object Object],[object Object],[object Object],[object Object],Personography
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Personography Encoding TEI header Participation description listPerson person
[object Object],[object Object],Placeography (Gazetteer)
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Placeography Encoding back div place
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Interpretative Keywords and Themes
Conclusion
[object Object],[object Object],[object Object],Future Trends in TEI
[object Object],[object Object],[object Object],Other Encoding Possibilities
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],References

More Related Content

What's hot

The Difference between HTML, XHTML & HTML5 for Beginners
The Difference between HTML, XHTML & HTML5 for BeginnersThe Difference between HTML, XHTML & HTML5 for Beginners
The Difference between HTML, XHTML & HTML5 for Beginners
Rasin Bekkevold
 
Design Tools Html Xhtml
Design Tools Html XhtmlDesign Tools Html Xhtml
Design Tools Html Xhtml
Ahsan Uddin Shan
 
HTML - R.D.sivakumar
HTML - R.D.sivakumarHTML - R.D.sivakumar
HTML - R.D.sivakumar
Sivakumar R D .
 
Web designing using html
Web designing using htmlWeb designing using html
Web designing using html
julicris021488
 
Annotating for Individual experiences
Annotating for Individual experiencesAnnotating for Individual experiences
Annotating for Individual experiences
liddy
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XMLyht4ever
 
Dublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax TutorialDublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax Tutorial
Eduserv Foundation
 
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTH
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTHWeb programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTH
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTH
Bhavsingh Maloth
 
10. XML in DBMS
10. XML in DBMS10. XML in DBMS
10. XML in DBMSkoolkampus
 
Xml 1
Xml 1Xml 1
Hypertext markup language (html)
Hypertext markup language (html)Hypertext markup language (html)
Hypertext markup language (html)
Aksa Sahi
 
Xml applications
Xml applicationsXml applications
Xml applications
Nabahat Tahir
 
Grade 10 COMPUTER
Grade 10 COMPUTERGrade 10 COMPUTER
Grade 10 COMPUTER
Joel Linquico
 
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
Beat Signer
 
XML
XMLXML
HTML
HTMLHTML
Web topic 2 html
Web topic 2  htmlWeb topic 2  html
Web topic 2 htmlCK Yang
 

What's hot (20)

The Difference between HTML, XHTML & HTML5 for Beginners
The Difference between HTML, XHTML & HTML5 for BeginnersThe Difference between HTML, XHTML & HTML5 for Beginners
The Difference between HTML, XHTML & HTML5 for Beginners
 
Design Tools Html Xhtml
Design Tools Html XhtmlDesign Tools Html Xhtml
Design Tools Html Xhtml
 
O9xml
O9xmlO9xml
O9xml
 
HTML - R.D.sivakumar
HTML - R.D.sivakumarHTML - R.D.sivakumar
HTML - R.D.sivakumar
 
Web designing using html
Web designing using htmlWeb designing using html
Web designing using html
 
Annotating for Individual experiences
Annotating for Individual experiencesAnnotating for Individual experiences
Annotating for Individual experiences
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
Intr To Html & Xhtml
Intr To Html & XhtmlIntr To Html & Xhtml
Intr To Html & Xhtml
 
Dublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax TutorialDublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax Tutorial
 
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTH
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTHWeb programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTH
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTH
 
10. XML in DBMS
10. XML in DBMS10. XML in DBMS
10. XML in DBMS
 
Basic XML
Basic XMLBasic XML
Basic XML
 
Xml 1
Xml 1Xml 1
Xml 1
 
Hypertext markup language (html)
Hypertext markup language (html)Hypertext markup language (html)
Hypertext markup language (html)
 
Xml applications
Xml applicationsXml applications
Xml applications
 
Grade 10 COMPUTER
Grade 10 COMPUTERGrade 10 COMPUTER
Grade 10 COMPUTER
 
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
 
XML
XMLXML
XML
 
HTML
HTMLHTML
HTML
 
Web topic 2 html
Web topic 2  htmlWeb topic 2  html
Web topic 2 html
 

Viewers also liked

Notas Y Apuntas Sobre La Escucha
Notas Y Apuntas Sobre La EscuchaNotas Y Apuntas Sobre La Escucha
Notas Y Apuntas Sobre La Escuchaguillermoescudero
 
Unification grammar
Unification grammarUnification grammar
Unification grammar
Hywel Evans
 
Parsing
ParsingParsing
Parsing
khush_boo31
 
Top down and botttom up Parsing
Top down     and botttom up ParsingTop down     and botttom up Parsing
Top down and botttom up Parsing
Gerwin Ocsena
 

Viewers also liked (6)

Parsing
ParsingParsing
Parsing
 
Notas Y Apuntas Sobre La Escucha
Notas Y Apuntas Sobre La EscuchaNotas Y Apuntas Sobre La Escucha
Notas Y Apuntas Sobre La Escucha
 
Unification grammar
Unification grammarUnification grammar
Unification grammar
 
Parsing
ParsingParsing
Parsing
 
Sgml
SgmlSgml
Sgml
 
Top down and botttom up Parsing
Top down     and botttom up ParsingTop down     and botttom up Parsing
Top down and botttom up Parsing
 

Similar to Wisneski TeI workshop 2009-2010

Xml
XmlXml
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
STC-Philadelphia Metro Chapter
 
Metadata Workshop-Maastricht - November 6, 2008
Metadata Workshop-Maastricht - November 6, 2008Metadata Workshop-Maastricht - November 6, 2008
Metadata Workshop-Maastricht - November 6, 2008askamy
 
Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008askamy
 
Html for Beginners
Html for BeginnersHtml for Beginners
Html for Beginners
Sriram Raj
 
Html
HtmlHtml
Xml
XmlXml
Xml Lecture Notes
Xml Lecture NotesXml Lecture Notes
Xml Lecture Notes
Santhiya Grace
 
Metadata Cloud
Metadata CloudMetadata Cloud
Metadata Cloud
Norm Friesen
 
Html
HtmlHtml
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Webliddy
 
Pmm05 16
Pmm05 16Pmm05 16
Pmm05 16
Rohit Luthra
 
Web Services Part 1
Web Services Part 1Web Services Part 1
Web Services Part 1patinijava
 
Decoding and developing the online finding aid
Decoding and developing the online finding aidDecoding and developing the online finding aid
Decoding and developing the online finding aid
kgerber
 
Metadata first, ontologies second
Metadata first, ontologies secondMetadata first, ontologies second
Metadata first, ontologies secondJoseba Abaitua
 
Intro XML for archivists (2011)
Intro XML for archivists (2011)Intro XML for archivists (2011)
Intro XML for archivists (2011)
Jane Stevenson
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7
Deniz Kılınç
 
Beyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content IntegrationBeyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content Integration
New York University
 
Introduction to HTML
Introduction to HTMLIntroduction to HTML
Introduction to HTML
MayaLisa
 

Similar to Wisneski TeI workshop 2009-2010 (20)

Xml
XmlXml
Xml
 
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
 
Metadata Workshop-Maastricht - November 6, 2008
Metadata Workshop-Maastricht - November 6, 2008Metadata Workshop-Maastricht - November 6, 2008
Metadata Workshop-Maastricht - November 6, 2008
 
Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008
 
Html for Beginners
Html for BeginnersHtml for Beginners
Html for Beginners
 
Html
HtmlHtml
Html
 
Xml
XmlXml
Xml
 
Xml Lecture Notes
Xml Lecture NotesXml Lecture Notes
Xml Lecture Notes
 
Metadata Cloud
Metadata CloudMetadata Cloud
Metadata Cloud
 
Html
HtmlHtml
Html
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Sweo talk
Sweo talkSweo talk
Sweo talk
 
Pmm05 16
Pmm05 16Pmm05 16
Pmm05 16
 
Web Services Part 1
Web Services Part 1Web Services Part 1
Web Services Part 1
 
Decoding and developing the online finding aid
Decoding and developing the online finding aidDecoding and developing the online finding aid
Decoding and developing the online finding aid
 
Metadata first, ontologies second
Metadata first, ontologies secondMetadata first, ontologies second
Metadata first, ontologies second
 
Intro XML for archivists (2011)
Intro XML for archivists (2011)Intro XML for archivists (2011)
Intro XML for archivists (2011)
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7
 
Beyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content IntegrationBeyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content Integration
 
Introduction to HTML
Introduction to HTMLIntroduction to HTML
Introduction to HTML
 

Recently uploaded

Marketing internship report file for MBA
Marketing internship report file for MBAMarketing internship report file for MBA
Marketing internship report file for MBA
gb193092
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
kimdan468
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Chapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdfChapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdf
Kartik Tiwari
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 

Recently uploaded (20)

Marketing internship report file for MBA
Marketing internship report file for MBAMarketing internship report file for MBA
Marketing internship report file for MBA
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Chapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdfChapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdf
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 

Wisneski TeI workshop 2009-2010

  • 1. Introduction to Text Encoding and the Text Encoding Initiative (TEI) Richard Wisneski Head, Bibliographic/Metadata Services Kelvin Smith Library Case Western Reserve University 2009-2010
  • 2.
  • 3.
  • 4. PART 1: Overview of Text Encoding
  • 5.
  • 6. Quick Example <lg> <head>After <del>an</del><add>the <del>unsolv’d</del></add> argument</head> <l><del>The</del><add><del>Coming in,</del> A group of</add> little children, and their <lb/>ways and chatter, flow in <del>upon me</del></l> <l>Like <add>welcome</add> rippling water o'er my <lb>heated <add>nerves and</add> flesh.</l> </lg>
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. PART 2: Text Encoding and XML
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19. PART 3: Levels of TEI Encoding
  • 20.
  • 21.
  • 22. Level 1 Encoding: Characteristics <div1> or <div> There should be only one child of <body>: a single <div> (or <div1>) <ab> There should be only one child of the <div> (or <div1>): a single <ab> wrapping all text OCR text. If the text is ever “upgraded” to a Level 3 or higher, the <ab> element will be replaced by structural elements like <p> and <table>. <pb> Required in Level 1. Page images can be linked to the text by specifying a jpeg or other image file as the value of the facs= attribute. Page numbers can be supplied with the n= attribute to record the number that is on the page. The Task Force sees the use of METS here as having a tremendous advantage. METS/TEI page turning documentation will be included in the near future.
  • 23.
  • 24. Level 2 Encoding: Characteristics All elements specified in Level 1 plus the following: <front>, <back> Optional <div1> or <div> If no type= attribute is specified, a type= value of &quot;section&quot; should be presumed. <head> Required if present. <ab> At least one container element is required. <fw> Running heads; can be automatically generated
  • 25.
  • 26.
  • 27.
  • 28. Level 3 Encoding: Characteristics All elements specified in Levels 1 and 2 plus the following : <front>, <back> Required if present <div> Required if present; type attribute is recommended <floatingText> Recommended if present. <p> Required for paragraph breaks in prose. <lg> and <l> Required for identifying groups of lines and lines, respectively <list> and <item> May be used in this level to indicate ordered and unordered list structures <table>, <row>, and <cell> May be used to indicate table structures. <figure> Required to indicate figures other than page images <hi> Required to indicate changes in typeface; rend attribute is optional <note> All notes must be encoded. It is also recommended that notes that extend beyond one page be combined into one <note> element. Marginal notes, without reference, should occur at the beginning of the paragraph to which they refer, with the value of the place attribute as &quot;margin&quot;
  • 29.
  • 30. Level 3 Encoding: Prose Example <TEI xmlns=&quot;http://www.tei-c.org/ns/1.0&quot; xml:id=&quot;VAA2383&quot;> <teiHeader> [stuff] </teiHeader> <text> <front> <div type=&quot;frontispiece&quot;>[figure]</div1> <titlePage>[text]</titlePage> <div type=&quot;dedication&quot;>[text]</div1> <div type=&quot;contents&quot;>[text]</div1> </front> <body> <div type=&quot;book&quot;> <head>[book title]</head> <div type=&quot;chapter“> <pb n=“5” xml:id=“freear-p03” />[text] </div2> <div type=&quot;chapter&quot;> <pb n=“12” xml:id=“freear-p12” />[text] </div2> <div type=&quot;chapter&quot;>[text]</div2> </div> </body> <back> <div type=&quot;appendix&quot;>[text]</div1> <div type=&quot;index&quot;>[text]</div1> </back> </text></TEI> Table of Contents: <!--@target references page break identifier--> <div type=&quot;contents&quot;> <head>CONTENTS</head> <list type=&quot;simple&quot;> <item>I. A Boy and His Dog <hi rend=&quot;right&quot;>3</hi> <ptr target=&quot;#freear-p03&quot;/> </item> <item>II. Romance <hi rend=&quot;right&quot;>12</hi> <ptr target=&quot;#freear-p12&quot;/> </item> </div>
  • 31. Level 3 Encoding: Verse Example <TEI xmlns=&quot;http://www.tei-c.org/ns/1.0&quot; xml:id=&quot;VAA2383&quot;> <teiHeader> [stuff] </teiHeader> <text> <front> <titlePage>[text]</titlePage> <div type=&quot;dedication&quot;>[text]</div1> <div type=&quot;contents&quot;>[text]</div1> </front> <body> <div type=&quot;book&quot;> <head>[book title]</head> <div type=&quot;part&quot;> <head>[section title]</head> <div type=&quot;poem&quot;> <head>THE DAYS GONE BY.</head> <lg> <l n=&quot;1&quot;>O the days gone by! O the days gone by!</l> <l n=&quot;2&quot;>The apples in the orchard, and the pathway through the rye;</l> <l n=&quot;3&quot;>The chirrup of the robin, and the whistle of the quail</l> <l n=&quot;4&quot;>As he piped across the meadows sweet as any nightingale;</l> </lg> <lg>[lines of poetry]</lg> <lg>[lines of poetry]</lg> </div> </div> </div> </body> </text> </TEI>
  • 32.
  • 33. Level 4 Encoding: Characteristics All elements specified in Levels 1, 2 and 3 plus the following : Et cetera; see TEI BPG Guidelines <titlePage> and child elements Required if present <group> Required to encode a collection of independent texts that are regarded as a single group for processing or other purposes <emph>, <foreign>, <gloss>, <term>, or <title> Recommended to identify typographically distinct text <epigraph>, <quote>, <said>, <mentioned>, or <soCalled> Recommended to represent speech, thought, quotation, etc. <sic>, <corr>, or <choice> Recommended to encode errors or typos. <add>, <del>, <gap>, and <unclear> Recommended to encode material that is omitted, added, marked for deletion, or is illegible, invisible, or inaudible <opener>, <dateline>, <salute> <closer>, <signed>, <postscript> Required to indicate specific parts of letters <sp>, <speaker>, and <stage> Required to encode different dramatic structures. <sp> and <speaker> Required to encode oral histories interviews
  • 34.
  • 35.
  • 36.
  • 37.
  • 38. PART 4: Short Practice in Text Encoding
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45. PART 5: TEI Header
  • 46.
  • 47.
  • 48.
  • 49.
  • 50. Session 2: Text Encoding and the Text Encoding Initiative (TEI) Richard Wisneski Head, Bibliographic/Metadata Services Kelvin Smith Library Case Western Reserve University 2009-2010
  • 51. PART 6: Some Common Practices in Text Encoding
  • 52.
  • 53.
  • 54.
  • 56.
  • 57.
  • 59.
  • 60. PART 7: Encoding References
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 70.
  • 71.
  • 72.

Editor's Notes

  1. Akin to learning a new language
  2. Show search features of each For Whitman, click “manuscripts”  clicking here (under “poetry manuscripts”)
  3. TEI founded in 2000. Members pay annual fee, pays for editorial work, outreach, workshops. KSL-CWRU is a member
  4. Text encoding borne out of new criticism, but more structuralist in nature. Regarding 1 st point, think of text encoding as akin to an edition of a text. Regarding the 2 nd point, there is no one right answer, but there does exist wrong answers Regarding the 3 rd point, it is expected that individual projects will remove elements, constrain attribute values, add new elements, and even import schemas from other namespaces.
  5. Regarding 1 st point: text encoding uses XML because it’s non-proprietary, requires no specialized software or hardware, and is meant to be long-lasting. 2 nd point: have an agreed-upon metadata and markup language that will work across collections and projects 3 rd point: these texts are not static, but rather meant to be built upon by a community of scholars
  6. TEI grew out of a need to create inter’l standards for textual markup in 1987. Members pay annual fee, pays for editorial work, outreach, workshops. KSL-CWRU is a member TEI is intended to serve an inter’l community. # Broad range of methods and approaches # Participation from member institutions around the world # Support for multilingual versions of the TEI Guidelines: Chinese, French, German, Japanese, Spanish, others in the future
  7. Code specifications include: Has a start and end tag No elements overlap Has a single root element (e.g. book; see upcoming slide)
  8. NOTES: Element names ARE case-sensitive Elements are also known as “tags” Attributes are to Elements as Adjectives are to Nouns Elements have an open and close, except for empty elements, such as &lt;pb /&gt; Elements must be properly nested
  9. We’ll use the Roma tool for this later on
  10. Not too important to understand all of this. GO TO PRACTICE
  11. Began in 1994. Major shift occurred in 2002 with P4 encoding LEVEL 1: Texts at Level 1 can be created and encoded by fully automated means, using uncorrected OCR of page images (&amp;quot;dirty OCR&amp;quot;), exporting from existing electronic text files, or actually not including any text at all. texts are not intended to be adequate for textual analysis; they are more likely to be suited to the goals of a preservation unit or mass digitization initiative LEVEL 2: Level 2 encoding requires some human intervention to identify each textual division and heading. Level 2 texts do not require any specialist knowledge or manual intervention below the section level. LEVEL 2 AND 1 both are not meant to have the text stand apart from the page images LEVEL 3: first attempt to have text stand alone from page images
  12. &lt;ab&gt; = anonymous block
  13. &lt;ab&gt; = anonymous block &lt;fw&gt; = forme works
  14. &lt;front&gt;[titlepage information, table of contents, prefaces, etc.][optional]&lt;/front&gt; &lt;ab&gt; = anonymous block, NOT &lt;p&gt; tags No &lt;p&gt; tags Facs attribute is used without METS record; xml:id attribute is used WITH METS document
  15. &lt;front&gt;[titlepage information, table of contents, prefaces, etc.][optional]&lt;/front&gt; &lt;ab&gt; = anonymous block, NOT &lt;p&gt; tags No &lt;p&gt; tags Not a good idea to use full file paths for facs= attribute
  16. This is the level KSL is using
  17. N.B. You can also use numbered divs. The maximum is 7. The example to the left is invalid; the &lt;div1&gt; and &lt;div2&gt; tags are there just to show that the option exists
  18. N= attribute for &lt;l&gt; is optional
  19. This is the level KSL is using
  20. Click the link to see the full example HAND OUT “SOME COMMON P5 TAGS”
  21. Ask: what do you think would need to be encoded here?
  22. Ask: what do you think would need to be encoded here?
  23. &lt;front&gt;[titlepage information, table of contents, prefaces, etc.][optional]&lt;/front&gt; &lt;ab&gt; = anonymous block, NOT &lt;p&gt; tags &lt;fw&gt; = forme works No &lt;p&gt; tags Not good practice to use file paths for facs= attribute
  24. &lt;pb&gt; comes after the &lt;div&gt; &lt;fw&gt; removed Xml:id is used with a METS document; facs= is used without a METS document
  25. &lt;hi rend=“italics”&gt; the rend attribute is optional
  26. &lt;bibStruct&gt; can be in the TEI header or in a separate TEI file, referenced in this TEI document (makes more sense to do the latter). Take note of &lt;q&gt; (can be missed in this example). GO TO PRACTICE
  27. In the local context, a TEI Header gives metadata about the TEI document, its source, and its provenance. The TEI Header may used for metadata exchange, to automatically create indexes (author lists, title lists) for a collection of TEI documents, and to aid in browsing heterogeneous TEI documents. TEI Headers may also be used as a basis for other metadata records (such as MARC or Dublin Core), though generation of other formats may require human intervention because they often are more granular, or have different granularity, than TEI Headers.
  28. In the local context, a TEI Header gives metadata about the TEI document, its source, and its provenance. The TEI Header may used for metadata exchange, to automatically create indexes (author lists, title lists) for a collection of TEI documents, and to aid in browsing heterogeneous TEI documents. TEI Headers may also be used as a basis for other metadata records (such as MARC or Dublin Core), though generation of other formats may require human intervention because they often are more granular, or have different granularity, than TEI Headers.
  29. Distribute spreadsheet
  30. Show how I got to the MARC display Be aware that other components may have to go into the header, depending on your project (e.g. working with verse). Also requires appropriate schema elements and attributes. GO TO PRACTICE TO CREATE A TEI HEADER
  31. We are using xml:id N= is the page number on page
  32. GO TO REFERENCE PRACTICE