SlideShare a Scribd company logo
C R A I G B E L L A M Y , C O N A L T U O H Y , V E R S I
3 0 S E P T E M B E R 2 0 1 0
The Text Encoding Initiative
(TEI) and Humanities Research
What is Text Encoding ?
“Before they can be studied with the aid of machines,
texts must be encoded in a machine readable form.
Methods for this transcription are called, generally,
‘text encoding schemes’; such schemes must provide
mechanisms for representing the characters of a text
and its logical and physical structure...ancillary
information achieved by analysis or interpretation
(may also be added)...
Michael Sperberg-McQueen (1990)
History of Text Encoding/mark-up
 (Machine readable texts: Creating, archiving, and
sharing, textual data)
 GML (Generalised Mark-up Language) 1960s
 SGML (1986)
 HTML (Hypertext Mark-up Language) Berners-Lee,
1991
 TEI (Text-Encoding Initiative) 1987; a tag schema for
marking up humanities texts (ordered hierarchy text
model)
 XML (re-worked in 1998 from SGML: a metalanguage
for defining descriptive mark-up languages (in part by
Michael Sperberg-McQueen)
Why TEI? (computers de-code, people read)
 Since 1949; Roberto Busa developed the first literary
text encoding project (concordance of Thomas
Aquinas 1225-1274), tension over standardisation
 TEI developed in 1987 in conjunction with the
Association for Computing in the Humanities (ACH)
 Nearly every major text-based digital humanities
project uses TEI
 TEI is a product of a large international
collaboration; tested on numerous projects (a
standard that is extensible and flexible)
 Opens up new methods of analysis!
Roberto Busa
TEI Case Studies
 HESTIA: Herodotus
 Fine Roles Henry III
 SETIS (Basic library TEI)
 Transcribe Bentham project (crowd-sourcing)
 Daisy Bates
Fine Roles Henry III (1216-1272)
 Latin on parchment
 A fine was essentially a promise of money to the king
in return for a concession or favour (key to
understanding the Magna Carta and the
development of the Parliamentary State
 Complex associations of people encoded
 http://www.finerollshenry3.org.uk/home.html
 Facilitates scholarly analysis
 http://www.methodsnetwork.ac.uk/resources/casest
udy10.html
HESTIA
 Books of Herodotus texts contained on the Perseus
Project
 Used TEI encoded texts of Herodotus’ ‘Histories’
(place and time)
 http://www.open.ac.uk/Arts/hestia/outcomes/index
.html
 Herodotus time-map
 http://www.open.ac.uk/Arts/hestia/herodotus/basic
.html
SETIS
 350 Australian texts (Text encoded...Federation,
‘Classics of Taxation’, Australia Studies collection)
 Rendered as print on demand, .pdf (library
environment so delivery important)
 http://setis.library.usyd.edu.au/oztexts/index.html
Transcribe Bentham
 Founder of UCL (philosopher, social reformer)
 150 manuscripts to transcribe
 Transcribing all the details (including deletions,
marginalia etc.)
 http://www.transcribe-
bentham.da.ulcc.ac.uk/td/Transcribe_Bentham
 (transcription desk)
 http://www.transcribe-
bentham.da.ulcc.ac.uk/td/Help:Transcription_Input_Fo
rm
 (guidelines)
Jeremy Bentham (1748-1832)
Daisy Bates (1859-1951)
Daisy Bates (1859-1951)
 Left Ireland in 1882 (aged 23) after a ‘sordid sexual
affair’
 Devoted more than 35 years of her life to studying
Aboriginal life, history, culture, rites, beliefs and customs
 Wrote millions of words living in tents on the edge of the
Nullarbor Plain and WA (in strict Edwardian attire!)
 NLA have Bates ‘word lists’ transcribing the common
words of numerous aboriginal tribes (didn’t name the
tribes). English words with numerous Aboriginal
trnaslations
Further Research/Training
 Digital Humanities projects, methods, tools
 http://www.arts-humanities.net/
 TEI by example
 http://tbe.kantl.be/TBE/
 TEI Summer School (Oxford)
 Digital Humanities Summer Institute (Canada)
 http://www.dhsi.org/
The TEI Community
 TEI Consortium (join & discussion list)
 http://www.tei-c.org/index.xml
 TEI Journal
 http://journal.tei-c.org/journal/index
 TEI Members Meeting (evolving research
community)
 http://ling.unizd.hr/~tei2010/
 Digital Humanities Answers
 http://digitalhumanities.org/answers/
End
 Evaluation forms

More Related Content

Viewers also liked

Mamíferos
MamíferosMamíferos
WhitepaperDigitalHumanities
WhitepaperDigitalHumanitiesWhitepaperDigitalHumanities
WhitepaperDigitalHumanities
Craig Bellamy
 
RUSCHEM - Manufacturing plant of solutions of organic and synthetic chemistry
RUSCHEM - Manufacturing plant of solutions of organic and synthetic chemistry RUSCHEM - Manufacturing plant of solutions of organic and synthetic chemistry
RUSCHEM - Manufacturing plant of solutions of organic and synthetic chemistry
Artekorus
 
AASC-PlacedLargeColumn
AASC-PlacedLargeColumnAASC-PlacedLargeColumn
AASC-PlacedLargeColumn
frank collins
 
Propiedades del arándano
Propiedades del arándanoPropiedades del arándano
Propiedades del arándano
diegogarciadavila
 
Early age strength and workability of slag pastes activated by sodium silicates
Early age strength and workability of slag pastes activated by sodium silicatesEarly age strength and workability of slag pastes activated by sodium silicates
Early age strength and workability of slag pastes activated by sodium silicates
frank collins
 
Music magazines pp
Music magazines ppMusic magazines pp
Music magazines pp
Gatkinson27
 

Viewers also liked (8)

Mamíferos
MamíferosMamíferos
Mamíferos
 
WhitepaperDigitalHumanities
WhitepaperDigitalHumanitiesWhitepaperDigitalHumanities
WhitepaperDigitalHumanities
 
RUSCHEM - Manufacturing plant of solutions of organic and synthetic chemistry
RUSCHEM - Manufacturing plant of solutions of organic and synthetic chemistry RUSCHEM - Manufacturing plant of solutions of organic and synthetic chemistry
RUSCHEM - Manufacturing plant of solutions of organic and synthetic chemistry
 
AASC-PlacedLargeColumn
AASC-PlacedLargeColumnAASC-PlacedLargeColumn
AASC-PlacedLargeColumn
 
Propiedades del arándano
Propiedades del arándanoPropiedades del arándano
Propiedades del arándano
 
Transcripts
TranscriptsTranscripts
Transcripts
 
Early age strength and workability of slag pastes activated by sodium silicates
Early age strength and workability of slag pastes activated by sodium silicatesEarly age strength and workability of slag pastes activated by sodium silicates
Early age strength and workability of slag pastes activated by sodium silicates
 
Music magazines pp
Music magazines ppMusic magazines pp
Music magazines pp
 

Similar to TEI_train

TWLOM Application Profile
TWLOM Application ProfileTWLOM Application Profile
TWLOM Application Profile
allenchen8888
 
Esad 12may2010
Esad 12may2010Esad 12may2010
Esad 12may2010
Anna Ashton
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Ontotext
 
LIS 653 Posters
LIS 653 PostersLIS 653 Posters
LIS 653 Posters
PrattSILS
 
The Hungarian National Archives’ online portal on medieval charters
The Hungarian National Archives’ online portal on medieval chartersThe Hungarian National Archives’ online portal on medieval charters
The Hungarian National Archives’ online portal on medieval charters
ICARUS - International Centre for Archival Research
 
NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Scrip...
NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Scrip...NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Scrip...
NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Scrip...
Đỗ Hợp
 
Semantic Web and Cultural Heritage Collections
Semantic Web and Cultural Heritage CollectionsSemantic Web and Cultural Heritage Collections
Semantic Web and Cultural Heritage Collections
RyanRM
 
The Virtual Memory Palace
The Virtual Memory PalaceThe Virtual Memory Palace
The Virtual Memory Palace
Richard Smyth
 
A Global Library of Life: The Biodiversity Heritage Library
A Global Library of Life: The Biodiversity Heritage LibraryA Global Library of Life: The Biodiversity Heritage Library
A Global Library of Life: The Biodiversity Heritage Library
Martin Kalfatovic
 
Seville2000
Seville2000Seville2000
Seville2000
behem0t
 
Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04
Rinke Hoekstra
 
11 terms in Corpus Linguistics1 (2)
11 terms in Corpus Linguistics1 (2)11 terms in Corpus Linguistics1 (2)
11 terms in Corpus Linguistics1 (2)
ThennarasuSakkan
 
The History Of Call The 60s
The History Of Call The 60sThe History Of Call The 60s
The History Of Call The 60s
Yurithk Jaramillo Vázquez
 
Classification, Cataloguing And Marc Crash Course
Classification, Cataloguing And Marc Crash CourseClassification, Cataloguing And Marc Crash Course
Classification, Cataloguing And Marc Crash Course
Juan D. Machin-Mastromatteo #Juantífico
 
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
OpenEdition
 

Similar to TEI_train (15)

TWLOM Application Profile
TWLOM Application ProfileTWLOM Application Profile
TWLOM Application Profile
 
Esad 12may2010
Esad 12may2010Esad 12may2010
Esad 12may2010
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
LIS 653 Posters
LIS 653 PostersLIS 653 Posters
LIS 653 Posters
 
The Hungarian National Archives’ online portal on medieval charters
The Hungarian National Archives’ online portal on medieval chartersThe Hungarian National Archives’ online portal on medieval charters
The Hungarian National Archives’ online portal on medieval charters
 
NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Scrip...
NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Scrip...NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Scrip...
NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Scrip...
 
Semantic Web and Cultural Heritage Collections
Semantic Web and Cultural Heritage CollectionsSemantic Web and Cultural Heritage Collections
Semantic Web and Cultural Heritage Collections
 
The Virtual Memory Palace
The Virtual Memory PalaceThe Virtual Memory Palace
The Virtual Memory Palace
 
A Global Library of Life: The Biodiversity Heritage Library
A Global Library of Life: The Biodiversity Heritage LibraryA Global Library of Life: The Biodiversity Heritage Library
A Global Library of Life: The Biodiversity Heritage Library
 
Seville2000
Seville2000Seville2000
Seville2000
 
Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04
 
11 terms in Corpus Linguistics1 (2)
11 terms in Corpus Linguistics1 (2)11 terms in Corpus Linguistics1 (2)
11 terms in Corpus Linguistics1 (2)
 
The History Of Call The 60s
The History Of Call The 60sThe History Of Call The 60s
The History Of Call The 60s
 
Classification, Cataloguing And Marc Crash Course
Classification, Cataloguing And Marc Crash CourseClassification, Cataloguing And Marc Crash Course
Classification, Cataloguing And Marc Crash Course
 
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
 

More from Craig Bellamy

The Ethics of AI in Education
The Ethics of AI in EducationThe Ethics of AI in Education
The Ethics of AI in Education
Craig Bellamy
 
Digital Humanities Pathways Semiar: University of Melbourne
Digital Humanities Pathways Semiar: University of MelbourneDigital Humanities Pathways Semiar: University of Melbourne
Digital Humanities Pathways Semiar: University of Melbourne
Craig Bellamy
 
Bellamy_certificate
Bellamy_certificateBellamy_certificate
Bellamy_certificate
Craig Bellamy
 
Facilitation based training in the Higher Ed sector
Facilitation based training in the Higher Ed sectorFacilitation based training in the Higher Ed sector
Facilitation based training in the Higher Ed sector
Craig Bellamy
 
Blended_Conference_pres_v4_MR
Blended_Conference_pres_v4_MRBlended_Conference_pres_v4_MR
Blended_Conference_pres_v4_MR
Craig Bellamy
 
Bellamy_death_DHA2016
Bellamy_death_DHA2016Bellamy_death_DHA2016
Bellamy_death_DHA2016
Craig Bellamy
 
cbellamy_presentation1
cbellamy_presentation1cbellamy_presentation1
cbellamy_presentation1
Craig Bellamy
 
IBES poster pias
IBES poster piasIBES poster pias
IBES poster pias
Craig Bellamy
 
PIAS - POSTER - 2011
PIAS - POSTER - 2011PIAS - POSTER - 2011
PIAS - POSTER - 2011
Craig Bellamy
 
BoF Bellamy et al 2010
BoF Bellamy et al 2010BoF Bellamy et al 2010
BoF Bellamy et al 2010
Craig Bellamy
 
arts-humanities-A4
arts-humanities-A4arts-humanities-A4
arts-humanities-A4
Craig Bellamy
 
poster
posterposter
Bellamy_death_internet1
Bellamy_death_internet1Bellamy_death_internet1
Bellamy_death_internet1
Craig Bellamy
 
Digital Humanities Workshop 22 March 2011
Digital Humanities Workshop 22 March 2011Digital Humanities Workshop 22 March 2011
Digital Humanities Workshop 22 March 2011
Craig Bellamy
 
Planning Session 3 AADH (priorities)
Planning Session 3 AADH (priorities)Planning Session 3 AADH (priorities)
Planning Session 3 AADH (priorities)
Craig Bellamy
 
Planning Session 2 AADH (framework)
Planning Session 2 AADH (framework)Planning Session 2 AADH (framework)
Planning Session 2 AADH (framework)
Craig Bellamy
 
D2L2
D2L2D2L2
D2L1
D2L1D2L1
smith_street_2014
smith_street_2014smith_street_2014
smith_street_2014
Craig Bellamy
 
IS_2010
IS_2010IS_2010
IS_2010
Craig Bellamy
 

More from Craig Bellamy (20)

The Ethics of AI in Education
The Ethics of AI in EducationThe Ethics of AI in Education
The Ethics of AI in Education
 
Digital Humanities Pathways Semiar: University of Melbourne
Digital Humanities Pathways Semiar: University of MelbourneDigital Humanities Pathways Semiar: University of Melbourne
Digital Humanities Pathways Semiar: University of Melbourne
 
Bellamy_certificate
Bellamy_certificateBellamy_certificate
Bellamy_certificate
 
Facilitation based training in the Higher Ed sector
Facilitation based training in the Higher Ed sectorFacilitation based training in the Higher Ed sector
Facilitation based training in the Higher Ed sector
 
Blended_Conference_pres_v4_MR
Blended_Conference_pres_v4_MRBlended_Conference_pres_v4_MR
Blended_Conference_pres_v4_MR
 
Bellamy_death_DHA2016
Bellamy_death_DHA2016Bellamy_death_DHA2016
Bellamy_death_DHA2016
 
cbellamy_presentation1
cbellamy_presentation1cbellamy_presentation1
cbellamy_presentation1
 
IBES poster pias
IBES poster piasIBES poster pias
IBES poster pias
 
PIAS - POSTER - 2011
PIAS - POSTER - 2011PIAS - POSTER - 2011
PIAS - POSTER - 2011
 
BoF Bellamy et al 2010
BoF Bellamy et al 2010BoF Bellamy et al 2010
BoF Bellamy et al 2010
 
arts-humanities-A4
arts-humanities-A4arts-humanities-A4
arts-humanities-A4
 
poster
posterposter
poster
 
Bellamy_death_internet1
Bellamy_death_internet1Bellamy_death_internet1
Bellamy_death_internet1
 
Digital Humanities Workshop 22 March 2011
Digital Humanities Workshop 22 March 2011Digital Humanities Workshop 22 March 2011
Digital Humanities Workshop 22 March 2011
 
Planning Session 3 AADH (priorities)
Planning Session 3 AADH (priorities)Planning Session 3 AADH (priorities)
Planning Session 3 AADH (priorities)
 
Planning Session 2 AADH (framework)
Planning Session 2 AADH (framework)Planning Session 2 AADH (framework)
Planning Session 2 AADH (framework)
 
D2L2
D2L2D2L2
D2L2
 
D2L1
D2L1D2L1
D2L1
 
smith_street_2014
smith_street_2014smith_street_2014
smith_street_2014
 
IS_2010
IS_2010IS_2010
IS_2010
 

TEI_train

  • 1. C R A I G B E L L A M Y , C O N A L T U O H Y , V E R S I 3 0 S E P T E M B E R 2 0 1 0 The Text Encoding Initiative (TEI) and Humanities Research
  • 2. What is Text Encoding ? “Before they can be studied with the aid of machines, texts must be encoded in a machine readable form. Methods for this transcription are called, generally, ‘text encoding schemes’; such schemes must provide mechanisms for representing the characters of a text and its logical and physical structure...ancillary information achieved by analysis or interpretation (may also be added)... Michael Sperberg-McQueen (1990)
  • 3. History of Text Encoding/mark-up  (Machine readable texts: Creating, archiving, and sharing, textual data)  GML (Generalised Mark-up Language) 1960s  SGML (1986)  HTML (Hypertext Mark-up Language) Berners-Lee, 1991  TEI (Text-Encoding Initiative) 1987; a tag schema for marking up humanities texts (ordered hierarchy text model)  XML (re-worked in 1998 from SGML: a metalanguage for defining descriptive mark-up languages (in part by Michael Sperberg-McQueen)
  • 4. Why TEI? (computers de-code, people read)  Since 1949; Roberto Busa developed the first literary text encoding project (concordance of Thomas Aquinas 1225-1274), tension over standardisation  TEI developed in 1987 in conjunction with the Association for Computing in the Humanities (ACH)  Nearly every major text-based digital humanities project uses TEI  TEI is a product of a large international collaboration; tested on numerous projects (a standard that is extensible and flexible)  Opens up new methods of analysis!
  • 6. TEI Case Studies  HESTIA: Herodotus  Fine Roles Henry III  SETIS (Basic library TEI)  Transcribe Bentham project (crowd-sourcing)  Daisy Bates
  • 7. Fine Roles Henry III (1216-1272)  Latin on parchment  A fine was essentially a promise of money to the king in return for a concession or favour (key to understanding the Magna Carta and the development of the Parliamentary State  Complex associations of people encoded  http://www.finerollshenry3.org.uk/home.html  Facilitates scholarly analysis  http://www.methodsnetwork.ac.uk/resources/casest udy10.html
  • 8. HESTIA  Books of Herodotus texts contained on the Perseus Project  Used TEI encoded texts of Herodotus’ ‘Histories’ (place and time)  http://www.open.ac.uk/Arts/hestia/outcomes/index .html  Herodotus time-map  http://www.open.ac.uk/Arts/hestia/herodotus/basic .html
  • 9. SETIS  350 Australian texts (Text encoded...Federation, ‘Classics of Taxation’, Australia Studies collection)  Rendered as print on demand, .pdf (library environment so delivery important)  http://setis.library.usyd.edu.au/oztexts/index.html
  • 10. Transcribe Bentham  Founder of UCL (philosopher, social reformer)  150 manuscripts to transcribe  Transcribing all the details (including deletions, marginalia etc.)  http://www.transcribe- bentham.da.ulcc.ac.uk/td/Transcribe_Bentham  (transcription desk)  http://www.transcribe- bentham.da.ulcc.ac.uk/td/Help:Transcription_Input_Fo rm  (guidelines)
  • 13. Daisy Bates (1859-1951)  Left Ireland in 1882 (aged 23) after a ‘sordid sexual affair’  Devoted more than 35 years of her life to studying Aboriginal life, history, culture, rites, beliefs and customs  Wrote millions of words living in tents on the edge of the Nullarbor Plain and WA (in strict Edwardian attire!)  NLA have Bates ‘word lists’ transcribing the common words of numerous aboriginal tribes (didn’t name the tribes). English words with numerous Aboriginal trnaslations
  • 14.
  • 15. Further Research/Training  Digital Humanities projects, methods, tools  http://www.arts-humanities.net/  TEI by example  http://tbe.kantl.be/TBE/  TEI Summer School (Oxford)  Digital Humanities Summer Institute (Canada)  http://www.dhsi.org/
  • 16. The TEI Community  TEI Consortium (join & discussion list)  http://www.tei-c.org/index.xml  TEI Journal  http://journal.tei-c.org/journal/index  TEI Members Meeting (evolving research community)  http://ling.unizd.hr/~tei2010/  Digital Humanities Answers  http://digitalhumanities.org/answers/