Poster displayed at The 2014 Text Encoding Initiative Conference and Members Meeting (October 22-24), hosted by Northwestern University (Evanston, IL). This paper discusses the work Digital Humanities Quarterly has done to create a centralized bibliography of material cited by the journal's various contributors. Poster by Jim McGrath (on Twitter @JimMc_Grath). The poster abstract can be found here:
http://tei.northwestern.edu/files/2014/04/Mcgrath_TEI_Poster_Abstract-pqtd57.pdf
Digital Humanities Quarterly: A Case Study In Bibliographic Development
1. DIGITAL HUMANITIES QUARTERLY:
A Case Study in Bibliographic Development
Jim McGrath (Northeastern University)
mcgrath.ja@husky.neu.edu
Introduction
Bibliographic Elements
Why Not Use the TEI?
Digital Humanities Quarterly is building a centralized bibliography of
material cited by the journal’s various contributors. This poster describes the
project’s aims, challenges, and practices, in the hopes of highlighting some
of the key issues at the heart of creating, preserving, and assessing
bibliographic data.
DHQ is particularly interested in:
-Using encoding to examine the bibliographic and citational practices of
DHQ’s contributors.
-Creating an indexed database of works cited that makes it easier for DHQ
contributors to add citational data to articles.
-Providing a potential case study for other scholars interested in
bibliographic markup and/or broader questions about markup languages
and practices.
Current project collaborators: Julia Flanders, Wendell Piez, Jim McGrath.
Special thanks to Julia and Wendell for their help in encouraging the
creation of this poster and for offering feedback on its content!
Workflow
1. Create a schema: categories of bibliographic elements (and attendant
elements for each bibliographic element). The team has taken a “bottom-up”
and iterative approach to schema development. Wendell Piez created a
schema after inferring patterns in the bibliographic data DHQ had previously
captured; refinements to the schema were made after identifying edge
cases while more closely examining this data.
2. Using a transformation that extracts bibliographic records from each DHQ
article, apply schema to these records, validate, and organize data.
3. Review this data, checking for duplicates, incomplete records, and errors
4. Send data to DHQ authors to flag potential gaps and oversights
5. Create and implement controlled vocabularies for publisher names, etc.
6. Export and publish data for creation of visualizations, analysis, etc.
7. Create interface allowing future DHQ contributors to use bibl records
Example
<ConferencePaper ID="piez2014" issuance="monographic">
<author>
<givenName>Wendell</givenName>
<familyName>Piez</familyName>
</author>
<title>Towards Strategic Reading: Graphical Maps and Renditions of TEI Data</title>
<conference>
<name>Text Encoding Initiative Conference and Members Meeting</name>
<date>2014</date>
<sponsor>Northwestern University</sponsor>
</conference>
</ConferencePaper>
The TEI has a means to encode bibliographic data (<bibl>, <biblStruct> and <biblFull>): why doesn’t DHQ use it for this project?
For the TEI, it’s useful to have elements that can say a range of things; for DHQ, it’s helpful for us to have more specific elements that say
more precise things about our data.
So why not create new elements in the TEI that let you document bibliographic data in more precise ways?
On this particular project, DHQ is less interested in creating a schema that attends to the widest range of possible bibliographic
considerations imagined by TEI practitioners: our schema is particularly tailored to the citational records created by our authors. Similarly,
we are interested in calling specific attention to the genres of publication favored by DHQ’s present and future contributors.
A lot of the available elements in the TEI for customization don’t have a specific semantics that considers the particular dimensions of
bibliographic data. For example, our use of our modified schema allows us to validate and track particular elements in our bibliographic
records that we deem essential: ensuring that all “WebSite” records include a URL, for instance, or that “JournalArticle” records refer back to
particular journals.
While it will continue to require addition and revision for as long as DHQ continues to publish, our database of bibliographic records will
provide interested researchers with detailed and indexed citational data. This record is arguably more complete and consistent than the
varied approaches to citation evident in an examination of individual articles: gaps in bibliographic records, differences in citational practices
(depending on field of study), human errors.
These records may be of particular interest to researchers curious about the particular investments our authors have in specific genres,
authors, publishing houses, dates of publication. For example, the perceived need for the creation of particular “ConferencePaper” and
“BlogEntry” elements suggests the investments these authors have in these modes of academic discourse. More broadly, the creation,
examination, and redistribution of this bibliographic data raises several questions about the value of bibliographic records to the larger field
of the digital humanities. What intellectual labor is intentionally or unwittingly erased in bibliographic records? How do debates and tensions
about the value of various forms of intellectual labor reveal themselves in citational records and the guidelines that shape and validate
them?
Northeastern University
Digital Scholarship Commons