This document provides a summary of key works related to language documentation. It begins by defining language documentation and discussing its goals of creating organized language corpora. It then summarizes several reference works on language documentation theory and practice. It also summarizes anthologies and collections of papers on language documentation, as well as conference proceedings. Finally, it discusses journals, and theoretical aspects of language documentation like defining its scope, data collection and analysis, and metadata standards.
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Annotated Bibliography Of Language Documentation
1. 1
An Annotated Bibliography of Language Documentation
Peter K. Austin
Endangered Languages Academic Programme
Department of Linguistics
SOAS, University of London
2012-06-24
1. Introduction1
Language documentation, also known as documentary linguistics, is the sub-field of
linguistics that deals with creating multi-purpose records of languages through audio and
video recording of speakers and signers, and annotation, translation, preservation and
distribution of the resulting materials. It shows by its nature multi-disciplinarity (see Section
6.7) and draws on theoretical concepts and methods from linguistics, ethnography, folklore
studies, psychology, information and library science, archiving and museum studies, digital
humanities, media and recording arts, pedagogy, ethics, and other research areas.. Its major
goal is the creation of well-organized long-lasting corpora that can be used for a variety of
purposes, including theoretical research and practical needs such as language and cultural
revitalization (see Section 9). Another prominent feature is attention to the rights and desires
of language speakers and communities and collaboration with them (see Section 7.2) in the
recording, analysis, archiving, dissemination and support of their own languages. The term
“language documentation” historically has been used in Linguistics to refer to the creation of
grammars, dictionaries and text collections for undescribed languages; however works
defining language documentation as a distinct sub-field of Linguistics emerged around 1995
as a response to the crisis facing the world’s endangered languages, about half of which could
disappear in the 21st century, and the urgent need to record and analyse languages and
speakers’ linguistic knowledge while they continue to be spoken, and to work with
communities on supporting threatened languages before opportunities to do so become
reduced. It was also prompted by developments in information, media, communication, and
archiving technologies which make possible the collection, analysis, preservation and
dissemination of documentary records in ways which were not feasible previously. It was
also facilitated by large levels of research funding support from three main sources: the
DOBES (Documentation of Endangered Languages) program sponsored by Volkswagen
Foundation in Germany (2000-2013), the Endangered Languages Documentation Programme
(ELDP) supported by Arcadia Trust in the United Kingdom (2002-2016), and the
Documenting Endangered Languages (DEL) interagency initiative of the United States
National Science Foundation and the National Endowment of the Humanities (2005
onwards). Language documentation concerns itself with principles and methods for the
recording and analysis of primary language and cultural materials, and metadata about them,
in ways that are transparent and accountable, and that can be archived and disseminated for
1
This is a pre-publication version of Peter K. Austin. 2012. Language Documentation. In Mark Aronoff (ed.)
Linguistics Bibliography. New York: Oxford Bibliographies Online. It covers works published up to June 2011.
2. 2
current and future generations to use. Some researchers have emphasised standardization of
data and analysis and ‘best practices’ while others have argued for a diversity of approaches
which recognize the unique and particular social, cultural and linguistic contexts within
which individual languages are used. Methods and practices for Training in language
documentation have also been explored.
2. Reference works
Since the development of language documentation as a separate sub-field of Linguistics is
relatively new, there are only a few reference works that deal with theoretical and practical
issues. Gippert et al. 2006 covers definitional concepts, and the practicalities of data
collection, analysis and archiving. Many of the authors are researchers associated with the
DOBES (Documentation of Endangered Languages) program funded by the Volkswagen
Foundation. Chapters vary in complexity but most will be useful for beginning researchers. A
Spanish translation of the volume is available. Gippert et al. 2006 is critically reviewed by
Evans 2008, who argues that the approach it takes, which excludes grammar writing, is
detrimental to the field. Austin 2010 is a series of lectures from the 3L Summer School 2009
and is aimed at beginning students. Grenoble and Furbee 2010 originated in discussions at a
series of meetings of concerned researchers in 2004-2006, and a conference at Harvard
University in 2005. It addresses praxis and values in documentation, measures of
documentary adequacy, technologies, collaboration models, and training needs. Its audience
is more advanced practitioners. Austin and Sallabank 2011 deals with a wide range of
endangered languages issues and is intended for students; Part II and Part IV of the book have
seven chapters on language documentation. The edited series Language Documentation and
Description, published since 2003 by the Hans Rausing Endangered Languages Project at
SOAS, University of London, contains articles on language documentation theory and
practice, mostly arising from workshops organized by the project.
Austin, Peter K. Language Documentation and Description, Volume 7. London : SOAS,
2010.
Contains chapters on documentation issues and methods, archiving, audio recording,
sign languages, ethics, language policy, typology, linguistic theory, and applying for a
research grant.
Austin, Peter K. and Julia Sallabank, eds. The Cambridge Handbook of Endangered
Languages. Cambridge, UK: Cambridge University Press, 2011.
Contains chapters on defining documentation, the roles of speakers, data types and
structures, archiving, and digital archiving, as well as training and project management.
Evans, Nicholas. “Review of Gippert, Jost, Nikolaus P. Himmelmann and Ulrike Mosel, eds.
Essentials of language documentation (Trends in Linguistics. Studies and Monographs, 178).
Berlin: Mouton de Gruyter, 2006.” Language Documentation and Conservation, 2(2008):
340-350.
A critical review arguing for more attention to grammar writing within language
documentation.
3. 3
Gippert, Jost, Nikolaus P. Himmelmann and Ulrike Mosel, eds. Essentials of language
documentation (Trends in Linguistics. Studies and Monographs, 178). Berlin: Mouton de
Gruyter, 2006.
An essential reference for principles and practices in documentation, highly influenced
by the models developed in the DOBES program.
Grenoble, Lenore A. and N. Louanna Furbee, eds. Language Documentation: Practice and
values. Amsterdam: John Benjamins Publishing Company, 2010.
A collection of position papers and case studies on practices and values, measures of
adequacy, technology, collaboration, and training.
Language Documentation and Description, London: SOAS, University of London
Annual or semi-annual volumes of peer reviewed articles on language documentation
(edited by Peter K. Austin, and guest editors), mostly arising from workshops held at
SOAS, University of London.
3. Anthologies and collections
The four volume collection of reprinted journal articles and book chapters in Austin and
McGill 2011 is intended to cover the essential published articles on endangered languages
and includes material on language documentation. Harrison et al. 2008 is a collection written
by members of research teams within the Volkswagen Foundation-funded DOBES project
covering linguistic, ethical and social outcomes of their documentation research, and is quite
technical in content. Lameen Souag of SOAS, University of London has curated an excellent
collection of up-to-date web links to relevant materials called OREL: Online Resources for
Endangered Languages.
Austin, Peter K. and Stuart McGill, eds. Endangered Languages: Critical concepts in
Linguistics. 2 vols. Oxford: Routledge, 2011.
Volume II contains 14 chapters on defining documentation, data and metadata,
archiving, and documentation methods. The introduction to this volume discusses the
significance of each chapter. The general introduction (Volume I) offers a detailed
discussion of issues and challenges in endangered languages documentation.
Harrison, K. David, David S. Rood and Arienne Dwyer, eds. Lessons from documented
endangered languages. Amsterdam: John Benjamins, 2008
A collection of research papers by scholars working within the DOBES project model
covering various outcomes of their work from linguistic, social and ethical
perspectives.
OREL: Online Resources for Endangered Languages
[http://www.hrelp.org/languages/resources/orel/]
A set of annotated links to online resource materials; the section on Technology and
Techniques [http://www.hrelp.org/languages/resources/orel/tech.html] contains many
useful links for language documenters.
4. 4
4. Conference and workshop proceedings
The NSF-funded Electronic Metastructure for Endangered Languages Documentation project
held a series of seven workshops between 2001 and 2007. Proceedings of the E-MELD
Workshops 2001-2007 are available online. The Language Documentation and Linguistic
Theory (LDLT) conference has been held biennially since 2007; the conference aims to bring
together researchers working on linguistic theory and language documentation and
description, with a particular focus on innovative work on underdescribed or endangered
languages. The Proceedings of Language Documentation and Linguistic Theory Conference
are published in book form for conference attendees and online. The International
Conference on Language Documentation and Conservation (ICLDC) has been held
biennially since 2009 with a particular focus on communities, linguists, and other academics
working in close collaboration. Audio recordings and PowerPoint slides from the
Proceedings of International Conference on Language Documentation and Conservation are
available online for download. Although not proceedings as such, the handouts and course
slides from Infield Workshops 2008 and Infield Workshops 2010 contain much valuable
material.
Proceedings of International Conference on Language Documentation and Conservation
[http://scholarspace.manoa.hawaii.edu/handle/10125/5961]
Recordings and Powerpoint slides from the annual ICLDC conference.
Proceedings of the E-MELD Workshops 2001-2007 [http://emeld.org/documents/index.cfm]
Position papers and presentations at these annual workshops.
Proceedings of Language Documentation and Linguistic Theory Conference
[http://www.hrelp.org/publications/]
The edited proceedings of the biennial LDLT conference.
Infield Workshops 2008 University of California Santa
Barbara. [http://www.linguistics.ucsb.edu/faculty/infield/workshops/index.html]
Training materials on many practical aspects of language documentation.
Infield Workshops 2010 University of Oregon
[http://logos.uoregon.edu/infield2010/workshops/index.php]
Course materials from Infield training course.
5. JOURNALS
Language Documentation and Conservation is a free online journal published by University
of Hawaii Press beginning in 2007. Linguistic Discovery is a free online journal published by
Dartmouth College since 2002 that occasionally contains articles related to language
documentation.
Language Documentation and Conservation, Hawaii: University of Hawaii Press
[http://nflrc.hawaii.edu/ldc/]
Free peer reviewed online journal dealing with language documentation and
5. 5
revitalization issues.
Linguistic Discovery, Dartmouth: Dartmouth College [http://linguistic-
discovery.dartmouth.edu/cgi-bin/WebObjects/Journals.woa/xmlpage/1/issue]
Free peer reviewed online journal that has occasional papers dealing with language
documentation and revitalization issues.
6. Theory and practice
Language Documentation is a relatively new field, and there have been a number of
publications aimed at defining its scope and establishing theoretical principles (see Section
6.1), research methods (see Section 6.6), and the tools to be used in data collection and
analysis, including computer software (see Section 6.5). Also discussed are the nature and
organization of the data and corpus collected (see Section 6.2). Another continuing concern is
with metadata, data about the documentary data, such as the identity of speakers, recorders,
locations, equipment used, languages/dialects and genres. Metadata is important because it
enables the management, identification, retrieval and understanding of the documentary
material. Two standardized sets of metadata have been developed for documentary
linguistics: (1) the very general Open Language Archives Community (OLAC), and (2) the
ISLE Metadata Initiative (IMDI) which is a more complex and expressive set of metadata
terms designed specifically for language engineering research (see Section 6.4). The DOBES
project funded by Volkswagen Foundation uses the IMDI set. Language archives use IMDI,
OLAC or their own particular metadata categorizations. A further recent concern has been
meta-documentation, that is, documentation of the goals, processes and outcomes of language
documentation projects in order to understand the particular histories, biographies,
relationships, commitments and results achieved by researchers and communities as they go
about the work of language documentation.
6.1 Definition
The scope and goals of language documentation are addressed in introductory works by
Furbee 2010, Himmelmann 1998 and 2002, Lehmann 2001 and Woodbury 2003 and 2011.
Himmelmann and Lehmann make a strong demarcation between language documentation and
language description, a position addressed by Austin and Grenoble 2007. Furbee 2010
discusses documentary linguistics as a practice-based field; Grenoble 2010 is a critical
assessment of the current state of the field and the challenges it faces for the future.
Austin, Peter K. and Lenore Grenoble. “Current trends in language documentation”. In Language
Documentation and Description, vol. 4. Edited by Peter K. Austin, 12-25. London: SOAS, University
of London, 2007.
The authors discuss what it means to make a comprehensive record of a language, determining
the quality of a language documentation, the boundaries between documentation and
description, and interdisciplinarity and cross-discipline collaboration.
6. 6
Furbee, Louanna. “Language documentation: theory and practice”. In Language Documentation:
Practice and values. Edited by Grenoble, Lenore A. and N. Louanna Furbee, 3-24. Amsterdam: John
Benjamins Publishing Company, 2010.
Discusses language documentation as a practice-based field, developing principles that could
lead to theory construction in the future.
Grenoble, Lenore. “Language documentation and field linguistics: The state of the field”. In
Language Documentation: Practice and values. Edited by Grenoble, Lenore A. and N. Louanna
Furbee, 289-309. Amsterdam: John Benjamins Publishing Company, 2010.
Critical assessment of the current state of the art in documentary linguistics, including
identification of challenges for the future concerning scope and outcomes of documentation,
collaboration and teamwork, and technical expectations of documentary corpora.
Himmelmann, Nikolaus P. “Documentary and descriptive linguistics”. Linguistics 36 (1998): 161-
195.
A foundational article defining documentary linguistics and emphasizing its distinction from
language description.
Himmelmann, Nikolaus P. “Documentary and descriptive linguistics”. In Lectures on Endangered
Languages, vol. 5. Edited by Osamu Sakiyama and Fubito Endo, 37-83. Kyoto: Endangered
Languages of the Pacific Rim, 2002.
A fuller version of Himmelmann’s foundational article.
Lehmann, Christian. “Language Documentation. A Program”. In Aspects of Typology and Universals.
Edited by Walter Bisang, 83-97. Berlin: Akademie Verlag, 2001.
Lehmann’s seminal paper on language documentation and description, argues that the primary
purpose of language documentation is to represent the language for those who do not have
direct access to it.
Woodbury, Anthony C. “Defining documentary linguistics”. In Language Documentation and
Description, vol. 1. Edited by Peter K. Austin, 25-51. London: SOAS, University of London, 2003.
A seminal article setting out a view of language documentation as a sub-field of linguistics.
Woodbury, Anthony C. “Language documentation”. In The Cambridge Handbook of Endangered
Languages. Edited by Peter K. Austin and Julia Sallabank, 159-186. Cambridge, UK: Cambridge
University Press, 2011.
Critical analysis of defining issues. Argues for a broad and inclusive approach to language
documentation and more attention to corpus theorization and its implications for overall project
design.
6.2 Data
From the beginning, language documentation has been concerned with the nature of the
language data collected and the kinds and structures of the analysis applied to it in corpus
creation, as discussed in Lehmann 2004, Austin 2006, and Good 2011. Nathan 2010 argues
that linguists’ approaches to collection of audio data have paid insufficient attention to
recording methodology and goals. Dobrin et al. 2009 takes a critical view of counting and
quantification in documentary linguistics, arguing that it fails to recognize the particular
social, cultural and linguistic contexts within which individual projects are carried out. The
distinction between data and metadata is challenged by Nathan and Austin 2004, who argue
that all value adding to original recordings is metadata. Bird and Simons 2003 is a seminal
7. 7
discussion of issues such as transparency, transferability and preservation of documentary
records.
Austin, Peter K. “Data and language documentation”. In Essentials of Language
Documentation. Edited by Jost Gippert, Nikolaus Himmelmann and Ulrike Mosel, 87-112.
Berlin: Mouton de Gruyter, 2006.
An introductory overview of the different kinds of data documenters collect, how they
process it, the contexts of data analysis and use, and examples of tools used in data
manipulation and presentation.
Bird, Steven and Gary Simons. 2003. Seven dimensions of portability for language
documentation and description. Language 79(3), 557-582.
Seminal article on issues in the transparency, transferability, identification and
preservation of digital language data.
Dobrin, Lise M., Peter K. Austin and David Nathan. “Dying to be counted: the
commodification of endangered languages in documentary linguistics”. In Language
Documentation and Description, vol. 6. Edited by Peter K. Austin, 37-52. London: SOAS,
University of London, 2009.
A critical analysis of ‘archivism’, the inclination to treat quantifiable properties such as
recording hours, data volume, and file parameters, and technical desiderata like
‘archival quality’ and ‘portability’ as primary criteria for assessing the aims and
outcomes of language documentation.
Good, Jeff. “Data and language documentation”. In The Cambridge Handbook of
Endangered Languages. Edited by Peter K. Austin and Julia Sallabank, 212-234. Cambridge,
UK: Cambridge University Press, 2011.
Overview of the main data types and representations of data structures employed in
documentary linguistics.
Lehmann, Christian. Data in linguistics. Linguistic Review 21(2004): 175-210.
Seminal paper exploring the kinds of data linguists employ and how they attempt to
analyse it.
Nathan, David. “Sound and unsound practices in documentary linguistics: towards an
epistemology for audio”. In Language Documentation and Description, vol. 7. Edited by
Peter K. Austin, 262-284. London: SOAS, University of London, 2010.
A critical discussion of the role of sound recordings in language documentation and the
importance of attention to equipment choice, environmental factors, recording
methodology, and research goals in capturing and representing sound.
Nathan, David and Peter K. Austin. “Reconceiving metadata: language documentation
through thick and thin”. In Language Documentation and Description, vol. 2. Edited by Peter
K. Austin, 179-187. London: SOAS, University of London, 2004.
Argues that metadata should be understood as all knowledge representations added to
audio and video recordings, not just information about participants and contexts.
8. 8
6.3 Analysis
Language documentation involves collection of primary data via audio and video recording,
and analysis of it by adding information of various types, including summaries, indexes,
transcriptions (Himmelmann 2006), translations (Woodbury 2007, Evans and Sasse 2007),
and annotations (Bickel et al. 2008, Lehmann 2004, Schultze-Bernd 2006).
Bickel, Balthsar, Bernard Comrie and Martin Haspelmath. Leipzig glossing rules, 2008.
[http://www.eva.mpg.de/lingua/resources/glossing-rules.php]
A set of recommendations about the format of interlinear morpheme-by-morpheme
glossing and a list of abbreviated category labels.
Evans, Nick and Hans-Juergen Sasse. “Searching for meaning in the library of Babel: field
semantics and the problems of digital archiving”. In Language Documentation and
Description, vol. 4. Edited by Peter K. Austin, 58-99. London: SOAS, University of London,
2006.
Wide-ranging discussion of the role of translation and exegesis in language
documentation, arguing for a multi-layered hypertextual approach to commentary on
recorded materials.
Himmelmann, Nikolaus P. “The challenges of segmenting spoken language”. In Essentials of
Language Documentation. Edited by Jost Gippert, Nikolaus P. Himmelmann and Ulrike
Mosel, 253-274. Berlin: Mouton de Gruyer, 2006.
Discusses transcription of spoken language data and issues raised by determining units
at the word and higher levels (intonation units, clauses, sentences and paragraphs).
Complements Schultze-Bernd 2006.
Lehmann, Christian. “Interlinear morphemic glossing”. In Morphologie. Ein internationales
Handbuch zur Flexion und Wortbildung. 2. Halbband. Edited by Booij, Geert, Christian
Lehmann, Joachim Mugdan & Stavros Skopeteas, 1834-1857. Berlin: Mouton de Gruyter,
2004.
Presentation of a model of morpheme-by-morpheme glossing for use in documentation
research.
Schultze-Bernd, Eva. “Linguistic annotation”. In Essentials of Language Documentation.
Edited by Jost Gippert, Nikolaus P. Himmelmann and Ulrike Mosel, 213-251. Berlin:
Mouton de Gruyer, 2006.
Detailed discussion of principles and practices for transcription and notation of
translation and the addition of morphosyntactic information to recordings of speech
events.
Woodbury, Anthony C. “On thick translation in language documentation”. In Language
Documentation and Description, vol. 4. Edited by Peter K. Austin, 120-135. London: SOAS,
University of London, 2007.
Problematises translation within language documentation arguing for different types —
word-for-word, simultaneous, free, literary—and multiple layers of linked translations
serving different functions.
9. 9
6.4 Metadata
Documentary linguists must attend not only to data and analysis but also to metadata, the data
about the data which enables it to be managed, identified, retrieved and understood. Austin
2006 includes introductory discussion of types of metadata, and Good 2002 presents an
introduction to metadata for documentary linguists, while details of ISLE Metadata Initiative
(IMDI) and Open Language Archives Community (OLAC) metadata sets are available on the
internet. Nathan and Austin 2004 argue that all value adding in documentary research is a
kind of metadata. Farrar and Lewis 2007 present a proposal to standardize morpheme glosses
to a Generalized Ontology for Language Documentation (GOLD).
Austin, Peter K. “Data and language documentation”. In Essentials of Language
Documentation. Edited by Jost Gippert, Nikolaus Himmelmann and Ulrike Mosel, 87-112.
Berlin: Mouton de Gruyter, 2006.
An introductory overview of the different kinds of data and metadata documenters
collect.
Farrar, Scott and William D. Lewis. The GOLD Community of Practice: An Infrastructure
for Linguistic Data on the Web. Language Resources and Evaluation, 41(2007): 45-60.
Proposal to establish a standardized ontology for morpheme-by-morpheme glossing in
annotations for language documentation materials.
Good, Jeff. 2002. A gentle introduction to metadata. [http://www.language-
archives.org/documents/gentle-intro.html]
Elementary overview of metadata and its use in language documentation
ISLE Metadata Initiative (IMDI) [ http://www.mpi.nl/IMDI/]
Reference materials for the IMDI metadata set.
Nathan, David and Peter K. Austin. “Reconceiving metadata: language documentation
through thick and thin”. In Language Documentation and Description, vol. 2. Edited by Peter
K. Austin, 179-187. London: SOAS, University of London, 2004.
Argues for a broad conception of metadata encompassing all additions of knowledge
representations to original audio and video recordings of speech events.
Open Language Archives Community (OLAC) metadata [http://www.language-
archives.org/OLAC/metadata.html]
Reference materials for the OLAC metadata set.
6.5 Technologies
Language documentation has been influenced and assisted by the development of powerful
software tools that enable and support data manipulation and analysis. Good 2010 argues for
the importance to linguists to understand this technology, while Boynton et al. 2010 discusses
good practices. Albright and Hatton 2007 presents an innovative tool that enables native-
speakers to document their language with minimal outside support. Bowe et al. 2003 surveys
published materials to establish a generalized format for interlinear glossing. Drude 2002
presents the implementation of a complex multi-tiered model for interlinear glossing using
10. 10
the Shoebox software tool. A comprehensive list of software used by linguists to annotate
recordings can be found at the Annotation Tool website.
Albright, Eric and John Hatton. “Chapter 10. WeSay, a tool for engaging communities in
dictionary building”. In Documenting and Revitalizing Austronesian Languages. Edited by D.
Victoria Rau and Margaret Florey, 189–201. Hawaii: University of Hawaii Press, 2007.
An innovative software tool designed for community members to document their own
languages, with minimal input from outside linguists.
Annotation Tools [http://annotation.exmaralda.org/index.php/Tools]
A comprehensive listing of computer software that can be used for linguistic
annotation.
Bowe, Cathy, Baden Hughes and Steven Bird. “Towards a general model for interlinear text.”
Proceedings of EMELD 2003. [http://emeld.org/workshop/2003/bowbadenbird-paper.html]
Review of interlinear glossing format based on existing publications and proposal for
an XML model of interlinear text.
Boynton, Jessica, Steven Moran, Helen Aristar-Dry and Anthony Aristar. “Using the E-
MELD School of Best Practices to create lasting digital documentation”. In Language
Documentation: Practice and values. Edited by Grenoble, Lenore A. and N. Louanna Furbee,
133-146. Amsterdam: John Benjamins Publishing Company, 2010.
Introduction to the Electronic Metastructure for Endangered Languages Documentation
project sponsored by the National Science Foundation that began in 2000 and
established a School of Best Practices to promulgate and popularise standards for
documentation research and to present case studies and searchable databases of
software tools and reference bibliographies.
Drude, Sebastian. Advanced Glossing — a language documentation format and its
implementation with Shoebox. [http://www.mpi.nl/lrec/2002/papers/lrec-pap-10-ag.pdf]
Proposal for a comprehensive multi-tier annotation format for glossing text materials.
Good, Jeff. “Valuing technology: Finding the linguist’s place in a new technological
universe”. In Language Documentation: Practice and values. Edited by Grenoble, Lenore A.
and N. Louanna Furbee, 111-131. Amsterdam: John Benjamins Publishing Company, 2010.
A position paper on the role of technology in language documentation arguing for
greater technological literacy on the part of linguists and attention to broad issues rather
than technical details such as file formats or software tools.
6.6 Methods
Language documentation research has involved discussion of methodological aspects for data
collection, analysis, archiving and dissemination. Lüpke 2010 overviews different
documentary methods, evaluating their strengths and weaknesses. Reiman 2010 presents an
alternative documentation method that does not use text or symbolic representations but relies
on second-order oral annotation. Hill 2006 discusses how responding to a researcher does not
produce a ‘normal’ type of speech event, a fact that has consequences for the nature of the
data collected and the interaction itself. Haviland 2006 discusses documentation of lexical
11. 11
knowledge while Lehmann 2004 focuses on grammar. Ashmore 2008 explores the role of
video in documentation while Nathan 2010 looks at sound recording; both argue that use and
recording of media by linguists is unscientific. Schembri 2010 covers the challenges
particular to the documentation of sign languages.
Ashmore, Louise. “The role of digital video in language documentation”. In Language
Documentation and Description, vol. 5. Edited by Peter K. Austin, 77-102. London: SOAS,
University of London, 2008.
Discusses the place of digital video in language documentation, for recording, capturing
visual and spatial aspects of interaction, assisting with transcription, and for community
use. It argues that researchers need to develop appropriate goals, methods and
evaluative criteria for using video.
Haviland, John. “Documenting lexical knowledge”. In Essentials of Language
Documentation. Edited by Jost Gippert, Nikolaus P. Himmelmann and Ulrike Mosel, 129-
162. Berlin: Mouton de Gruyer, 2006.
A thorough and richly illustrated discussion about documenting the meanings and uses
of words .
Hill, Jane. “The ethnography of language and language documentation”. In Essentials of
Language Documentation. Edited by Jost Gippert, Nikolaus P. Himmelmann and Ulrike
Mosel, 113-128. Berlin: Mouton de Gruyer, 2006.
Calls for an ethnographic approach to the documentation of speech that proceeds in
light of both culture and language structure. Emphasizes the peculiar nature of the
linguistic fieldwork encounter and its consequent impact on the data produced.
Lehmann, Christian. “Documentation of grammar”. In Lectures on endangered languages: 4.
From Kyoto Conference 2001. Edited by Sakiyama, Osamu, Fubito Endo, Honore Watanabe,
and Fumiko Sasama, 61-74. Osaka: Osaka Gakuin University (Endangered Languages of the
Pacific Rim Publication Series, C-004), 2004.
Detailed discussion of how grammatical information should be represented in a
documentary corpus.
Lüpke, Friederike. “Research methods in language documentation”. In Language
Documentation and Description, vol. 7. Edited by Peter K. Austin, 55-104. London: SOAS,
University of London, 2010.
Outlines and illustrates methods that can be used for data collection (e.g., participant
observation, use of stimuli, experiments, games, elicitation), evaluating the strengths
and weakness of each.
Nathan, David. “Sound and unsound practices in documentary linguistics: towards an
epistemology for audio”. In Language Documentation and Description, vol. 7. Edited by
Peter K. Austin, 262-284. London: SOAS, University of London, 2010.
Critical discussion of the role of sound recordings in language documentation and the
need for linguists to play more attention to equipment choice, environmental factors,
recording methodology, and research goals so that their approach to sound is more
systematic and effective.
12. 12
Reiman, D. Will. “Basic Oral Language Documentation”. Language Documentation and
Conservation 4(2010): 254-268.
Presentation of a novel approach to documentation that involves recording materials in
their social and cultural context and then recording under laboratory conditions the
respeaking of them slowly or their translation, to provide a second-order oral
representation.
Schembri, Adam. 2010. “Documenting sign languages.” In Language Documentation and
Description, vol. 7. Edited by Peter K. Austin, 105-143. London: SOAS, University of
London, 2010.
Discusses the challenges involved in creating a language documentation corpus for
British Sign Language, some of which differ from spoken languages in kind or degree.
6.7 Multidisciplinarity
The value of multidisciplinary for creating multi-purpose records of speech events in their
social and cultural contexts has long been recognized in language documentation. Coelho
2005 discusses the mutually beneficial nature of ecological and linguistic documentation;
Harrison 2005 shows why linguistic documentation must be ethnographically informed.
Barwick 2005 urges linguistic researchers to document basic aspects of musical forms and
expression. Methods and challenges of multidisciplinary research are presented in Franchetto
2006 for ethnography, Kendon 2004 for gesture (though it is not specifically targeted at
language documentation), Gaenszle 2010 for oral literature, Eisenbeiss 2005 for
psycholinguistics, and Roche et al. 2010 for cultural documentation.
Barwick, Linda. “A musicologist's wish list: some issues, practices and practicalities in
musical aspects of language documentation”. In Language Documentation and Description,
vol. 3. Edited by Peter K. Austin, 53-62. London: SOAS, University of London, 2005.
Argues that most language documentation includes some musical material and presents
an overview of what a musicologist would like to see recorded and documented.
Coelho, Gail. “Language documentation and ecology: areas of interaction”. In Language
Documentation and Description, vol. 3. Edited by Peter K. Austin, 63-74. London: SOAS,
University of London, 2005.
Discusses ways in which linguists and ecologists can collaborate in documenting
traditional ecological knowledge.
Eisenbeiss, Sonja. “Psycholinguistic contributions to language documentation”. In Language
Documentation and Description, vol. 3. Edited by Peter K. Austin, 106-140. London: SOAS,
University of London, 2005.
Discusses how concepts from psycholinguistic research, including child language
acquisition, can be applied in documentary linguistics.
Franchetto, Bruna. “Ethnography in language documentation”. In Essentials of Language
Documentation. Edited by Jost Gippert, Nikolaus P. Himmelmann and Ulrike Mosel, 183-
211. Berlin: Mouton de Gruyer, 2006.
Examines what kinds of information ethnographers might look for in a language corpus
and illustrates one method of recording such information as applied to research in
Brazil.
13. 13
Gaenszle, Martin. “Documenting Ceremonial Dialogues: An in vitro performance and the
problem of textualisation”. In Language Documentation and Description, vol. 8. Edited
by Imogen Gunn and Mark Turin, 70-87. London: SOAS, University of London, 2010.
Discusses the advantages and disadvantages of documenting staged performances
versus ‘naturally occurring’ events, arguing that both can provide a starting point for
creating ethnographic text.
Harrison, K. David. “Ethnographically informed language documentation”. In Language
Documentation and Description, vol. 3. Edited by Peter K. Austin, 22-41. London: SOAS,
University of London, 2005.
Argues for the need to pay attention to cultural factors in language documentation with
examples from Siberian languages how phonology, verb semantics, colour terminology
and noun phrase structure can be better understood by collecting data in an
ethnographically informed way.
Kendon, Adam. Gesture: Visible Action as Utterance. Cambridge: Cambridge University
Press, 2004.
Comprehensive treatment of study of gesture from historical, linguistic and cultural
perspectives. Recommendations on transcriptional conventions and analytical methods
will be of use to language documenters.
Roche, Gerald, Ban+de mkhar, Bkra shis bzang po, G.yu lha, Snying dkar skyid, Tshe ring
rnam gyal, Zla ba sgrol ma, and Charles Kevin Stuart. Participatory Culture Documentation
on the Tibetan Plateau. In Language Documentation and Description, vol. 8. Edited by
Imogen Gunn and Mark Turin, 147-165. London: SOAS, University of London, 2010.
Examines the benefits of participatory approaches to cultural documentation in the
context of fieldwork on the Tibetan Plateau.
6.8 Corpus adequacy and representativeness
Language documentation is intended to create a corpus representing the range of ways
language is used within a community. But ensuring the adequacy and representativeness of
such a corpus is challenging. Foley 2003 identifies genre as an important variable that can
have unexpected effects on the validity of data included. Berge 2010 proposes parameters for
adequacy of documentation while Seifart 2008 approaches representativeness in terms of
sampling.
Berge, Anna. “Adequacy in documentation”. In Language Documentation: Practice and
values. Edited by Grenoble, Lenore A. and N. Louanna Furbee, 51-66. Amsterdam: John
Benjamins Publishing Company, 2010.
Proposal for general principles to be followed in language documentation concerning
description, diversity and the roles played by different participants.
Foley, William A. “Genre, register and language documentation in literate and preliterate
communities”. In Language Documentation and Description, vol. 1. Edited by Peter K.
Austin, 85-98. London: SOAS, University of London, 2003.
14. 14
Argues for the need to pay attention to genre in documentation and that using stimuli,
such as picture books, can result in language forms and use that differ greatly from
naturally occurring narrative text.
Seifart, Frank. “The representativeness of language documentations”. In Language
Documentation and Description, vol. 5. Edited by Peter K. Austin, 60-76. London: SOAS,
University of London, 2008.
Addresses what a “representative sample” might mean for a documentary corpus and
how sampling procedures can serve the goal of achieving representativeness.
7. Ethics, speakers and collaboration
The rights and needs of the language speakers who participate in documentation projects are
of primary importance. As a result, there is an increasing emphasis on collaborative models in
documentary linguistics. Issues of research ethics, the roles of speakers in recording and
documenting languages, and the relationships between researchers and communities have
been particularly prominent in the language documentation literature.
7.1 Ethics
Overviews of ethical issues in language documentation can be found in Austin 2010, Dwyer
2006, Rice 2006, Thieberger and Musgrave 2006, and Macri 2010, all intended for
researchers beginning a documentation project. Dobrin 2009 argues for understanding,
analyzing and responding to the different the values and moral positions adopted by
researchers and by the people with whom they work. A range of issues in linguistic fieldwork
are covered in Newman and Ratliffe 2001.
Austin, Peter K. “Communities, ethics and rights in language documentation”. In Language
Documentation and Description, vol. 7. Edited by Peter K. Austin, 34-54. London: SOAS,
University of London.
Elementary introduction to ethical issues, intellectual property rights, copyright, moral
rights, and community views for students.
Dobrin, Lise M. From linguistic elicitation to eliciting the linguist: lessons in community
empowerment from Melanesia” Language, 84/2(2008): 300–324.
Argues that linguists working on endangered languages need to understand and
respond to the different the values and moral positions adopted by communities,
especially those that are different from a western perspective.
Dwyer, Arienne M. “Ethics and practicalities of cooperative fieldwork and analysis”. In
Essentials of Language Documentation. Edited by Jost Gippert, Nikolaus P. Himmelmann
and Ulrike Mosel, 31-66. Berlin: Mouton de Gruyer, 2006.
Inventory of the major ethical issues to be addressed in language documentation,
including ethical principles, rights and legal responsibilities, and the practical problems
associated with fieldwork.
15. 15
Macri, Martha. “Language documentation: whose ethics?”. In Language Documentation:
Practice and values. Edited by Grenoble, Lenore A. and N. Louanna Furbee, 37-47.
Amsterdam: John Benjamins Publishing Company, 2010.
Argues that an ethical approach to documenting a language should be not only aimed at
recording language but also supporting continued use within heritage communities.
Newman, Paul, and Martha Ratliff, (eds.) Linguistic Fieldwork. Cambridge: Cambridge
University Press, 2001.
A collection of personal reflections on fieldwork from some of the leading practitioners
in the discipline.
Rice, Keren. “Ethical issues in fieldwork: an overview”. Journal of Academic Ethics 4(2006):
123-155.
Exploration of ethical models for fieldwork and the responsibilities of researchers.
Thieberger, Nicholas and Simon Musgrave. “Documentary linguistics and ethical issues”. In
Language Documentation and Description, vol. 4. Edited by Peter K. Austin, 26-37. London:
SOAS, University of London, 2006.
Review of some of the most pressing ethical issues and challenges linguistic
researchers face in carrying out language documentation.
7.2 Speakers and collaboration
The kinds of speakers that exist in communities where languages are being documented, and
the roles they can play in documentation projects have been discussed by Grinevald 2003 and
Dobrin and Berson 2011. Czaykowska-Higgins 2009 and Mosel 2006 discuss collaboration
between researchers and communities; Wilkins 1992 is a seminal account of collaborative
research under the control of the community members. Glenn 2009 focuses on collaboration
between different kinds of researchers. Leonard and Hayes 2010 is a recent critical discussion
of collaboration as a research approach.
Czaykowska-Higgins. “Research Models, Community Engagement, and Linguistic
Fieldwork: Reflections on Working within Canadian Indigenous Communities” Language
Documentation and Conservation 3(2009): 15-50
Reflections on research models in linguistic fieldwork and on different levels of
engagement in and with language-speaking communities, focusing on the Canadian
context.
Dobrin, Lise and Josh Berson. “Speakers and language documentation”. In The Cambridge
Handbook of Endangered Languages. Edited by Peter K. Austin and Julia Sallabank, 187-
211. Cambridge, UK: Cambridge University Press, 2011.
Critical overview of the practical, moral, and cultural issues surrounding speakers in
language documentation, and how these follow from recognizing speakers as fellow
humans and collaborators, rather than as mere sources of data.
16. 16
Glenn, Akiemi. “Five Dimensions of Collaboration: Toward a Critical Theory of
Coordination and Interoperability in Language Documentation” Language Documentation
and Conservation 3(2009): 149–160.
Discusses differences across disciplinary cultures and their implications for
collaborative language documentation projects.
Grinevald, Colette. “Speakers and documentation of endangered languages”. In Language
Documentation and Description, vol. 1. Edited by Peter K. Austin, 52-72. London: SOAS,
University of London, 2003.
Presents a typology of language speakers and research approaches in documentary
linguistics.
Leonard, Wesley Y. and Erin Haynes. “Making ‘collaboration’ collaborative: An
examination of perspectives that frame linguistic field research.” Language Documentation
and Conservation 4(2010): 268-293.
A critical study of ‘collaboration’ in recent linguistic literature, arguing that true
collaboration necessitates a collaborative approach in the initial project stages of
establishing research roles and goals. Case studies from US Native American
communities are presented.
Mosel, Ulrike. “Fieldwork and community language work”. In Essentials of Language
Documentation. Edited by Jost Gippert, Nikolaus P. Himmelmann and Ulrike Mosel, 67-85.
Berlin: Mouton de Gruyer, 2006.
Discusses differences of motivation and approach among documentary researchers and
communities and how these affect collaborative projects, with illustrations from the
Pacific.
Wilkins, David P. Linguistic research under Aboriginal control: a personal account of
fieldwork in central Australia”. Australian Journal of Linguistics, 12/1(1992): 171–200.
Seminal article presenting a vision for linguistic research where the language
community rather than the researcher determines the goals and outcomes.
8. Archiving
Language documentation emphasises the long-term preservation of documentary corpora and
hence archiving plays an important role in the field. The basic principles and approach taken
by archivists are covered by Conathan 2011, while the organisation of a corpus to make
archiving easier is discussed by Johnson 2004. Audio archiving principles are covered by
Bradley 2009. Nathan 2011 reviews the particular challenges of digital archiving, where
changing software and data formats require the continual refreshing of archival materials; he
also discusses in detail ‘protocol’ requirements, that is, specification and management of
restrictions on access and use of archived data. Trilsbeek and Wittenburg 2006 covers similar
ground from the perspective of archivists for the Volkswagen-funded DOBES project.
Nathan 2010 discusses new directions in language documentation archiving, especially
application of Web 2.0 social networking models.
17. 17
Bradley, Kevin. Guidelines on the Production and Preservation of Digital Audio Objects.
IASA-TC04, Second Edition. Sydney: International Association of Sound and Audiovisual
Archives, 2009.
Authoritative reference on audio archiving published by international association of
sound and audiovisual archivists. Also online at [http://www.iasa-web.org/tc04/audio-
preservation]
Conathan, Lisa. “Archiving and language documentation”. In The Cambridge Handbook of
Endangered Languages. Edited by Peter K. Austin and Julia Sallabank, 235-254. Cambridge,
UK: Cambridge University Press, 2011.
Discussion of the policies, principles and practices of archives, with particular
relevance to curation, cataloguing and preservation of language materials.
Johnson, Heidi. “Language documentation and archiving, or how to build a better corpus”. In
Language Documentation and Description, vol. 2. Edited by Peter K. Austin, 140-153.
London: SOAS, University of London, 2004.
Introduction to basic principles of archiving and corpus management to ensure
preservation and usability of the documentary materials collected by researchers.
Nathan, David. “Archives 2.0 for endangered languages: from disk space to MySpace”.
International Journal of Humanities and Arts Computing, Volume 4(1-2). 111-124. (2010):
Presentation of an approach to language archiving developed at the Endangered
Languages Archive at SOAS, University of London that builds upon Web 2.0 social
computing models that allow the archive to serve as a communication channel between
users and depositors.
Nathan, David. “Digital archiving”. In The Cambridge Handbook of Endangered Languages.
Edited by Peter K. Austin and Julia Sallabank, 255-273. Cambridge, UK: Cambridge
University Press, 2011.
Overview on archiving issues raised by digital materials, with particular attention to the
need to control access and use of deposited data and analysis for endangered languages
communities.
Trilsbeek, Paul and Peter Wittenburg. “Archiving challenges”. In Essentials of Language
Documentation. Edited by Jost Gippert, Nikolaus P. Himmelmann and Ulrike Mosel, 311-
335. Berlin: Mouton de Gruyer, 2006.
Introduction to the role of archiving in language documentation with particular
reference to the approach taken for DOBES projects.
9. Mobilization and revitalization
Documentary linguistic corpora are not only collected for research purposes, but may also be
organized in such a way that they can be mobilized for practical uses such as the creation of
multimedia and teaching materials. Amery 2009 argues that failure to collect data on speech
formulas and routines, neologisms and non-traditional conversation makes corpora less useful
to both researchers and those wishing to revitalize languages. The role of interfaces in
mobilization is discussed by Nathan 2006, while Francis and Gómez 2009 argue that
mobilization nicely complements more narrowly focused research agendas. There is an
18. 18
enormous literature on language revitalization; Hinton 2011 serves as a good general
introduction.
Amery, Rob. “Phoenix or Relic? Documentation of Languages with Revitalization in Mind.”
Language Documentation and Conservation 3/2(2009): 138-148.
Argues that a documentary corpus that takes into consideration the possibility of
revitalization would include data on speech formulas, neologisms, and conversation in
non-traditional as well as traditional contexts.
Francis, Norbert and Pablo Rogelio Navarrete Gómez. “Documentation and Language
Learning: Separate Agendas or Complementary Tasks?” Language Documentation and
Conservation 3(2009): 176–191.
Drawing on experiences with Nahuatl speakers in Mexico, the authors argue that the
research interests of documenters do not necessarily conflict with those of communities
concerning the goals and outcomes of documentation and language maintenance.
Hinton, Leanne. “Revitalization of endangered languages”. In The Cambridge Handbook of
Endangered Languages. Edited by Peter K. Austin and Julia Sallabank, 291-311. Cambridge,
UK: Cambridge University Press, 2011.
Introductory overview of issues and methods in language revitalization.
Nathan, David. “Thick interfaces: Mobilizing language documentation with multimedia”. In
Essentials of Language Documentation. Edited by Jost Gippert, Nikolaus P. Himmelmann
and Ulrike Mosel, 363-379. Berlin: Mouton de Gruyer, 2006.
Discusses how multimedia products based on documentary materials can serve the
purpose of language maintenance, strengthening and revitalization.
10. Training
Language documentation is a new field within linguistics and requires a more complex range
of skills and knowledge than traditionally included in training programs in linguistics, such as
those typically associated with recording arts, data management, information and
communications technologies, ethnography etc. Austin 2008 examines how these skills are
taught at SOAS, University of London. Woodbury and England 2006 discuss experiences
training indigenous Latin American students at the University of Texas at Austin, while Jukes
2011 gives a broad overview of training for language documentation. The training of native
speakers is discussed in Himmelmann and Florey 2010, while Maxwell 2010 covers both
groups. Course materials are available online from DOBES Training Courses , ELDP Grantee
Training , Workshops of Infield 2008, and Workshops of Infield 2010 training courses.
Austin, Peter K. 2008. Training for language documentation: Experiences at the School of
Oriental and African Studies. In Margaret Florey and Victoria Rau (eds.) Documenting and
Revitalising Austronesian Languages, 25-41. Language Documentation and Conservation
Special Publication No. 1. Hawaii: University of Hawaii Press
Discussion of skills and training needs for language documenters with examples of how
these are provided in post-graduate degree courses, workshops, and grantee training at
SOAS.
19. 19
DOBES Training Courses [http://www.mpi.nl/DOBES/training_courses/]
Information and materials from training courses for the recipients of DOBES grants
from the Volkswagen Foundation.
ELDP Grantee Training [http://www.hrelp.org/events/workshops/eldp2010_9/index.html]
Schedule, syllabus and training materials used at SOAS, University of London to train
grant recipients before their projects begin.
Himmelmann, Nikolaus and Margaret Florey. "New directions in field linguistics. Training
strategies for language documentation in Indonesia". In Endangered languages of
Austronesia. Edited by Margaret Florey, 121-140. Oxford: Oxford University Press, 2010.
Discussion of an innovative training course for native speakers and local researchers in
Indonesia sponsored by the Volkswagen Foundation.
Jukes, Anthony. 2011. Researcher training and capacity development in language
documentation. In Peter K. Austin and Julia Sallabank (eds.) The Cambridge Handbook of
Endangered Languages, 423-445. Cambridge: Cambridge University Press.
Overview of the goals, methods and structures of training courses for language
documenters drawing on examples from training of various types.
Maxwell, Judith M. “Training graduate students and community members for native
language documentation”. In Language Documentation: Practice and values. Edited by
Grenoble, Lenore A. and N. Louanna Furbee, 255-274. Amsterdam: John Benjamins
Publishing Company, 2010.
Broad-ranging study of the skills needed by students and community members for
language documentation, arguing that co-constructed projects can lead to greater
satisfaction in goals, outcomes and dissemination.
Woodbury, Anthony C. and Nora England. “Training speakers of indigenous languages of
Latin America at a US university”. Linguistic Discovery 4/1(2006) [
http://journals.dartmouth.edu/cgi-bin/WebObjects/Journals.woa/2/xmlpage/1/issue/26]
Workshops of Infield, University of California Santa Barbara. 2008
[http://www.linguistics.ucsb.edu/faculty/infield/workshops/index.html]
Course materials from the 2008 Infield workshops; includes handouts intended for
native speakers and language activists.
Workshops of Infield, University of Oregon. 2010
[http://logos.uoregon.edu/infield2010/workshops/index.php]
Course materials from the 2010 Infield training course with a particular emphasis on
collaborative methods.