Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

LSA 2010 Austin


Published on

Published in: Education, Technology
  • Be the first to comment

LSA 2010 Austin

  1. 1. Language Documentation: An Overview Prof Peter K. Austin Endangered Languages Academic Programme Department of Linguistics, SOAS LSA Annual Meeting, Baltimore, January 2010
  2. 2. Outline <ul><li>What is language documentation? </li></ul><ul><li>How does it differ from language description (and from linguistic theory)? </li></ul><ul><li>Reshaping 'the science of language' </li></ul><ul><li>Some challenges </li></ul>
  3. 3. Language documentation <ul><li>“ concerned with the methods, tools, and theoretical underpinnings for compiling a representative and lasting multipurpose record of a natural language or one of its varieties” (Himmelmann 1998) </li></ul><ul><li>has developed over the last decade in large part in response to the urgent need to make an enduring record of the world’s many endangered languages and to support speakers of these languages in their desire to maintain them, fuelled also by developments in information and communication technologies </li></ul><ul><li>essentially concerned with roles of language speakers and their rights and needs </li></ul>
  4. 4. What documentary linguistics is not <ul><li>it's not about collecting stuff to preserve it without analysing it </li></ul><ul><li>it's not = description + technology </li></ul><ul><li>it's not necessarily about endangered languages per se </li></ul><ul><li>it's not a fad </li></ul>
  5. 5. The level of interest is very high <ul><li>Graduate student interest </li></ul><ul><li>62 students graduated from SOAS MA in Language Documentation and Description 2004-09 – currently 17 are enrolled </li></ul><ul><li>7 graduates in PhD in Field Linguistics – 12 currently enrolled </li></ul><ul><li>other documentation programmes, eg. UTAustin have similar experience </li></ul>
  6. 6. Interest in training <ul><li>3L Summer School 2009 – 100 attendees </li></ul><ul><li>3L Summer School 2008 – 80 attendees </li></ul><ul><li>SOAS fieldwork seminars – 70 attendees </li></ul>
  7. 7. InField 2008 – 75 attendees
  8. 8. And more good news ... <ul><li>Research funding $$$ </li></ul><ul><li>ELDP has so far funded 250 documentation research projects on endangered languages worth GBP 8 million </li></ul><ul><li>Volkswagen DoBeS has funded 60 projects EUR 30 million </li></ul><ul><li>NSF-NEH DEL 60 projects $US 10 million </li></ul><ul><li>ESF EuroBABEL project EUR 8 million </li></ul><ul><li>and ELF, FEL, GfBS, Unesco ... </li></ul>
  9. 9. DoBeS projects
  10. 10. ELDP funding Projects 2003-2007
  11. 11. Books and journals <ul><li>Gippert et al 2006 Essentials of Language Documentation. Mouton </li></ul><ul><li>Tsunoda 2006 Language endangerment and language revitalization: an introduction </li></ul><ul><li>Language Documentation and Description – 6 issues (1,500 copies sold) </li></ul><ul><li>Language Documentation and Conservation – 6 issues (on-line only) </li></ul><ul><li>Cambridge Handbook of Endangered Languages </li></ul><ul><li>Routledge Essential Readings </li></ul>
  12. 12. <ul><li>back to Language Documentation </li></ul>
  13. 13. Main features ( Himmelmann 2006:15) <ul><li>Focus on primary data – collection and analysis of an array of primary language data to be made available for a wide range of users; </li></ul><ul><li>Explicit concern for accountability – access to primary data and representations of it makes evaluation of linguistic analyses possible and expected; </li></ul><ul><li>Concern for long-term storage and preservation of primary data – includes a focus on archiving in order to ensure that documentary materials are made available to potential users now and into the distant future; </li></ul>
  14. 14. Main features (cont.) <ul><li>Diversity – of contexts, languages, cultures, communities, individuals, projects </li></ul><ul><li>Work in interdisciplinary teams – documentation requires input and expertise from a range of disciplines and is not restricted to mainstream (“core”) linguistics alone </li></ul><ul><li>Close cooperation with and direct involvement of the speech community – active and collaborative work with community members both as producers of language materials and as co-researchers </li></ul>
  15. 15. The documentation record <ul><li>core of a language documentation is a corpus of audio and/or video materials with transcription, annotation, translation into a language of wider communication, and relevant metadata on context and use of the materials </li></ul><ul><li>the corpus will ideally be large , cover a diverse range of genres and contexts, be expandable , opportu-nistic , portable , transparent , ethical and preservable </li></ul><ul><li>lexico-grammatical analysis (description) and theory construction is contingent on and emergent from the documentation corpus (Woodbury 2003, 2010) </li></ul>
  16. 16. Components of documentation <ul><li>Recording – of media and text (including metadata) in context </li></ul><ul><li>Transfer – to data management environment </li></ul><ul><li>Adding value – transcription, translation, annotation, notation and linking of metadata </li></ul><ul><li>Archiving – creating archival objects, assigning access and usage rights </li></ul><ul><li>Mobilisation – creation, publication and distribution of outputs </li></ul>
  17. 17. An example – Stuart McGill <ul><li>4 year PhD project at SOAS </li></ul><ul><li>documentation of Cicipu (Niger-Congo, north-west Nigeria) in collaboration with native speaker researchers </li></ul><ul><li>outcomes: </li></ul><ul><ul><li>a corpus of texts (video, ELAN, Toolbox) </li></ul></ul><ul><ul><li>2,000 item lexicon </li></ul></ul><ul><ul><li>archive (956 files, 50Gbytes) </li></ul></ul><ul><ul><li>overview grammar (134 pages) </li></ul></ul><ul><ul><li>analysis of agreement (158 pages) </li></ul></ul><ul><ul><li>website, cassette tapes, books, orthography proposal and workshop </li></ul></ul>
  18. 18. <ul><li>Documentation and Description </li></ul>
  19. 19. Documentation and description <ul><li>language documentation : systematic recording, transcription, translation and analysis of the broadest possible variety of spoken (and written) language samples collected within their appropriate social and cultural context </li></ul><ul><li>language description : grammar, dictionary, text collection, typically written for linguists </li></ul><ul><li>Ref: Himmelmann 1998, Woodbury 2003, 2010 </li></ul>
  20. 20. Documentation and description <ul><li>documentation projects must rely on application of theoretical and descriptive linguistic techniques, to ensure that they are usable (i.e. have accessible entry points via transcription, translation and annotation) as well as to ensure that they are comprehensive </li></ul><ul><li>only through linguistic analysis can we discover that some crucial speech genre, lexical form, grammatical paradigm or sentence construction is missing or under-represented in the documentary record </li></ul>
  21. 21. Documentation and description <ul><li>without good analysis, recorded audio and video materials do not serve as data for any community of potential users. Similarly, linguistic description without documentary support risks being sterile, opaque and untestable (not to mention non-preservable for future generations and useless for language support) </li></ul>
  22. 22. Workflow Description Documentation FOCUS OF INTEREST FOCUS OF INTEREST FOCUS OF INTEREST  something inscribed something happened you applied knowledge, made decisions NOT OF INTEREST representations, lists, summaries, analyses cleaned up, selected, analysed presented, published something happened  recording you applied knowledge, techniques representations, eg transcription, annotation made decisions, applied linguistic & other knowledge archived, mobilised recapitulates
  23. 23. <ul><li>Language documentation gives linguistics an opportunity to reassert itself as 'the science of human language' </li></ul>
  24. 24. Linguistics – the science of language <ul><li>documentation requires a scientific approach to information capture, paying proper attention to environmental factors including spatial layouts, equipment choice etc., requiring knowledge and skills more often found in music or film rather than descriptive/theoretical linguistics </li></ul><ul><li>documentation requires a scientific approach to data structuring, processing and analysis, paying attention to, eg. data modeling and knowledge representation. requiring skills more often found in computer science rather than descriptive/theoretical linguistics </li></ul>
  25. 25. <ul><li>documentation requires a scientific approach to data archiving and preservation, with proper attention to metadata, data formats, corpus structure, workflows, and to protocols (access and usage rights) requiring knowledge and skills more often found in archiving theory and practice rather than descriptive/theoretical linguistics </li></ul><ul><li>documentation demands a scientific approach to mobilisation with proper attention to pedagogy, applied linguistics, human-computer interaction (interface design etc.) </li></ul>
  26. 26. Challenges <ul><li>Determining quality </li></ul><ul><li>Interdisciplinarity </li></ul><ul><li>Recruitment, training and sustainability </li></ul>
  27. 27. Quality <ul><li>tendency among some researchers to equate documentation outcomes with properties of archival objects (part of what Nathan has termed ‘archivism’), eg. number and volume of recorded digital audio and/or video files and their related transcription, annotation, translation and metadata </li></ul><ul><li>quantity of objects is not a good proxy for quality of research. </li></ul><ul><li>some would argue that outcomes which contribute to language maintenance and revitalization are the true measure of the quality of a documentation project (what better success of an endangered language project than that the language continues to be used?) </li></ul><ul><li>So how could we measure ‘quality’ of a documentary corpus? What parameters might be included? </li></ul>
  28. 28. Interdisciplinarity <ul><li>multidisciplinary perspective in language documentation draws in researchers, theories and methods from a wide range of areas, including anthropology, musicology, oral history, psychology, ecology, pedagogy, applied linguistics, computer science etc </li></ul><ul><li>true interdisciplinary research, is difficult to achieve, both because of theoretically different orientations, and practical differences in approach that can make communication and understanding complex and difficult </li></ul><ul><li>mainstream linguistics has tended to turn away from other disciplines and to emphasise its ‘independence’ by concentrating on theoretical concerns that are of internal interest primarily to linguists alone (Libermann 2007) </li></ul><ul><li>language documentation opens new doors to interdisciplinary collaboration but we need to work out how to achieve it </li></ul>
  29. 29. Sustainability <ul><li>we need to work out how to recruit new contributors to the discipline, how to train them, and how to sustain them through fulfilling career paths </li></ul><ul><li>we understand sustainability of archived data but how do we sustain projects and relationships beyond the typical 3-5 year academic life cycle? </li></ul><ul><li>how can documentation contribute to sustaining endangered languages and the communities who want to maintain and develop them? </li></ul>
  30. 30. Conclusion <ul><li>language documentation is an exciting development in terms of research goals, methods and outcomes that offers the potential to reshape the scientific and humanistic basis of linguistics now and into the future </li></ul>
  31. 31. <ul><li>Thank you </li></ul>