Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite

    Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project - Presentation Transcript

    1. Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project Baden Hughes 1 , David Penton 1 , Steven Bird 1 , Catherine Bow 1 , Gillian Wigglesworth 1 , Patrick McConvell 2 and Jane Simpson 3 1 University of Melbourne, 2 AIATSIS, 3 University of Sydney
    2. Overview
      • Introduction
      • Requirements
      • Data Model
      • Implementation
        • Data Entry
        • Reports, Queries and Searches
        • Exports
        • Synchronisation
        • Administration
      • Conclusion
    3. Introduction
      • A metadata creation and management tool for a multiple fieldworker, longitudinal, child language acquisition research project
      • Addressing the need for principled metadata creation as well as best practice data creation
      • Challenging deployment scenario which is typical of numerous field-oriented linguistic research and language data collection projects
    4. Requirements
      • Data Management
        • Metadata for complex multimodal data
        • Relational data for participants
        • Delineation between participant roles
        • Not just collection, but reports and queries
      • Research Methodology
        • Integration with tool of choice for analysis
        • 2 stage enquiry process - metadata then data
        • Extensible controlled vocabularies
        • User defined fields (particularly lists)
      • Technology
        • Full support for data entry and enquiry in both online and offline modes
        • Metadata collection with maximum utility to project without precluding other renderings eg as OLAC or IMDI catalogue
        • Easy to install and use on multiple platforms
    5. Data Model
      • Tools for modelling
        • DBDesigner (open source, XML based, multi-platform)
      • Challenges for modelling
        • Multiple interlinked media, sessions, and transcripts
        • Differentiating between participants and focus children in multiple contexts
        • Incomplete personal data eg no DOB
        • Non-linear progression through educational system
        • Multiple types of anthropological relations
        • Non-standardised linguistic classification and nomenclature
    6. Implementation
      • Architecture
        • (fully independent) networked client-server
        • single line of code difference between client and server installation
        • Underlying requirement to provide full functionality in both online or offline environments
      • Technology Platform
        • PHP, PEAR scripting language
        • MySQL database engine
        • Apache HTTP server
        • fundamentally open source, cross-platform
    7.  
    8. Data Entry
      • Forms based data entry
        • Participant Form
        • Session Form
      • Feature of both these forms is the “build your own list” form interface which allows end user to construct a list of parameters and then apply instances of these parameters within the parent form
        • educational progress
        • session-media-transcript
    9. Reports, Queries and Searches
      • Simple Reports
        • for frequently used 2 dimensional queries
          • eg participants by fieldworker
          • eg participants by gender
      • Advanced Reports
        • design your own query interface
      • Full Text Query
        • Boolean support
        • full database index query
    10.  
    11. Exports
      • Generate headers for CLAN
        • eg @participants
      • Generate Physical Media Labels
        • Eg FM025.A.DV, FM025.A.MD
      • Generate File Names for Transcriptions
        • eg DEV00012004049.trn
      • XML-based database dump
    12. Synchronisation
      • Client -> Server
        • SQL query identifies all changed data since last sync
        • Export and serialize as XML
        • Compress, checksum
        • Transfer over HTTP
        • Checksum, uncompress
        • Serialise XML to SQL
        • Import SQL into database
      • Server -> Client is this process in reverse
    13.  
    14. Administration
      • User facilitated editing of
        • System data
          • Synchronisation – server settings
        • Extensible controlled vocabularies
          • Languages – linked to Ethnologue and AIATSIS codes
          • Locations – geographical metadata
          • Activities/tasks – both locally and globally defined
        • User administration
          • Access (personal metadata)
          • Roles (fieldworker, administrator …)
        • Project administration
          • Fieldworker activity
    15. Conclusion
      • Feature of note is complete online and offline operation
      • Research methodology is indicative of many field linguistics projects
      • Available for other interested parties to build on and extend
      • http://www.cs.mu.oz.au/research/lt/projects/acla-db
    16. Acknowledgements
      • The research reported here is supported by the Australian Research Council Discovery Project Grant DP0343189.

    + Baden  HughesBaden Hughes, 2 years ago

    custom

    439 views, 1 favs, 0 embeds more stats

    Paper at LREC2004 (May 2004, Lisbon)

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 439
      • 439 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 2
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories