Management of Metadata in Linguistic Fieldwork:  Experience from the ACLA Project Baden Hughes 1 , David Penton 1 , Steven...
Overview <ul><li>Introduction </li></ul><ul><li>Requirements </li></ul><ul><li>Data Model </li></ul><ul><li>Implementation...
Introduction <ul><li>A metadata creation and management tool for a multiple fieldworker, longitudinal, child language acqu...
Requirements <ul><li>Data Management </li></ul><ul><ul><li>Metadata for complex multimodal data </li></ul></ul><ul><ul><li...
Data Model <ul><li>Tools for modelling </li></ul><ul><ul><li>DBDesigner (open source, XML based, multi-platform) </li></ul...
Implementation <ul><li>Architecture </li></ul><ul><ul><li>(fully independent) networked client-server </li></ul></ul><ul><...
 
Data Entry <ul><li>Forms based data entry </li></ul><ul><ul><li>Participant Form </li></ul></ul><ul><ul><li>Session Form <...
Reports, Queries and Searches <ul><li>Simple Reports </li></ul><ul><ul><li>for frequently used 2 dimensional queries </li>...
 
Exports <ul><li>Generate headers for CLAN </li></ul><ul><ul><li>eg @participants </li></ul></ul><ul><li>Generate Physical ...
Synchronisation <ul><li>Client -> Server </li></ul><ul><ul><li>SQL query identifies all changed data since last sync </li>...
 
Administration <ul><li>User facilitated editing of </li></ul><ul><ul><li>System data </li></ul></ul><ul><ul><ul><li>Synchr...
Conclusion <ul><li>Feature of note is complete online and offline operation  </li></ul><ul><li>Research methodology is ind...
Acknowledgements <ul><li>The research reported here is supported by the Australian Research Council Discovery Project Gran...
Upcoming SlideShare
Loading in …5
×

Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project

3,351 views

Published on

Paper at LREC2004 (May 2004, Lisbon)

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,351
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
8
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project

  1. 1. Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project Baden Hughes 1 , David Penton 1 , Steven Bird 1 , Catherine Bow 1 , Gillian Wigglesworth 1 , Patrick McConvell 2 and Jane Simpson 3 1 University of Melbourne, 2 AIATSIS, 3 University of Sydney
  2. 2. Overview <ul><li>Introduction </li></ul><ul><li>Requirements </li></ul><ul><li>Data Model </li></ul><ul><li>Implementation </li></ul><ul><ul><li>Data Entry </li></ul></ul><ul><ul><li>Reports, Queries and Searches </li></ul></ul><ul><ul><li>Exports </li></ul></ul><ul><ul><li>Synchronisation </li></ul></ul><ul><ul><li>Administration </li></ul></ul><ul><li>Conclusion </li></ul>
  3. 3. Introduction <ul><li>A metadata creation and management tool for a multiple fieldworker, longitudinal, child language acquisition research project </li></ul><ul><li>Addressing the need for principled metadata creation as well as best practice data creation </li></ul><ul><li>Challenging deployment scenario which is typical of numerous field-oriented linguistic research and language data collection projects </li></ul>
  4. 4. Requirements <ul><li>Data Management </li></ul><ul><ul><li>Metadata for complex multimodal data </li></ul></ul><ul><ul><li>Relational data for participants </li></ul></ul><ul><ul><li>Delineation between participant roles </li></ul></ul><ul><ul><li>Not just collection, but reports and queries </li></ul></ul><ul><li>Research Methodology </li></ul><ul><ul><li>Integration with tool of choice for analysis </li></ul></ul><ul><ul><li>2 stage enquiry process - metadata then data </li></ul></ul><ul><ul><li>Extensible controlled vocabularies </li></ul></ul><ul><ul><li>User defined fields (particularly lists) </li></ul></ul><ul><li>Technology </li></ul><ul><ul><li>Full support for data entry and enquiry in both online and offline modes </li></ul></ul><ul><ul><li>Metadata collection with maximum utility to project without precluding other renderings eg as OLAC or IMDI catalogue </li></ul></ul><ul><ul><li>Easy to install and use on multiple platforms </li></ul></ul>
  5. 5. Data Model <ul><li>Tools for modelling </li></ul><ul><ul><li>DBDesigner (open source, XML based, multi-platform) </li></ul></ul><ul><li>Challenges for modelling </li></ul><ul><ul><li>Multiple interlinked media, sessions, and transcripts </li></ul></ul><ul><ul><li>Differentiating between participants and focus children in multiple contexts </li></ul></ul><ul><ul><li>Incomplete personal data eg no DOB </li></ul></ul><ul><ul><li>Non-linear progression through educational system </li></ul></ul><ul><ul><li>Multiple types of anthropological relations </li></ul></ul><ul><ul><li>Non-standardised linguistic classification and nomenclature </li></ul></ul>
  6. 6. Implementation <ul><li>Architecture </li></ul><ul><ul><li>(fully independent) networked client-server </li></ul></ul><ul><ul><li>single line of code difference between client and server installation </li></ul></ul><ul><ul><li>Underlying requirement to provide full functionality in both online or offline environments </li></ul></ul><ul><li>Technology Platform </li></ul><ul><ul><li>PHP, PEAR scripting language </li></ul></ul><ul><ul><li>MySQL database engine </li></ul></ul><ul><ul><li>Apache HTTP server </li></ul></ul><ul><ul><li>fundamentally open source, cross-platform </li></ul></ul>
  7. 8. Data Entry <ul><li>Forms based data entry </li></ul><ul><ul><li>Participant Form </li></ul></ul><ul><ul><li>Session Form </li></ul></ul><ul><li>Feature of both these forms is the “build your own list” form interface which allows end user to construct a list of parameters and then apply instances of these parameters within the parent form </li></ul><ul><ul><li>educational progress </li></ul></ul><ul><ul><li>session-media-transcript </li></ul></ul>
  8. 9. Reports, Queries and Searches <ul><li>Simple Reports </li></ul><ul><ul><li>for frequently used 2 dimensional queries </li></ul></ul><ul><ul><ul><li>eg participants by fieldworker </li></ul></ul></ul><ul><ul><ul><li>eg participants by gender </li></ul></ul></ul><ul><li>Advanced Reports </li></ul><ul><ul><li>design your own query interface </li></ul></ul><ul><li>Full Text Query </li></ul><ul><ul><li>Boolean support </li></ul></ul><ul><ul><li>full database index query </li></ul></ul>
  9. 11. Exports <ul><li>Generate headers for CLAN </li></ul><ul><ul><li>eg @participants </li></ul></ul><ul><li>Generate Physical Media Labels </li></ul><ul><ul><li>Eg FM025.A.DV, FM025.A.MD </li></ul></ul><ul><li>Generate File Names for Transcriptions </li></ul><ul><ul><li>eg DEV00012004049.trn </li></ul></ul><ul><li>XML-based database dump </li></ul>
  10. 12. Synchronisation <ul><li>Client -> Server </li></ul><ul><ul><li>SQL query identifies all changed data since last sync </li></ul></ul><ul><ul><li>Export and serialize as XML </li></ul></ul><ul><ul><li>Compress, checksum </li></ul></ul><ul><ul><li>Transfer over HTTP </li></ul></ul><ul><ul><li>Checksum, uncompress </li></ul></ul><ul><ul><li>Serialise XML to SQL </li></ul></ul><ul><ul><li>Import SQL into database </li></ul></ul><ul><li>Server -> Client is this process in reverse </li></ul>
  11. 14. Administration <ul><li>User facilitated editing of </li></ul><ul><ul><li>System data </li></ul></ul><ul><ul><ul><li>Synchronisation – server settings </li></ul></ul></ul><ul><ul><li>Extensible controlled vocabularies </li></ul></ul><ul><ul><ul><li>Languages – linked to Ethnologue and AIATSIS codes </li></ul></ul></ul><ul><ul><ul><li>Locations – geographical metadata </li></ul></ul></ul><ul><ul><ul><li>Activities/tasks – both locally and globally defined </li></ul></ul></ul><ul><ul><li>User administration </li></ul></ul><ul><ul><ul><li>Access (personal metadata) </li></ul></ul></ul><ul><ul><ul><li>Roles (fieldworker, administrator …) </li></ul></ul></ul><ul><ul><li>Project administration </li></ul></ul><ul><ul><ul><li>Fieldworker activity </li></ul></ul></ul>
  12. 15. Conclusion <ul><li>Feature of note is complete online and offline operation </li></ul><ul><li>Research methodology is indicative of many field linguistics projects </li></ul><ul><li>Available for other interested parties to build on and extend </li></ul><ul><li>http://www.cs.mu.oz.au/research/lt/projects/acla-db </li></ul>
  13. 16. Acknowledgements <ul><li>The research reported here is supported by the Australian Research Council Discovery Project Grant DP0343189. </li></ul>

×