Making Inter-operability Visible


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Making Inter-operability Visible

  1. 1. Making Inter-operability Visible Visualising Interoperability: ARH, Aggregation, Rationalisation and Harmonisation Michael Currie, Meigan Geileskey, Liddy Nevile, Richard Woodman A Project in the Victorian Department of Premier & Cabinet
  2. 2. Summary <ul><li>Introduction </li></ul><ul><ul><li>disclaimer, context </li></ul></ul><ul><ul><li>purpose, process </li></ul></ul><ul><li>The Case Study </li></ul><ul><li>Some after-thoughts... </li></ul>
  3. 3. Disclaimer <ul><li>Just what it is … </li></ul><ul><li>Using info m’ment to teach maths </li></ul><ul><li>Not a scientific paper ... </li></ul><ul><li>Information scientists, systems architects, administrators, DC worshippers, ….. </li></ul>
  4. 4. Context <ul><li>Dept PC developing Intranet </li></ul><ul><li>Dept SD developing VOG (brochure-net) </li></ul><ul><li>Most depts developing doc. m’ment systems </li></ul><ul><li>departmental agencies have (sometimes) implemented, more or less, DC-based Australian Government Locator Standard </li></ul><ul><li>the Government needs discovery for all resources and information interoperability </li></ul>
  5. 5. Context (2) <ul><li>a visualisation of interoperability to assist real-world deployment of metadata </li></ul><ul><li>a few metadata records, but the future of the Whole-of-Government Intranet and all departments of government </li></ul><ul><li>my role </li></ul>
  6. 6. Purpose of Project <ul><li>With minimal interference </li></ul><ul><li>To accommodate work on extranet </li></ul><ul><li>Influence new document management systems to prepare for intranet </li></ul><ul><li>To maintain local specificity and global discovery </li></ul><ul><li>To increase interest and effort. </li></ul>
  7. 7. Previous Process <ul><li>Departmental mandate to manage information </li></ul><ul><li>Monthly meetings of those interested in metadata from various departments </li></ul><ul><li>Only collaboration, cooperation, … </li></ul><ul><li>Weak government “policy”. </li></ul>
  8. 8. Case Study Process <ul><li>Literature review - </li></ul><ul><ul><li>what is inter-operability </li></ul></ul><ul><li>Metadata review </li></ul><ul><ul><li>Emergent phenomena </li></ul></ul><ul><li>Bricolage development </li></ul><ul><ul><li>Aggregation, rationalisation, harmonisation </li></ul></ul><ul><li>Registry, repository, WOG search... </li></ul>
  9. 9. Interoperating <ul><li>- an action when one tries to mix and match across domains </li></ul><ul><ul><li>Agencies use many different strategies to make this happen…. </li></ul></ul>
  10. 10. Interoperability <ul><li>- a state where there is interchange and exchange without difficulty </li></ul><ul><ul><li>Government agencies need complete or absolute interoperability - every document must be discoverable </li></ul></ul>
  11. 11. Mapping for interoperation
  12. 12. What is good enough? <ul><li>Moderate inter-operability is good enough for information re-use </li></ul><ul><ul><li>Departments like to maintain independence, operate in ‘silos’ </li></ul></ul><ul><li>Only perfect is good enough for Parliament </li></ul><ul><ul><li>public accountability is forcing the issue </li></ul></ul>
  13. 13. Aggregation <ul><li>First list </li></ul><ul><ul><li>who has metadata, for what </li></ul></ul><ul><li>Second list - </li></ul><ul><ul><li>who uses what elements </li></ul></ul><ul><li>List all elements currently used </li></ul><ul><ul><li>is there a significant difference </li></ul></ul>
  14. 14. Record review Live record
  15. 15. List review
  16. 16. Spreadsheet review Live spreadsheet
  17. 17. Aggregation - review <ul><li>Looking at the list, are there problems? </li></ul><ul><ul><li>It’s way too big </li></ul></ul><ul><ul><li>Too many things in the list </li></ul></ul><ul><ul><ul><li>Variations in application profiles, errors, and variation in use of elements, formats, etc... </li></ul></ul></ul><ul><ul><li>Not all metadata is useful for discovery </li></ul></ul><ul><ul><ul><li>eg. random use of ‘DPC’ vs ‘Department of Premier and Cabinet’ </li></ul></ul></ul><ul><ul><ul><li>searchers may miss some documents. </li></ul></ul></ul>
  18. 18. Emergent goals - 1 <ul><ul><li>Illuminate limitations in inter-operability resulting from existing metadata practices </li></ul></ul><ul><ul><ul><li>Articulate the cause of the problems </li></ul></ul></ul><ul><ul><ul><li>Develop a shared strategy for improving the inter-operability </li></ul></ul></ul>
  19. 19. Emergent goals - 2 <ul><li>Encourage data managers to develop a single, comprehensive metadata application profile </li></ul><ul><ul><li>derived from the current requirements and foci of all users </li></ul></ul><ul><ul><li>with high local specificity and deep interoperability. </li></ul></ul>
  20. 20. Analysis <ul><li>Look at the list of elements and decide which have material differences </li></ul><ul><li>**Remembering that each agency values all its metadata content </li></ul>
  21. 21. Material differences <ul><li>Mis-use of available elements, qualifiers etc </li></ul><ul><li>Different expression of the same type of information </li></ul><ul><li>Different granularity </li></ul><ul><li>Different element name for the same information, … </li></ul><ul><li>-> need to rationalise </li></ul>
  22. 22. Element Name Variants <ul><li>Inconsistent case </li></ul><ul><ul><li>eg. DC.Title/TITLE/title EDNA.Userlevel/UserLevel </li></ul></ul><ul><li>Non-standard names eg. DC.Keywords </li></ul><ul><li>Non-standard qualifiers </li></ul><ul><ul><li>eg. DC.Description.Abstract </li></ul></ul><ul><li>Non-standard abbreviations </li></ul><ul><ul><li>eg. DC.Lang </li></ul></ul>
  23. 23. Field Selection <ul><li>Standard and non-standard element names </li></ul><ul><ul><li>eg. 'description' and DC.Description </li></ul></ul><ul><li>Locally created element names </li></ul><ul><ul><li>eg. Custodian </li></ul></ul>
  24. 24. Value string Variants <ul><li>Despite DCMES recommendations … </li></ul><ul><li>DC.Identifier: </li></ul><ul><ul><li>other id numbers without qualifiers. </li></ul></ul><ul><li>DC.Date: </li></ul><ul><ul><li>also used yyyy, yyyy/m/d, yyyy-dd-mm </li></ul></ul><ul><li>DC.Format: </li></ul><ul><ul><li>Non-standard terms eg. VHS (PAL) </li></ul></ul><ul><ul><li>Incorrect case eg. text/HTML </li></ul></ul>
  25. 25. <ul><li>DC.Language: also used en, en-au, en-AU </li></ul><ul><li>Qualifiers embedded in values: </li></ul><ul><ul><li>DC.Publisher CONTENT=&quot;corporateName=State...&quot; </li></ul></ul><ul><li>Non-standard proper names </li></ul><ul><ul><li>DPC for Department of Premier and Cabinet </li></ul></ul><ul><li>->Generally inconsistent use of capitalisation and punctuation </li></ul>
  26. 26. Observation <ul><li>Most element variants due to non-standard use of capitals, punctuation, spelling </li></ul><ul><li>Users seem to act independently of Application Profiles </li></ul><ul><li>Little use of collection specific qualifiers to enhance specificity </li></ul>
  27. 27. Rationalise! <ul><li>Add qualifiers to each type of element? </li></ul><ul><ul><ul><li>no less elements but significantly increases semantic interoperability </li></ul></ul></ul><ul><li>Dumb-down for interoperation? </li></ul><ul><ul><ul><li>Loss of precision, too many documents from search but not everything is found </li></ul></ul></ul><ul><li>Cross-map ontologies, one to another </li></ul><ul><ul><li>Too cumbersome </li></ul></ul><ul><li>Map everything to a new ontology </li></ul><ul><ul><ul><li>Blanchi, ‘Harmony’, etc. </li></ul></ul></ul>
  28. 28. Rationalise (a process?) <ul><li>Choose strategy </li></ul><ul><li>Decide what is disposable/worth saving given chosen strategy </li></ul><ul><li>Delete disposable elements </li></ul><ul><li>-> The list gets shorter, semantic inter-operability is improved. </li></ul>
  29. 29. Harmonisation <ul><li>Work together to choose appropriate formats and definitions for elements and qualifiers </li></ul><ul><li>**Remembering that, locally, departments will want more and less precision </li></ul>
  30. 30. Harmonisation (process?) <ul><li>Eg </li></ul><ul><li>in - maybe just a different format so it’s obvious what to do </li></ul><ul><li>in DC.subject - some use AGIFT, some use SCIS, some use NRE’s geo-spatial thesaurus, etc. - consensus work to be done </li></ul>
  31. 31. Results from ARH - HA? <ul><li>DPC Project is working on towards an Intranet, harmonising the application profiles </li></ul>
  32. 32. Supporting granularity - technologies <ul><li>Registries - leads to the well-defined full WOG ontology supporting evolving granularity </li></ul><ul><li>Federated Metadata repositories preserve local control and remote interoperability </li></ul>
  33. 33. Key service providers - VOG VOG DoJ DoI DNRE
  34. 34. Key service providers - WOG WOG AP VOG AP
  35. 35. The grammar of DC metadata
  36. 36. <ul><li>Resource has property (subject): </li></ul><ul><ul><li>‘about land’ </li></ul></ul><ul><ul><li>‘about rural land’ </li></ul></ul><ul><ul><li>‘about Victorian rural land’ </li></ul></ul><ul><ul><li>‘about Victorian rural land in section 43’ </li></ul></ul><ul><ul><li>‘about Victorian rural land in section 43 in January 2002’ </li></ul></ul>The grammar of DC metadata
  37. 37. Does your resource speak Dublin Core (AGLS)? <ul><li>“… Pidgins are inherently limited in what they can express, but they are easy to learn and enormously useful. In real life, we talk one way to our professional colleagues and another way to visitors from other cultures. Our digital library applications need to do this as well. Simplicity and complexity are both appropriate, depending on context. If Dublin Core is too simple or generic to use as the native idiom of a particular application, pidgin statements may be extracted or translated from richer idioms that exist for specialized domains. This output should also be filtered to keep the fifteen buckets clear of encoding debris and semantic silt. One should treat digital tourists with courtesy and hide from them the complexities of a local application vocabulary or grammar. However sophisticated its local idiom may be, an application might also speak a pidgin that general users and generic search engines will understand. Simple, semantically clean, computationally obvious values will help us negotiate our way through a splendidly diverse and heterogeneous Internet.” (Tom Baker, </li></ul>
  38. 38. Pidgin vs Symbolic Languages <ul><li>Pidgin languages serve for good enough communication </li></ul><ul><li>Symbolic languages serve for complete communication in abbreviated form </li></ul><ul><ul><li>structure </li></ul></ul><ul><ul><li>dictionaries </li></ul></ul>